diff --git a/docs/blog/2021/02.01-happy-anniversary-gardener/index.html b/docs/blog/2021/02.01-happy-anniversary-gardener/index.html index 8bd9e886070..5fa4d3f5cec 100644 --- a/docs/blog/2021/02.01-happy-anniversary-gardener/index.html +++ b/docs/blog/2021/02.01-happy-anniversary-gardener/index.html @@ -154,7 +154,7 @@ environments (as a service) for all teams within the company. Later that same year, SAP also joined the CNCF as a platinum member.

We first deliberated intensively on the BUY options (including acquisitions, due to the size and estimated volume needed at SAP). There were some early products from commercial vendors and startups available that did not bind exclusively to one of the hyperscalers, but these products did not cover many of our crucial and immediate requirements for a multi-cloud environment.

Ultimately, we opted to BUILD ourselves. This decision was not made lightly, because right from the start, we knew that we would have to cover thousands of clusters, across the globe, on all kinds of infrastructures. We would have to be able to create them at scale as well as manage them 24x7. And thus, we predicted the need to invest into automation of all aspects, to keep the service TCO at a minimum, and to offer an enterprise worthy SLA early on. This particular endeavor grew into launching the project Gardener, first internally, and ultimately fulfilling all checks, externally based on open source. Its mission statement, in a nutshell, is “Universal Kubernetes at scale”. Now, that’s quite bold. But we also had a nifty innovation that helped us tremendously along the way. And we can openly reveal the secret here: Gardener was built, not only for creating Kubernetes at scale, but it was built (recursively) in Kubernetes itself.

What Do You Get with Gardener?

Gardener offers managed and homogenous Kubernetes clusters on IaaS providers like AWS, Azure, GCP, AliCloud, Open Telekom Cloud, SCS, OVH and more, but also covers versatile infrastructures like OpenStack, VMware or bare metal. Day-1 and Day-2 operations are an integral part of a cluster’s feature set. This means that Gardener is not only capable of provisioning or de-provisioning thousands of clusters, but also of monitoring your cluster’s health state, upgrading components in a rolling fashion, or scaling the control plane as well as worker nodes up and down depending on the current resource demand.

Some features mentioned above might sound familiar to you, simply because they’re squarely derived from Kubernetes. Concretely, if you explore a Gardener managed end-user cluster, you’ll never see the so-called “control plane components” (Kube-Apiserver, Kube-Controller-Manager, Kube-Scheduler, etc.) The reason is that they run as Pods inside another, hosting/seeding Kubernetes cluster. Speaking in Gardener terms, the latter is called a Seed cluster, and the end-user cluster is called a Shoot cluster; and thus the botanical naming scheme for Gardener was born. Further assets like infrastructure components or worker machines are modelled as managed Kubernetes objects too. This allows Gardener to leverage all the great and production proven features of Kubernetes for managing Kubernetes clusters. Our blog post on Kubernetes.io reveals more details about the architectural refinements.

Figure 1: Gardener architecture overview

End-users directly benefit from Gardener’s recursive architecture. Many of the requirements that we identified for the Gardener service turned out to be highly convenient for shoot owners. For instance, Seed clusters are usually equipped with DNS and x509 services. At the same time, these service offerings can be extended to requests coming from the Shoot clusters i.e., end-users get domain names and certificates for their applications out of the box.

Recognizing the Power of Open Source

The Gardener team immediately profited from open source: from Kubernetes obviously, and all its ecosystem projects. That all facilitated our project’s very fast and robust development. But it does not answer:

“Why would SAP open source a tool that clearly solves a monetizable enterprise requirement?"_

Short spoiler alert: it initially involved a leap of faith. If we just look at our own decision path, it is undeniable that developers, and with them entire industries, gravitate towards open source. We chose Linux, Containers, and Kubernetes exactly because they are open, and we could bet on network effects, especially around skills. The same decision process is currently replicated in thousands of companies, with the same results. Why? Because all companies are digitally transforming. They are becoming software companies as well to a certain extent. Many of them are also our customers and in many discussions, we recognized that they have the same challenges that we are solving with Gardener. This, in essence, was a key eye opener. We were confident that if we developed Gardener as open source, we’d not only seize the opportunity to shape a Kubernetes management tool that finds broad interest and adoption outside of our use case at SAP, but we could solve common challenges faster with the help of a community, and that in consequence would sustain continuous feature development.

Coincidently, that was also when the SAP Open Source Program Office (OSPO) was launched. It supported us making a case to develop Gardener completely as open source. -Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.

Figure 2: Example immediate community contribution

Differentiating Gardener from Other Kubernetes Management Solutions

Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!

Also, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.

Who Is Using Gardener?

Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.

Flant, PingCap, StackIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.

An interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. +Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.

Figure 2: Example immediate community contribution

Differentiating Gardener from Other Kubernetes Management Solutions

Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!

Also, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.

Who Is Using Gardener?

Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.

Flant, PingCap, STACKIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.

An interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. For Kubernetes-as-a-Service, they evaluated and chose Gardener because it was open and a perfect candidate. With Gardener’s extension capabilities, it was possible to bring managed Kubernetes clusters to their very own bare metal stack, metal-stack.io. Of course, this meant implementation effort. But by reusing the Gardener project, FI-TS was able to leverage our standard with minimal adjustments for their special use-case. Subsequently, with their contributions, SAP was able to make Gardener more open for the community.

Full Speed Ahead with the Community in 2021

Some of the current and most active topics are about the installer (Landscaper), control plane migration, diff --git a/docs/blog/2021/_print/index.html b/docs/blog/2021/_print/index.html index ff9b7daeaaf..c63d99f7f69 100644 --- a/docs/blog/2021/_print/index.html +++ b/docs/blog/2021/_print/index.html @@ -14,7 +14,7 @@ environments (as a service) for all teams within the company. Later that same year, SAP also joined the CNCF as a platinum member.

We first deliberated intensively on the BUY options (including acquisitions, due to the size and estimated volume needed at SAP). There were some early products from commercial vendors and startups available that did not bind exclusively to one of the hyperscalers, but these products did not cover many of our crucial and immediate requirements for a multi-cloud environment.

Ultimately, we opted to BUILD ourselves. This decision was not made lightly, because right from the start, we knew that we would have to cover thousands of clusters, across the globe, on all kinds of infrastructures. We would have to be able to create them at scale as well as manage them 24x7. And thus, we predicted the need to invest into automation of all aspects, to keep the service TCO at a minimum, and to offer an enterprise worthy SLA early on. This particular endeavor grew into launching the project Gardener, first internally, and ultimately fulfilling all checks, externally based on open source. Its mission statement, in a nutshell, is “Universal Kubernetes at scale”. Now, that’s quite bold. But we also had a nifty innovation that helped us tremendously along the way. And we can openly reveal the secret here: Gardener was built, not only for creating Kubernetes at scale, but it was built (recursively) in Kubernetes itself.

What Do You Get with Gardener?

Gardener offers managed and homogenous Kubernetes clusters on IaaS providers like AWS, Azure, GCP, AliCloud, Open Telekom Cloud, SCS, OVH and more, but also covers versatile infrastructures like OpenStack, VMware or bare metal. Day-1 and Day-2 operations are an integral part of a cluster’s feature set. This means that Gardener is not only capable of provisioning or de-provisioning thousands of clusters, but also of monitoring your cluster’s health state, upgrading components in a rolling fashion, or scaling the control plane as well as worker nodes up and down depending on the current resource demand.

Some features mentioned above might sound familiar to you, simply because they’re squarely derived from Kubernetes. Concretely, if you explore a Gardener managed end-user cluster, you’ll never see the so-called “control plane components” (Kube-Apiserver, Kube-Controller-Manager, Kube-Scheduler, etc.) The reason is that they run as Pods inside another, hosting/seeding Kubernetes cluster. Speaking in Gardener terms, the latter is called a Seed cluster, and the end-user cluster is called a Shoot cluster; and thus the botanical naming scheme for Gardener was born. Further assets like infrastructure components or worker machines are modelled as managed Kubernetes objects too. This allows Gardener to leverage all the great and production proven features of Kubernetes for managing Kubernetes clusters. Our blog post on Kubernetes.io reveals more details about the architectural refinements.

Figure 1: Gardener architecture overview

End-users directly benefit from Gardener’s recursive architecture. Many of the requirements that we identified for the Gardener service turned out to be highly convenient for shoot owners. For instance, Seed clusters are usually equipped with DNS and x509 services. At the same time, these service offerings can be extended to requests coming from the Shoot clusters i.e., end-users get domain names and certificates for their applications out of the box.

Recognizing the Power of Open Source

The Gardener team immediately profited from open source: from Kubernetes obviously, and all its ecosystem projects. That all facilitated our project’s very fast and robust development. But it does not answer:

“Why would SAP open source a tool that clearly solves a monetizable enterprise requirement?"_

Short spoiler alert: it initially involved a leap of faith. If we just look at our own decision path, it is undeniable that developers, and with them entire industries, gravitate towards open source. We chose Linux, Containers, and Kubernetes exactly because they are open, and we could bet on network effects, especially around skills. The same decision process is currently replicated in thousands of companies, with the same results. Why? Because all companies are digitally transforming. They are becoming software companies as well to a certain extent. Many of them are also our customers and in many discussions, we recognized that they have the same challenges that we are solving with Gardener. This, in essence, was a key eye opener. We were confident that if we developed Gardener as open source, we’d not only seize the opportunity to shape a Kubernetes management tool that finds broad interest and adoption outside of our use case at SAP, but we could solve common challenges faster with the help of a community, and that in consequence would sustain continuous feature development.

Coincidently, that was also when the SAP Open Source Program Office (OSPO) was launched. It supported us making a case to develop Gardener completely as open source. -Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.

Figure 2: Example immediate community contribution

Differentiating Gardener from Other Kubernetes Management Solutions

Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!

Also, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.

Who Is Using Gardener?

Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.

Flant, PingCap, StackIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.

An interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. +Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.

Figure 2: Example immediate community contribution

Differentiating Gardener from Other Kubernetes Management Solutions

Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!

Also, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.

Who Is Using Gardener?

Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.

Flant, PingCap, STACKIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.

An interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. For Kubernetes-as-a-Service, they evaluated and chose Gardener because it was open and a perfect candidate. With Gardener’s extension capabilities, it was possible to bring managed Kubernetes clusters to their very own bare metal stack, metal-stack.io. Of course, this meant implementation effort. But by reusing the Gardener project, FI-TS was able to leverage our standard with minimal adjustments for their special use-case. Subsequently, with their contributions, SAP was able to make Gardener more open for the community.

Full Speed Ahead with the Community in 2021

Some of the current and most active topics are about the installer (Landscaper), control plane migration, diff --git a/docs/blog/2021/index.xml b/docs/blog/2021/index.xml index 82092eb5d4e..39f8774f544 100644 --- a/docs/blog/2021/index.xml +++ b/docs/blog/2021/index.xml @@ -37,7 +37,7 @@ Today, we can witness that this strategy has unfolded. It opened the gates not o <p>Also, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like <em>PingCap</em>, who use Gardener to serve <em>TiDB Cloud service</em>. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.</p> <h2 id="who-is-using-gardener">Who Is Using Gardener?</h2> <p>Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.</p> -<p><em>Flant, PingCap, StackIT, T-Systems, Sky</em>, or <em>b’nerd</em> are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.</p> +<p><em>Flant, PingCap, STACKIT, T-Systems, Sky</em>, or <em>b’nerd</em> are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.</p> <p>An interesting journey in the open source space started with <em>Finanz Informatik Technologie Service (FI-TS)</em>, an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. For Kubernetes-as-a-Service, they evaluated and chose Gardener because it was open and a perfect candidate. With Gardener’s extension capabilities, it was possible to bring managed Kubernetes clusters to their very own bare metal stack, <a href="https://metal-stack.io/">metal-stack.io</a>. Of course, this meant implementation effort. But by reusing the Gardener project, <em>FI-TS</em> was able to leverage our standard with minimal adjustments for their special use-case. Subsequently, with their contributions, SAP was able to make Gardener more open for the community.</p> diff --git a/docs/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/index.html b/docs/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/index.html index 139ce57bf31..bcfe09a6545 100644 --- a/docs/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/index.html +++ b/docs/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/index.html @@ -139,7 +139,7 @@

그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그

KubeCon / CloudNativeCon Europe 2024 Highlights

KubeCon EU 2024 Keynote Room

KubeCon + CloudNativeCon Europe 2024, recently held in Paris, was a testament to the robustness of the open-source community and its pivotal role in driving advancements in AI and cloud-native technologies. With a record attendance of over +12,000 participants, the conference underscored the ubiquity of cloud-native architectures and the business opportunities they provide.

AI Everywhere

LLMs and GenAI took center stage at the event, with discussions on challenges such as security, data management, and energy consumption. A popular quote stated, “If #inference is the new web application, #kubernetes is the new web server”. The conference emphasized the need for more open data models for AI to democratize the technology. Cloud-native platforms offer advantages for AI innovation, such as packaging models and dependencies as Docker packages and enhancing resource management for proper model execution. The community is exploring AI workload management, including using CPUs for inferencing and preprocessing data before handing it over to GPUs. CNCF took the initiative and put together an AI whitepaper outlining the apparent synergy between cloud-native technologies and AI.

Cluster Autopilot

The conference showcased popular projects in the cloud-native ecosystem, including Kubernetes, Istio, and OpenTelemetry. Kubernetes was highlighted as a platform for running massive AI workloads. The UXL Foundation aims to enable multi-vendor AI workloads on Kubernetes, allowing developers to move AI workloads without being locked into a specific infrastructure. Every vendor we interacted with has assembled an AI-powered chatbot, which performs various functions – from assessing cluster health through analyzing cost efficiency and proposing workload optimizations to troubleshooting issues and alerting for potential challenges with upcoming Kubernetes version upgrades. Sysdig went even further with a chatbot, which answers the popular question, “Do any of my products have critical CVEs in production?” and analyzes workloads’ structure and configuration. Some chatbots leveraged the k8sgpt project, which joined the CNCF sandbox earlier this year.

Sophisticated Fleet Management

The ecosystem showcased maturity in observability, platform engineering, security, and optimization, which will help operationalize AI workloads. Data demands and costs were also in focus, touching on data observability and cloud-cost management. Cloud-native technologies, also going beyond Kubernetes, are expected to play a crucial role in managing the increasing volume of data and scaling AI. Google showcased fleet management in their Google Hosted Cloud offering (ex-Anthos). It allows for defining teams and policies at the fleet level, later applied to all the Kubernetes clusters in the fleet, irrespective of the infrastructure they run on (GCP and beyond).

WASM Everywhere

The conference also highlighted the growing interest in WebAssembly (WASM) as a portable binary instruction format for executable programs and its integration with Kubernetes and other functions. The topic here started with a dedicated WASM pre-conference day, the sessions of which are available in the following playlist. WASM is positioned as the smoother approach to software distribution and modularity, providing more lightweight runtime execution options and an easier way for app developers to enter.

Rust on the Rise

Several talks were promoting Rust as an ideal programming language for cloud-native workloads. It was even promoted as suitable for writing Kubernetes controllers.

Internal Developer Platforms

The event showcased the importance of Internal Developer Platforms (IDPs), both commercial and open-source, in facilitating the development process across all types of organizations – from Allianz to Mercedes. Backstage leads the pack by a large margin, with all relevant sessions being full or at capacity. Much effort goes into the modularization of Backstage, which was also a notable highlight at the conference.

Sustainability

Sustainability was a key theme, with discussions on the role of cloud-native technologies in promoting green practices. The KubeCost application folks put a lot of effort into emphasizing the large amount of wasted money, which hyperscalers benefit from. In parallel – the kube-green project emphasized optimizing your cluster footprint to minimize CO2 emissions. The conference also highlighted the importance of open source in creating a level playing field for multiple players to compete, fostering diverse participation, and solving global challenges.

Customer Stories

In contrast to the Chicago KubeCon in 2023, the one in Paris outlined multiple case studies, best practices, and reference scenarios. Many enterprises and their IT teams were well represented at KubeCon - regarding sessions, sponsorships, and participation. These companies strive to excel forward, reaping the efficiency and flexibility benefits cloud-native architectures provide. -We came across multiple companies using Gardener as their Kubernetes management underlay – including FUGA Cloud, StackIT, and metal-stack Cloud. We eagerly anticipate more companies embracing Gardener at future events. The consistent feedback from these companies has been overwhelmingly positive—they absolutely love using Gardener and our shared excitement grows as the community thrives!

Notable Talks

Notable talks from leaders in the cloud-native world, including Solomon Hykes, Bob Wise, and representatives from KCP for Platforms and the United Nations, provided valuable insights into the future of AI and cloud-native technologies. All the talks are now uploaded to YouTube in the following playlist. Those do not include the various pre-conference days, available as separate playlists by CNCF.

In Conclusion…

In conclusion, KubeCon 2024 showcased the intersection of AI and cloud-native technologies, the open-source community’s growth, and the cloud-native ecosystem’s maturity. Many enterprises are actively engaged there, innovating, trying, and growing their internal expertise. They’re using KubeCon as a recruiting event, expanding their internal talent pool and taking more of their internal operations and processes into their own hands. The event served as a platform for global collaboration, cross-company alignments, innovation, and the exchange of ideas, setting the stage for the future of cloud-native computing.

그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그

  4 minute read  

Problem Space

Let’s discuss the problem space first. Why does anyone need something like Gardener?

Running Software

The starting point is this rather simple question: Why would you want to run some software?

Typically, software is run with a purpose and not just for the sake of running it. Whether it is a digital ledger, a company’s inventory or a blog - software provides a service to its user.

Which brings us to the way this software is being consumed. Traditionally, software has been shipped on physical / digital media to the customer or end user. There, someone had to install, configure, and operate it. In recent times, the pattern has shifted. More and more solutions are operated by the vendor or a hosting partner and sold as a service ready to be used.

But still, someone needs to install, configure, and maintain it - regardless of where it is installed. And of course, it will run forever once started and is generally resilient to any kind of failures.

For smaller installations things like maintenance, scaling, debugging or configuration can be done in a semi-automatic way. It’s probably no fun and most importantly, only a limited amount of instances can be taken care of - similar to how one would take care of a pet.

But when hosting services at scale, there is no way someone can do all this manually at acceptable costs. So we need some vehicle to easily spin up new instances, do lifecycle operations, get some basic failure resilience, and more. How can we achieve that?

Solution Space 1 - Kubernetes

Let’s start solving some of the problems described earlier with Container technology and Kubernetes.

Containers

Container technology is at the core of the solution space. A container forms a vehicle that is shippable, can easily run in any supported environment and generally adds a powerful abstraction layer to the infrastructure.

However, plain containers do not help with resilience or scaling. Therefore, we need another system for orchestration.

Orchestration

“Classical” orchestration that just follows the “notes” and moves from state A to state B doesn’t solve all of our problems. We need something else.

Kubernetes operates on the principle of “desired state”. With it, you write a construction plan, then have controllers cycle through “observe -> analyze -> act” and transition the actual to the desired state. Those reconciliations ensure that whatever breaks there is a path back to a healthy state.

Summary

Containers (famously brought to the mainstream as “Docker”) and Kubernetes are the ingredients of a fundamental shift in IT. Similar to how the Operating System layer enabled the decoupling of software and hardware, container-related technologies provide an abstract interface to any kind of infrastructure platform for the next-generation of applications.

Solution Space 2 - Gardener

operating-apps

So, Kubernetes solves a lot of problems. But how do you get a Kubernetes cluster?

Either:

Essentially, it was a “make or buy” decision that led to the founding of Gardener.

The Reason Why We Choose to “Make It”

Gardener allows to run Kubernetes clusters on various hyperscalers. It offers the same set of basic configuration options independent of the chosen infrastructure. This kind of harmonization supports any multi-vendor strategy while reducing adoption costs for the individual teams. Just imagine having to deal with multiple vendors all offering vastly different Kubernetes clusters.

Of course, there are plenty more reasons - from acquiring operational knowledge to having influence on the developed features - that made the pendulum swing towards “make it”.

What exactly is Gardener?

universal-kubernetes

Gardener is a system to manage Kubernetes clusters. It is driven by the same “desired state” pattern as Kubernetes itself. In fact, it is using Kubernetes to run Kubernetes.

A user may “desire” clusters with specific configuration on infrastructures such as GCP, AWS, Azure, Alicloud, Openstack, vsphere, … and Gardener will make sure to create such a cluster and keep it running.

If you take this rather simplistic principle of reconciliation and add the feature-richness of Gardener to it, you end up with universal Kubernetes at scale.

Whether you need fleet management at minimal TCO or to look for a highly customizable control plane - we have it all.

On top of that, Gardener-managed Kubernetes clusters fulfill the conformance standard set out by the CNCF and we submit our test results for certification.

Have a look at the CNCF map for more information or dive into the testgrid directly.

Gardener itself is open-source. Under the umbrella of github.com/gardener we develop the core functionalities as well as the extensions and you are welcome to contribute (by opening issues, feature requests or submitting code).

Last time we counted, there were already 131 projects. That’s actually more projects than members of the organization.

As of today, Gardener is mainly developed by SAP employees and SAP is an “adopter” as well, among StackIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the Adopters page.

그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그 그

  4 minute read  

Problem Space

Let’s discuss the problem space first. Why does anyone need something like Gardener?

Running Software

The starting point is this rather simple question: Why would you want to run some software?

Typically, software is run with a purpose and not just for the sake of running it. Whether it is a digital ledger, a company’s inventory or a blog - software provides a service to its user.

Which brings us to the way this software is being consumed. Traditionally, software has been shipped on physical / digital media to the customer or end user. There, someone had to install, configure, and operate it. In recent times, the pattern has shifted. More and more solutions are operated by the vendor or a hosting partner and sold as a service ready to be used.

But still, someone needs to install, configure, and maintain it - regardless of where it is installed. And of course, it will run forever once started and is generally resilient to any kind of failures.

For smaller installations things like maintenance, scaling, debugging or configuration can be done in a semi-automatic way. It’s probably no fun and most importantly, only a limited amount of instances can be taken care of - similar to how one would take care of a pet.

But when hosting services at scale, there is no way someone can do all this manually at acceptable costs. So we need some vehicle to easily spin up new instances, do lifecycle operations, get some basic failure resilience, and more. How can we achieve that?

Solution Space 1 - Kubernetes

Let’s start solving some of the problems described earlier with Container technology and Kubernetes.

Containers

Container technology is at the core of the solution space. A container forms a vehicle that is shippable, can easily run in any supported environment and generally adds a powerful abstraction layer to the infrastructure.

However, plain containers do not help with resilience or scaling. Therefore, we need another system for orchestration.

Orchestration

“Classical” orchestration that just follows the “notes” and moves from state A to state B doesn’t solve all of our problems. We need something else.

Kubernetes operates on the principle of “desired state”. With it, you write a construction plan, then have controllers cycle through “observe -> analyze -> act” and transition the actual to the desired state. Those reconciliations ensure that whatever breaks there is a path back to a healthy state.

Summary

Containers (famously brought to the mainstream as “Docker”) and Kubernetes are the ingredients of a fundamental shift in IT. Similar to how the Operating System layer enabled the decoupling of software and hardware, container-related technologies provide an abstract interface to any kind of infrastructure platform for the next-generation of applications.

Solution Space 2 - Gardener

operating-apps

So, Kubernetes solves a lot of problems. But how do you get a Kubernetes cluster?

Either:

Essentially, it was a “make or buy” decision that led to the founding of Gardener.

The Reason Why We Choose to “Make It”

Gardener allows to run Kubernetes clusters on various hyperscalers. It offers the same set of basic configuration options independent of the chosen infrastructure. This kind of harmonization supports any multi-vendor strategy while reducing adoption costs for the individual teams. Just imagine having to deal with multiple vendors all offering vastly different Kubernetes clusters.

Of course, there are plenty more reasons - from acquiring operational knowledge to having influence on the developed features - that made the pendulum swing towards “make it”.

What exactly is Gardener?

universal-kubernetes

Gardener is a system to manage Kubernetes clusters. It is driven by the same “desired state” pattern as Kubernetes itself. In fact, it is using Kubernetes to run Kubernetes.

A user may “desire” clusters with specific configuration on infrastructures such as GCP, AWS, Azure, Alicloud, Openstack, vsphere, … and Gardener will make sure to create such a cluster and keep it running.

If you take this rather simplistic principle of reconciliation and add the feature-richness of Gardener to it, you end up with universal Kubernetes at scale.

Whether you need fleet management at minimal TCO or to look for a highly customizable control plane - we have it all.

On top of that, Gardener-managed Kubernetes clusters fulfill the conformance standard set out by the CNCF and we submit our test results for certification.

Have a look at the CNCF map for more information or dive into the testgrid directly.

Gardener itself is open-source. Under the umbrella of github.com/gardener we develop the core functionalities as well as the extensions and you are welcome to contribute (by opening issues, feature requests or submitting code).

Last time we counted, there were already 131 projects. That’s actually more projects than members of the organization.

As of today, Gardener is mainly developed by SAP employees and SAP is an “adopter” as well, among STACKIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the Adopters page.

diff --git a/docs/docs/index.xml b/docs/docs/index.xml index 51f102e7fd8..52337d32b40 100644 --- a/docs/docs/index.xml +++ b/docs/docs/index.xml @@ -2203,7 +2203,7 @@ Prometheus itself scrapes various targets for metrics, as seen in the diagram be <p>Have a look at the <a href="https://cncf.landscape2.io/?item=platform--certified-kubernetes--installer--gardener">CNCF map</a> for more information or dive into the <a href="https://testgrid.k8s.io/conformance-gardener">testgrid</a> directly.</p> <p>Gardener itself is open-source. Under the umbrella of <a href="https://github.com/gardener">github.com/gardener</a> we develop the core functionalities as well as the extensions and you are welcome to contribute (by opening issues, feature requests or submitting code).</p> <p>Last time we counted, there were already 131 projects. That&rsquo;s actually more projects than members of the organization.</p> -<p>As of today, Gardener is mainly developed by SAP employees and SAP is an &ldquo;adopter&rdquo; as well, among StackIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the <a href="https://gardener.cloud/adopter/">Adopters page</a>.</p>Docs: Tutorialshttps://gardener.cloud/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/Mon, 01 Jan 0001 00:00:00 +0000https://gardener.cloud/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/ +<p>As of today, Gardener is mainly developed by SAP employees and SAP is an &ldquo;adopter&rdquo; as well, among STACKIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the <a href="https://gardener.cloud/adopter/">Adopters page</a>.</p>Docs: Tutorialshttps://gardener.cloud/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/Mon, 01 Jan 0001 00:00:00 +0000https://gardener.cloud/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/ <h3 id="overview">Overview</h3> <p>Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on AWS.</p> <h3 id="prerequisites">Prerequisites</h3> diff --git a/docs/offline-search-index.2a6392954410f8fd04518d369fbbb421.json b/docs/offline-search-index.2a6392954410f8fd04518d369fbbb421.json index 47fa25fe4fa..bbc58fea809 100644 --- a/docs/offline-search-index.2a6392954410f8fd04518d369fbbb421.json +++ b/docs/offline-search-index.2a6392954410f8fd04518d369fbbb421.json @@ -1 +1 @@ -[{"body":"Gardener API Reference authentication.gardener.cloud API Group core.gardener.cloud API Group extensions.gardener.cloud API Group operations.gardener.cloud API Group resources.gardener.cloud API Group security.gardener.cloud API Group seedmanagement.gardener.cloud API Group settings.gardener.cloud API Group ","categories":"","description":"","excerpt":"Gardener API Reference authentication.gardener.cloud API Group …","ref":"/docs/gardener/api-reference/","tags":"","title":"API Reference"},{"body":"Packages:\n druid.gardener.cloud/v1alpha1 druid.gardener.cloud/v1alpha1 Package v1alpha1 is the v1alpha1 version of the etcd-druid API.\nResource Types: BackupSpec (Appears on: EtcdSpec) BackupSpec defines parameters associated with the full and delta snapshots of etcd.\n Field Description port int32 (Optional) Port define the port on which etcd-backup-restore server will be exposed.\n tls TLSConfig (Optional) image string (Optional) Image defines the etcd container image and tag\n store StoreSpec (Optional) Store defines the specification of object store provider for storing backups.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources defines compute Resources required by backup-restore container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n compactionResources Kubernetes core/v1.ResourceRequirements (Optional) CompactionResources defines compute Resources required by compaction job. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n fullSnapshotSchedule string (Optional) FullSnapshotSchedule defines the cron standard schedule for full snapshots.\n garbageCollectionPolicy GarbageCollectionPolicy (Optional) GarbageCollectionPolicy defines the policy for garbage collecting old backups\n garbageCollectionPeriod Kubernetes meta/v1.Duration (Optional) GarbageCollectionPeriod defines the period for garbage collecting old backups\n deltaSnapshotPeriod Kubernetes meta/v1.Duration (Optional) DeltaSnapshotPeriod defines the period after which delta snapshots will be taken\n deltaSnapshotMemoryLimit k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) DeltaSnapshotMemoryLimit defines the memory limit after which delta snapshots will be taken\n compression CompressionSpec (Optional) SnapshotCompression defines the specification for compression of Snapshots.\n enableProfiling bool (Optional) EnableProfiling defines if profiling should be enabled for the etcd-backup-restore-sidecar\n etcdSnapshotTimeout Kubernetes meta/v1.Duration (Optional) EtcdSnapshotTimeout defines the timeout duration for etcd FullSnapshot operation\n leaderElection LeaderElectionSpec (Optional) LeaderElection defines parameters related to the LeaderElection configuration.\n ClientService (Appears on: EtcdConfig) ClientService defines the parameters of the client service that a user can specify\n Field Description annotations map[string]string (Optional) Annotations specify the annotations that should be added to the client service\n labels map[string]string (Optional) Labels specify the labels that should be added to the client service\n CompactionMode (string alias)\n (Appears on: SharedConfig) CompactionMode defines the auto-compaction-mode: ‘periodic’ or ‘revision’. ‘periodic’ for duration based retention and ‘revision’ for revision number based retention.\nCompressionPolicy (string alias)\n (Appears on: CompressionSpec) CompressionPolicy defines the type of policy for compression of snapshots.\nCompressionSpec (Appears on: BackupSpec) CompressionSpec defines parameters related to compression of Snapshots(full as well as delta).\n Field Description enabled bool (Optional) policy CompressionPolicy (Optional) Condition (Appears on: EtcdCopyBackupsTaskStatus, EtcdStatus) Condition holds the information about the state of a resource.\n Field Description type ConditionType Type of the Etcd condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n lastUpdateTime Kubernetes meta/v1.Time Last time the condition was updated.\n reason string The reason for the condition’s last transition.\n message string A human-readable message indicating details about the transition.\n ConditionStatus (string alias)\n (Appears on: Condition) ConditionStatus is the status of a condition.\nConditionType (string alias)\n (Appears on: Condition) ConditionType is the type of condition.\nCrossVersionObjectReference (Appears on: EtcdStatus) CrossVersionObjectReference contains enough information to let you identify the referred resource.\n Field Description kind string Kind of the referent\n name string Name of the referent\n apiVersion string (Optional) API version of the referent\n Etcd Etcd is the Schema for the etcds API\n Field Description metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec EtcdSpec selector Kubernetes meta/v1.LabelSelector selector is a label query over pods that should match the replica count. It must match the pod template’s labels. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors\n labels map[string]string annotations map[string]string (Optional) etcd EtcdConfig backup BackupSpec sharedConfig SharedConfig (Optional) schedulingConstraints SchedulingConstraints (Optional) replicas int32 priorityClassName string (Optional) PriorityClassName is the name of a priority class that shall be used for the etcd pods.\n storageClass string (Optional) StorageClass defines the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1\n storageCapacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageCapacity defines the size of persistent volume.\n volumeClaimTemplate string (Optional) VolumeClaimTemplate defines the volume claim template to be created\n status EtcdStatus EtcdConfig (Appears on: EtcdSpec) EtcdConfig defines parameters associated etcd deployed\n Field Description quota k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Quota defines the etcd DB quota.\n defragmentationSchedule string (Optional) DefragmentationSchedule defines the cron standard schedule for defragmentation of etcd.\n serverPort int32 (Optional) clientPort int32 (Optional) image string (Optional) Image defines the etcd container image and tag\n authSecretRef Kubernetes core/v1.SecretReference (Optional) metrics MetricsLevel (Optional) Metrics defines the level of detail for exported metrics of etcd, specify ‘extensive’ to include histogram metrics.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources defines the compute Resources required by etcd container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n clientUrlTls TLSConfig (Optional) ClientUrlTLS contains the ca, server TLS and client TLS secrets for client communication to ETCD cluster\n peerUrlTls TLSConfig (Optional) PeerUrlTLS contains the ca and server TLS secrets for peer communication within ETCD cluster Currently, PeerUrlTLS does not require client TLS secrets for gardener implementation of ETCD cluster.\n etcdDefragTimeout Kubernetes meta/v1.Duration (Optional) EtcdDefragTimeout defines the timeout duration for etcd defrag call\n heartbeatDuration Kubernetes meta/v1.Duration (Optional) HeartbeatDuration defines the duration for members to send heartbeats. The default value is 10s.\n clientService ClientService (Optional) ClientService defines the parameters of the client service that a user can specify\n EtcdCopyBackupsTask EtcdCopyBackupsTask is a task for copying etcd backups from a source to a target store.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec EtcdCopyBackupsTaskSpec sourceStore StoreSpec SourceStore defines the specification of the source object store provider for storing backups.\n targetStore StoreSpec TargetStore defines the specification of the target object store provider for storing backups.\n maxBackupAge uint32 (Optional) MaxBackupAge is the maximum age in days that a backup must have in order to be copied. By default all backups will be copied.\n maxBackups uint32 (Optional) MaxBackups is the maximum number of backups that will be copied starting with the most recent ones.\n waitForFinalSnapshot WaitForFinalSnapshotSpec (Optional) WaitForFinalSnapshot defines the parameters for waiting for a final full snapshot before copying backups.\n status EtcdCopyBackupsTaskStatus EtcdCopyBackupsTaskSpec (Appears on: EtcdCopyBackupsTask) EtcdCopyBackupsTaskSpec defines the parameters for the copy backups task.\n Field Description sourceStore StoreSpec SourceStore defines the specification of the source object store provider for storing backups.\n targetStore StoreSpec TargetStore defines the specification of the target object store provider for storing backups.\n maxBackupAge uint32 (Optional) MaxBackupAge is the maximum age in days that a backup must have in order to be copied. By default all backups will be copied.\n maxBackups uint32 (Optional) MaxBackups is the maximum number of backups that will be copied starting with the most recent ones.\n waitForFinalSnapshot WaitForFinalSnapshotSpec (Optional) WaitForFinalSnapshot defines the parameters for waiting for a final full snapshot before copying backups.\n EtcdCopyBackupsTaskStatus (Appears on: EtcdCopyBackupsTask) EtcdCopyBackupsTaskStatus defines the observed state of the copy backups task.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of an object’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n lastError string (Optional) LastError represents the last occurred error.\n EtcdMemberConditionStatus (string alias)\n (Appears on: EtcdMemberStatus) EtcdMemberConditionStatus is the status of an etcd cluster member.\nEtcdMemberStatus (Appears on: EtcdStatus) EtcdMemberStatus holds information about a etcd cluster membership.\n Field Description name string Name is the name of the etcd member. It is the name of the backing Pod.\n id string (Optional) ID is the ID of the etcd member.\n role EtcdRole (Optional) Role is the role in the etcd cluster, either Leader or Member.\n status EtcdMemberConditionStatus Status of the condition, one of True, False, Unknown.\n reason string The reason for the condition’s last transition.\n lastTransitionTime Kubernetes meta/v1.Time LastTransitionTime is the last time the condition’s status changed.\n EtcdRole (string alias)\n (Appears on: EtcdMemberStatus) EtcdRole is the role of an etcd cluster member.\nEtcdSpec (Appears on: Etcd) EtcdSpec defines the desired state of Etcd\n Field Description selector Kubernetes meta/v1.LabelSelector selector is a label query over pods that should match the replica count. It must match the pod template’s labels. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors\n labels map[string]string annotations map[string]string (Optional) etcd EtcdConfig backup BackupSpec sharedConfig SharedConfig (Optional) schedulingConstraints SchedulingConstraints (Optional) replicas int32 priorityClassName string (Optional) PriorityClassName is the name of a priority class that shall be used for the etcd pods.\n storageClass string (Optional) StorageClass defines the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1\n storageCapacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageCapacity defines the size of persistent volume.\n volumeClaimTemplate string (Optional) VolumeClaimTemplate defines the volume claim template to be created\n EtcdStatus (Appears on: Etcd) EtcdStatus defines the observed state of Etcd.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n etcd CrossVersionObjectReference (Optional) conditions []Condition (Optional) Conditions represents the latest available observations of an etcd’s current state.\n serviceName string (Optional) ServiceName is the name of the etcd service.\n lastError string (Optional) LastError represents the last occurred error.\n clusterSize int32 (Optional) Cluster size is the size of the etcd cluster.\n currentReplicas int32 (Optional) CurrentReplicas is the current replica count for the etcd cluster.\n replicas int32 (Optional) Replicas is the replica count of the etcd resource.\n readyReplicas int32 (Optional) ReadyReplicas is the count of replicas being ready in the etcd cluster.\n ready bool (Optional) Ready is true if all etcd replicas are ready.\n updatedReplicas int32 (Optional) UpdatedReplicas is the count of updated replicas in the etcd cluster.\n labelSelector Kubernetes meta/v1.LabelSelector (Optional) LabelSelector is a label query over pods that should match the replica count. It must match the pod template’s labels.\n members []EtcdMemberStatus (Optional) Members represents the members of the etcd cluster\n peerUrlTLSEnabled bool (Optional) PeerUrlTLSEnabled captures the state of peer url TLS being enabled for the etcd member(s)\n GarbageCollectionPolicy (string alias)\n (Appears on: BackupSpec) GarbageCollectionPolicy defines the type of policy for snapshot garbage collection.\nLeaderElectionSpec (Appears on: BackupSpec) LeaderElectionSpec defines parameters related to the LeaderElection configuration.\n Field Description reelectionPeriod Kubernetes meta/v1.Duration (Optional) ReelectionPeriod defines the Period after which leadership status of corresponding etcd is checked.\n etcdConnectionTimeout Kubernetes meta/v1.Duration (Optional) EtcdConnectionTimeout defines the timeout duration for etcd client connection during leader election.\n MetricsLevel (string alias)\n (Appears on: EtcdConfig) MetricsLevel defines the level ‘basic’ or ‘extensive’.\nSchedulingConstraints (Appears on: EtcdSpec) SchedulingConstraints defines the different scheduling constraints that must be applied to the pod spec in the etcd statefulset. Currently supported constraints are Affinity and TopologySpreadConstraints.\n Field Description affinity Kubernetes core/v1.Affinity (Optional) Affinity defines the various affinity and anti-affinity rules for a pod that are honoured by the kube-scheduler.\n topologySpreadConstraints []Kubernetes core/v1.TopologySpreadConstraint (Optional) TopologySpreadConstraints describes how a group of pods ought to spread across topology domains, that are honoured by the kube-scheduler.\n SecretReference (Appears on: TLSConfig) SecretReference defines a reference to a secret.\n Field Description SecretReference Kubernetes core/v1.SecretReference (Members of SecretReference are embedded into this type.) dataKey string (Optional) DataKey is the name of the key in the data map containing the credentials.\n SharedConfig (Appears on: EtcdSpec) SharedConfig defines parameters shared and used by Etcd as well as backup-restore sidecar.\n Field Description autoCompactionMode CompactionMode (Optional) AutoCompactionMode defines the auto-compaction-mode:‘periodic’ mode or ‘revision’ mode for etcd and embedded-Etcd of backup-restore sidecar.\n autoCompactionRetention string (Optional) AutoCompactionRetention defines the auto-compaction-retention length for etcd as well as for embedded-Etcd of backup-restore sidecar.\n StorageProvider (string alias)\n (Appears on: StoreSpec) StorageProvider defines the type of object store provider for storing backups.\nStoreSpec (Appears on: BackupSpec, EtcdCopyBackupsTaskSpec) StoreSpec defines parameters related to ObjectStore persisting backups\n Field Description container string (Optional) Container is the name of the container the backup is stored at.\n prefix string Prefix is the prefix used for the store.\n provider StorageProvider (Optional) Provider is the name of the backup provider.\n secretRef Kubernetes core/v1.SecretReference (Optional) SecretRef is the reference to the secret which used to connect to the backup store.\n TLSConfig (Appears on: BackupSpec, EtcdConfig) TLSConfig hold the TLS configuration details.\n Field Description tlsCASecretRef SecretReference serverTLSSecretRef Kubernetes core/v1.SecretReference clientTLSSecretRef Kubernetes core/v1.SecretReference (Optional) WaitForFinalSnapshotSpec (Appears on: EtcdCopyBackupsTaskSpec) WaitForFinalSnapshotSpec defines the parameters for waiting for a final full snapshot before copying backups.\n Field Description enabled bool Enabled specifies whether to wait for a final full snapshot before copying backups.\n timeout Kubernetes meta/v1.Duration (Optional) Timeout is the timeout for waiting for a final full snapshot. When this timeout expires, the copying of backups will be performed anyway. No timeout or 0 means wait forever.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n druid.gardener.cloud/v1alpha1 …","ref":"/docs/other-components/etcd-druid/api-reference/","tags":"","title":"API Reference"},{"body":"Dashboard Architecture Overview Overview The dashboard frontend is a Single Page Application (SPA) built with Vue.js. The dashboard backend is a web server built with Express and Node.js. The backend serves the bundled frontend as static content. The dashboard uses Socket.IO to enable real-time, bidirectional and event-based communication between the frontend and the backend. For the communication from the backend to different kube-apiservers the http/2 network protocol is used. Authentication at the apiserver of the garden cluster is done via JWT tokens. These can either be an ID Token issued by an OpenID Connect Provider or the token of a Kubernetes Service Account.\nFrontend The dashboard frontend consists of many Vue.js single file components that manage their state via a centralized store. The store defines mutations to modify the state synchronously. If several mutations have to be combined or the state in the backend has to be modified at the same time, the store provides asynchronous actions to do this job. The synchronization of the data with the backend is done by plugins that also use actions.\nBackend The backend is currently a monolithic Node.js application, but it performs several tasks that are actually independent.\n Static web server for the frontend single page application Forward real time events of the apiserver to the frontend Provide an HTTP API Initiate and manage the end user login flow in order to obtain an ID Token Bidirectional integration with the GitHub issue management It is planned to split the backend into several independent containers to increase stability and performance.\nAuthentication The following diagram shows the authorization code flow in the Gardener dashboard. When the user clicks the login button, he is redirected to the authorization endpoint of the openid connect provider. In the case of Dex IDP, authentication is delegated to the connected IDP. After a successful login, the OIDC provider redirects back to the dashboard backend with a one time authorization code. With this code, the dashboard backend can now request an ID token for the logged in user. The ID token is encrypted and stored as a secure httpOnly session cookie.\n","categories":"","description":"","excerpt":"Dashboard Architecture Overview Overview The dashboard frontend is a …","ref":"/docs/dashboard/architecture/","tags":"","title":"Architecture"},{"body":"Core Components The core Observability components which Gardener offers out-of-the-box are:\n Prometheus - for Metrics and Alerting Vali - a Loki fork for Logging Plutono - a Grafana fork for Dashboard visualization Both forks are done from the last version with an Apache license.\nControl Plane Components on the Seed Prometheus, Plutono, and Vali are all located in the seed cluster. They run next to the control plane of your cluster.\nThe next sections will explore those components in detail.\nNote Gardener only provides monitoring for Gardener-deployed components. If you need logging or monitoring for your workload, then you need to deploy your own monitoring stack into your shoot cluster. Note Gardener only provides a monitoring stack if the cluster is not of purpose: testing. For more information, see Shoot Cluster Purpose. Logging into Plutono Let us start by giving some visual hints on how to access Plutono. Plutono allows us to query logs and metrics and visualise those in form of dashboards. Plutono is shipped ready-to-use with a Gardener shoot cluster.\nIn order to access the Gardener provided dashboards, open the Plutono link provided in the Gardener dashboard and use the username and password provided next to it.\nThe password you can use to log in can be retrieved as shown below:\nAccessing the Dashboards After logging in, you will be greeted with a Plutono welcome screen. Navigate to General/Home, as depicted with the red arrow in the next picture:\nThen you will be able to select the dashboards. Some interesting ones to look at are:\n The Kubernetes Control Plane Status dashboard allows you to check control plane availability during a certain time frame. The API Server dashboard gives you an overview on which requests are done towards your apiserver and how long they take. With the Node Details dashboard you can analyze CPU/Network pressure or memory usage for nodes. The Network Problem Detector dashboard illustrates the results of periodic networking checks between nodes and to the APIServer. Here is a picture with the Kubernetes Control Plane Status dashboard.\nPrometheus Prometheus is a monitoring system and a time series database. It can be queried using PromQL, the so called Prometheus Querying Language.\nThis example query describes the current uptime status of the kube apiserver.\nPrometheus and Plutono Time series data from Prometheus can be made visible with Plutono. Here we see how the query above which describes the uptime of a Kubernetes cluster is visualized with a Plutono dashboard.\nVali Logs via Plutono Vali is our logging solution. In order to access the logs provided by Vali, you need to:\n Log into Plutono.\n Choose Explore, which is depicted as the little compass symbol:\n Select Vali at the top left, as shown here: There you can browse logs or events of the control plane components.\nHere are some examples of helpful queries:\n {container_name=\"cluster-autoscaler\" } to get cluster-autoscaler logs and see why certain node groups were scaled up. {container_name=\"kube-apiserver\"} |~ \"error\" to get the logs of the kube-apiserver container and filter for errors. {unit=\"kubelet.service\", nodename=\"ip-123\"} to get the kubelet logs of a specific node. {unit=\"containerd.service\", nodename=\"ip-123\"} to retrieve the containerd logs for a specific node. Choose Help \u003e in order to see what options exist to filter the results.\nFor more information on how to retrieve K8s events from the past, see How to Access Logs.\nDetailed View Data Flow Our monitoring and logging solutions Vali and Prometheus both run next to the control plane of the shoot cluster.\nData Flow - Logging The following diagram allows a more detailed look at Vali and the data flow.\nOn the very left, we see Plutono as it displays the logs. Vali is aggregating the logs from different sources.\nValitail and Fluentbit send the logs to Vali, which in turn stores them.\nValitail\nValitail is a systemd service that runs on each node. It scrapes kubelet, containerd, kernel logs, and the logs of the pods in the kube-system namespace.\nFluentbit\nFluentbit runs as a daemonset on each seed node. It scrapes logs of the kubernetes control plane components, like apiserver or etcd.\nIt also scrapes logs of the Gardener deployed components which run next to the control plane of the cluster, like the machine-controller-manager or the cluster autoscaler. Debugging those components, for example, would be helpful when finding out why certain worker groups got scaled up or why nodes were replaced.\nData Flow - Monitoring Next to each shoot’s control plane, we deploy an instance of Prometheus in the seed.\nGardener uses Prometheus for storing and accessing shoot-related metrics and alerting.\nThe diagram below shows the data flow of metrics. Plutono uses PromQL queries to query data from Prometheus. It then visualises those metrics in dashboards. Prometheus itself scrapes various targets for metrics, as seen in the diagram below by the arrows pointing to the Prometheus instance.\nLet us have a look what metrics we scrape for debugging purposes:\nContainer performance metrics\ncAdvisor is an open-source agent integrated into the kubelet binary that monitors resource usage and analyzes the performance of containers. It collects statistics about the CPU, memory, file, and network usage for all containers running on a given node. We use it to scrape data for all pods running in the kube-system namespace in the shoot cluster.\nHardware and kernel-related metrics\nThe Prometheus Node Exporter runs as a daemonset in the kube-system namespace of your shoot cluster. It exposes a wide variety of hardware and kernel-related metrics. Some of the metrics we scrape are, for example, the current usage of the filesystem (node_filesystem_free_bytes) or current CPU usage (node_cpu_seconds_total). Both can help you identify if nodes are running out of hardware resources, which could lead to your workload experiencing downtimes.\nControl plane component specific metrics\nThe different control plane pods (for example, etcd, API server, and kube-controller-manager) emit metrics over the /metrics endpoint. This includes metrics like how long webhooks take, the request count of the apiserver and storage information, like how many and what kind of objects are stored in etcd.\nMetrics about the state of Kubernetes objects\nkube-state-metrics is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. It is not concerned with metrics about the Kubernetes components, but rather it exposes metrics calculated from the status of Kubernetes objects (for example, resource requests or health of pods).\nIn the following image a few example metrics, which are exposed by the various components, are listed: We only store metrics for Gardener deployed components. Those include the Kubernetes control plane, Gardener managed system components (e.g., pods) in the kube-system namespace of the shoot cluster or systemd units on the nodes. We do not gather metrics for workload deployed in the shoot cluster. This is also shown in the picture below.\nThis means that for any workload you deploy into your shoot cluster, you need to deploy monitoring and logging yourself.\nLogs or metrics are kept up to 14 days or when a configured space limit is reached.\n","categories":"","description":"","excerpt":"Core Components The core Observability components which Gardener …","ref":"/docs/getting-started/observability/components/","tags":"","title":"Components"},{"body":"Dependency Watchdog \nOverview A watchdog which actively looks out for disruption and recovery of critical services. If there is a disruption then it will prevent cascading failure by conservatively scaling down dependent configured resources and if a critical service has just recovered then it will expedite the recovery of dependent services/pods.\nAvoiding cascading failure is handled by Prober component and expediting recovery of dependent services/pods is handled by Weeder component. These are separately deployed as individual pods.\nCurrent Limitation \u0026 Future Scope Although in the current offering the Prober is tailored to handle one such use case of kube-apiserver connectivity, but the usage of prober can be extended to solve similar needs for other scenarios where the components involved might be different.\nStart using or developing the Dependency Watchdog See our documentation in the /docs repository, please find the index here.\nFeedback and Support We always look forward to active community engagement.\nPlease report bugs or suggestions on how we can enhance dependency-watchdog to address additional recovery scenarios on GitHub issues\n","categories":"","description":"A watchdog which actively looks out for disruption and recovery of critical services","excerpt":"A watchdog which actively looks out for disruption and recovery of …","ref":"/docs/other-components/dependency-watchdog/","tags":"","title":"Dependency Watchdog"},{"body":"Welcome to the Gardener Getting Started section! Here you will be able to get accustomed to the way Gardener functions and learn how its components work together in order to seamlessly run Kubernetes clusters on various hyperscalers.\nThe following topics aim to be useful to both complete beginners and those already somewhat familiar with Gardener. While the content is structured, with Introduction serving as the starting point, if you’re feeling confident in your knowledge, feel free to skip to a topic you’re more interested in.\n","categories":"","description":"Gardener onboarding materials","excerpt":"Gardener onboarding materials","ref":"/docs/getting-started/","tags":"","title":"Getting Started"},{"body":"Hibernation Some clusters need to be up all the time - typically, they would be hosting some kind of production workload. Others might be used for development purposes or testing during business hours only. Keeping them up and running all the time is a waste of money. Gardener can help you here with its “hibernation” feature. Essentially, hibernation means to shut down all components of a cluster.\nHow Hibernation Works The hibernation flow for a shoot attempts to reduce the resources consumed as much as possible. Hence everything not state-related is being decommissioned.\nData Plane All nodes will be drained and the VMs will be deleted. As a result, all pods will be “stuck” in a Pending state since no new nodes are added. Of course, PVC / PV holding data is not deleted.\nServices of type LoadBalancer will keep their external IP addresses.\nControl Plane All components will be scaled down and no pods will remain running. ETCD data is kept safe on the disk.\nThe DNS records routing traffic for the API server are also destroyed. Trying to connect to a hibernated cluster via kubectl will result in a DNS lookup failure / no-such-host message.\nWhen waking up a cluster, all control plane components will be scaled up again and the DNS records will be re-created. Nodes will be created again and pods scheduled to run on them.\nHow to Configure / Trigger Hibernation The easiest way to configure hibernation schedules is via the dashboard. Of course, this is reflected in the shoot’s spec and can also be maintained there. Before a cluster is hibernated, constraints in the shoot’s status will be evaluated. There might be conditions (mostly revolving around mutating / validating webhooks) that would block a successful wake-up. In such a case, the constraint will block hibernation in the first place.\nTo wake-up or hibernate a shoot immediately, the dashboard can be used or a patch to the shoot’s spec can be applied directly.\n","categories":"","description":"","excerpt":"Hibernation Some clusters need to be up all the time - typically, they …","ref":"/docs/getting-started/features/hibernation/","tags":"","title":"Hibernation"},{"body":"","categories":"","description":"Gardener extension controllers for the different infrastructures","excerpt":"Gardener extension controllers for the different infrastructures","ref":"/docs/extensions/infrastructure-extensions/","tags":"","title":"Infrastructure Extensions"},{"body":"Problem Space Let’s discuss the problem space first. Why does anyone need something like Gardener?\nRunning Software The starting point is this rather simple question: Why would you want to run some software?\nTypically, software is run with a purpose and not just for the sake of running it. Whether it is a digital ledger, a company’s inventory or a blog - software provides a service to its user.\nWhich brings us to the way this software is being consumed. Traditionally, software has been shipped on physical / digital media to the customer or end user. There, someone had to install, configure, and operate it. In recent times, the pattern has shifted. More and more solutions are operated by the vendor or a hosting partner and sold as a service ready to be used.\nBut still, someone needs to install, configure, and maintain it - regardless of where it is installed. And of course, it will run forever once started and is generally resilient to any kind of failures.\nFor smaller installations things like maintenance, scaling, debugging or configuration can be done in a semi-automatic way. It’s probably no fun and most importantly, only a limited amount of instances can be taken care of - similar to how one would take care of a pet.\nBut when hosting services at scale, there is no way someone can do all this manually at acceptable costs. So we need some vehicle to easily spin up new instances, do lifecycle operations, get some basic failure resilience, and more. How can we achieve that?\nSolution Space 1 - Kubernetes Let’s start solving some of the problems described earlier with Container technology and Kubernetes.\nContainers Container technology is at the core of the solution space. A container forms a vehicle that is shippable, can easily run in any supported environment and generally adds a powerful abstraction layer to the infrastructure.\nHowever, plain containers do not help with resilience or scaling. Therefore, we need another system for orchestration.\nOrchestration “Classical” orchestration that just follows the “notes” and moves from state A to state B doesn’t solve all of our problems. We need something else.\nKubernetes operates on the principle of “desired state”. With it, you write a construction plan, then have controllers cycle through “observe -\u003e analyze -\u003e act” and transition the actual to the desired state. Those reconciliations ensure that whatever breaks there is a path back to a healthy state.\nSummary Containers (famously brought to the mainstream as “Docker”) and Kubernetes are the ingredients of a fundamental shift in IT. Similar to how the Operating System layer enabled the decoupling of software and hardware, container-related technologies provide an abstract interface to any kind of infrastructure platform for the next-generation of applications.\nSolution Space 2 - Gardener So, Kubernetes solves a lot of problems. But how do you get a Kubernetes cluster?\nEither:\n Buy a cluster as a service from an external vendor Run a Gardener instance and host yourself a cluster with its help Essentially, it was a “make or buy” decision that led to the founding of Gardener.\nThe Reason Why We Choose to “Make It” Gardener allows to run Kubernetes clusters on various hyperscalers. It offers the same set of basic configuration options independent of the chosen infrastructure. This kind of harmonization supports any multi-vendor strategy while reducing adoption costs for the individual teams. Just imagine having to deal with multiple vendors all offering vastly different Kubernetes clusters.\nOf course, there are plenty more reasons - from acquiring operational knowledge to having influence on the developed features - that made the pendulum swing towards “make it”.\nWhat exactly is Gardener? Gardener is a system to manage Kubernetes clusters. It is driven by the same “desired state” pattern as Kubernetes itself. In fact, it is using Kubernetes to run Kubernetes.\nA user may “desire” clusters with specific configuration on infrastructures such as GCP, AWS, Azure, Alicloud, Openstack, vsphere, … and Gardener will make sure to create such a cluster and keep it running.\nIf you take this rather simplistic principle of reconciliation and add the feature-richness of Gardener to it, you end up with universal Kubernetes at scale.\nWhether you need fleet management at minimal TCO or to look for a highly customizable control plane - we have it all.\nOn top of that, Gardener-managed Kubernetes clusters fulfill the conformance standard set out by the CNCF and we submit our test results for certification.\nHave a look at the CNCF map for more information or dive into the testgrid directly.\nGardener itself is open-source. Under the umbrella of github.com/gardener we develop the core functionalities as well as the extensions and you are welcome to contribute (by opening issues, feature requests or submitting code).\nLast time we counted, there were already 131 projects. That’s actually more projects than members of the organization.\nAs of today, Gardener is mainly developed by SAP employees and SAP is an “adopter” as well, among StackIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the Adopters page.\n","categories":"","description":"","excerpt":"Problem Space Let’s discuss the problem space first. Why does anyone …","ref":"/docs/getting-started/introduction/","tags":"","title":"Introduction to Gardener"},{"body":"machine-controller-manager \nNote One can add support for a new cloud provider by following Adding support for new provider.\nOverview Machine Controller Manager aka MCM is a group of cooperative controllers that manage the lifecycle of the worker machines. It is inspired by the design of Kube Controller Manager in which various sub controllers manage their respective Kubernetes Clients. MCM gives you the following benefits:\n seamlessly manage machines/nodes with a declarative API (of course, across different cloud providers) integrate generically with the cluster autoscaler plugin with tools such as the node-problem-detector transport the immutability design principle to machine/nodes implement e.g. rolling upgrades of machines/nodes MCM supports following providers. These provider code is maintained externally (out-of-tree), and the links for the same are linked below:\n Alicloud AWS Azure Equinix Metal GCP KubeVirt Metal Stack Openstack V Sphere Yandex It can easily be extended to support other cloud providers as well.\nExample of managing machine:\nkubectl create/get/delete machine vm1 Key terminologies Nodes/Machines/VMs are different terminologies used to represent similar things. We use these terms in the following way\n VM: A virtual machine running on any cloud provider. It could also refer to a physical machine (PM) in case of a bare metal setup. Node: Native kubernetes node objects. The objects you get to see when you do a “kubectl get nodes”. Although nodes can be either physical/virtual machines, for the purposes of our discussions it refers to a VM. Machine: A VM that is provisioned/managed by the Machine Controller Manager. Design of Machine Controller Manager The design of the Machine Controller Manager is influenced by the Kube Controller Manager, where-in multiple sub-controllers are used to manage the Kubernetes clients.\nDesign Principles It’s designed to run in the master plane of a Kubernetes cluster. It follows the best principles and practices of writing controllers, including, but not limited to:\n Reusing code from kube-controller-manager leader election to allow HA deployments of the controller workqueues and multiple thread-workers SharedInformers that limit to minimum network calls, de-serialization and provide helpful create/update/delete events for resources rate-limiting to allow back-off in case of network outages and general instability of other cluster components sending events to respected resources for easy debugging and overview Prometheus metrics, health and (optional) profiling endpoints Objects of Machine Controller Manager Machine Controller Manager reconciles a set of Custom Resources namely MachineDeployment, MachineSet and Machines which are managed \u0026 monitored by their controllers MachineDeployment Controller, MachineSet Controller, Machine Controller respectively along with another cooperative controller called the Safety Controller.\nMachine Controller Manager makes use of 4 CRD objects and 1 Kubernetes secret object to manage machines. They are as follows:\n Custom ResourceObject Description MachineClass A MachineClass represents a template that contains cloud provider specific details used to create machines. Machine A Machine represents a VM which is backed by the cloud provider. MachineSet A MachineSet ensures that the specified number of Machine replicas are running at a given point of time. MachineDeployment A MachineDeployment provides a declarative update for MachineSet and Machines. Secret A Secret here is a Kubernetes secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. See here for CRD API Documentation\nComponents of Machine Controller Manager Controller Description MachineDeployment controller Machine Deployment controller reconciles the MachineDeployment objects and manages the lifecycle of MachineSet objects. MachineDeployment consumes provider specific MachineClass in its spec.template.spec which is the template of the VM spec that would be spawned on the cloud by MCM. MachineSet controller MachineSet controller reconciles the MachineSet objects and manages the lifecycle of Machine objects. Safety controller There is a Safety Controller responsible for handling the unidentified or unknown behaviours from the cloud providers. Safety Controller: freezes the MachineDeployment controller and MachineSet controller if the number of Machine objects goes beyond a certain threshold on top of Spec.replicas. It can be configured by the flag --safety-up or --safety-down and also --machine-safety-overshooting-period`. freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. Along with the above Custom Controllers and Resources, MCM requires the MachineClass to use K8s Secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. All these controllers work in an co-operative manner. They form a parent-child relationship with MachineDeployment Controller being the grandparent, MachineSet Controller being the parent, and Machine Controller being the child.\nDevelopment To start using or developing the Machine Controller Manager, see the documentation in the /docs repository.\nFAQ An FAQ is available here.\ncluster-api Implementation cluster-api branch of machine-controller-manager implements the machine-api aspect of the cluster-api project. Link: https://github.com/gardener/machine-controller-manager/tree/cluster-api Once cluster-api project gets stable, we may make master branch of MCM as well cluster-api compliant, with well-defined migration notes. ","categories":"","description":"Declarative way of managing machines for Kubernetes cluster","excerpt":"Declarative way of managing machines for Kubernetes cluster","ref":"/docs/other-components/machine-controller-manager/","tags":"","title":"Machine Controller Manager"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/tutorials/","tags":"","title":"Tutorials"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on AWS.\nPrerequisites You have created an AWS account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a Project.\n Choose Secrets, then the plus icon and select AWS.\n To copy the policy for AWS from the Gardener dashboard, click on the help icon for AWS secrets, and choose copy .\n Create a new policy in AWS:\n Choose Create policy.\n Paste the policy that you copied from the Gardener dashboard to this custom policy.\n Choose Next until you reach the Review section.\n Fill in the name and description, then choose Create policy.\n Create a new technical user in AWS:\n Type in a username and select the access key credential type.\n Choose Attach an existing policy.\n Select GardenerAccess from the policy list.\n Choose Next until you reach the Review section.\n Note Note: After the user is created, Access key ID and Secret access key are generated and displayed. Remember to save them. The Access key ID is used later to create secrets for Gardener. On the Gardener dashboard, choose Secrets and then the plus sign . Select AWS from the drop down menu to add a new AWS secret.\n Create your secret.\n Type the name of your secret. Copy and paste the Access Key ID and Secret Access Key you saved when you created the technical user on AWS. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select AWS in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/tutorials/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/tutorials/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"Walkthroughs of common activities","excerpt":"Walkthroughs of common activities","ref":"/docs/guides/","tags":"","title":"Guides"},{"body":"Overview In this overview, we want to present two ways to receive alerts for control plane and Gardener managed system-components:\n Predefined Gardener alerts Custom alerts Predefined Control Plane Alerts In the shoot spec it is possible to configure emailReceivers. On this email address you will automatically receive email notifications for predefined alerts of your control plane. Such alerts are deployed in the shoot Prometheus and have visibility owner or all. For more alert details, shoot owners can use this visibility to find these alerts in their shoot Prometheus UI.\nspec: monitoring: alerting: emailReceivers: - john.doe@example.com For more information, see Alerting.\nCustom Alerts - Federation If you need more customization for alerts for control plane metrics, you have the option to deploy your own Prometheus into your shoot control plane.\nThen you can use federation, which is a Prometheus feature, to forward the metrics from the Gardener managed Prometheus to your custom deployed Prometheus. Since as a shoot owner you do not have access to the control plane pods, this is the only way to get those metrics.\nThe credentials and endpoint for the Gardener managed Prometheus are exposed over the Gardener dashboard or programmatically in the garden project as a secret (\u003cshoot-name\u003e.monitoring).\n","categories":"","description":"","excerpt":"Overview In this overview, we want to present two ways to receive …","ref":"/docs/getting-started/observability/alerts/","tags":"","title":"Alerts"},{"body":"Kubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn the classic setup, there is a dedicated host / VM to host the master components / control plane of a Kubernetes cluster. However, these are just normal programs that can easily be put into containers. Once in containers, Kubernetes Deployments and StatefulSets (for the etcd) can be made to watch over them. And by putting all that into a separate, dedicated Kubernetes cluster you get Kubernetes on Kubernetes, aka Kubeception (named after the famous movie Inception with Leonardo DiCaprio).\nBut what are the advantages of running Kubernetes on Kubernetes? For one, it makes use of resources more reasonably. Instead of providing a dedicated computer or virtual machine for the control plane of a Kubernetes cluster - which will probably never be the right size but either too small or too big - you can dynamically scale the individual control plane components based on demand and maximize resource usage by combining the control planes of multiple Kubernetes clusters.\nIn addition to that, it helps introducing a first layer of high availability. What happens if the API server suddenly stops responding to requests? In a traditional setup, someone would have to find out and manually restart the API server. In the Kubeception model, the API server is a Kubernetes Deployment and of course, it has sophisticated liveness- and readiness-probes. Should the API server fail, its liveness-probe will fail too and the pod in question simply gets restarted automatically - sometimes even before anybody would have noticed about the API server being unresponsive.\nIn Gardener’s terminology, the cluster hosting the control plane components is called a seed cluster. The cluster that end users actually use (and whose control plane is hosted in the seed) is called a shoot cluster.\nThe worker nodes of a shoot cluster are plain, simple virtual machines in a hyperscaler (EC2 instances in AWS, GCE instances in GCP or ECS instances in Alibaba Cloud). They run an operating system, a container runtime (e.g., containerd), and the kubelet that gets configured during node bootstrap to connect to the shoot’s API server. The API server in turn runs in the seed cluster and is exposed through an ingress. This connection happens over public internet and is - of course - TLS encrypted.\nIn other terms: you use Kubernetes to run Kubernetes.\nCluster Hierarchy in Gardener Gardener uses many Kubernetes clusters to eventually provide you with your very own shoot cluster.\nAt the heart of Gardener’s cluster hierarchy is the garden cluster. Since Gardener is 100% Kubernetes native, a Kubernetes cluster is needed to store all Gardener related resources. The garden cluster is actually nodeless - it only consists of a control plane, an API server (actually two), an etcd, and a bunch of controllers. The garden cluster is the central brain of a Gardener landscape and the one you connect to in order to create, modify or delete shoot clusters - either with kubectl and a dedicated kubeconfig or through the Gardener dashboard.\nThe seed clusters are next in the hierarchy - they are the clusters which will host the “kubeceptioned” control planes of the shoot clusters. For every hyperscaler supported in a Gardener landscape, there would be at least one seed cluster. However, to reduce latencies as well as for scaling, Gardener landscapes have several different seeds in different regions across the globe to keep the distance between control planes and actual worker nodes small.\nFinally, there are the shoot clusters - what Gardener is all about. Shoot clusters are the clusters which you create through Gardener and which your workload gets deployed to.\nGardener Components Overview From a very high level point of view, the important components of Gardener are:\nThe Gardener API Endpoint You can connect to the Gardener API Endpoint (i.e., the API server in the garden cluster) either through the dashboard or with kubectl, given that you have a proper kubeconfig for it.\nThe Seeds Running the Shoot Cluster Control Planes Inside each seed is one of the most important controllers in Gardener - the gardenlet. It spawns many other controllers, which will eventually create all resources for a shoot cluster, including all resources on the cloud providers such as virtual networks, security groups, and virtual machines.\nGardener’s API Endpoint Kubernetes’ API can be extended - either by CRDs or by API aggregation.\nAPI aggregation involves setting up a so called extension-API-server and registering it with the main Kubernetes API server. The extension API server will then serve resources of custom-defined API groups on its own. While the main Kubernetes API server is still used to handle RBAC, authorization, namespacing, quotas, limits, etc., all custom resources will be delegated to the extension-API-server. This is done through an APIService resource in the main API server - it specifies that, e.g., the API group core.gardener.cloud is served by a dedicated extension-API-server and all requests concerning this API group should be forwarded the specified IP address or Kubernetes service name. Extension API servers can persist their resources in their very own etcd but they do not have to - instead, they can use the main API servers etcd as well.\nGardener uses its very own extension API server for its resources like Shoot, Seed, CloudProfile, SecretBinding, etc… However, Gardener does not set up a dedicated etcd for its own extension API server - instead, it reuses the existing etcd of the main Kubernetes API server. This is absolutely possible since the resources of Gardener’s API are part of the API group gardener.cloud and thus will not interfere with any resources of the main Kubernetes API in etcd.\nIn case you are interested, you can read more on:\n API Extension API Aggregation APIService Resource Gardener API Resources Since Gardener’s API endpoint is a regular Kubernetes cluster, it would theoretically serve all resources from the Kubernetes core API, including Pods, Deployments, etc. However, Gardener implements RBAC rules and disables certain controllers that make these resources inaccessible. Objects like Secrets, Namespaces, and ResourceQuotas are still available, though, as they play a vital role in Gardener.\nIn addition, through Gardener’s extension API server, the API endpoint also serves Gardener’s custom resources like Projects, Shoots, CloudProfiles, Seeds, SecretBindings (those are relevant for users), ControllerRegistrations, ControllerDeployments, BackupBuckets, BackupEntries (those are relevant to an operator), etc.\n","categories":"","description":"","excerpt":"Kubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn …","ref":"/docs/getting-started/architecture/","tags":"","title":"Architecture"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/concepts/","tags":"","title":"Concepts"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/dependency-watchdog/concepts/","tags":"","title":"Concepts"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/documents/","tags":"","title":"Documents"},{"body":"","categories":"","description":"Gardener extension controllers for the supported operating systems","excerpt":"Gardener extension controllers for the supported operating systems","ref":"/docs/extensions/os-extensions/","tags":"","title":"Operating System Extensions"},{"body":"Controlplane as a Service Sometimes, there may be use cases for Kubernetes clusters that don’t require pods but only features of the control plane. Gardener can create the so-called “workerless” shoots, which are exactly that. A Kubernetes cluster without nodes (and without any controller related to them).\nIn a scenario where you already have multiple clusters, you can use it for orchestration (leases) or factor out components that require many CRDs.\nAs part of the control plane, the following components are deployed in the seed cluster for workerless shoot:\n etcds kube-apiserver kube-controller-manager gardener-resource-manager Logging and monitoring components Extension components (to find out if they support workerless shoots, see the Extensions documentation) ","categories":"","description":"","excerpt":"Controlplane as a Service Sometimes, there may be use cases for …","ref":"/docs/getting-started/features/workerless-shoots/","tags":"","title":"Workerless Shoots"},{"body":"","categories":"","description":"Make sure that your clusters are compliant and secure","excerpt":"Make sure that your clusters are compliant and secure","ref":"/docs/security-and-compliance/","tags":"","title":"Security and Compliance"},{"body":"Keys There are plenty of keys in Gardener. The ETCD needs one to store resources like secrets encrypted at rest. Gardener generates certificate authorities (CAs) to ensure secured communication between the various components and actors and service account tokens are signed with a dedicated key. There is also an SSH key pair to allow debugging of nodes and the observability stack has its own passwords too.\nAll of these keys share a common property: they are managed by Gardener. Rotating them, however, is potentially very disruptive. Hence, Gardener does not do it automatically, but offers you means to perform these tasks easily. For a single cluster, you may conveniently use the dashboard. Of course, it is also possible to do the same by annotating the shoot resource accordingly:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-start kubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-complete​ Where possible, the rotation happens in two phases - Preparing and Completing. The Preparing phase introduces new keys while the old ones are still valid. Users can safely exchange keys / CA bundles wherever they are used. Afterwards, the Completing phase will invalidate the old keys / CA bundles.\nRotation Phases At the beginning, only the old set of credentials exists. By triggering the rotation, new credentials are created in the Preparing phase and both sets are valid. Now, all clients have to update and start using the new credentials. Only afterwards it is safe to trigger the Completing phase, which invalidates the old credentials.\nThe shoot’s status will always show the current status / phase of the rotation.\nFor more information, see Credentials Rotation for Shoot Clusters.\nUser-Provided Credentials You grant Gardener permissions to create resources by handing over cloud provider keys. These keys are stored in a secret and referenced to a shoot via a SecretBinding. Gardener uses the keys to create the network for the cluster resources, routes, VMs, disks, and IP addresses.\nWhen you rotate credentials, the new keys have to be stored in the same secret and the shoot needs to reconcile successfully to ensure the replication to every controller. Afterwards, the old keys can be deleted safely from Gardener’s perspective.\nWhile the reconciliation can be triggered manually, there is no need for it (if you’re not in a hurry). Each shoot reconciles once within 24h and the new keys will be picked up during the next maintenance window.\nNote It is not possible to move a shoot to a different infrastructure account (at all!). ","categories":"","description":"","excerpt":"Keys There are plenty of keys in Gardener. The ETCD needs one to store …","ref":"/docs/getting-started/features/credential-rotation/","tags":"","title":"Credential Rotation"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/dependency-watchdog/deployment/","tags":"","title":"Deployment"},{"body":"Extensibility Overview Initially, everything was developed in-tree in the Gardener project. All cloud providers and the configuration for all the supported operating systems were released together with the Gardener core itself. But as the project grew, it got more and more difficult to add new providers and maintain the existing code base. As a consequence and in order to become agile and flexible again, we proposed GEP-1 (Gardener Enhancement Proposal). The document describes an out-of-tree extension architecture that keeps the Gardener core logic independent of provider-specific knowledge (similar to what Kubernetes has achieved with out-of-tree cloud providers or with CSI volume plugins).\nBasic Concepts Gardener keeps running in the “garden cluster” and implements the core logic of shoot cluster reconciliation / deletion. Extensions are Kubernetes controllers themselves (like Gardener) and run in the seed clusters. As usual, we try to use Kubernetes wherever applicable. We rely on Kubernetes extension concepts in order to enable extensibility for Gardener. The main ideas of GEP-1 are the following:\n During the shoot reconciliation process, Gardener will write CRDs into the seed cluster that are watched and managed by the extension controllers. They will reconcile (based on the .spec) and report whether everything went well or errors occurred in the CRD’s .status field.\n Gardener keeps deploying the provider-independent control plane components (etcd, kube-apiserver, etc.). However, some of these components might still need little customization by providers, e.g., additional configuration, flags, etc. In this case, the extension controllers register webhooks in order to manipulate the manifests.\n Example 1:\nGardener creates a new AWS shoot cluster and requires the preparation of infrastructure in order to proceed (networks, security groups, etc.). It writes the following CRD into the seed cluster:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--core--aws-01 spec: type: aws providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a dns: apiserver: api.aws-01.core.example.com region: eu-west-1 secretRef: name: my-aws-credentials sshPublicKey: | base64(key) Please note that the .spec.providerConfig is a raw blob and not evaluated or known in any way by Gardener. Instead, it was specified by the user (in the Shoot resource) and just “forwarded” to the extension controller. Only the AWS controller understands this configuration and will now start provisioning/reconciling the infrastructure. It reports in the .status field the result:\nstatus: observedGeneration: ... state: ... lastError: .. lastOperation: ... providerStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus vpc: id: vpc-1234 subnets: - id: subnet-acbd1234 name: workers zone: eu-west-1 securityGroups: - id: sg-xyz12345 name: workers iam: nodesRoleARN: \u003csome-arn\u003e instanceProfileName: foo ec2: keyName: bar Gardener waits until the .status.lastOperation / .status.lastError indicates that the operation reached a final state and either continuous with the next step, or stops and reports the potential error. The extension-specific output in .status.providerStatus is - similar to .spec.providerConfig - not evaluated, and simply forwarded to CRDs in subsequent steps.\nExample 2:\nGardener deploys the control plane components into the seed cluster, e.g. the kube-controller-manager deployment with the following flags:\napiVersion: apps/v1 kind: Deployment ... spec: template: spec: containers: - command: - /usr/local/bin/kube-controller-manager - --allocate-node-cidrs=true - --attach-detach-reconcile-sync-period=1m0s - --controllers=*,bootstrapsigner,tokencleaner - --cluster-cidr=100.96.0.0/11 - --cluster-name=shoot--core--aws-01 - --cluster-signing-cert-file=/srv/kubernetes/ca/ca.crt - --cluster-signing-key-file=/srv/kubernetes/ca/ca.key - --concurrent-deployment-syncs=10 - --concurrent-replicaset-syncs=10 ... The AWS controller requires some additional flags in order to make the cluster functional. It needs to provide a Kubernetes cloud-config and also some cloud-specific flags. Consequently, it registers a MutatingWebhookConfiguration on Deployments and adds these flags to the container:\n - --cloud-provider=external - --external-cloud-volume-plugin=aws - --cloud-config=/etc/kubernetes/cloudprovider/cloudprovider.conf Of course, it would have needed to create a ConfigMap containing the cloud config and to add the proper volume and volumeMounts to the manifest as well.\n(Please note for this special example: The Kubernetes community is also working on making the kube-controller-manager provider-independent. However, there will most probably be still components other than the kube-controller-manager which need to be adapted by extensions.)\nIf you are interested in writing an extension, or generally in digging deeper to find out the nitty-gritty details of the extension concepts, please read GEP-1. We are truly looking forward to your feedback!\nCurrent Status Meanwhile, the out-of-tree extension architecture of Gardener is in place and has been productively validated. We are tracking all internal and external extensions of Gardener in the Gardener Extensions Library repo.\n","categories":"","description":"","excerpt":"Extensibility Overview Initially, everything was developed in-tree in …","ref":"/docs/gardener/extensions/","tags":"","title":"Extensions"},{"body":"Documentation Index Overview General Architecture Gardener landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Concepts Components Gardener API server In-Tree admission plugins Gardener Controller Manager Gardener Scheduler Gardener Admission Controller Gardener Resource Manager Gardener Operator Gardener Node Agent Gardenlet Backup Restore etcd Relation between Gardener API and Cluster API Usage Audit a Kubernetes cluster Cleanup of Shoot clusters in deletion containerd Registry Configuration Custom containerd configuration Custom CoreDNS configuration (Custom) CSI components Default Seccomp Profile DNS Autoscaling DNS Search Path Optimization Endpoints and Ports of a Shoot Control-Plane ETCD Encryption Config ExposureClasses Hibernate a Cluster IPv6 in Gardener Clusters Logging NodeLocalDNS feature OpenIDConnect presets Projects Service Account Manager Readiness of Shoot Worker Nodes Reversed Cluster VPN Shoot Cluster Purposes Shoot Scheduling Profiles Shoot Credentials Rotation Shoot Kubernetes and Operating System Versioning Shoot KUBERNETES_SERVICE_HOST Environment Variable Injection Shoot Networking Shoot Maintenance Shoot ServiceAccount Configurations Shoot Status Shoot Info ConfigMap Shoot Updates and Upgrades Shoot Auto-Scaling Configuration Shoot Pod Auto-Scaling Best Practices Shoot High-Availability Control Plane Shoot High-Availability Best Practices Shoot Workers Settings Accessing Shoot Clusters Supported Kubernetes versions Tolerations Trigger shoot operations Trusted TLS certificate for shoot control planes Trusted TLS certificate for garden runtime cluster Controlling the Kubernetes versions for specific worker pools Admission Configuration for the PodSecurity Admission Plugin Supported CPU Architectures for Shoot Worker Nodes Workerless Shoots API Reference authentication.gardener.cloud API Group core.gardener.cloud API Group extensions.gardener.cloud API Group operations.gardener.cloud API Group resources.gardener.cloud API Group security.gardener.cloud API Group seedmanagement.gardener.cloud API Group settings.gardener.cloud API Group Proposals GEP: Gardener Enhancement Proposal Description GEP: Template GEP-1: Gardener extensibility and extraction of cloud-specific/OS-specific knowledge GEP-2: BackupInfrastructure CRD and Controller Redesign GEP-3: Network extensibility GEP-4: New core.gardener.cloud/v1beta1 APIs required to extract cloud-specific/OS-specific knowledge out of Gardener core GEP-5: Gardener Versioning Policy GEP-6: Integrating etcd-druid with Gardener GEP-7: Shoot Control Plane Migration GEP-8: SNI Passthrough proxy for kube-apiservers GEP-9: Gardener integration test framework GEP-10: Support additional container runtimes GEP-11: Utilize API Server Network Proxy to Invert Seed-to-Shoot Connectivity GEP-12: OIDC Webhook Authenticator GEP-13: Automated Seed Management GEP-14: Reversed Cluster VPN GEP-15: Manage Bastions and SSH Key Pair Rotation GEP-16: Dynamic kubeconfig generation for Shoot clusters GEP-17: Shoot Control Plane Migration “Bad Case” Scenario GEP-18: Automated Shoot CA Rotation GEP-19: Observability Stack - Migrating to the prometheus-operator and fluent-bit operator GEP-20: Highly Available Shoot Control Planes GEP-21: IPv6 Single-Stack Support in Local Gardener GEP-22: Improved Usage of the ShootState API GEP-23: Autoscaling Shoot kube-apiserver via Independently Driven HPA and VPA GEP-24: Shoot OIDC Issuer GEP-25: Namespaced Cloud Profiles GEP-26: Workload Identity - Trust Based Authentication GEP-27: Add Optional Bastion Section To CloudProfile Development Getting started locally (using the local provider) Setting up a development environment (using a cloud provider) Testing (Unit, Integration, E2E Tests) Test Machinery Tests Dependency Management Kubernetes Clients in Gardener Logging in Gardener Components Changing the API Secrets Management for Seed and Shoot Clusters Releases, Features, Hotfixes Adding New Cloud Providers Adding Support For A New Kubernetes Version Extending the Monitoring Stack How to create log parser for container into fluent-bit PriorityClasses in Gardener Clusters High Availability Of Deployed Components Checklist For Adding New Components Defaulting Strategy and Developer Guideline Extensions Extensibility overview Extension controller registration Cluster resource Extension points General conventions Trigger for reconcile operations Deploy resources into the shoot cluster Shoot resource customization webhooks Logging and monitoring for extensions Contributing to shoot health status conditions Health Check Library CA Rotation in Extensions Blob storage providers BackupBucket resource BackupEntry resource DNS providers DNSRecord resources IaaS/Cloud providers Control plane customization webhooks Bastion resource ControlPlane resource ControlPlane exposure resource Infrastructure resource Worker resource Network plugin providers Network resource Operating systems OperatingSystemConfig resource Container runtimes ContainerRuntime resource Generic (non-essential) extensions Extension resource Extension Admission Heartbeat controller Provider Local machine-controller-manager-provider-local Access to the Garden Cluster Control plane migration Force Deletion Extending project roles Referenced resources Deployment Getting started locally Getting started locally with extensions Setup Gardener on a Kubernetes cluster Version Skew Policy Deploying Gardenlets Automatic Deployment of Gardenlets Deploy a Gardenlet Manually Scoped API Access for Gardenlets Overwrite image vector Migration from Gardener v0 to v1 Feature Gates in Gardener Configuring the Logging stack SecretBinding Provider Controller Operations Gardener configuration and usage Control Plane Migration Istio ManagedSeeds: Register Shoot as Seed NetworkPolicys In Garden, Seed, Shoot Clusters Seed Bootstrapping Seed Settings Topology-Aware Traffic Routing Monitoring Alerting Connectivity Profiling Gardener Components ","categories":"","description":"The core component providing the extension API server of your Kubernetes cluster","excerpt":"The core component providing the extension API server of your …","ref":"/docs/gardener/","tags":"","title":"Gardener"},{"body":"Overview Gardener is all about Kubernetes clusters, which we call shoots. However, Gardener also does user management, delicate permission management and offers technical accounts to integrate its services into other infrastructures. It allows you to create several quotas and it needs credentials to connect to cloud providers. All of these are arranged in multiple fully contained projects, each of which belongs to a dedicated user and / or group.\nProjects on YAML Level Projects are a Kubernetes resource which can be expressed by YAML. The resource specification can be found in the API reference documentation.\nA project’s specification defines a name, a description (which is a free-text field), a purpose (again, a free-text field), an owner, and members. In Gardener, user management is done on a project level. Therefore, projects can have different members with certain roles.\nIn Gardener, a user can have one of five different roles: owner, admin, viewer, UAM, and service account manager. A member with the viewer role can see and list all clusters but cannot create, delete or modify them. For that, a member would need the admin role. Another important role would be the uam role - members with that role are allowed to manage members and technical users for a project. The owner of a project is allowed to do all of that, regardless of what other roles might be assigned to him.\nProjects are getting reconciled by Gardener’s project-controller, a component of Gardener’s controller manager. The status of the last reconcilation, along with any potential failures, will be recorded in the project’s status field.\nFor more information, see Projects.\nIn case you are interested, you can also view the source code for:\n The structure of a project API object Reconciling a project Gardener Projects and Kubernetes Namespaces Note Each Gardener project corresponds to a Kubernetes namespace and all project specific resources are placed into it. Even though projects are a dedicated Kubernetes resource, every project also corresponds to a dedicated namespace in the garden cluster. All project resources - including shoots - are placed into this namespace.\nYou can ask Gardener to use a specific namespace name in the project manifest but usually, this field should be left empty. The namespace then gets created automatically by Gardener’s project-controller, with its name getting generated from the project’s name, prefixed by “garden-”.\nResourceQuotas - if any - will be enforced on the project namespace.\nQuotas Since all Gardener resources are custom Kubernetes resources, the usual and well established concept of resourceQuotas in Kubernetes can also be applied to Gardener resources. With a resourceQuota that sets a hard limit on, e.g., count/shoots.core.gardener.cloud, you can restrict the number of shoot clusters that can be created in a project. Infrastructure Secrets For Gardener to create all relevant infrastructure that a shoot cluster needs inside a cloud provider, it needs to know how to authenticate to the cloud provider’s API. This is done through regular secrets.\nThrough the Gardener dashboard, secrets can be created for each supported cloud provider (using the dashboard is the preferred way, as it provides interactive help on what information needs to be placed into the secret and how the corresponding user account on the cloud provider should be configured). All of that is stored in a standard, opaque Kubernetes secret.\nInside of a shoot manifest, a reference to that secret is given so that Gardener knows which secret to use for a given shoot. Consequently, different shoots, even though they are in the same project, can be created on multiple different cloud provider accounts. However, instead of referring to the secret directly, Gardener introduces another layer of indirection called a SecretBinding.\nIn the shoot manifest, we refer to a SecretBinding and the SecretBinding in turn refers to the actual secret.\nSecretBindings With SecretBindings, it is possible to reference the same infrastructure secret in different projects across namespaces. This has the following advantages:​\n Infrastructure secrets can be kept in one project (and thus namespace) with limited access. Through SecretsBindings, the secrets can be used in other projects (and thus namespaces) without being able to read their contents.​ Infrastructure secrets can be kept at one central place (a dedicated project) and be used by many other projects. This way, if a credential rotation is required, they only need to be changed in the secrets at that central place and not in all projects that reference them. Service Accounts Since Gardener is 100% Kubernetes, it can be easily used in a programmatic way - by just sending the resource manifest of a Gardener resource to its API server. To do so, a kubeconfig file and a (technical) user that the kubeconfig maps to are required.\nNext to project members, a project can have several service accounts - simple Kubernetes service accounts that are created in a project’s namespace. Consequently, every service account will also have its own, dedicated kubeconfig and they can be granted different roles through RoleBindings.\nTo integrate Gardener with other infrastructure or CI/CD platforms, one can create a service account, obtain its kubeconfig and then automatically send shoot manifests to the Gardener API server. With that, Kubernetes clusters can be created, modified or deleted on the fly whenever they are needed.\n","categories":"","description":"","excerpt":"Overview Gardener is all about Kubernetes clusters, which we call …","ref":"/docs/getting-started/project/","tags":"","title":"Gardener Projects"},{"body":"","categories":"","description":"Gardener extension controllers for the supported container network interfaces","excerpt":"Gardener extension controllers for the supported container network …","ref":"/docs/extensions/network-extensions/","tags":"","title":"Network Extensions"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/proposals/","tags":"","title":"Proposals"},{"body":"Overview In this topic you can see various shoot statuses and how you can use them to monitor your shoot cluster.\nShoot Status - Conditions You can retrieve the shoot status by using kubectl get shoot -oyaml\nIt contains conditions, which give you information about the healthiness of your cluster. Those conditions are also forwarded to the Gardener dashboard and show your cluster as healthy or unhealthy.\nShoot Status - Constraints The shoot status also contains constraints. If these constraints are met, your cluster operations are impaired and the cluster is likely to fail at some point. Please watch them and act accordingly.\nShoot Status - Last Operation The lastOperation, lastErrors, and lastMaintenance give you information on what was last happening in your clusters. This is especially useful when you are facing an error.\nIn this example, nodes are being recreated and not all machines have reached the desired state yet.\nShoot Status - Credentials Rotation You can also see the status of the last credentials rotation. Here you can also programmatically derive when the last rotation was down in order to trigger the next rotation.\n","categories":"","description":"","excerpt":"Overview In this topic you can see various shoot statuses and how you …","ref":"/docs/getting-started/observability/shoot-status/","tags":"","title":"Shoot Status"},{"body":"","categories":"","description":"The infrastructure, networking, OS and other extension components for Gardener","excerpt":"The infrastructure, networking, OS and other extension components for …","ref":"/docs/extensions/","tags":"","title":"List of Extensions"},{"body":"","categories":"","description":"Gardener extensions for the supported container runtime interfaces","excerpt":"Gardener extensions for the supported container runtime interfaces","ref":"/docs/extensions/container-runtime-extensions/","tags":"","title":"Container Runtime Extensions"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/deployment/","tags":"","title":"Deployment"},{"body":"External DNS Management When you deploy to Kubernetes, there is no native management of external DNS. Instead, the cloud-controller-manager requests (mostly IPv4) addresses for every service of type LoadBalancer. Of course, the Ingress resource helps here, but how is the external DNS entry for the ingress controller managed?\nEssentially, some sort of automation for DNS management is missing.\nAutomating DNS Management From a user’s perspective, it is desirable to work with already known resources and concepts. Hence, the DNS management offered by Gardener plugs seamlessly into Kubernetes resources and you do not need to “leave” the context of the shoot cluster.\nTo request a DNS record creation / update, a Service or Ingress resource is annotated accordingly. The shoot-dns-service extension will (if configured) will pick up the request and create a DNSEntry resource + reconcile it to have an actual DNS record created at a configured DNS provider. Gardener supports the following providers:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns For more information, see DNS Names.\nDNS Provider For the above to work, we need some ingredients. Primarily, this is implemented via a so-called DNSProvider. Every shoot has a default provider that is used to set up the API server’s public DNS record. It can be used to request sub-domains as well.\nIn addition, a shoot can reference credentials to a DNS provider. Those can be used to manage custom domains.\nPlease have a look at the documentation for further details.\n","categories":"","description":"","excerpt":"External DNS Management When you deploy to Kubernetes, there is no …","ref":"/docs/getting-started/features/dns-management/","tags":"","title":"External DNS Management"},{"body":"Overview A Kubernetes cluster consists of a control plane and a data plane. The data plane runs the actual containers on worker nodes (which translate to physical or virtual machines). For the control and data plane to work together properly, lots of components need matching configuration.\nSome configurations are standardized but some are also very specific to the needs of a cluster’s user / workload. Ideally, you want a properly configured cluster with the possibility to fine-tune some settings.\nConcept of a “Shoot” In Gardener, Kubernetes clusters (with their control plane and their data plane) are called shoot clusters or simply shoots. For Gardener, a shoot is just another Kubernetes resource. Gardener components watch it and act upon changes (e.g., creation). It comes with reasonable default settings but also allows fine-tuned configuration. And on top of it, you get a status providing health information, information about ongoing operations, and so on.\nLuckily there is a dashboard to get started.\nBasic Configuration Options Every cluster needs a name - after all, it is a Kubernetes resource and therefore unique within a namespace.\nThe Kubernetes version will be used as a starting point. Once a newer version is available, you can always update your existing clusters (but not downgrade, as this is not supported by Kubernetes in general).\nThe “purpose” affects some configuration (like automatic deployment of a monitoring stack or setting up certain alerting rules) and generally indicates the importance of a cluster.\nStart by selecting the infrastructure you want to use. The choice will be mapped to a cloud profile that contains provider specific information like the available (actual) OS images, zones and regions or machine types.\nEach data plane runs in an infrastructure account owned by the end user. By selecting the infrastructure secret containing the accounts credentials, you are granting Gardener access to the respective account to create / manage resources.\nNote Changing the account after the creation of a cluster is not possible. The credentials can be updated with a new key or even user but have to stay within the same account.\nCurrently, there is no way to move a single cluster to a different account. You would rather have to re-create a cluster and migrate workloads by different means.\n As part of the infrastructure you chose, the region for data plane has to be chosen as well. The Gardener scheduler will try to place the control plane on a seed cluster based on a minimal distance strategy. See Gardener Scheduler for more details.\nUp next, the networking provider (CNI) for the cluster has to be selected. At the point of writing, it is possible to choose between Calico and Cilium. If not specified in the shoot’s manifest, default CIDR ranges for nodes, services, and pods will be used.\nIn order to run any workloads in your cluster, you need nodes. The worker section lets you specify the most important configuration options. For beginners, the machine type is probably the most relevant field, together with the machine image (operating system).\nThe machine type is provider-specific and configured in the cloud profile. Check your respective cloud profile if you’re missing a machine type. Maybe it is available in general but unavailable in your selected region.\nThe operating system your machines will run is the next thing to choose. Debian-based GardenLinux is the best choice for most use cases.\nOther specifications for the workers include the volume type and size. These settings affect the root disk of each node. Therefore we would always recommend to use an SSD-based type to avoid i/o issues.\nCaveat Some machine types (e.g., bare-metal machine types on OpenStack) require you to omit the volume type and volume size settings. The autoscaler parameter defines the initial elasticity / scalability of your cluster. The cluster-autoscaler will add more nodes up to the maximum defined here when your workload grows and remove nodes in case your workload shrinks. The minimum number of nodes should be equal to or higher than the number of zones. You can distribute the nodes of a worker pool among all zones available to your cluster. This is the first step in running HA workloads.\nOnce per day, all clusters reconcile. This means all controllers will check if there are any updates they have to apply (e.g., new image version for ETCD). The maintenance window defines when this daily operation will be triggered. It is important to understand that there is no opt-out for reconciliation.\nIt is also possible to confine updates to the shoot spec to be applied only during this time. This can come in handy when you want to bundle changes or prevent changes to be applied outside a well-known time window.\nYou can allow Gardener to automatically update your cluster’s Kubernetes patch version and/or OS version (of the nodes). Take this decision consciously! Whenever a new Kubernetes patch version or OS version is set to supported in the respective cloud profile, auto update will upgrade your cluster during the next maintenance window. If you fail to (manually) upgrade the Kubernetes or OS version before they expire, force-upgrades will take place during the maintenance window.\nResult The result of your provided inputs and a set of conscious default values is a shoot resource that, once applied, will be acted upon by various Gardener components. The status section represents the intermediate steps / results of these operations. A typical shoot creation flow would look like this:\n Assign control plane to a seed. Create infrastructure resources in the data plane account (e.g., VPC, gateways, …) Deploy control plane incl. DNS records. Create nodes (VMs) and bootstrap kubelets. Deploy kube-system components to nodes. How to Access a Shoot Static credentials for shoots were discontinued in Gardener with Kubernetes v1.27. Short lived credentials need to be used instead. You can create/request tokens directly via Gardener or delegate authentication to an identity provider.\nA short-lived admin kubeconfig can be requested by using kubectl. If this is something you do frequently, consider switching to gardenlogin, which helps you with it.\nAn alternative is to use an identity provider and issue OIDC tokens.\nWhat can you configure? With the basic configuration options having been introduced, it is time to discuss more possibilities. Gardener offers a variety of options to tweak the control plane’s behavior - like defining an event TTL (default 1h), adding an OIDC configuration or activating some feature gates. You could alter the scheduling profile and define an audit logging policy. In addition, the control plane can be configured to run in HA mode (applied on a node or zone level), but keep in mind that once you enable HA, you cannot go back.\nIn case you have specific requirements for the cluster internal DNS, Gardener offers a plugin mechanism for custom core DNS rules or optimization with node-local DNS. For more information, see Custom DNS Configuration and NodeLocalDNS Configuration.\nAnother category of configuration options is dedicated to the nodes and the infrastructure they are running on. Every provider has their own perks and some of them are exposed. Check the detailed documentation of the relevant extension for your infrastructure provider.\nYou can fine-tune the cluster-autoscaler or help the kubelet to cope better with your workload.\nWorker Pools There are a couple of ways to configure a worker pool. One of them is to set everything in the Gardener dashboard. However, only a subset of options is presented there.\nA slightly more complex way is to set the configuration through the yaml file itself.\nThis allows you to configure much more properties of a worker pool, like the timeout after which an unhealthy machine is getting replaced. For more options, see the Worker API reference.\nHow to Change Things Since a shoot is just another Kubernetes resource, changes can be applied via kubectl. For convenience, the basic settings are configurable via the dashboard’s UI. It also has a “yaml” tab where you can alter all of the shoot’s specification in your browser. Once applied, the cluster will reconcile eventually and your changes become active (or cause an error).\nImmutability in a Shoot While Gardener allows you to modify existing shoot clusters, it is important to remember that not all properties of a shoot can be changed after it is created.\nFor example, it is not possible to move a shoot to a different infrastructure account. This is mainly rooted in the fact that discs and network resources are bound to your account.\nAnother set of options that become immutable are most of the network aspects of a cluster. On an infrastructure level the VPC cannot be changed and on a cluster level things like the pod / service cidr ranges, together with the nodeCIDRmask, are set for the lifetime of the cluster.\nSome other things can be changed, but not reverted. While it is possible to add more zones to a cluster on an infrastructure level (assuming that an appropriate CIDR range is available), removing zones is not supported. Similarly, upgrading Kubernetes versions is comparable to a one-way ticket. As of now, Kubernetes does not support downgrading. Lastly, the HA setting of the control plane is immutable once specified.\nCrazy Botany Since remembering all these options can be quite challenging, here is very helpful resource - an example shoot with all the latest options 🎉\n","categories":"","description":"","excerpt":"Overview A Kubernetes cluster consists of a control plane and a data …","ref":"/docs/getting-started/shoots/","tags":"","title":"Gardener Shoots"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/todo/","tags":"","title":"ToDo"},{"body":"","categories":"","description":"Other Gardener extensions","excerpt":"Other Gardener extensions","ref":"/docs/extensions/others/","tags":"","title":"Others"},{"body":"Certificate Management For proper consumption, any service should present a TLS certificate to its consumers. However, self-signed certificates are not fit for this purpose - the certificate should be signed by a CA trusted by an application’s userbase. Luckily, Issuers like Let’s Encrypt and others help here by offering a signing service that issues certificates based on the ACME challenge (Automatic Certificate Management Environment).\nThere are plenty of tools you can use to perform the challenge. For Kubernetes, cert-manager certainly is the most common, however its configuration is rather cumbersome and error prone. So let’s see how a Gardener extension can help here.\nManage Certificates with Gardener You may annotate a Service or Ingress resource to trigger the cert-manager to request a certificate from the any configured issuer (e.g. Let’s Encrypt) and perform the challenge. A Gardener operator can add a default issuer for convenience. With the DNS extension discussed previously, setting up the DNS TXT record for the ACME challenge is fairly easy. The requested certificate can be customized by the means of several other annotations known to the controller. Most notably, it is possible to specify SANs via cert.gardener.cloud/dnsnames to accommodate domain names that have more than 64 characters (the limit for the CN field).\nThe user’s request for a certificate manifests as a certificate resource. The status, issuer, and other properties can be checked there.\nOnce successful, the resulting certificate will be stored in a secret and is ready for usage.\nWith additional configuration, it is also possible to define custom issuers of certificates.\nFor more information, see the Manage certificates with Gardener for public domain topic and the cert-management repository.\n","categories":"","description":"","excerpt":"Certificate Management For proper consumption, any service should …","ref":"/docs/getting-started/features/certificate-management/","tags":"","title":"Certificate Management"},{"body":"Overview A cluster has a data plane and a control plane. The data plane is like a space station. It has certain components which keep everyone / everything alive and can operate autonomously to a certain extent. However, without mission control (and the occasional delivery of supplies) it cannot share information or receive new instructions.\nSo let’s see what the mission control (control plane) of a Kubernetes cluster looks like.\nKubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn the classic setup, there is a dedicated host / VM to host the master components / control plane of a Kubernetes cluster. However, these are just normal programs that can easily be put into containers. Once in containers, we can make Kubernetes Deployments and StatefulSets (for the etcd) watch over them. And now we put all that into a separate, dedicated Kubernetes cluster - et voilà, we have Kubernetes in Kubernetes, aka Kubeception (named after the famous movie Inception with Leonardo DiCaprio).\nIn Gardener’s terminology, the cluster hosting the control plane components is called a seed cluster. The cluster that end users actually use (and whose control plane is hosted in the seed) is called a shoot cluster.\nControl Plane Components on the Seed All control-plane components of a shoot cluster run in a dedicated namespace on the seed.\nA control plane has lots of components:\n Everything needed to run vanilla Kubernetes etcd main \u0026 events (split for performance reasons) Kube-.*-manager CSI driver Additionally, we deploy components needed to manage the cluster:\n Gardener Resource Manager (GRM) Machine Controller Manager (MCM) DNS Management VPN There is also a set of components making our life easier (logging, monitoring) or adding additional features (cert manager).\nCore Components Let’s take a close look at the API server as well as etcd.\nSecrets are encrypted at rest. When asking etcd for the data, the reply is still encrypted. Decryption is done by the API server which knows the necessary key.\nFor non-HA clusters etcd has only 1 replica, while for HA clusters there are 3 replicas.\nOne special remark is needed for Gardener’s deployment of etcd. The pods coming from the etcd-main StatefulSet contain two containers - one runs etcd, the other runs a program that periodically backs up etcd’s contents to an object store that is set up per seed cluster to make sure no data is lost. After all, etcd is the Achilles heel of all Kubernetes clusters. The backup container is also capable of performing a restore from the object store as well as defragment and compact the etcd datastore. For performance reasons, Gardener stores Kubernetes events in a separate etcd instance. By default, events are retained for 1h but can be kept longer if defined in the shoot.spec.\nThe kube API server (often called “kapi”) scales both horizontally and vertically.\nThe kube API server is not directly exposed / reachable via its public hostname. Instead, Gardener runs a single LoadBalancer service backed by an istio gateway / envoy, which uses SNI to forward traffic.\nThe kube-controller-manager (aka KCM) is the component that contains all the controllers for the core Kubernetes objects such as Deployments, Services, PVCs, etc.\nThe Kubernetes scheduler will assign pods to nodes.\nThe Cloud Controller Manager (aka CCM) is the component that contains all functionality to talk to Cloud environments (e.g., create LoadBalancer services).\nThe CSI driver is the storage subsystem of Kubernetes. It provisions and manages anything related to persistence.\nWithout the cluster autoscaler, nodes could not be added or removed based on current pressure on the cluster resources. Without the VPA, pods would have fixed resource limits that could not change on demand.\nGardener-Specific Components Shoot DNS service: External DNS management for resources within the cluster.\nMachine Controller Manager: Responsible for managing VMs which will become nodes in the cluster.\nVirtual Private Network deployments (aka VPN): Almost every communication between Kubernetes controllers and the API server is unidirectional - the controllers are given a kubeconfig and will establish a connection to the API server, which is exposed to all nodes of the cluster through a LoadBalancer. However, there are a few operations that require the API server to connect to the kubelet instead (e.g., for every webhook, when using kubectl exec or kubectl logs). Since every good Kubernetes cluster will have its worker nodes shielded behind firewalls to reduce the attack surface, Gardener establishes a VPN connection from the shoot’s internal network to the API server in the seed. For that, every shoot, as well as every control plane namespace in the seed, have openVPN pods in them that connect to each other (with the connection being established from the shoot to the seed).\nGardener Resource Manager: Tooling to deploy and manage Kubernetes resources required for cluster functionality.\nMachines Machine Controller Manager (aka MCM):\nThe machine controller manager, which lives on the seed in a shoot’s control plane namespace, is the key component responsible for provisioning and removing worker nodes for a Kubernetes cluster. It acts on MachineClass, MachineDeployment, and MachineSet resources in the seed (think of them as the equivalent of Deployments and ReplicaSets) and controls the lifecycle of machine objects. Through a system of plugins, the MCM is the component that phones to the cloud provider’s API and bootstraps virtual machines.\nFor more information, see MCM and Cluster-autoscaler.\nManagedResources Gardener Resource Manager (aka GRM):\nGardener not only deploys components into the control plane namespace of the seed but also to the shoot (e.g., the counterpart of the VPN). Together with the components in the seed, Gardener needs to have a way to reconcile them.\nEnter the GRM - it reconciles on ManagedResources objects, which are descriptions of Kubernetes resources which are deployed into the seed or shoot by GRM. If any of these resources are modified or deleted by accident, the usual observe-analyze-act cycle will revert these potentially malicious changes back to the values that Gardener envisioned. In fact, all the components found in a shoot’s kube-system namespace are ManagedResources governed by the GRM. The actual resource definition is contained in secrets (as they may contain “secret” data), while the ManagedResources contain a reference to the secret containing the actual resource to be deployed and reconciled.\nDNS Records - “Internal” and “External” The internal domain name is used by all Gardener components to talk to the API server. Even though it is called “internal”, it is still publicly routable.\nBut most importantly, it is pre-defined and not configurable by the end user.\nTherefore, the “external” domain name exists. It is either a user owned domain or can be pre-defined for a Gardener landscape. It is used by any end user accessing the cluster’s API server.\nFor more information, see Contract: DNSRecord Resources.\nFeatures and Observability Gardener runs various health checks to ensure that the cluster works properly. The Network Problem Detector gives information about connectivity within the cluster and to the API server.\nCertificate Management: allows to request certificates via the ACME protocol (e.g., issued by Let’s Encrypt) from within the cluster. For detailed information, have a look at the cert-manager project.\nObservability stack: Gardener deploys observability components and gathers logs and metrics for the control-plane \u0026 kube-system namespace. Also provided out-of-the-box is a UI based on Plutono (fork of Grafana) with pre-defined dashboards to access and query the monitoring data. For more information, see Observability.\nHA Control Plane As the title indicates, the HA control plane feature is only about the control plane. Setting up the data plane to span multiple zones is part of the worker spec of a shoot.\nHA control planes can be configured as part of the shoot’s spec. The available types are:\n Node Zone Both work similarly and just differ in the failure domain the concepts are applied to.\nFor detailed guidance and more information, see the High Availability Guides.\nZonal HA Control Planes Zonal HA is the most likely setup for shoots with purpose: production.\nThe starting point is a regular (non-HA) control plane. etcd and most controllers are singletons and the kube-apiserver might have been scaled up to several replicas.\nTo get to an HA setup we need:\n A minimum of 3 replicas of the API server 3 replicas for etcd (both main and events) A second instance for each controller (e.g., controller manager, csi-driver, scheduler, etc.) that can take over in case of failure (active / passive). To distribute those pods across zones, well-known concepts like PodTopologySpreadConstraints or Affinities are applied.\nkube-system Namespace For a fully functional cluster, a few components need to run on the data plane side of the diagram. They all exist in the kube-system namespace. Let’s have a closer look at them.\nNetworking On each node we need a CNI (container network interface) plugin. Gardener offers Calico or Cilium as network provider for a shoot. When using Calico, a kube-proxy is deployed. Cilium does not need a kube-proxy, as it takes care of its tasks as well.\nThe CNI plugin ensures pod-to-pod communication within the cluster. As part of it, it assigns cluster-internal IP addresses to the pods and manages the network devices associated with them. When an overlay network is enabled, calico will also manage the routing of pod traffic between different nodes.\nOn the other hand, kube-proxy implements the actual service routing (cilium can do this as well and no kube-proxy is needed). Whenever packets go to a service’s IP address, they are re-routed based on IPtables rules maintained by kube-proxy to reach the actual pods backing the service. kube-proxy operates on endpoint-slices and manages IPtables on EVERY node. In addition, kube-proxy provides a health check endpoint for services with externalTrafficPolicy=local, where traffic only gets to nodes that run a pod matching the selector of the service.\nThe egress filter implements basic filtering of outgoing traffic to be compliant with SAP’s policies.\nAnd what happens if the pods crashloop, are missing or otherwise broken?\nWell, in case kube-proxy is broken, service traffic will degrade over time (depending on the pod churn rate and how many kube-proxy pods are broken).\nWhen calico is failing on a node, no new pods can start there as they don’t get any IP address assigned. It might also fail to add routes to newly added nodes. Depending on the error, deleting the pod might help.\nDNS System For a normal service in Kubernetes, a cluster-internal DNS record that resolves to the service’s ClusterIP address is being created. In Gardener (similar to most other Kubernetes offerings) CoreDNS takes care of this aspect. To reduce the load when it comes to upstream DNS queries, Gardener deploys a DNS cache to each node by default. It will also forward queries outside the cluster’s search domain directly to the upstream DNS server. For more information, see NodeLocalDNS Configuration and DNS autoscaling.\nIn addition to this optimization, Gardener allows custom DNS configuration to be added to CoreDNS via a dedicated ConfigMap.\nIn case this customization is related to non-Kubernetes entities, you may configure the shoot’s NodeLocalDNS to forward to CoreDNS instead of upstream (disableForwardToUpstreamDNS: true).\nA broken DNS system on any level will cause disruption / service degradation for applications within the cluster.\nHealth Checks and Metrics Gardener deploys probes checking the health of individual nodes. In a similar fashion, a network health check probes connectivity within the cluster (node to node, pod to pod, pod to api-server, …).\nThey provide the data foundation for Gardener’s monitoring stack together with the metrics collecting / exporting components.\nConnectivity Components From the perspective of the data plane, the shoot’s API server is reachable via the cluster-internal service kubernetes.default.svc.cluster.local. The apiserver-proxy intercepts connections to this destination and changes it so that the traffic is forwarded to the kube-apiserver service in the seed cluster. For more information, see kube-apiserver via apiserver-proxy.\nThe second component here is the VPN shoot. It initiates a VPN connection to its counterpart in the seed. This way, there is no open port / Loadbalancer needed on the data plane. The VPN connection is used for any traffic flowing from the control plane to the data plane. If the VPN connection is broken, port-forwarding or log querying with kubectl will not work. In addition, webhooks will stop functioning properly.\ncsi-driver The last component to mention here is the csi-driver that is deployed as a Daemonset to all nodes. It registers with the kubelet and takes care of the mounting of volume types it is responsible for.\n","categories":"","description":"","excerpt":"Overview A cluster has a data plane and a control plane. The data …","ref":"/docs/getting-started/ca-components/","tags":"","title":"Control Plane Components"},{"body":"Frequently Asked Questions The answers in this FAQ apply to the newest (HEAD) version of Machine Controller Manager. If you’re using an older version of MCM please refer to corresponding version of this document. Few of the answers assume that the MCM being used is in conjuction with cluster-autoscaler:\nTable of Contents: Basics\n What is Machine Controller Manager? Why is my machine deleted? What are the different sub-controllers in MCM? What is Safety Controller in MCM? How to?\n How to install MCM in a Kubernetes cluster? How to better control the rollout process of the worker nodes? How to scale down MachineDeployment by selective deletion of machines? How to force delete a machine? How to pause the ongoing rolling-update of the machinedeployment? How to delete machine object immedietly if I don’t have access to it? How to avoid garbage collection of your node? How to trigger rolling update of a machinedeployment? Internals\n What is the high level design of MCM? What are the different configuration options in MCM? What are the different timeouts/configurations in a machine’s lifecycle? How is the drain of a machine implemented? How are the stateful applications drained during machine deletion? How does maxEvictRetries configuration work with drainTimeout configuration? What are the different phases of a machine? What health checks are performed on a machine? How does rate limiting replacement of machine work in MCM ? How is it related to meltdown protection? How MCM responds when scale-out/scale-in is done during rolling update of a machinedeployment? How some unhealthy machines are drained quickly? How does MCM prioritize the machines for deletion on scale-down of machinedeployment? Troubleshooting\n My machine is stuck in deletion for 1 hr, why? My machine is not joining the cluster, why? Developer\n How should I test my code before submitting a PR? I need to change the APIs, what are the recommended steps? How can I update the dependencies of MCM? In the context of Gardener\n How can I configure MCM using Shoot resource? How is my worker-pool spread across zones? Basics What is Machine Controller Manager? Machine Controller Manager aka MCM is a bunch of controllers used for the lifecycle management of the worker machines. It reconciles a set of CRDs such as Machine, MachineSet, MachineDeployment which depicts the functionality of Pod, Replicaset, Deployment of the core Kubernetes respectively. Read more about it at README.\n Gardener uses MCM to manage its Kubernetes nodes of the shoot cluster. However, by design, MCM can be used independent of Gardener. Why is my machine deleted? A machine is deleted by MCM generally for 2 reasons-\n Machine is unhealthy for at least MachineHealthTimeout period. The default MachineHealthTimeout is 10 minutes.\n By default, a machine is considered unhealthy if any of the following node conditions - DiskPressure, KernelDeadlock, FileSystem, Readonly is set to true, or KubeletReady is set to false. However, this is something that is configurable using the following flag. Machine is scaled down by the MachineDeployment resource.\n This is very usual when an external controller cluster-autoscaler (aka CA) is used with MCM. CA deletes the under-utilized machines by scaling down the MachineDeployment. Read more about cluster-autoscaler’s scale down behavior here. What are the different sub-controllers in MCM? MCM mainly contains the following sub-controllers:\n MachineDeployment Controller: Responsible for reconciling the MachineDeployment objects. It manages the lifecycle of the MachineSet objects. MachineSet Controller: Responsible for reconciling the MachineSet objects. It manages the lifecycle of the Machine objects. Machine Controller: responsible for reconciling the Machine objects. It manages the lifecycle of the actual VMs/machines created in cloud/on-prem. This controller has been moved out of tree. Please refer an AWS machine controller for more info - link. Safety-controller: Responsible for handling the unidentified/unknown behaviors from the cloud providers. Please read more about its functionality below. What is Safety Controller in MCM? Safety Controller contains following functions:\n Orphan VM handler: It lists all the VMs in the cloud matching the tag of given cluster name and maps the VMs with the machine objects using the ProviderID field. VMs without any backing machine objects are logged and deleted after confirmation. This handler runs every 30 minutes and is configurable via machine-safety-orphan-vms-period flag. Freeze mechanism: Safety Controller freezes the MachineDeployment and MachineSet controller if the number of machine objects goes beyond a certain threshold on top of Spec.Replicas. It can be configured by the flag –safety-up or –safety-down and also machine-safety-overshooting-period. Safety Controller freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. Safety Controller unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. How to? How to install MCM in a Kubernetes cluster? MCM can be installed in a cluster with following steps:\n Apply all the CRDs from here\n Apply all the deployment, role-related objects from here.\n Control cluster is the one where the machine-* objects are stored. Target cluster is where all the node objects are registered. How to better control the rollout process of the worker nodes? MCM allows configuring the rollout of the worker machines using maxSurge and maxUnavailable fields. These fields are applicable only during the rollout process and means nothing in general scale up/down scenarios. The overall process is very similar to how the Deployment Controller manages pods during RollingUpdate.\n maxSurge refers to the number of additional machines that can be added on top of the Spec.Replicas of MachineDeployment during rollout process. maxUnavailable refers to the number of machines that can be deleted from Spec.Replicas field of the MachineDeployment during rollout process. How to scale down MachineDeployment by selective deletion of machines? During scale down, triggered via MachineDeployment/MachineSet, MCM prefers to delete the machine/s which have the least priority set. Each machine object has an annotation machinepriority.machine.sapcloud.io set to 3 by default. Admin can reduce the priority of the given machines by changing the annotation value to 1. The next scale down by MachineDeployment shall delete the machines with the least priority first.\nHow to force delete a machine? A machine can be force deleted by adding the label force-deletion: \"True\" on the machine object before executing the actual delete command. During force deletion, MCM skips the drain function and simply triggers the deletion of the machine. This label should be used with caution as it can violate the PDBs for pods running on the machine.\nHow to pause the ongoing rolling-update of the machinedeployment? An ongoing rolling-update of the machine-deployment can be paused by using spec.paused field. See the example below:\napiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: test-machine-deployment spec: paused: true It can be unpaused again by removing the Paused field from the machine-deployment.\nHow to delete machine object immedietly if I don’t have access to it? If the user doesn’t have access to the machine objects (like in case of Gardener clusters) and they would like to replace a node immedietly then they can place the annotation node.machine.sapcloud.io/trigger-deletion-by-mcm: \"true\" on their node. This will start the replacement of the machine with a new node.\nOn the other hand if the user deletes the node object immedietly then replacement will start only after MachineHealthTimeout.\nThis annotation can also be used if the user wants to expedite the replacement of unhealthy nodes\nNOTE:\n node.machine.sapcloud.io/trigger-deletion-by-mcm: \"false\" annotation is NOT acted upon by MCM , neither does it mean that MCM will not replace this machine. this annotation would delete the desired machine but another machine would be created to maintain desired replicas specified for the machineDeployment/machineSet. Currently if the user doesn’t have access to machineDeployment/machineSet then they cannot remove a machine without replacement. How to avoid garbage collection of your node? MCM provides an in-built safety mechanism to garbage collect VMs which have no corresponding machine object. This is done to save costs and is one of the key features of MCM. However, sometimes users might like to add nodes directly to the cluster without the help of MCM and would prefer MCM to not garbage collect such VMs. To do so they should remove/not-use tags on their VMs containing the following strings:\n kubernetes.io/cluster/ kubernetes.io/role/ kubernetes-io-cluster- kubernetes-io-role- How to trigger rolling update of a machinedeployment? Rolling update can be triggered for a machineDeployment by updating one of the following:\n .spec.template.annotations .spec.template.spec.class.name Internals What is the high level design of MCM? Please refer the following document.\nWhat are the different configuration options in MCM? MCM allows configuring many knobs to fine-tune its behavior according to the user’s need. Please refer to the link to check the exact configuration options.\nWhat are the different timeouts/configurations in a machine’s lifecycle? A machine’s lifecycle is governed by mainly following timeouts, which can be configured here.\n MachineDrainTimeout: Amount of time after which drain times out and the machine is force deleted. Default ~2 hours. MachineHealthTimeout: Amount of time after which an unhealthy machine is declared Failed and the machine is replaced by MachineSet controller. MachineCreationTimeout: Amount of time after which a machine creation is declared Failed and the machine is replaced by the MachineSet controller. NodeConditions: List of node conditions which if set to true for MachineHealthTimeout period, the machine is declared Failed and replaced by MachineSet controller. MaxEvictRetries: An integer number depicting the number of times a failed eviction should be retried on a pod during drain process. A pod is deleted after max-retries. How is the drain of a machine implemented? MCM imports the functionality from the upstream Kubernetes-drain library. Although, few parts have been modified to make it work best in the context of MCM. Drain is executed before machine deletion for graceful migration of the applications. Drain internally uses the EvictionAPI to evict the pods and triggers the Deletion of pods after MachineDrainTimeout. Please note:\n Stateless pods are evicted in parallel. Stateful applications (with PVCs) are serially evicted. Please find more info in this answer below. How are the stateful applications drained during machine deletion? Drain function serially evicts the stateful-pods. It is observed that serial eviction of stateful pods yields better overall availability of pods as the underlying cloud in most cases detaches and reattaches disks serially anyways. It is implemented in the following manner:\n Drain lists all the pods with attached volumes. It evicts very first stateful-pod and waits for its related entry in Node object’s .status.volumesAttached to be removed by KCM. It does the same for all the stateful-pods. It waits for PvDetachTimeout (default 2 minutes) for a given pod’s PVC to be removed, else moves forward. How does maxEvictRetries configuration work with drainTimeout configuration? It is recommended to only set MachineDrainTimeout. It satisfies the related requirements. MaxEvictRetries is auto-calculated based on MachineDrainTimeout, if maxEvictRetries is not provided. Following will be the overall behavior of both configurations together:\n If maxEvictRetries isn’t set and only maxDrainTimeout is set: MCM auto calculates the maxEvictRetries based on the drainTimeout. If drainTimeout isn’t set and only maxEvictRetries is set: Default drainTimeout and user provided maxEvictRetries for each pod is considered. If both maxEvictRetries and drainTimoeut are set: Then both will be respected. If none are set: Defaults are respected. What are the different phases of a machine? A phase of a machine can be identified with Machine.Status.CurrentStatus.Phase. Following are the possible phases of a machine object:\n Pending: Machine creation call has succeeded. MCM is waiting for machine to join the cluster.\n CrashLoopBackOff: Machine creation call has failed. MCM will retry the operation after a minor delay.\n Running: Machine creation call has succeeded. Machine has joined the cluster successfully and corresponding node doesn’t have node.gardener.cloud/critical-components-not-ready taint.\n Unknown: Machine health checks are failing, eg kubelet has stopped posting the status.\n Failed: Machine health checks have failed for a prolonged time. Hence it is declared failed by Machine controller in a rate limited fashion. Failed machines get replaced immediately.\n Terminating: Machine is being terminated. Terminating state is set immediately when the deletion is triggered for the machine object. It also includes time when it’s being drained.\n NOTE: No phase means the machine is being created on the cloud-provider.\nBelow is a simple phase transition diagram: What health checks are performed on a machine? Health check performed on a machine are:\n Existense of corresponding node obj Status of certain user-configurable node conditions. These conditions can be specified using the flag --node-conditions for OOT MCM provider or can be specified per machine object. The default user configurable node conditions can be found here True status of NodeReady condition . This condition shows kubelet’s status If any of the above checks fails , the machine turns to Unknown phase.\nHow does rate limiting replacement of machine work in MCM? How is it related to meltdown protection? Currently MCM replaces only 1 Unkown machine at a time per machinedeployment. This means until the particular Unknown machine get terminated and its replacement joins, no other Unknown machine would be removed.\nThe above is achieved by enabling Machine controller to turn machine from Unknown -\u003e Failed only if the above condition is met. MachineSet controller on the other hand marks Failed machine as Terminating immediately.\nOne reason for this rate limited replacement was to ensure that in case of network failures , where node’s kubelet can’t reach out to kube-apiserver , all nodes are not removed together i.e. meltdown protection. In gardener context however, DWD is deployed to deal with this scenario, but to stay protected from corner cases , this mechanism has been introduced in MCM.\nNOTE: Rate limiting replacement is not yet configurable\nHow MCM responds when scale-out/scale-in is done during rolling update of a machinedeployment? Machinedeployment controller executes the logic of scaling BEFORE logic of rollout. It identifies scaling by comparing the deployment.kubernetes.io/desired-replicas of each machineset under the machinedeployment with machinedeployment’s .spec.replicas. If the difference is found for any machineSet, a scaling event is detected.\nCase scale-out -\u003e ONLY New machineSet is scaled out Case scale-in -\u003e ALL machineSets(new or old) are scaled in , in proportion to their replica count , any leftover is adjusted in the largest machineSet.\nDuring update for scaling event, a machineSet is updated if any of the below is true for it:\n .spec.Replicas needs update deployment.kubernetes.io/desired-replicas needs update Once scaling is achieved, rollout continues.\nHow does MCM prioritize the machines for deletion on scale-down of machinedeployment? There could be many machines under a machinedeployment with different phases, creationTimestamp. When a scale down is triggered, MCM decides to remove the machine using the following logic:\n Machine with least value of machinepriority.machine.sapcloud.io annotation is picked up. If all machines have equal priorities, then following precedence is followed: Terminating \u003e Failed \u003e CrashloopBackoff \u003e Unknown \u003e Pending \u003e Available \u003e Running If still there is no match, the machine with oldest creation time (.i.e. creationTimestamp) is picked up. How some unhealthy machines are drained quickly ? If a node is unhealthy for more than the machine-health-timeout specified for the machine-controller, the controller health-check moves the machine phase to Failed. By default, the machine-health-timeout is 10` minutes.\nFailed machines have their deletion timestamp set and the machine then moves to the Terminating phase. The node drain process is initiated. The drain process is invoked either gracefully or forcefully.\nThe usual drain process is graceful. Pods are evicted from the node and the drain process waits until any existing attached volumes are mounted on new node. However, if the node Ready is False or the ReadonlyFilesystem is True for greater than 5 minutes (non-configurable), then a forceful drain is initiated. In a forceful drain, pods are deleted and VolumeAttachment objects associated with the old node are also marked for deletion. This is followed by the deletion of the cloud provider VM associated with the Machine and then finally ending with the Node object deletion.\nDuring the deletion of the VM we only delete the local data disks and boot disks associated with the VM. The disks associated with persistent volumes are left un-touched as their attach/de-detach, mount/unmount processes are handled by k8s attach-detach controller in conjunction with the CSI driver.\nTroubleshooting My machine is stuck in deletion for 1 hr, why? In most cases, the Machine.Status.LastOperation provides information around why a machine can’t be deleted. Though following could be the reasons but not limited to:\n Pod/s with misconfigured PDBs block the drain operation. PDBs with maxUnavailable set to 0, doesn’t allow the eviction of the pods. Hence, drain/eviction is retried till MachineDrainTimeout. Default MachineDrainTimeout could be as large as ~2hours. Hence, blocking the machine deletion. Short term: User can manually delete the pod in the question, with caution. Long term: Please set more appropriate PDBs which allow disruption of at least one pod. Expired cloud credentials can block the deletion of the machine from infrastructure. Cloud provider can’t delete the machine due to internal errors. Such situations are best debugged by using cloud provider specific CLI or cloud console. My machine is not joining the cluster, why? In most cases, the Machine.Status.LastOperation provides information around why a machine can’t be created. It could possibly be debugged with following steps:\n Firstly make sure all the relevant controllers like kube-controller-manager , cloud-controller-manager are running. Verify if the machine is actually created in the cloud. User can use the Machine.Spec.ProviderId to query the machine in cloud. A Kubernetes node is generally bootstrapped with the cloud-config. Please verify, if MachineDeployment is pointing the correct MachineClass, and MachineClass is pointing to the correct Secret. The secret object contains the actual cloud-config in base64 format which will be used to boot the machine. User must also check the logs of the MCM pod to understand any broken logical flow of reconciliation. My rolling update is stuck , why? The following can be the reason:\n Insufficient capacity for the new instance type the machineClass mentions. Old machines are stuck in deletion If you are using Gardener for setting up kubernetes cluster, then machine object won’t turn to Running state until node-critical-components are ready. Refer this for more details. Developer How should I test my code before submitting a PR? Developer can locally setup the MCM using following guide Developer must also enhance the unit tests related to the incoming changes. Developer can locally run the unit test by executing: make test-unit Developer can locally run integration tests to ensure basic functionality of MCM is not altered. I need to change the APIs, what are the recommended steps? Developer should add/update the API fields at both of the following places:\n https://github.com/gardener/machine-controller-manager/blob/master/pkg/apis/machine/types.go https://github.com/gardener/machine-controller-manager/tree/master/pkg/apis/machine/v1alpha1 Once API changes are done, auto-generate the code using following command:\nmake generate Please ignore the API-violation errors for now.\nHow can I update the dependencies of MCM? MCM uses gomod for depedency management. Developer should add/udpate depedency in the go.mod file. Please run following command to automatically tidy the dependencies.\nmake tidy In the context of Gardener How can I configure MCM using Shoot resource? All of the knobs of MCM can be configured by the workers section of the shoot resource.\n Gardener creates a MachineDeployment per zone for each worker-pool under workers section. workers.dataVolumes allows to attach multiple disks to a machine during creation. Refer the link. workers.machineControllerManager allows configuration of multiple knobs of the MachineDeployment from the shoot resource. How is my worker-pool spread across zones? Shoot resource allows the worker-pool to spread across multiple zones using the field workers.zones. Refer link.\n Gardener creates one MachineDeployment per zone. Each MachineDeployment is initiated with the following replica: MachineDeployment.Spec.Replicas = (Workers.Minimum)/(Number of availibility zones) ","categories":"","description":"Frequently Asked Questions","excerpt":"Frequently Asked Questions","ref":"/docs/other-components/machine-controller-manager/faq/","tags":"","title":"FAQ"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/monitoring/","tags":"","title":"Monitoring"},{"body":"","categories":"","description":"Other components included in the Gardener project","excerpt":"Other components included in the Gardener project","ref":"/docs/other-components/","tags":"","title":"Other Components"},{"body":"Gardener Dashboard \n \nDemo Documentation Gardener Dashboard Documentation\nLicense Apache License 2.0\nCopyright 2020 The Gardener Authors\n","categories":"","description":"The web UI for managing your projects and clusters","excerpt":"The web UI for managing your projects and clusters","ref":"/docs/dashboard/","tags":"","title":"Dashboard"},{"body":"Reconciliation in Kubernetes and Gardener The starting point of all reconciliation cycles is the constant observation of both the desired and actual state. A component would analyze any differences between the two states and try to converge the actual towards the desired state using appropriate actions. Typically, a component is responsible for a single resource type but it also watches others that have an implication on it.\nAs an example, the Kubernetes controller for ReplicaSets will watch pods belonging to it in order to ensure that the specified replica count is fulfilled. If one pod gets deleted, the controller will create a new pod to enforce the desired over the actual state.\nThis is all standard behaviour, as Gardener is following the native Kubernetes approach. All elements of a shoot cluster have a representation in Kubernetes resources and controllers are watching / acting upon them.\nIf we pick up the example of the ReplicaSet - a user typically creates a deployment resource and the ReplicaSet is implicitly generated on the way to create the pods. Similarly, Gardener takes the user’s intent (shoot) and creates lots of domain specific resources on the way. They all reconcile and make sure their actual and desired states match.\nUpdating the Desired State of a Shoot Based on the shoot’s specifications, Gardener will create network resources on a hyperscaler, backup resources for the ETCD, credentials, and other resources, but also representations of the worker pools. Eventually, this process will result in a fully functional Kubernetes cluster.\nIf you change the desired state, Gardener will reconcile the shoot and run through the same cycle to ensure the actual state matches the desired state.\nFor example, the (infrastructure-specific) machine type can be changed within the shoot resource. The following reconciliation will pick up the change and initiate the creation of new nodes with a different machine type and the removal of the old nodes.\nMaintenance Window and Daily Reconciliation EVERY shoot cluster reconciles once per day during the so-called “maintenance window”. You can confine the rollout of spec changes to this window.\nAdditionally, the daily reconciliation will help pick up all kind of version changes. When a new Gardener version was rolled out to the landscape, shoot clusters will pick up any changes during their next reconciliation. For example, if a new Calico version is introduced to fix some bug, it will automatically reach all shoots.\nImpact of a Change It is important to be aware of the impacts that a change can have on a cluster and the workloads within it.\nAn operator pushing a new Gardener version with a new calico image to a landscape will cause all calico pods to be re-created. Another example would be the rollout of a new etcd backup-restore image. This would cause etcd pods to be re-created, rendering a non-HA control plane unavailable until etcd is up and running again.\nWhen you change the shoot spec, it can also have significant impact on the cluster. Imagine that you have changes the machine type of a worker pool. This will cause new machines to be created and old machines to be deleted. Or in other words: all nodes will be drained, the pods will be evicted and then re-created on newly created nodes.\nKubernetes Version Update (Minor + Patch) Some operations are rather common and have to be performed on a regular basis. Updating the Kubernetes version is one them. Patch updates cause relatively little disruption, as only the control-plane pods will be re-created with new images and the kubelets on all nodes will restart.\nA minor version update is more impactful - it will cause all nodes to be recreated and rolls components of the control plane.\nOS Version Update The OS version is defined for each worker pool and can be changed per worker pool. You can freely switch back and forth. However, as there is no in-place update, each change will cause the entire worker pool to roll and nodes will be replaced. For OS versions different update strategies can be configured. Please check the documentation for details.\nAvailable Versions​ Gardener has a dedicated resource to maintain a list of available versions – the so-called cloudProfile.\nA cloudProfile provides information about supported​:\n Kubernetes versions​ OS versions (and where to find those images)​ Regions (and their zones)​ Machine types​ Each shoot references a cloudProfile in order to obtain information about available / possible versions and configurations.\nVersion Classifications Gardener has the following classifications for Kubernetes and OS image versions:\n preview: still in testing phase (several versions can be in preview at the same time)\n supported: recommended version\n deprecated: a new version has been set to “supported”, updating is recommended (might have an expiration date)\n expired: cannot be used anymore, clusters using this version will be force-upgraded\n Version information is maintained in the relevant cloud profile resource. There might be circumstances where a version will never become supported but instead move to deprecated directly. Similarly, a version might be directly introduced as supported.\nAutoUpdate / Forced Updates AutoUpdate for a machine image version will update all node pools to the latest supported version based on the defined update strategy. Whenever a new version is set to supported, the cluster will pick it up during its next maintenance window.\nFor Kubernetes versions the mechanism is the same, but only applied to patch version. This means that the cluster will be kept on the latest supported patch version of a specific minor version.\nIn case a version used in a cluster expires, there is a force update during the next maintenance window. In a worst case scenario, 2 minor versions expire simultaneously. Then there will be two consecutive minor updates enforced.\nFor more information, see Shoot Kubernetes and Operating System Versioning in Gardener.\nApplying Changes to a Seed It is important to keep in mind that a seed is just another Kubernetes cluster. As such, it has its own lifecycle (daily reconciliation, maintenance, etc.) and is also a subject to change.\nFrom time to time changes need to be applied to the seed as well. Some (like updating the OS version) cause the node pool to roll. In turn, this will cause the eviction of ALL pods running on the affected node. If your etcd is evicted and you don’t have a highly available control plane, it will cause downtime for your cluster. Your workloads will continue to run ,of course, but your cluster’s API server will not function until the etcd is up and running again.\n","categories":"","description":"","excerpt":"Reconciliation in Kubernetes and Gardener The starting point of all …","ref":"/docs/getting-started/lifecycle/","tags":"","title":"Shoot Lifecycle"},{"body":"Vertical Pod Autoscaler When a pod’s resource CPU or memory grows, it will hit a limit eventually. Either the pod has resource limits specified or the node will run short of resources. In both cases, the workload might be throttled or even terminated. When this happens, it is often desirable to increase the request or limits. To do this autonomously within certain boundaries is the goal of the Vertical Pod Autoscaler project.\nSince it is not part of the standard Kubernetes API, you have to install the CRDs and controller manually. With Gardener, you can simply flip the switch in the shoot’s spec and start creating your VPA objects.\nPlease be aware that VPA and HPA operate in similar domains and might interfere.\nA controller \u0026 CRDs for vertical pod auto-scaling can be activated via the shoot’s spec.\n","categories":"","description":"","excerpt":"Vertical Pod Autoscaler When a pod’s resource CPU or memory grows, it …","ref":"/docs/getting-started/features/vpa/","tags":"","title":"Vertical Pod Autoscaler"},{"body":"Obtaining Aditional Nodes The scheduler will assign pods to nodes, as long as they have capacity (CPU, memory, Pod limit, # attachable disks, …). But what happens when all nodes are fully utilized and the scheduler does not find any suitable target?\nOption 1: Evict other pods based on priority. However, this has the downside that other workloads with lower priority might become unschedulable.\nOption 2: Add more nodes. There is an upstream Cluster Autoscaler project that does exactly this. It simulates the scheduling and reacts to pods not being schedulable events. Gardener has forked it to make it work with machine-controller-manager abstraction of how node (groups) are defined in Gardener. The cluster autoscaler respects the limits (min / max) of any worker pool in a shoot’s spec. It can also scale down nodes based on utilization thresholds. For more details, see the autoscaler documentation.\nScaling by Priority For clusters with more than one node pool, the cluster autoscaler has to decide which group to scale up. By default, it randomly picks from the available / applicable. However, this behavior is customizable by the use of so-called expanders.\nThis section will focus on the priority based expander.\nEach worker pool gets a priority and the cluster autoscaler will scale up the one with the highest priority until it reaches its limit.\nTo get more information on the current status of the autoscaler, you can check a “status” configmap in the kube-system namespace with the following command:\nkubectl get cm -n kube-system cluster-autoscaler-status -oyaml\nTo obtain information about the decision making, you can check the logs of the cluster-autoscaler pod by using the shoot’s monitoring stack.\nFor more information, see the cluster-autoscaler FAQ and the Priority based expander for cluster-autoscaler topic.\n","categories":"","description":"","excerpt":"Obtaining Aditional Nodes The scheduler will assign pods to nodes, as …","ref":"/docs/getting-started/features/cluster-autoscaler/","tags":"","title":"Cluster Autoscaler"},{"body":"gardenctl-v2 \nWhat is gardenctl? gardenctl is a command-line client for the Gardener. It facilitates the administration of one or many garden, seed and shoot clusters. Use this tool to configure access to clusters and configure cloud provider CLI tools. It also provides support for accessing cluster nodes via ssh.\nInstallation Install the latest release from Homebrew, Chocolatey or GitHub Releases.\nInstall using Package Managers # Homebrew (macOS and Linux) brew install gardener/tap/gardenctl-v2 # Chocolatey (Windows) # default location C:\\ProgramData\\chocolatey\\bin\\gardenctl-v2.exe choco install gardenctl-v2 Attention brew users: gardenctl-v2 uses the same binary name as the legacy gardenctl (gardener/gardenctl) CLI. If you have an existing installation you should remove it with brew uninstall gardenctl before attempting to install gardenctl-v2. Alternatively, you can choose to link the binary using a different name. If you try to install without removing or relinking the old installation, brew will run into an error and provide instructions how to resolve it.\nInstall from Github Release If you install via GitHub releases, you need to\n put the gardenctl binary on your path and install gardenlogin. The other install methods do this for you.\n# Example for macOS # set operating system and architecture os=darwin # choose between darwin, linux, windows arch=amd64 # choose between amd64, arm64 # Get latest version. Alternatively set your desired version version=$(curl -s https://raw.githubusercontent.com/gardener/gardenctl-v2/master/LATEST) # Download gardenctl curl -LO \"https://github.com/gardener/gardenctl-v2/releases/download/${version}/gardenctl_v2_${os}_${arch}\" # Make the gardenctl binary executable chmod +x \"./gardenctl_v2_${os}_${arch}\" # Move the binary in to your PATH sudo mv \"./gardenctl_v2_${os}_${arch}\" /usr/local/bin/gardenctl Configuration gardenctl requires a configuration file. The default location is in ~/.garden/gardenctl-v2.yaml.\nYou can modify this file directly using the gardenctl config command. It allows adding, modifying and deleting gardens.\nExample config command:\n# Adapt the path to your kubeconfig file for the garden cluster (not to be mistaken with your shoot cluster) export KUBECONFIG=~/relative/path/to/kubeconfig.yaml # Fetch cluster-identity of garden cluster from the configmap cluster_identity=$(kubectl -n kube-system get configmap cluster-identity -ojsonpath={.data.cluster-identity}) # Configure garden cluster gardenctl config set-garden $cluster_identity --kubeconfig $KUBECONFIG This command will create or update a garden with the provided identity and kubeconfig path of your garden cluster.\nExample Config gardens: - identity: landscape-dev # Unique identity of the garden cluster. See cluster-identity ConfigMap in kube-system namespace of the garden cluster kubeconfig: ~/relative/path/to/kubeconfig.yaml # name: my-name # An alternative, unique garden name for targeting # context: different-context # Overrides the current-context of the garden cluster kubeconfig # patterns: ~ # List of regex patterns for pattern targeting Note: You need to have gardenlogin installed as kubectl plugin in order to use the kubeconfigs for Shoot clusters provided by gardenctl.\nConfig Path Overwrite The gardenctl config path can be overwritten with the environment variable GCTL_HOME. The gardenctl config name can be overwritten with the environment variable GCTL_CONFIG_NAME. export GCTL_HOME=/alternate/garden/config/dir export GCTL_CONFIG_NAME=myconfig # without extension! # config is expected to be under /alternate/garden/config/dir/myconfig.yaml Shell Session The state of gardenctl is bound to a shell session and is not shared across windows, tabs or panes. A shell session is defined by the environment variable GCTL_SESSION_ID. If this is not defined, the value of the TERM_SESSION_ID environment variable is used instead. If both are not defined, this leads to an error and gardenctl cannot be executed. The target.yaml and temporary kubeconfig.*.yaml files are store in the following directory ${TMPDIR}/garden/${GCTL_SESSION_ID}.\nYou can make sure that GCTL_SESSION_ID or TERM_SESSION_ID is always present by adding the following code to your terminal profile ~/.profile, ~/.bashrc or comparable file.\nbash and zsh: [ -n \"$GCTL_SESSION_ID\" ] || [ -n \"$TERM_SESSION_ID\" ] || export GCTL_SESSION_ID=$(uuidgen) fish: [ -n \"$GCTL_SESSION_ID\" ] || [ -n \"$TERM_SESSION_ID\" ] || set -gx GCTL_SESSION_ID (uuidgen) powershell: if ( !(Test-Path Env:GCTL_SESSION_ID) -and !(Test-Path Env:TERM_SESSION_ID) ) { $Env:GCTL_SESSION_ID = [guid]::NewGuid().ToString() } Completion Gardenctl supports completion that will help you working with the CLI and save you typing effort. It will also help you find clusters by providing suggestions for gardener resources such as shoots or projects. Completion is supported for bash, zsh, fish and powershell. You will find more information on how to configure your shell completion for gardenctl by executing the help for your shell completion command. Example:\ngardenctl completion bash --help Usage Targeting You can set a target to use it in subsequent commands. You can also overwrite the target for each command individually.\nNote that this will not affect your KUBECONFIG env variable. To update the KUBECONFIG env for your current target see Configure KUBECONFIG section\nExample:\n# target control plane gardenctl target --garden landscape-dev --project my-project --shoot my-shoot --control-plane Find more information in the documentation.\nConfigure KUBECONFIG for Shoot Clusters Generate a script that points KUBECONFIG to the targeted cluster for the specified shell. Use together with eval to configure your shell. Example for bash:\neval $(gardenctl kubectl-env bash) Configure Cloud Provider CLIs Generate the cloud provider CLI configuration script for the specified shell. Use together with eval to configure your shell. Example for bash:\neval $(gardenctl provider-env bash) SSH Establish an SSH connection to a Shoot cluster’s node.\ngardenctl ssh my-node ","categories":"","description":"The command line interface to control your clusters","excerpt":"The command line interface to control your clusters","ref":"/docs/gardenctl-v2/","tags":"","title":"Gardenctl V2"},{"body":"Overview Gardener offers out-of-the-box observability for the control plane, Gardener managed system-components, and the nodes of a shoot cluster.\nHaving your workload survive on day 2 can be a challenge. The goal of this topic is to give you the tools with which to observe, analyze, and alert when the control plane or system components of your cluster become unhealthy. This will let you guide your containers through the storm of operating in a production environment.\n","categories":"","description":"","excerpt":"Overview Gardener offers out-of-the-box observability for the control …","ref":"/docs/getting-started/observability/","tags":"","title":"Observability"},{"body":"","categories":"","description":"Commonly asked questions about Gardener","excerpt":"Commonly asked questions about Gardener","ref":"/docs/faq/","tags":"","title":"FAQ"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/getting-started/features/","tags":"","title":"Features"},{"body":"Architecture Containers will NOT fix a broken architecture! Running a highly distributed system has advantages, but of course, those come at a cost. In order to succeed, one would need:\n Logging Tracing No singleton Tolerance to failure of individual instances Automated config / change management Kubernetes knowledge Scalability Most scalability dimensions are interconnected with others. If a cluster grows beyond reasonable defaults, it can still function very well. But tuning it comes at the cost of time and can influence stability negatively.\nTake the number of nodes and pods, for example. Both are connected and you cannot grow both towards their individual limits, as you would face issues way before reaching any theoretical limits.\nReading the Scalability of Gardener Managed Kubernetes Clusters guide is strongly recommended in order to understand the topic of scalability within Kubernetes and Gardener.\nA Small Sample of Things That Can Grow Beyond Reasonable Limits When scaling a cluster, there are plenty of resources that can be exhausted or reach a limit:\n The API server will be scaled horizontally and vertically by Gardener. However, it can still consume too much resources to fit onto a single node on the seed. In this case, you can only reduce the load on the API server. This should not happen with regular usage patterns though. ETCD disk space: 8GB is the limit. If you have too many resources or a high churn rate, a cluster can run out of ETCD capacity. In such a scenario it will stop working until defragmented, compacted, and cleaned up. The number of nodes is limited by the network configuration (pod cidr range \u0026 node cidr mask). Also, there is a reasonable number of nodes (300) that most workloads should not exceed. It is possible to go beyond but doing so requires careful tuning and consideration of connected scaling dimensions (like the number of pods per node). The availability of your cluster is directly impacted by the way you use it.\nInfrastructure Capacity and Quotas Sometimes requests cannot be fulfilled due to shortages on the infrastructure side. For example, a certain instance type might not be available and new Kubernetes nodes of this type cannot be added. It is a good practice to use the cluster-autoscaler’s priority expander and have a secondary node pool.\nSometimes, it is not the physical capacity but exhausted quotas within an infrastructure account that result in limits. Obviously, there should be sufficient quota to create as many VMs as needed. But there are also other resources that are created in the infrastructure that need proper quotas:\n Loadbalancers VPC Disks Routes (often forgotten, but very important for clusters without overlay network; typically defaults to around 50 routes, meaning that 50 nodes is the maximum a cluster can have) … NodeCIDRMaskSize Upon cluster creation, there are several settings that are network related. For example, the address space for Pods has to be defined. In this case, it is a /16 subnet that includes a total of 65.536 hosts. However, that does not imply that you can easily use all addresses at the same point in time.\nAs part of the Kubernetes network setup, the /16 network is divided into smaller subnets and each node gets a distinct subnet. The size of this subnet defaults to /24. It can also be specified (but not changed later).\nNow, as you create more nodes, you have a total of 256 subnets that can be assigned to nodes, thus limiting the total number of nodes of this cluster to 256.\nFor more information, see Shoot Networking.\nOverlapping VPCs Avoid Overlapping CIDR Ranges in VPCs Gardener can create shoot cluster resources in an existing / user-created VPC. However, you have to make sure that the CIDR ranges used by the shoots nodes or subnets for zones do not overlap with other shoots deployed to the same VPC.\nIn case of an overlap, there might be strange routing effects, and packets ending up at a wrong location.\nExpired Credentials Credentials expire or get revoked. When this happens to the actively used infrastructure credentials of a shoot, the cluster will stop working after a while. New nodes cannot be added, LoadBalancers cannot be created, and so on.\nYou can update the credentials stored in the project namespace and reconcile the cluster to replicate the new keys to all relevant controllers. Similarly, when doing a planned rotation one should wait until the shoot reconciled successfully before invalidating the old credentials.\nAutoUpdate Breaking Clusters Gardener can automatically update a shoot’s Kubernetes patch version, when a new patch version is labeled as “supported”. Automatically updating of the OS images works in a similar way. Both are triggered by the “supported” classification in the respective cloud profile and can be enabled / disabled as part a shoot’s spec.\nAdditionally, when a minor Kubernetes / OS version expires, Gardener will force-update the shoot to the next supported version.\nTurning on AutoUpdate for a shoot may be convenient but comes at the risk of potentially unwanted changes. While it is possible to switch to another OS version, updates to the Kubernetes version are a one way operation and cannot be reverted.\nRecommendation Control the version lifecycle separately for any cluster that hosts important workload. Node Draining Node Draining and Pod Disruption Budget Typically, nodes are drained when:\n There is a update of the OS / Kubernetes minor version An Operator cordons \u0026 drains a node The cluster-autoscaler wants to scale down Without a PodDistruptionBudget, pods will be terminated as fast as possible. If an application has 2 out of 2 replicas running on the drained node, this will probably cause availability issues.\nNode Draining with PDB PodDisruptionBudgets can help to manage a graceful node drain. However, if no disruptions are allowed there, the node drain will be blocked until it reaches a timeout. Only then will the nodes be terminated but without respecting PDB thresholds.\nRecommendation Configure PDBs and allow disruptions. Pod Resource Requests and Limits Resource Consumption Pods consume resources and, of course, there are only so many resources available on a single node. Setting requests will make the scheduling much better, as the scheduler has more information available.\nSpecifying limits can help, but can also limit an application in unintended ways. A recommendation to start with:\n Do not set CPU limits (CPU is compressible and throttling is really hard to detect) Set memory limits and monitor OOM kills / restarts of workload (typically detectable by container status exit code 137 and corresponding events). This will decrease the likelihood of OOM situations on the node itself. However, for critical workloads it might be better to have uncapped growth and rather risk a node going OOM. Next, consider if assigning the workload to quality of service class guaranteed is needed. Again - this can help or be counterproductive. It is important to be aware of its implications. For more information, see Pod Quality of Service Classes.\nTune shoot.spec.Kubernetes.kubeReserved to protect the node (kubelet) in case of a workload pod consuming too much resources. It is very helpful to ensure a high level of stability.\nIf the usage profile changes over time, the VPA can help a lot to adapt the resource requests / limits automatically.\nWebhooks User-Deployed Webhooks in Kubernetes By default, any request to the API server will go through a chain of checks. Let’s take the example of creating a pod.\nWhen the resource is submitted to the API server, it will be checked against the following validations:\n Is the user authorized to perform this action? Is the pod definitionactually valid? Are the specified values allowed? Additionally, there is the defaulting - like the injection of the default service account’s name, if nothing else is specified.\nThis chain of admission control and mutation can be enhanced by the user. Read about dynamic admission control for more details.\nValidatingWebhookConfiguration: allow or deny requests based on custom rules\nMutatingWebhookConfiguration: change а resource before it is actually stored in etcd (that is, before any other controller acts upon)\nBoth ValidatingWebhookConfiguration as well as MutatingWebhookConfiguration resources:\n specify for which resources and operations these checks should be executed. specify how to reach the webhook server (typically a service running on the data plane of a cluster) rely on a webhook server performing a review and reply to the admissionReview request What could possibly go wrong? Due to the separation of control plane and data plane in Gardener’s architecture, webhooks have the potential to break a cluster. If the webhook server is not responding in time with a valid answer, the request should timeout and the failure policy is invoked. Depending on the scope of the webhook, frequent failures may cause downtime for applications. Common causes for failure are:\n The call to the webhook is made through the VPN tunnel. VPN / connection issues can happen both on the side of the seed as well as the shoot and would render the webhook unavailable from the perspective of the control plane. The traffic cannot reach the pod (network issue, pod not available) The pod is processing too slow (e.g., because there are too many requests) Timeout Webhooks are a very helpful feature of Kubernetes. However, they can easily be configured to break a shoot cluster. Take the timeout, for example. High timeouts (\u003e15s) can lead to blocking requests of control plane components. That’s because most control-plane API calls are made with a client-side timeout of 30s, so if a webhook has timeoutSeconds=30, the overall request might still fail as there is overhead in communication with the API server and other potential webhooks.\nRecommendation Webhooks (esp. mutating) may be called sequentially and thus adding up their individual timeouts. Even with a faliurePolicy=ignore the timeout will stop the request. Recommendations Problematic webhooks are reported as part of a shoot’s status. In addition to timeouts, it is crucial to exclude the kube-system namespace and (potentially non-namespaced) resources that are necessary for the cluster to function properly. Those should not be subject to a user-defined webhook.\nIn particular, a webhook should not operate on:\n the kube-system namespace Endpoints or EndpointSlices Nodes PodSecurityPolicies ClusterRoles ClusterRoleBindings CustomResourceDefinitions ApiServices CertificateSigningRequests PriorityClasses Example:\nA webhook checks node objects upon creation and has a failurePolicy: fail. If the webhook does not answer in time (either due to latency or because there is no pod serving it), new nodes cannot join the cluster.\nFor more information, see Shoot Status.\nConversion Webhooks Who installs a conversion webhook? If you have written your own CustomResourceDefinition (CRD) and made a version upgrade, you will also have consciously written \u0026 deployed the conversion webhook.\nHowever, sometimes, you simply use helm or kustomize to install a (third-party) dependency that contains CRDs. Of course, those can contain conversion webhooks as well. As a user of a cluster, please make sure to be aware what you deploy.\nCRD with a Conversion Webhook Conversion webhooks are tricky. Similarly to regular webhooks, they should have a low timeout. However, they cannot be remediated automatically and can cause errors in the control plane. For example, if a webhook is invoked but not available, it can block the garbage collection run by the kube-controller-manager.\nIn turn, when deleting something like a deployment, dependent resources like pods will not be deleted automatically.\nRecommendation Try to avoid conversion webhooks. They are valid and can be used, but should not stay in place forever. Complete the upgrade to a new version of the CRD as soon as possible. For more information, see the Webhook Conversion, Upgrade Existing Objects to a New Stored Version, and Version Priority topics in the Kubernetes documentation.\n","categories":"","description":"","excerpt":"Architecture Containers will NOT fix a broken architecture! Running a …","ref":"/docs/getting-started/common-pitfalls/","tags":"","title":"Common Pitfalls"},{"body":"Purpose Synonyms and inconsistent writing style makes it hard for beginners to get into a new topic. This glossary aims to help users to get a better understanding of Gardener and authors to use the right terminology.\nContributions are most welcome!\nIf you would like to contribute please check first if your new term is already part of the Standardized Kubernetes Glossary, and if so refrain from adding it here. Whenever you see the need to explain Kubernetes terminology or to refer to Kubernetes concepts it is recommended that you link to the official Kubernetes documentation in your section.\nGardener Glossary If you add anything to the list please keep it in alphabetical order.\n Term Definition Related Term cloud provider secret А resource storing confidential data used to authenticate Gardener and Kubernetes components for infrastructure operations. When a new cluster is created in a Gardener project, the project admin who creates the cluster specification must select the infrastructure secret that will be used to manage IaaS resources required for the new cluster. secret Gardener API server An API server designed to run inside a Kubernetes cluster whose API it wants to extend. After registration, it is used to expose resources native to Gardener such as cloud profiles, shoots, seeds and secret bindings. kube-apiserver garden cluster control plane A control plane that manages the overall creation, modification, and deletion of clusters. control plane Gardener controller manager A component that runs next to the Gardener API server which runs several control loops that do not require talking to any seed or shoot cluster. kube-controller-manager Gardener project A consolidation of project members, clusters, and secrets of the underlying IaaS provider used to organize teams and clusters in a meaningful way. none Gardener scheduler A controller that watches newly created shoots and assigns a seed cluster to them. kube-scheduler gardenlet An agent that manages seed clusters decentrally; reads the desired state from the Gardener API Server and updates the current state. The gardenlet has a similar role as the kubelet in Kubernetes, which manages the workload of a node decentrally; gardenlet manages the shoot clusters (workload) of a seed cluster instead. More information: gardenlet. kubelet garden cluster A dedicated Kubernetes cluster that the Gardener control plane runs in. cluster project “Gardener” An open source project that focuses on operating, monitoring, and managing Kubernetes clusters. none physical garden cluster A physical cluster of the IaaS provider that is used to install Gardener in. none secretBinding A resource that makes it possible for shoot clusters to connect to the cloud provider secret. none seed cluster A cluster that hosts shoot cluster control planes as pods in order to manage shoot clusters. node shoot cluster A Kubernetes runtime for the actual applications or services consisting of a shoot control plane running on the seed cluster and worker nodes hosting the actual workload. pod shoot cluster control plane A Kubernetes control plane used to run the actual end-user workload. It is hosted in the form of pods on a seed cluster. control plane soil cluster A cluster that is created manually and is used as host for other seeds. Sometimes it is technically impossible that Gardener can install shoot clusters on an infrastructure, for example, because the infrastructure is not supported or protected by a firewall. In such cases you can create a soil cluster on that infrastructure manually as a host for seed clusters. From inside the firewall, seed clusters can reach the garden cluster outside the firewall. This is possible since Gardener delegated cluster management to the Gardenlet. none virtual garden cluster A cluster without any nodes that runs the Kubernetes API server, etcd, and stores Gardener metadata like projects, shoot resources, seed resources, secrets, and others. The virtual garden cluster is installed on the physical garden cluster (base cluster of IaaS provider) during the installation of Gardener. Thanks to the virtual garden cluster, Gardener has full control over all Gardener metadata. This full control simplifies the support for the backup, restore, recovery, migration, relocation, or recreation of this data, because it can be implemented independently from the underlying physical garden cluster. none ","categories":"","description":"Commonly used terms in Gardener","excerpt":"Commonly used terms in Gardener","ref":"/docs/glossary/","tags":"","title":"Glossary"},{"body":"Overview The Gardener team takes security seriously, which is why we mandate the Security Technical Implementation Guide (STIG) for Kubernetes as published by the Defense Information Systems Agency (DISA) here. We offer Gardener adopters the opportunity to show compliance with DISA Kubernetes STIG via the compliance checker tool diki. The latest release in machine readable format can be found in the STIGs Document Library by searching for Kubernetes.\nKubernetes Clusters Security Requirements DISA Kubernetes STIG version 1 release 11 contains 91 rules overall. Only the following rules, however, apply to you. Some of them are secure-by-default, so your responsibility is to make sure that they are not changed. For your convenience, the requirements are grouped logically and per role:\nRules Relevant for Cluster Admins Control Plane Configuration ID Description Secure By Default Comments 242390 Kubernetes API server must have anonymous authentication disabled ✅ Disabled unless you enable it via enableAnnonymousAuthentication 245543 Kubernetes API Server must disable token authentication to protect information in transit ✅ Disabled unless you enable it via enableStaticTokenKubeconfig 242400 Kubernetes API server must have Alpha APIs disabled ✅ Disabled unless you enable it via featureGates 242436 Kubernetes API server must have the ValidatingAdmissionWebhook enabled ✅ Enabled unless you disable it explicitly via admissionPlugins 242393 Kubernetes Worker Nodes must not have sshd service running ❌ Active to allow debugging of network issues, but it is possible to deactivate via the sshAccess setting 242394 Kubernetes Worker Nodes must not have the sshd service enabled ❌ Enabled to allow debugging of network issues, but it is possible to deactivate via the sshAccess setting 242434 Kubernetes Kubelet must enable kernel protection ✅ Enabled for Kubernetes v1.26 or later unless disabled explicitly via protectKernalDefaults 245541 Kubernetes Kubelet must not disable timeouts ✅ Enabled for Kubernetes v1.26 or later unless disabled explicitly via streamingConnectionIdleTimeout Audit Configuration ID Description Secure By Default Comments 242402 The Kubernetes API Server must have an audit log path set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242403 Kubernetes API Server must generate audit records that identify what type of event has occurred, identify the source of the event, contain the event results, identify any users, and identify any containers associated with the event ❌ Users should set an audit policy that meets the requirements of their organization. Please consult the Shoot Audit Policy documentation. 242461 Kubernetes API Server audit logs must be enabled ❌ Users should set an audit policy that meets the requirements of their organization. Please consult the Shoot Audit Policy documentation. 242462 The Kubernetes API Server must be set to audit log max size ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242463 The Kubernetes API Server must be set to audit log maximum backup ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242464 The Kubernetes API Server audit log retention must be set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242465 The Kubernetes API Server audit log path must be set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. End User Workload ID Description Secure By Default Comments 242395 Kubernetes dashboard must not be enabled ✅ Not installed unless you install it via kubernetesDashboard. 242414 Kubernetes cluster must use non-privileged host ports for user pods ❌ Do not use any ports below 1024 for your own workload. 242415 Secrets in Kubernetes must not be stored as environment variables ❌ Always mount secrets as volumes and never as environment variables. 242383 User-managed resources must be created in dedicated namespaces ❌ Create and use your own/dedicated namespaces and never place anything into the default, kube-system, kube-public, or kube-node-lease namespace. The default namespace is never to be used while the other above listed namespaces are only to be used by the Kubernetes provider (here Gardener). 242417 Kubernetes must separate user functionality ❌ While 242383 is about all resources, this rule is specifically about pods. Create and use your own/dedicated namespaces and never place pods into the default, kube-system, kube-public, or kube-node-lease namespace. The default namespace is never to be used while the other above listed namespaces are only to be used by the Kubernetes provider (here Gardener). 242437 Kubernetes must have a pod security policy set ✅ Set, but Gardener can only set default pod security policies (PSP) and does so only until v1.24 as with v1.25 PSPs were removed (deprecated since v1.21) and replaced with Pod Security Standards (see this blog for more information). Whatever the technology, you are responsible to configure custom-tailured appropriate PSPs respectively use them or PSSs, depending on your own workload and security needs (only you know what a pod should be allowed to do). 242442 Kubernetes must remove old components after updated versions have been installed ❌ While Gardener manages all its components in its system namespaces (automated), you are naturally responsible for your own workload. 254800 Kubernetes must have a Pod Security Admission control file configured ❌ Gardener ensures that the pod security configuration allows system components to be deployed in the kube-system namespace but does not set configurations that can affect user namespaces. It is recommended that users enforce a minimum of baseline pod security level for their workload via PodSecurity admission plugin. Rules Relevant for Service Providers ID Description 242376 The Kubernetes Controller Manager must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242377 The Kubernetes Scheduler must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242378 The Kubernetes API Server must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242379 The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination. 242380 The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination. 242381 The Kubernetes Controller Manager must create unique service accounts for each work payload. 242382 The Kubernetes API Server must enable Node,RBAC as the authorization mode. 242384 The Kubernetes Scheduler must have secure binding. 242385 The Kubernetes Controller Manager must have secure binding. 242386 The Kubernetes API server must have the insecure port flag disabled. 242387 The Kubernetes Kubelet must have the “readOnlyPort” flag disabled. 242388 The Kubernetes API server must have the insecure bind address not set. 242389 The Kubernetes API server must have the secure port set. 242391 The Kubernetes Kubelet must have anonymous authentication disabled. 242392 The Kubernetes kubelet must enable explicit authorization. 242396 Kubernetes Kubectl cp command must give expected access and results. 242397 The Kubernetes kubelet staticPodPath must not enable static pods. 242398 Kubernetes DynamicAuditing must not be enabled. 242399 Kubernetes DynamicKubeletConfig must not be enabled. 242404 Kubernetes Kubelet must deny hostname override. 242405 The Kubernetes manifests must be owned by root. 242406 The Kubernetes KubeletConfiguration file must be owned by root. 242407 The Kubernetes KubeletConfiguration files must have file permissions set to 644 or more restrictive. 242408 The Kubernetes manifest files must have least privileges. 242409 Kubernetes Controller Manager must disable profiling. 242410 The Kubernetes API Server must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242411 The Kubernetes Scheduler must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242412 The Kubernetes Controllers must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242413 The Kubernetes etcd must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242418 The Kubernetes API server must use approved cipher suites. 242419 Kubernetes API Server must have the SSL Certificate Authority set. 242420 Kubernetes Kubelet must have the SSL Certificate Authority set. 242421 Kubernetes Controller Manager must have the SSL Certificate Authority set. 242422 Kubernetes API Server must have a certificate for communication. 242423 Kubernetes etcd must enable client authentication to secure service. 242424 Kubernetes Kubelet must enable tlsPrivateKeyFile for client authentication to secure service. 242425 Kubernetes Kubelet must enable tlsCertFile for client authentication to secure service. 242426 Kubernetes etcd must enable client authentication to secure service. 242427 Kubernetes etcd must have a key file for secure communication. 242428 Kubernetes etcd must have a certificate for communication. 242429 Kubernetes etcd must have the SSL Certificate Authority set. 242430 Kubernetes etcd must have a certificate for communication. 242431 Kubernetes etcd must have a key file for secure communication. 242432 Kubernetes etcd must have peer-cert-file set for secure communication. 242433 Kubernetes etcd must have a peer-key-file set for secure communication. 242438 Kubernetes API Server must configure timeouts to limit attack surface. 242443 Kubernetes must contain the latest updates as authorized by IAVMs, CTOs, DTMs, and STIGs. 242444 The Kubernetes component manifests must be owned by root. 242445 The Kubernetes component etcd must be owned by etcd. 242446 The Kubernetes conf files must be owned by root. 242447 The Kubernetes Kube Proxy must have file permissions set to 644 or more restrictive. 242448 The Kubernetes Kube Proxy must be owned by root. 242449 The Kubernetes Kubelet certificate authority file must have file permissions set to 644 or more restrictive. 242450 The Kubernetes Kubelet certificate authority must be owned by root. 242451 The Kubernetes component PKI must be owned by root. 242452 The Kubernetes kubelet KubeConfig must have file permissions set to 644 or more restrictive. 242453 The Kubernetes kubelet KubeConfig file must be owned by root. 242454 The Kubernetes kubeadm.conf must be owned by root. 242455 The Kubernetes kubeadm.conf must have file permissions set to 644 or more restrictive. 242456 The Kubernetes kubelet config must have file permissions set to 644 or more restrictive. 242457 The Kubernetes kubelet config must be owned by root. 242459 The Kubernetes etcd must have file permissions set to 644 or more restrictive. 242460 The Kubernetes admin.conf must have file permissions set to 644 or more restrictive. 242466 The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive. 242467 The Kubernetes PKI keys must have file permissions set to 600 or more restrictive. 245542 Kubernetes API Server must disable basic authentication to protect information in transit. 245544 Kubernetes endpoints must use approved organizational certificate and key pair to protect information in transit. 254801 Kubernetes must enable PodSecurity admission controller on static pods and Kubelets. ","categories":"","description":"Compliant user management of your Gardener projects","excerpt":"Compliant user management of your Gardener projects","ref":"/docs/security-and-compliance/kubernetes-hardening/","tags":["task"],"title":"Kubernetes Cluster Hardening Procedure"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/client-tools/","tags":"","title":"Set Up Client Tools"},{"body":"A curated list of awesome Kubernetes sources. Inspired by @sindresorhus’ awesome\nSetup Install Docker for Mac Install Docker for Windows Run a Kubernetes Cluster on your local machine A Place That Marks the Beginning of a Journey Read the kubernetes.io documentation Take an online Udemy course Kubernetes Community Overview and Contributions Guide by Ihor Dvoretskyi Kubernetes: The Future of Cloud Hosting by Meteorhacks Kubernetes by Google by Gaston Pantana Application Containers: Kubernetes and Docker from Scratch by Keith Tenzer Learn the Kubernetes Key Concepts in 10 Minutes by Omer Dawelbeit The Children’s Illustrated Guide to Kubernetes by Deis :-) Docker Kubernetes Lab Handbook by Peng Xiao Interactive Learning Environments Learn Kubernetes using an interactive environment without requiring downloads or configuration\n Interactive Kubernetes Tutorials Kubernetes: From Basics to Guru Kubernetes Bootcamp Massive Open Online Courses / Tutorials List of available free online courses(MOOC) and tutorials\n DevOps with Kubernetes Introduction to Kubernetes Courses Scalable Microservices with Kubernetes at Udacity Introduction to Kubernetes at edX Tutorials Kubernetes Tutorials by Kubernetes Team Kubernetes By Example by OpenShift Team Kubernetes Tutorial by Tutorialspoint Package Managers Helm KPM RPC gRPC RBAC Kubernetes RBAC: Role-Based Access Control Secret Generation and Management Vault auth plugin backend: Kubernetes Vault controller kube-lego k8sec kubernetes-vault kubesec - Secure Secret management Machine Learning TensorFlow k8s mxnet-operator - Tools for ML/MXNet on Kubernetes. kubeflow - Machine Learning Toolkit for Kubernetes. seldon-core - Open source framework for deploying machine learning models on Kubernetes Raspberry Pi Some of the awesome findings and experiments on using Kubernetes with Raspberry Pi.\n Kubecloud Setting up a Kubernetes on ARM cluster Setup Kubernetes on a Raspberry Pi Cluster easily the official way! by Mathias Renner and Lucas Käldström How to Build a Kubernetes Cluster with ARM Raspberry Pi then run .NET Core on OpenFaas by Scott Hanselman Contributing Contributions are most welcome!\nThis list is just getting started, please contribute to make it super awesome.\n","categories":"","description":"","excerpt":"A curated list of awesome Kubernetes sources. Inspired by …","ref":"/curated-links/","tags":"","title":"Curated Links"},{"body":" ","categories":"","description":"Gardener - Kubernetes automation including day 2 operations","excerpt":"Gardener - Kubernetes automation including day 2 operations","ref":"/docs/resources/videos/gardener-teaser/","tags":"","title":"Gardener Teaser"},{"body":"","categories":"","description":"Interesting and useful content on Kubernetes","excerpt":"Interesting and useful content on Kubernetes","ref":"/docs/resources/","tags":"","title":"Resources"},{"body":"Contributing to Gardener Welcome Welcome to the Contributor section of Gardener. Here you can learn how it is possible for you to contribute your ideas and expertise to the project and have it grow even more.\nPrerequisites Before you begin contributing to Gardener, there are a couple of things you should become familiar with and complete first.\nCode of Conduct All members of the Gardener community must abide by the Contributor Covenant. Only by respecting each other can we develop a productive, collaborative community. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting gardener.opensource@sap.com and/or a Gardener project maintainer.\nDeveloper Certificate of Origin Due to legal reasons, contributors will be asked to accept a Developer Certificate of Origin (DCO) before they submit the first pull request to this projects, this happens in an automated fashion during the submission process. We use the standard DCO text of the Linux Foundation.\nLicense Your contributions to Gardener must be licensed properly:\n Code contributions must be licensed under the Apache 2.0 License Documentation contributions must be licensed under the Creative Commons Attribution 4.0 International License Contributing Gardener uses GitHub to manage reviews of pull requests.\n If you are a new contributor see: Steps to Contribute\n If you have a trivial fix or improvement, go ahead and create a pull request.\n If you plan to do something more involved, first discuss your ideas on our mailing list. This will avoid unnecessary work and surely give you and us a good deal of inspiration.\n Relevant coding style guidelines are the Go Code Review Comments and the Formatting and style section of Peter Bourgon’s Go: Best Practices for Production Environments.\n Steps to Contribute Should you wish to work on an issue, please claim it first by commenting on the GitHub issue that you want to work on it. This is to prevent duplicated efforts from contributors on the same issue.\nIf you have questions about one of the issues, with or without the tag, please comment on them and one of the maintainers will clarify it.\nWe kindly ask you to follow the Pull Request Checklist to ensure reviews can happen accordingly.\nPull Request Checklist Branch from the master branch and, if needed, rebase to the current master branch before submitting your pull request. If it doesn’t merge cleanly with master you may be asked to rebase your changes.\n Commits should be as small as possible, while ensuring that each commit is correct independently (i.e., each commit should compile and pass tests).\n Test your changes as thoroughly as possible before your commit them. Preferably, automate your testing with unit / integration tests. If tested manually, provide information about the test scope in the PR description (e.g., “Test passed: Upgrade K8s version from 1.14.5 to 1.15.2 on AWS, Azure, GCP, Alicloud, Openstack.”).\n When creating the PR, make your Pull Request description as detailed as possible to help out the reviewers.\n Create Work In Progress [WIP] pull requests only if you need a clarification or an explicit review before you can continue your work item.\n If your patch is not getting reviewed or you need a specific person to review it, you can @-reply a reviewer asking for a review in the pull request or a comment, or you can ask for a review on our mailing list.\n If you add new features, make sure that they are documented in the Gardener documentation.\n If your changes are relevant for operators, consider to update the ops toolbelt image.\n Post review:\n If a review requires you to change your commit(s), please test the changes again. Amend the affected commit(s) and force push onto your branch. Set respective comments in your GitHub review to resolved. Create a general PR comment to notify the reviewers that your amendments are ready for another round of review. Contributing Bigger Changes If you want to contribute bigger changes to Gardener, such as when introducing new API resources and their corresponding controllers, or implementing an approved Gardener Enhancement Proposal, follow the guidelines outlined in Contributing Bigger Changes.\nAdding Already Existing Documentation If you want to add documentation that already exists on GitHub to the website, you should update the central manifest instead of duplicating the content. To find out how to do that, see Adding Already Existing Documentation.\nIssues and Planning We use GitHub issues to track bugs and enhancement requests. Please provide as much context as possible when you open an issue. The information you provide must be comprehensive enough to reproduce that issue for the assignee. Therefore, contributors may use but aren’t restricted to the issue template provided by the Gardener maintainers.\nZenHub is used for planning:\n Install the ZenHub Chrome plugin Login to ZenHub Open the Gardener ZenHub workspace Security Release Process See Security Release Process.\nCommunity Slack Channel #gardener, sign up here.\nMailing List gardener@googlegroups.com\nThe mailing list is hosted through Google Groups. To receive the lists’ emails, join the group as you would any other Google Group.\nOther For additional channels where you can reach us, as well as links to our bi-weekly meetings, visit the Community page.\n","categories":"","description":"Contributors guides for code and documentation","excerpt":"Contributors guides for code and documentation","ref":"/docs/contribute/","tags":"","title":"Contribute"},{"body":"Using images on the website has to contribute to the aesthetics and comprehensibility of the materials, with uncompromised experience when loading and browsing pages. That concerns crisp clear images, their consistent layout and color scheme, dimensions and aspect ratios, flicker-free and fast loading or the feeling of it, even on unreliable mobile networks and devices.\nImage Production Guidelines A good, detailed reference for optimal use of images for the web can be found at web.dev’s Fast Load Times topic. The following summarizes some key points plus suggestions for tools support.\nYou are strongly encouraged to use vector images (SVG) as much as possible. They scale seamlessly without compromising the quality and are easier to maintain.\nIf you are just now starting with SVG authoring, here are some tools suggestions: Figma (online/Win/Mac), Sketch (Mac only).\nFor raster images (JPG, PNG, GIF), consider the following requirements and choose a tool that enables you to conform to them:\n Be mindful about image size, the total page size and loading times. Larger images (\u003e10K) need to support progressive rendering. Consult with your favorite authoring tool’s documentation to find out if and how it supports that. The site delivers the optimal media content format and size depending on the device screen size. You need to provide several variants (large screen, laptop, tablet, phone). Your authoring tool should be able to resize and resample images. Always save the largest size first and then downscale from it to avoid image quality loss. If you are looking for a tool that conforms to those guidelines, IrfanView is a very good option.\nScreenshots can be taken with whatever tool you have available. A simple Alt+PrtSc (Win) and paste into an image processing tool to save it does the job. If you need to add emphasized steps (1,2,3) when you describe a process on a screeshot, you can use Snaggit. Use red color and numbers. Mind the requirements for raster images laid out above.\nDiagrams can be exported as PNG/JPG from a diagraming tool such as Visio or even PowerPoint. Pick whichever you are comfortable with to design the diagram and make sure you comply with the requirements for the raster images production above. Diagrams produced as SVG are welcome too if your authoring tool supports exporting in that format. In any case, ensure that your diagrams “blend” with the content on the site - use the same color scheme and geometry style. Do not complicate diagrams too much. The site also supports Mermaid diagrams produced with markdown and rendered as SVG. You don’t need special tools for them, but for more complex ones you might want to prototype your diagram wth Mermaid’s online live editor, before encoding it in your markdown. More tips on using Mermaid can be found in the Shortcodes documentation.\nUsing Images in Markdown The standard for adding images to a topic is to use markdown’s ![caption](image-path). If the image is not showing properly, or if you wish to serve images close to their natural size and avoid scaling, then you can use HTML5’s \u003cpicture\u003e tag.\nExample:\n\u003cpicture\u003e \u003c!-- default, laptop-width-L max 1200px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-XL.png\" media=\"(min-width: 1000px)\"\u003e \u003c!-- default, laptop-width max 1000px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-L.png\" media=\"(min-width: 1400px)\"\u003e \u003c!-- default, tablets-width max 750px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-M.png\" media=\"(min-width: 750px)\"\u003e \u003c!-- default, phones-width max 450px --\u003e \u003cimg src=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview.png\" /\u003e \u003c/picture\u003e When deciding on image sizes, consider the breakpoints in the example above as maximum widths for each image variant you provide. Note that the site is designed for maximum width 1200px. There is no point to create images larger than that, since they will be scaled down.\nFor a nice overview on making the best use of responsive images with HTML5, please refer to the Responsive Images guide.\n","categories":"","description":"","excerpt":"Using images on the website has to contribute to the aesthetics and …","ref":"/docs/contribute/documentation/images/","tags":"","title":"Working with Images"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/high-availability/","tags":"","title":"High Availability"},{"body":"Run Partial DISA K8s STIGs Ruleset Against a Gardener Shoot Cluster Introduction This part shows how to run the DISA K8s STIGs ruleset against a Gardener shoot cluster. The guide features the managedk8s provider which does not implement all of the DISA K8s STIG rules since it assumes that the user running the ruleset does not have access to the environment (the seed in this particular case) in which the control plane components reside.\nPrerequisites Make sure you have diki installed and have a running Gardener shoot cluster.\nConfiguration We will be using the sample Partial DISA K8s STIG for Shoots configuration file for this run. You will need to set the provider.args.kubeconfigPath field pointing to a shoot admin kubeconfig.\nIn case you need instructions on how to generate such a kubeconfig, please read Accessing Shoot Clusters.\nAdditional metadata such as the shoot’s name can also be included in the provider.metadata section. The metadata section can be used to add additional context to different diki runs.\nThe provided configuration contains the recommended rule options for running the managedk8s provider ruleset against a shoot cluster, but you can modify rule options parameters according to requirements. All available options can be found in the managedk8s example configuration.\nRunning the DISA K8s STIGs Ruleset To run diki against a Gardener shoot cluster, run the following command:\ndiki run \\ --config=./example/guides/partial-disa-k8s-stig-shoot.yaml \\ --provider=managedk8s \\ --ruleset-id=disa-kubernetes-stig \\ --ruleset-version=v2r1 \\ --output=disa-k8s-stigs-report.json Generating a Report We can use the file generated in the previous step to create an html report by using the following command:\ndiki report generate \\ --output=disa-k8s-stigs-report.html \\ disa-k8s-stigs-report.json ","categories":"","description":"How can I check whether my shoot cluster fulfills the DISA STIGs security requirements?","excerpt":"How can I check whether my shoot cluster fulfills the DISA STIGs …","ref":"/docs/security-and-compliance/partial-disa-k8s-stig-shoot/","tags":"","title":"Run DISA K8s STIGs Ruleset"},{"body":" ","categories":"","description":"The Illustrated Children's Guide to Kubernetes. Written and performed by Matt Butcher Illustrated by Bailey Beougher","excerpt":"The Illustrated Children's Guide to Kubernetes. Written and performed …","ref":"/docs/resources/videos/fairy-tail/","tags":"","title":"The Illustrated Guide to Kubernetes"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/resources/videos/","tags":"","title":"Videos"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/administer-shoots/","tags":"","title":"Administer Client (Shoot) Clusters"},{"body":"Overview Gardener aims to comply with public security standards and guidelines, such as the Security Technical Implementation Guide (STIG) for Kubernetes from Defense Information Systems Agency (DISA). The DISA Kubernetes STIG is a set of rules that provide recommendations for secure deployment and operation of Kubernetes. It covers various aspects of Kubernetes security, including the configurations of the Kubernetes API server and other components, cluster management, certificate management, handling of updates and patches.\nWhile Gardener aims to follow this guideline, we also recognize that not all of the rules may be directly applicable or optimal for Gardener specific environment. Therefore, some of the requirements are adjusted. Rules that are not applicable to Gardener are skipped given an appropriate justification.\nFor every release, we check that Gardener is able of creating security hardened shoot clusters, reconfirming that the configurations which are not secure by default (as per Gardener Kubernetes Cluster Hardening Procedure) are still possible and work as expected.\nIn order to automate and ease this process, Gardener uses a tool called diki.\nSecurity Hardened Shoot Configurations The following security hardened shoot configurations were used in order to generate the compliance report.\n AWS kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: aws spec: cloudProfileName: aws kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: aws controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.180.0.0/16 zones: - internal: 10.180.48.0/20 name: eu-west-1c public: 10.180.32.0/20 workers: 10.180.0.0/19 workers: - cri: name: containerd name: worker-kkfk1 machine: type: m5.large image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: gp3 size: 50Gi zones: - eu-west-1c workersSettings: sshAccess: enabled: false purpose: evaluation region: eu-west-1 secretBindingName: secretBindingName Azure kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: azure spec: cloudProfileName: az kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: azure controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.180.0.0/16 workers: 10.180.0.0/16 zoned: true workers: - cri: name: containerd name: worker-g7p4p machine: type: Standard_A4_v2 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: StandardSSD_LRS size: 50Gi zones: - '3' workersSettings: sshAccess: enabled: false purpose: evaluation region: westeurope secretBindingName: secretBindingName GCP kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: gcp spec: cloudProfileName: gcp kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: gcp controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.180.0.0/16 workers: - cri: name: containerd name: worker-bex82 machine: type: n1-standard-2 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: pd-balanced size: 50Gi zones: - europe-west1-b workersSettings: sshAccess: enabled: false purpose: evaluation region: europe-west1 secretBindingName: secretBindingName OpenStack kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: openstack spec: cloudProfileName: converged-cloud-cp kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: openstack controlPlaneConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: f5 infrastructureConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.180.0.0/16 floatingPoolName: FloatingIP-external-cp workers: - cri: name: containerd name: worker-dqty2 machine: type: g_c2_m4 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 zones: - eu-de-1b workersSettings: sshAccess: enabled: false purpose: evaluation region: eu-de-1 secretBindingName: secretBindingName Diki Configuration The following diki configuration was used in order to test each of the shoot clusters described above. Mind that the rules regarding audit logging are skipped because organizations have different requirements and Gardener can integrate with different audit logging solutions.\n Configuration metadata: ... providers: - id: gardener name: Gardener metadata: ... args: ... rulesets: - id: disa-kubernetes-stig name: DISA Kubernetes Security Technical Implementation Guide version: v1r11 args: maxRetries: 5 ruleOptions: - ruleID: \"242402\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242403\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242414\" args: acceptedPods: - podMatchLabels: k8s-app: node-local-dns namespaceMatchLabels: kubernetes.io/metadata.name: kube-system justification: \"node local dns requires port 53 in order to operate properly\" ports: - 53 - ruleID: \"242445\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242446\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242451\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242462\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242463\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242464\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"245543\" args: acceptedTokens: - user: \"health-check\" uid: \"health-check\" - ruleID: \"254800\" args: minPodSecurityLevel: \"baseline\" output: minStatus: Passed Security Compliance Report for Hardened Shoot Clusters The report can be reviewed directly or downloaded by clicking here.\n *,:after,:before{border:0 solid #e5e7eb;box-sizing:border-box}:after,:before{--tw-content:\"\"}html{-webkit-text-size-adjust:100%;font-feature-settings:normal;font-family:ui-sans-serif,system-ui,-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,Helvetica Neue,Arial,Noto Sans,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;font-variation-settings:normal;line-height:1.5;-moz-tab-size:4;-o-tab-size:4;tab-size:4}body{line-height:inherit;margin:0}hr{border-top-width:1px;color:inherit;height:0}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}b,strong{font-weight:bolder}code,kbd,pre,samp{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:initial}sub{bottom:-.25em}sup{top:-.5em}table{border-collapse:collapse;border-color:inherit;text-indent:0}button,input,optgroup,select,textarea{font-feature-settings:inherit;color:inherit;font-family:inherit;font-size:100%;font-variation-settings:inherit;font-weight:inherit;line-height:inherit;margin:0;padding:0}button,select{text-transform:none}[type=button],[type=reset],[type=submit],button{-webkit-appearance:button;background-color:initial;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:initial}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dd,dl,figure,h1,h2,h3,h4,h5,h6,hr,p,pre{margin:0}fieldset{margin:0}fieldset,legend{padding:0}menu,ol,ul{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{color:#9ca3af;opacity:1}input::placeholder,textarea::placeholder{color:#9ca3af;opacity:1}[role=button],button{cursor:pointer}:disabled{cursor:default}img,video{height:auto;max-width:100%}[hidden]{display:none}*,::backdrop,:after,:before{--tw-border-spacing-x:0;--tw-border-spacing-y:0;--tw-translate-x:0;--tw-translate-y:0;--tw-rotate:0;--tw-skew-x:0;--tw-skew-y:0;--tw-scale-x:1;--tw-scale-y:1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness:proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width:0px;--tw-ring-offset-color:#fff;--tw-ring-color:#3b82f680;--tw-ring-offset-shadow:0 0 #0000;--tw-ring-shadow:0 0 #0000;--tw-shadow:0 0 #0000;--tw-shadow-colored:0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }.tw-absolute{position:absolute}.tw-relative{position:relative}.tw-right-3{right:.75rem}.tw-top-3{top:.75rem}.tw-flex{display:flex}.tw-hidden{display:none}.tw-list-inside{list-style-position:inside}.tw-list-disc{list-style-type:disc}.tw-list-none{list-style-type:none}.tw-flex-col{flex-direction:column}.tw-justify-center{justify-content:center}.tw-overflow-x-auto{overflow-x:auto}.tw-rounded{border-radius:.25rem}.tw-rounded-lg{border-radius:.5rem}.tw-bg-gray-200{--tw-bg-opacity:1;background-color:rgb(229 231 235/var(--tw-bg-opacity))}.tw-p-1{padding:.25rem}.tw-p-4{padding:1rem}.tw-px-6{padding-left:1.5rem;padding-right:1.5rem}.tw-pb-5{padding-bottom:1.25rem}.tw-pl-2{padding-left:.5rem}.tw-pl-5{padding-left:1.25rem}.tw-pr-2{padding-right:.5rem}.tw-pt-2{padding-top:.5rem}.tw-text-2xl{font-size:1.5rem;line-height:2rem}.tw-text-3xl{font-size:1.875rem;line-height:2.25rem}.tw-text-lg{font-size:1.125rem;line-height:1.75rem}.tw-text-xl{font-size:1.25rem;line-height:1.75rem}.tw-font-bold{font-weight:700}.tw-font-medium{font-weight:500}.tw-font-semibold{font-weight:600}.hover\\:tw-bg-gray-100:hover{--tw-bg-opacity:1;background-color:rgb(243 244 246/var(--tw-bg-opacity))} .arrow { border: solid black; border-width: 0px 3px 3px 0px; display: inline-block; padding: 4px; } .right { transform: rotate(-45deg); -webkit-transform: rotate(-45deg); } .left { transform: rotate(135deg); -webkit-transform: rotate(135deg); } .up { transform: rotate(-135deg); -webkit-transform: rotate(-135deg); } .down { transform: rotate(45deg); -webkit-transform: rotate(45deg); } function collapse(event) { const parent = event.currentTarget.parentElement const list = parent.getElementsByTagName('ul')[0] const arrow = event.currentTarget.getElementsByTagName('i')[0] if (list.classList.contains('tw-hidden') === true) { list.classList.remove('tw-hidden') arrow.classList.replace('right', 'down') return } list.classList.add('tw-hidden') arrow.classList.replace('down', 'right') } function cpCode(event) { const parent = event.currentTarget.parentElement const code = parent.getElementsByTagName('pre')[0].innerText navigator.clipboard.writeText(code); } Compliance Run (07-25-2024) Diki Version: v0.10.0\nGlossary 🟢 Passed: Rule check has been fulfilled. 🔵 Skipped: Rule check has been considered irrelevant for the specific scenario and will not be run. 🔵 Accepted: Rule check may or may not have been run, but it was decided by the user that the check is not a finding. 🟠 Warning: Rule check has encountered an ambiguous condition or configuration preventing the ability to determine if the check is fulfilled or not. 🔴 Failed: Rule check has been unfulfilled, can be considered a finding. 🔴 Errored: Rule check has errored during runtime. It cannot be determined whether the check is fulfilled or not. 🟠 Not Implemented: Rule check has not been implemented yet. Provider Gardener\n Evaluated targets aws (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: aws, seedKubernetesVersion: v1.29.4, shootCloudProvider: aws, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:20:33) azure (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: azure, seedKubernetesVersion: v1.29.4, shootCloudProvider: azure, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:21:30) gcp (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: gcp, seedKubernetesVersion: v1.29.4, shootCloudProvider: gcp, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:22:14) openstack (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: openstack, seedKubernetesVersion: v1.29.4, shootCloudProvider: openstack, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:24:21) v1r11 DISA Kubernetes Security Technical Implementation Guide (61x Passed 🟢, 24x Skipped 🔵, 7x Accepted 🔵, 7x Warning 🟠, 3x Failed 🔴) 🟢 Passed The Kubernetes Controller Manager must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242376) Option tls-min-version has not been set. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack Kubernetes Scheduler must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242377) Option tls-min-version has not been set. aws cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--aws azure cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--azure gcp cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--gcp openstack cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--openstack The Kubernetes API Server must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242378) Option tls-min-version has not been set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242379) Option client-transport-security.auto-tls set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack The Kubernetes Controller Manager must create unique service accounts for each work payload(HIGH 242381) Option use-service-account-credentials set to allowed value. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack The Kubernetes API Server must enable Node,RBAC as the authorization mode (MEDIUM 242382) Option authorization-mode set to expected value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must separate user functionality (MEDIUM 242383) System resource in system namespaces. aws kind: Service name: kubernetes namespace: default azure kind: Service name: kubernetes namespace: default gcp kind: Service name: kubernetes namespace: default openstack kind: Service name: kubernetes namespace: default The Kubernetes API server must have the insecure port flag disabled (HIGH 242386) Option insecure-port not set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes Kubelet must have the \"readOnlyPort\" flag disabled (HIGH 242387) Option readOnlyPort not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have the insecure bind address not set (HIGH 242388) Option insecure-bind-address not set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes API server must have the secure port set (MEDIUM 242389) Option secure-port set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes API server must have anonymous authentication disabled (HIGH 242390) Option anonymous-auth set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes Kubelet must have anonymous authentication disabled (HIGH 242391) Option authentication.anonymous.enabled set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes kubelet must enable explicit authorization (HIGH 242392) Option authorization.mode set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Worker Nodes must not have sshd service running (MEDIUM 242393) SSH daemon service not installed aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Worker Nodes must not have the sshd service enabled (MEDIUM 242394) SSH daemon disabled (or could not be probed) aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes dashboard must not be enabled (MEDIUM 242395) Kubernetes dashboard not installed aws azure gcp openstack The Kubernetes kubelet staticPodPath must not enable static pods (HIGH 242397) Option staticPodPath not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have Alpha APIs disabled (MEDIUM 242400) Option featureGates.AllAlpha not set. aws cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--aws cluster: shoot kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--azure cluster: shoot kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--gcp cluster: shoot kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--openstack cluster: shoot kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system Kubernetes Kubelet must deny hostname override (MEDIUM 242404) Flag hostname-override not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet configuration file must be owned by root (MEDIUM 242406) File has expected owners aws details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet configuration files must have file permissions set to 644 or more restrictive (MEDIUM 242407) File has expected permissions aws details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Controller Manager must disable profiling (MEDIUM 242409) Option profiling set to allowed value. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack The Kubernetes cluster must use non-privileged host ports for user pods (MEDIUM 242414) Container does not use hostPort \u003c 1024. aws cluster: seed kind: pod name: aws-custom-route-controller-7856476fd4-hsq29 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-2v7cs namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-d7bpd namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cert-controller-manager-755dbd646b-hgxzx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cloud-controller-manager-769c9b45dd-c5vxq namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-controller-7669f6bfc4-nscqb namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xfjxn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xs2pt namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: event-logger-7cdddb58d8-65h7q namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-28tts namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-5q5st namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-56mqn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-b2lbj namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-7gwf4 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-state-metrics-68dfcd5d48-5mdnv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: network-problem-detector-controller-5f458c7579-82tns namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: shoot-dns-service-645f556cf4-7xc4r namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-hxrh7 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-vf58j namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-recommender-6f499cfd88-lnbrx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-updater-746fb98848-8zzf8 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-nd86n namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vjfwc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-s5286 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system cluster: shoot kind: pod name: node-exporter-fkdwq namespace: kube-system cluster: shoot kind: pod name: node-exporter-xhh5n namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-7nhkg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-vngln namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-lpf4t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-wk9l5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cert-controller-manager-7bd977469b-gj7zt namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cloud-controller-manager-678c6d74d6-9n8dm namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-controller-54b4bcd846-mlxgq namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-685cb namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-t64t4 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: event-logger-5d8496f566-jbqv7 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-mkrs9 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-tddc6 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-k6cl8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-ml2z8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-fqmls namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-thd52 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-controller-manager-86f5fc4fc7-fx4b5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-scheduler-9df464f49-fswpk namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-state-metrics-85b5bf77b4-mxf42 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: network-problem-detector-controller-66989c7547-j6rgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: remedy-controller-azure-57f7db994-gv467 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: shoot-dns-service-55f4885d86-85jgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-fxmch namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-s822t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-recommender-56bbfc87c8-lbv2s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-updater-6f4b5fb546-xb778 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system cluster: shoot kind: pod name: calico-node-4wmbt namespace: kube-system cluster: shoot kind: pod name: calico-node-8wlvp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-rzc7h namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-svm6w namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: diki-242449-m2wpk64dps namespace: kube-system cluster: shoot kind: pod name: diki-242451-0r3a1mudxn namespace: kube-system cluster: shoot kind: pod name: diki-242466-syzgrb0nhu namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-bbbbr namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system cluster: shoot kind: pod name: node-exporter-nbkkr namespace: kube-system cluster: shoot kind: pod name: node-exporter-ph9sx namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-8mw8p namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-p9jp4 namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-db9kq namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-t667q namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cert-controller-manager-6946674f78-9dsg6 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cloud-controller-manager-6f67b6df64-9svgn namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-controller-fd9587fdf-2mvdf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-6kzb7 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-qggvf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: event-logger-69576b5c95-hjbwj namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-qlhnp namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-z7rjv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-4r2tv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-szjgd namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-5mfhz namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-mjzj9 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-controller-manager-856b7c9889-dzsbv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-scheduler-5d4c7456bd-mvv6x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-state-metrics-64d5994f8-rfzmh namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: network-problem-detector-controller-66cc54677c-kvq75 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: shoot-dns-service-575bcd459-79s4m namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-jl676 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-s8flk namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-recommender-56645d8bdb-2lcmb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-updater-f79b6fc6b-4rlg5 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: diki-242404-z1nu9wom0m namespace: kube-system cluster: shoot kind: pod name: diki-242449-8z89s24f3f namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-2blsk namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-zl466 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system cluster: shoot kind: pod name: node-exporter-8frqb namespace: kube-system cluster: shoot kind: pod name: node-exporter-xq6cg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-mhj4m namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-rn6hv namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-46wrb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-v88mn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cert-controller-manager-5df68f6f5d-sgc7d namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cloud-controller-manager-b4857486b-2h6jb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-controller-5d4fc5c479-dmrwv namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-66245 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-c924q namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: event-logger-6469658865-tbjft namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-j9wdx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-wrpcb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-pg654 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-rfqn2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-m7mmg namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-state-metrics-7f54fbdbdb-jpq78 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: network-problem-detector-controller-78bbfd4757-tf8f2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: shoot-dns-service-867b566fc5-ct8wj namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-7j9lc namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-rhbmx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-recommender-5df469cbf4-kngl8 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-updater-5dfd58d478-ph8mz namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system cluster: shoot kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system cluster: shoot kind: pod name: calico-node-7xv9t namespace: kube-system cluster: shoot kind: pod name: calico-node-k2pc6 namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-t965v namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vsrrl namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-55ptw namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system cluster: shoot kind: pod name: node-exporter-rnbv9 namespace: kube-system cluster: shoot kind: pod name: node-exporter-trqtg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-k79bs namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-pdtdj namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system Secrets in Kubernetes must not be stored as environment variables (HIGH 242415) Pod does not use environment to inject secret. aws cluster: seed kind: pod name: aws-custom-route-controller-7856476fd4-hsq29 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-2v7cs namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-d7bpd namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cert-controller-manager-755dbd646b-hgxzx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cloud-controller-manager-769c9b45dd-c5vxq namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-controller-7669f6bfc4-nscqb namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xfjxn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xs2pt namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: event-logger-7cdddb58d8-65h7q namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-28tts namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-5q5st namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-56mqn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-b2lbj namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-7gwf4 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-state-metrics-68dfcd5d48-5mdnv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: network-problem-detector-controller-5f458c7579-82tns namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: shoot-dns-service-645f556cf4-7xc4r namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-hxrh7 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-vf58j namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-recommender-6f499cfd88-lnbrx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-updater-746fb98848-8zzf8 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-nd86n namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vjfwc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-s5286 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system cluster: shoot kind: pod name: node-exporter-fkdwq namespace: kube-system cluster: shoot kind: pod name: node-exporter-xhh5n namespace: kube-system cluster: shoot kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot kind: pod name: node-local-dns-ws9mx namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-7nhkg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-vngln namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-lpf4t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-wk9l5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cert-controller-manager-7bd977469b-gj7zt namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cloud-controller-manager-678c6d74d6-9n8dm namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-controller-54b4bcd846-mlxgq namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-685cb namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-t64t4 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: event-logger-5d8496f566-jbqv7 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-mkrs9 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-tddc6 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-k6cl8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-ml2z8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-fqmls namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-thd52 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-controller-manager-86f5fc4fc7-fx4b5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-scheduler-9df464f49-fswpk namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-state-metrics-85b5bf77b4-mxf42 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: network-problem-detector-controller-66989c7547-j6rgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: remedy-controller-azure-57f7db994-gv467 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: shoot-dns-service-55f4885d86-85jgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-fxmch namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-s822t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-recommender-56bbfc87c8-lbv2s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-updater-6f4b5fb546-xb778 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system cluster: shoot kind: pod name: calico-node-4wmbt namespace: kube-system cluster: shoot kind: pod name: calico-node-8wlvp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-rzc7h namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-svm6w namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-bbbbr namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system cluster: shoot kind: pod name: node-exporter-nbkkr namespace: kube-system cluster: shoot kind: pod name: node-exporter-ph9sx namespace: kube-system cluster: shoot kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot kind: pod name: node-local-dns-zs2sb namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-8mw8p namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-p9jp4 namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-db9kq namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-t667q namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cert-controller-manager-6946674f78-9dsg6 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cloud-controller-manager-6f67b6df64-9svgn namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-controller-fd9587fdf-2mvdf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-6kzb7 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-qggvf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: event-logger-69576b5c95-hjbwj namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-qlhnp namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-z7rjv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-4r2tv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-szjgd namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-5mfhz namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-mjzj9 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-controller-manager-856b7c9889-dzsbv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-scheduler-5d4c7456bd-mvv6x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-state-metrics-64d5994f8-rfzmh namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: network-problem-detector-controller-66cc54677c-kvq75 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: shoot-dns-service-575bcd459-79s4m namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-jl676 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-s8flk namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-recommender-56645d8bdb-2lcmb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-updater-f79b6fc6b-4rlg5 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: diki-242393-ot4eirqfni namespace: kube-system cluster: shoot kind: pod name: diki-242406-uphz6x02zf namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-2blsk namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-zl466 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system cluster: shoot kind: pod name: node-exporter-8frqb namespace: kube-system cluster: shoot kind: pod name: node-exporter-xq6cg namespace: kube-system cluster: shoot kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot kind: pod name: node-local-dns-kz9nr namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-mhj4m namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-rn6hv namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-46wrb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-v88mn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cert-controller-manager-5df68f6f5d-sgc7d namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cloud-controller-manager-b4857486b-2h6jb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-controller-5d4fc5c479-dmrwv namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-66245 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-c924q namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: event-logger-6469658865-tbjft namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-j9wdx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-wrpcb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-pg654 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-rfqn2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-m7mmg namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-state-metrics-7f54fbdbdb-jpq78 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: network-problem-detector-controller-78bbfd4757-tf8f2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: shoot-dns-service-867b566fc5-ct8wj namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-7j9lc namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-rhbmx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-recommender-5df469cbf4-kngl8 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-updater-5dfd58d478-ph8mz namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system cluster: shoot kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system cluster: shoot kind: pod name: calico-node-7xv9t namespace: kube-system cluster: shoot kind: pod name: calico-node-k2pc6 namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-t965v namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vsrrl namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-55ptw namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system cluster: shoot kind: pod name: node-exporter-rnbv9 namespace: kube-system cluster: shoot kind: pod name: node-exporter-trqtg namespace: kube-system cluster: shoot kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot kind: pod name: node-local-dns-r8z88 namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-k79bs namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-pdtdj namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system Kubernetes must separate user functionality (MEDIUM 242417) Gardener managed pods are not user pods aws kind: pod name: apiserver-proxy-kx2mw namespace: kube-system kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system kind: pod name: calico-node-9nlzv namespace: kube-system kind: pod name: calico-node-l94hn namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system kind: pod name: csi-driver-node-mrv64 namespace: kube-system kind: pod name: csi-driver-node-s74n2 namespace: kube-system kind: pod name: egress-filter-applier-nd86n namespace: kube-system kind: pod name: egress-filter-applier-vjfwc namespace: kube-system kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system kind: pod name: network-problem-detector-host-s5286 namespace: kube-system kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system kind: pod name: node-exporter-fkdwq namespace: kube-system kind: pod name: node-exporter-xhh5n namespace: kube-system kind: pod name: node-local-dns-6kjdw namespace: kube-system kind: pod name: node-local-dns-ws9mx namespace: kube-system kind: pod name: node-problem-detector-7nhkg namespace: kube-system kind: pod name: node-problem-detector-vngln namespace: kube-system kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure kind: pod name: apiserver-proxy-kbgdp namespace: kube-system kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system kind: pod name: calico-node-4wmbt namespace: kube-system kind: pod name: calico-node-8wlvp namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system kind: pod name: cloud-node-manager-rzc7h namespace: kube-system kind: pod name: cloud-node-manager-svm6w namespace: kube-system kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system kind: pod name: egress-filter-applier-bbbbr namespace: kube-system kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system kind: pod name: node-exporter-nbkkr namespace: kube-system kind: pod name: node-exporter-ph9sx namespace: kube-system kind: pod name: node-local-dns-s2lvs namespace: kube-system kind: pod name: node-local-dns-zs2sb namespace: kube-system kind: pod name: node-problem-detector-8mw8p namespace: kube-system kind: pod name: node-problem-detector-p9jp4 namespace: kube-system kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp kind: pod name: apiserver-proxy-rmcnj namespace: kube-system kind: pod name: apiserver-proxy-v88dp namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system kind: pod name: calico-node-5bzc2 namespace: kube-system kind: pod name: calico-node-cnwrp namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system kind: pod name: csi-driver-node-z298z namespace: kube-system kind: pod name: csi-driver-node-zgp8f namespace: kube-system kind: pod name: egress-filter-applier-2blsk namespace: kube-system kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system kind: pod name: network-problem-detector-host-zl466 namespace: kube-system kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system kind: pod name: node-exporter-8frqb namespace: kube-system kind: pod name: node-exporter-xq6cg namespace: kube-system kind: pod name: node-local-dns-cl4xr namespace: kube-system kind: pod name: node-local-dns-kz9nr namespace: kube-system kind: pod name: node-problem-detector-mhj4m namespace: kube-system kind: pod name: node-problem-detector-rn6hv namespace: kube-system kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack kind: pod name: apiserver-proxy-qw9pr namespace: kube-system kind: pod name: apiserver-proxy-qzdcp namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system kind: pod name: calico-node-7xv9t namespace: kube-system kind: pod name: calico-node-k2pc6 namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system kind: pod name: csi-driver-node-gcsc7 namespace: kube-system kind: pod name: csi-driver-node-pmml4 namespace: kube-system kind: pod name: egress-filter-applier-t965v namespace: kube-system kind: pod name: egress-filter-applier-vsrrl namespace: kube-system kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system kind: pod name: network-problem-detector-host-55ptw namespace: kube-system kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system kind: pod name: node-exporter-rnbv9 namespace: kube-system kind: pod name: node-exporter-trqtg namespace: kube-system kind: pod name: node-local-dns-jdng7 namespace: kube-system kind: pod name: node-local-dns-r8z88 namespace: kube-system kind: pod name: node-problem-detector-k79bs namespace: kube-system kind: pod name: node-problem-detector-pdtdj namespace: kube-system kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system The Kubernetes API server must use approved cipher suites (MEDIUM 242418) Option tls-cipher-suites set to allowed values. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes API Server must have the SSL Certificate Authority set (MEDIUM 242419) Option client-ca-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes Kubelet must have the SSL Certificate Authority set (MEDIUM 242420) Option authentication.x509.clientCAFile set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Controller Manager must have the SSL Certificate Authority set (MEDIUM 242421) Option root-ca-file set. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack Kubernetes API Server must have a certificate for communication (MEDIUM 242422) Option tls-cert-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Option tls-private-key-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must enable client authentication to secure service (MEDIUM 242423) Option client-transport-security.client-cert-auth set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes Kubelet must enable tlsPrivateKeyFile for client authentication to secure service (MEDIUM 242424) Kubelet rotates server certificates automatically itself. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Kubelet must enable tlsCertFile for client authentication to secure service (MEDIUM 242425) Kubelet rotates server certificates automatically itself. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes etcd must have a key file for secure communication (MEDIUM 242427) Option client-transport-security.key-file set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have a certificate for communication (MEDIUM 242428) Option client-transport-security.cert-file set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have the SSL Certificate Authority set (MEDIUM 242429) Option etcd-cafile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must have a certificate for communication (MEDIUM 242430) Option etcd-certfile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must have a key file for secure communication (MEDIUM 242431) Option etcd-keyfile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes Kubelet must enable kernel protection (HIGH 242434) Option protectKernelDefaults set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have the ValidatingAdmissionWebhook enabled (HIGH 242436) Option enable-admission-plugins set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes API Server must configure timeouts to limit attack surface (MEDIUM 242438) Option request-timeout has not been set. aws details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must remove old components after updated versions have been installed (MEDIUM 242442) All found images use current versions. aws azure gcp openstack The Kubernetes component etcd must be owned by etcd (MEDIUM 242445) File has expected owners aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_31.3632059657/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/region, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/secretAccessKey, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/accessKeyID, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_34.2074945830/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageAccount, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageKey, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_30.2940324903/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/serviceaccount.json, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_39.3264256653/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialSecret, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/authURL, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/domainName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/region, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/tenantName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialID, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_27.791977657/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_26.760285163/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes conf files must be owned by root (MEDIUM 242446) File has expected owners aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_03_32.3178977814/config.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_03_07.736850249/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_02_10.919451044/audit-policy.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/podsecurity.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/admission-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_02_10.226502613/encryption-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_02_10.2933211119/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_02_10.2023717197/egress-selector-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_02_10.1624455993/static_tokens.csv, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_01_59.3581293990/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_02_16.2132886517/config.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_00_42.2870882805/audit-policy.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/podsecurity.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/admission-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_00_42.531503639/encryption-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_00_42.322496126/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_00_42.3637718223/egress-selector-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_00_42.2571933157/static_tokens.csv, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack The Kubernetes Kube Proxy kubeconfig must have file permissions set to 644 or more restrictive (MEDIUM 242447) File has expected permissions aws details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes Kube Proxy kubeconfig must be owned by root (MEDIUM 242448) File has expected owners aws details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes Kubelet certificate authority file must have file permissions set to 644 or more restrictive (MEDIUM 242449) File has expected permissions aws details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes Kubelet certificate authority must be owned by root (MEDIUM 242450) File has expected owners aws details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes component PKI must be owned by root (MEDIUM 242451) File has expected owners aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes kubelet KubeConfig must have file permissions set to 644 or more restrictive (MEDIUM 242452) File has expected permissions aws details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet KubeConfig file must be owned by root (MEDIUM 242453) File has expected owners aws details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) File has expected permissions aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes admin.conf must have file permissions set to 644 or more restrictive (MEDIUM 242460) File has expected permissions aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_03_32.3178977814/config.yaml, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/kubeconfig, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/token, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_02_10.919451044/audit-policy.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/podsecurity.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/admission-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_02_10.226502613/encryption-configuration.yaml, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_02_10.2933211119/id_rsa, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_02_10.2023717197/egress-selector-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_02_10.1624455993/static_tokens.csv, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_03_07.736850249/id_rsa, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/kubeconfig, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/token, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_01_59.3581293990/id_rsa, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/kubeconfig, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/token, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_02_16.2132886517/config.yaml, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/token, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/kubeconfig, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_00_42.2870882805/audit-policy.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/podsecurity.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/admission-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_00_42.531503639/encryption-configuration.yaml, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_00_42.322496126/id_rsa, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_00_42.3637718223/egress-selector-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_00_42.2571933157/static_tokens.csv, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack The Kubernetes API Server audit logs must be enabled (MEDIUM 242461) Option audit-policy-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive (MEDIUM 242466) File has expected permissions aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) File has expected permissions aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Kubelet must not disable timeouts (MEDIUM 245541) Option streamingConnectionIdleTimeout set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes API Server must disable basic authentication to protect information in transit (HIGH 245542) Option basic-auth-file has not been set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes endpoints must use approved organizational certificate and key pair to protect information in transit (HIGH 245544) Option kubelet-client-certificate set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Option kubelet-client-key set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must have a Pod Security Admission control file configured (HIGH 254800) PodSecurity is properly configured aws kind: PodSecurityConfiguration azure kind: PodSecurityConfiguration gcp kind: PodSecurityConfiguration openstack kind: PodSecurityConfiguration 🔵 Skipped The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242380) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack The Kubernetes Scheduler must have secure binding (MEDIUM 242384) The Kubernetes Scheduler runs in a container which already has limited access to network interfaces. In addition ingress traffic to the Kubernetes Scheduler is restricted via network policies, making an unintended exposure less likely. aws azure gcp openstack The Kubernetes Controller Manager must have secure binding (MEDIUM 242385) The Kubernetes Controller Manager runs in a container which already has limited access to network interfaces. In addition ingress traffic to the Kubernetes Controller Manager is restricted via network policies, making an unintended exposure less likely. aws azure gcp openstack Kubernetes Kubectl cp command must give expected access and results (MEDIUM 242396) \"kubectl\" is not installed into control plane pods or worker nodes and Gardener does not offer Kubernetes v1.12 or older. aws azure gcp openstack Kubernetes DynamicAuditing must not be enabled (MEDIUM 242398) Option feature-gates.DynamicAuditing removed in Kubernetes v1.19. aws azure gcp openstack Kubernetes DynamicKubeletConfig must not be enabled (MEDIUM 242399) Option featureGates.DynamicKubeletConfig removed in Kubernetes v1.26. aws details: Used Kubernetes version 1.28.10. azure details: Used Kubernetes version 1.28.10. gcp details: Used Kubernetes version 1.28.10. openstack details: Used Kubernetes version 1.28.10. Kubernetes manifests must be owned by root (MEDIUM 242405) Gardener does not deploy any control plane component as systemd processes or static pod. aws azure gcp openstack The Kubernetes manifest files must have least privileges (MEDIUM 242408) Gardener does not deploy any control plane component as systemd processes or static pod. aws azure gcp openstack The Kubernetes API Server must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242410) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes Scheduler must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242411) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes Controllers must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242412) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes etcd must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242413) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack Kubernetes etcd must enable client authentication to secure service (MEDIUM 242426) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have peer-cert-file set for secure communication (MEDIUM 242432) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have a peer-key-file set for secure communication (MEDIUM 242433) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes must have a pod security policy set (HIGH 242437) PSPs are removed in K8s version 1.25. aws azure gcp openstack Kubernetes must contain the latest updates as authorized by IAVMs, CTOs, DTMs, and STIGs (MEDIUM 242443) Scanning/patching security vulnerabilities should be enforced organizationally. Security vulnerability scanning should be automated and maintainers should be informed automatically. aws azure gcp openstack Kubernetes component manifests must be owned by root (MEDIUM 242444) Rule is duplicate of \"242405\" aws azure gcp openstack Kubernetes kubeadm.conf must be owned by root(MEDIUM 242454) Gardener does not use \"kubeadm\" and also does not store any \"main config\" anywhere in seed or shoot (flow/component logic built-in/in-code). aws azure gcp openstack Kubernetes kubeadm.conf must have file permissions set to 644 or more restrictive (MEDIUM 242455) Gardener does not use \"kubeadm\" and also does not store any \"main config\" anywhere in seed or shoot (flow/component logic built-in/in-code). aws azure gcp openstack Kubernetes kubelet config must have file permissions set to 644 or more restrictive (MEDIUM 242456) Rule is duplicate of \"242452\". aws azure gcp openstack Kubernetes kubelet config must be owned by root (MEDIUM 242457) Rule is duplicate of \"242453\". aws azure gcp openstack Kubernetes API Server audit log path must be set (MEDIUM 242465) Rule is duplicate of \"242402\" aws azure gcp openstack Kubernetes must enable PodSecurity admission controller on static pods and Kubelets (HIGH 254801) Option featureGates.PodSecurity was made GA in v1.25 and removed in v1.28. aws azure gcp openstack 🔵 Accepted The Kubernetes API Server must have an audit log path set (MEDIUM 242402) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server must generate audit records that identify what type of event has occurred, identify the source of the event, contain the event results, identify any users, and identify any containers associated with the event (MEDIUM 242403) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes cluster must use non-privileged host ports for user pods (MEDIUM 242414) node local dns requires port 53 in order to operate properly aws cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-ws9mx namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-ws9mx namespace: kube-system azure cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-zs2sb namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-zs2sb namespace: kube-system gcp cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-kz9nr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-kz9nr namespace: kube-system openstack cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-r8z88 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-r8z88 namespace: kube-system The Kubernetes API Server must be set to audit log max size (MEDIUM 242462) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server must be set to audit log maximum backup (MEDIUM 242463) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server audit log retention must be set (MEDIUM 242464) Gardener can integrate with different audit logging solutions aws azure gcp openstack Kubernetes API Server must disable token authentication to protect information in transit (HIGH 245543) All defined tokens are accepted. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack 🟠 Warning The Kubernetes component etcd must be owned by etcd (MEDIUM 242445) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f gcp kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f The Kubernetes conf files must be owned by root (MEDIUM 242446) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes component PKI must be owned by root (MEDIUM 242451) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f gcp kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f The Kubernetes admin.conf must have file permissions set to 644 or more restrictive (MEDIUM 242460) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive (MEDIUM 242466) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 🔴 Failed Secrets in Kubernetes must not be stored as environment variables (HIGH 242415) Pod uses environment to inject secret. gcp cluster: seed details: containerName: backup-restore, variableName: GOOGLE_STORAGE_API_ENDPOINT, keyRef: storageAPIEndpoint kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) File has too wide permissions aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/region, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/secretAccessKey, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/accessKeyID, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_34.2074945830/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_31.3632059657/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageAccount, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageKey, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_30.2940324903/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/serviceaccount.json, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_39.3264256653/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialSecret, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/authURL, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/domainName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/region, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/tenantName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialID, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_27.791977657/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_26.760285163/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) File has too wide permissions aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack ","categories":"","description":"The latest compliance report generated against security hardened shoot clusters","excerpt":"The latest compliance report generated against security hardened shoot …","ref":"/docs/security-and-compliance/report/","tags":"","title":"Gardener Compliance Report"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/networking/","tags":"","title":"Networking"},{"body":" ","categories":"","description":"Red Hatter Jamie Duncan gives a technical overview of Kubernetes, an open source container orchestration system, in just five minutes.","excerpt":"Red Hatter Jamie Duncan gives a technical overview of Kubernetes, an …","ref":"/docs/resources/videos/why-kubernetes/","tags":"","title":"Why Kubernetes"},{"body":"Shared Responsibility Model Gardener, like most cloud providers’ Kubernetes offerings, is dedicated for a global setup. And just like how most cloud providers offer means to fulfil regional restrictions, Gardener also has some means built in for this purpose. Similarly, Gardener also follows a shared responsibility model where users are obliged to use the provided Gardener means in a way which results in compliance with regional restrictions.\nRegions Gardener users need to understand that Gardener is a generic tool and has no built-in knowledge about regions as geographical or political conglomerates. For Gardener, regions are only strings. To create regional restrictions is an obligation of all Gardener users who orchestrate existing Gardener functionality to reach evidence which can be audited later on.\nSupport for Regional Restrictions Gardener offers functionality to support the most important kind of regional restrictions in its global setup:\n No Restriction: All seeds in all regions can be allowed to host the control plane of all shoots. Restriction by Dedication: Shoots running in a region can be configured so that only dedicated seeds in dedicated regions are allowed to host the shoot’s control plane. This can be achieved by adding labels to a seed and subsequently restricting shoot control plane placement to appropriately labeled seeds by using the field spec.seedSelector (example). Restriction by Tainting: Some seeds running in some dedicated regions are not allowed to host the control plane of any shoots unless explicitly allowed. This can be achieved by tainting seeds appropriately (example) which in turn requires explicit tolerations if a shoot’s control plane should be placed on such tainted seeds (example). ","categories":"","description":"How Gardener supports regional restrictions","excerpt":"How Gardener supports regional restrictions","ref":"/docs/security-and-compliance/regional-restrictions/","tags":["task"],"title":"Regional Restrictions"},{"body":" ","categories":"","description":"In this talk Andrew Jessup walks through the essential elements of building a performant, secure and well factored micro-service in Go and how to deploy it to Google Container Engine.You'll also learn how to use Google Stackdriver to monitor, instrument, trace and even debug a production service in real time.","excerpt":"In this talk Andrew Jessup walks through the essential elements of …","ref":"/docs/resources/videos/microservices-in_kubernetes/","tags":"","title":"High Performance Microservices with Kubernetes, Go, and gRPC"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/monitoring-and-troubleshooting/","tags":"","title":"Monitor and Troubleshoot"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/applications/","tags":"","title":"Applications"},{"body":" ","categories":"","description":"Sandeep Dinesh shows how you can build small containers to make your Kubernetes deployments faster and more secure.","excerpt":"Sandeep Dinesh shows how you can build small containers to make your …","ref":"/docs/resources/videos/small-container/","tags":"","title":"Building Small Containers"},{"body":"Contributing Bigger Changes Here are the guidelines you should follow when contributing larger changes to Gardener:\n Avoid proposing a big change in one single PR. Instead, split your work into multiple stages which are independently mergeable and create one PR for each stage. For example, if introducing a new API resource and its controller, these stages could be:\n API resource types, including defaults and generated code. API resource validation. API server storage. Admission plugin(s), if any. Controller(s), including changes to existing controllers. Split this phase further into different functional subsets if appropriate. If you realize later that changes to artifacts introduced in a previous stage are required, by all means make them and explain in the PR why they were needed.\n Consider splitting a big PR further into multiple commits to allow for more focused reviews. For example, you could add unit tests / documentation in separate commits from the rest of the code. If you have to adapt your PR to review feedback, prefer doing that also in a separate commit to make it easier for reviewers to check how their feedback has been addressed.\n To make the review process more efficient and avoid too many long discussions in the PR itself, ask for a “main reviewer” to be assigned to your change, then work with this person to make sure he or she understands it in detail, and agree together on any improvements that may be needed. If you can’t reach an agreement on certain topics, comment on the PR and invite other people to join the discussion.\n Even if you have a “main reviewer” assigned, you may still get feedback from other reviewers. In general, these “non-main reviewers” are advised to focus more on the design and overall approach rather than the implementation details. Make sure that you address any concerns on this level appropriately.\n ","categories":"","description":"","excerpt":"Contributing Bigger Changes Here are the guidelines you should follow …","ref":"/docs/contribute/code/contributing-bigger-changes/","tags":"","title":"Contributing Bigger Changes"},{"body":" ","categories":"","description":"In this episode of Kubernetes Best Practices, Sandeep Dinesh shows how to work with Namespaces and how they can help you manage your Kubernetes resources.","excerpt":"In this episode of Kubernetes Best Practices, Sandeep Dinesh shows how …","ref":"/docs/resources/videos/namespace/","tags":"","title":"Organizing with Namespaces"},{"body":" ","categories":"","description":"How to make your Kubernetes deployments more robust by using Liveness and Readiness probes.","excerpt":"How to make your Kubernetes deployments more robust by using Liveness …","ref":"/docs/resources/videos/livecheck-readiness/","tags":"","title":"Readiness != Liveness"},{"body":" ","categories":"","description":"Smoothly handling Google Container Engine and networking can take some practice. In this video, Tim Hockin and Michael Rubin discuss migrating applications to Container Engine, networking in Container Engine, use of overlays, segmenting traffic between pods and services, and the variety of options available to you.","excerpt":"Smoothly handling Google Container Engine and networking can take some …","ref":"/docs/resources/videos/in-out-networking/","tags":"","title":"The Ins and Outs of Networking"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2024/","tags":"","title":"2024"},{"body":"Overview Here you can find a variety of articles related to Gardener and keep up to date with the latest community calls, features, and highlights!\nHow to Contribute If you’d like to create a new blog post, simply follow the steps outlined in the Documentation Contribution Guide and add the topic to the corresponding folder.\n","categories":"","description":"","excerpt":"Overview Here you can find a variety of articles related to Gardener …","ref":"/blog/","tags":"","title":"Blogs"},{"body":"","categories":"","description":"","excerpt":"","ref":"/","tags":"","title":"Gardener"},{"body":"The Gardener community recently concluded its 5th Hackathon, a week-long event that brought together multiple companies to collaborate on common topics of interest. The Hackathon, held at Schlosshof Freizeitheim in Schelklingen, Germany, was a testament to the power of collective effort and open-source, producing a tremendous number of results in a short time and moving the Gardener project forward with innovative solutions.\nA Week of Collaboration and Innovation The Hackathon addressed a wide range of topics, from improving the maturity of the Gardener API to harmonizing development setups and automating additional preparation tasks for Gardener installations. The event also saw the introduction of new resources and configurations, the rewriting of VPN components from Bash to Golang, and the exploration of a Tailscale-based VPN to secure shoot clusters.\nKey Achievements 🗃️ OCI Helm Release Reference for ControllerDeployment: The Hackathon introduced the core.gardener.cloud/v1 API, which supports OCI repository-based Helm chart references. This innovation reduces operational complexity and enables reusability for other scenarios. 👨🏼‍💻 Local gardener-operator Development Setup with gardenlet: A new Skaffold configuration was created to harmonize the development setups for Gardener. This configuration deploys gardener-operator and its Garden CRD together with a deployment of gardenlet to register a seed cluster, allowing for a full-fledged Gardener setup. 👨🏻‍🌾 Extensions for Garden Cluster via gardener-operator: The Hackathon focused on automating additional preparation tasks for Gardener installations. The Garden controller was augmented to deploy extensions as part of its reconciliation flow, reducing operational complexity. 🪄 Gardenlet Self-Upgrades for Unmanaged Seeds: A new Gardenlet resource was introduced, allowing for the specification of deployment values and component configurations. A new controller within gardenlet watches these resources and updates the gardenlet’s Helm chart and configuration accordingly, effectively implementing self-upgrades. 🦺 Type-Safe Configurability in OperatingSystemConfig: The Hackathon improved the configurability of the OperatingSystemConfig for containerd, DNS, NTP, etc. The OperatingSystemConfig API was augmented to support containerd-config related use-cases. 👮 Expose Shoot API Server in Tailscale VPN: The Hackathon explored the use of a Tailscale-based VPN to secure shoot clusters. A document was compiled explaining how shoot owners can expose their API server within a Tailscale VPN. ⌨️ Rewrite gardener/vpn2 from Bash to Golang: The Hackathon improved the VPN components by rewriting them in Golang. All functionality was successfully rewritten, and the pull requests have been opened for gardener/vpn2 and the integration into gardener/gardener. 🕳️ Pure IPv6-Based VPN Tunnel: The Hackathon addressed the restriction of the VPN network CIDR by switching the VPN tunnel to a pure IPv6-based network (follow-up of gardener/gardener#9597). This allows for more flexibility in network design. 👐 Harmonize Local VPN Setup with Real-World Scenario: The Hackathon aimed to align the local VPN setup with real-world scenarios regarding the VPN connection. provider-local was augmented to dynamically create Calico’s IPPool resources to emulate the real-world’s networking situation. 🐝 Support Cilium v1.15+ for HA Shoots: The Hackathon addressed the issue of Cilium v1.15+ not considering StatefulSet labels in NetworkPolicys. A prototype was developed to make the Service resources for vpn-seed-server headless. 🍞 Compression for ManagedResource Secrets: The Hackathon focused on reducing the size of Secret related to ManagedResources by leveraging the Brotli compression algorithm. This reduces network I/O and related costs, improving scalability and reducing load on the ETCD cluster. 🚛 Making Shoot Flux Extension Production-Ready: The Hackathon aimed to promote the Flux extension to “production-ready” status. Features such as reconciliation sync mode, and the option to provide additional Secret resources were added. 🧹 Move machine-controller-manager-provider-local Repository into gardener/gardener: The Hackathon focused on moving the machine-controller-manager-provider-local repository content into the gardener/gardener repository. This simplifies maintenance and development tasks. 🗄️ Stop Vendoring Third-Party Code in OS Extensions: The Hackathon aimed to avoid vendoring third-party code in the OS extensions. Two out of the four OS extensions have been adapted. 📦 Consider Embedded Files for Local Image Builds: The Hackathon addressed the issue that changes to embedded files don’t lead to automatic rebuilds of the Gardener images by Skaffold for local development. The related hack script was augmented to detect embedded files and make them part of the list of dependencies. Note that a significant portion of the above topics have been built on top of the achievements of previous Hackathons.This continuity and progression of these Hackathons, with each one building on the achievements of the last, is a testament to the power of sustained collaborative effort.\nLooking Ahead As we look towards the future, the Gardener community is already gearing up for the next Hackathon slated for the end of 2024. The anticipation is palpable, as these events have consistently proven to be a hotbed of creativity, innovation, and collaboration. The 5th Gardener Community Hackathon has once again demonstrated the remarkable outcomes that can be achieved when diverse minds unite to work on shared interests. The event has not only yielded an impressive array of results in a short span but has also sparked innovations that promise to propel the Gardener project to new heights. The community eagerly awaits the next Hackathon, ready to tackle new challenges and continue the journey of innovation and growth.\n","categories":"","description":"","excerpt":"The Gardener community recently concluded its 5th Hackathon, a …","ref":"/blog/2024/05-21-innovation-unleashed-a-deep-dive-into-the-5th-gardener-community-hackathon/","tags":"","title":"Innovation Unleashed: A Deep Dive into the 5th Gardener Community Hackathon"},{"body":"Use Cases In Kubernetes, on every Node the container runtime daemon pulls the container images that are configured in the Pods’ specifications running on the corresponding Node. Although these container images are cached on the Node’s file system after the initial pull operation, there are imperfections with this setup.\nNew Nodes are often created due to events such as auto-scaling (scale up), rolling updates, or replacements of unhealthy Nodes. A new Node would need to pull the images running on it from the container registry because the Node’s cache is initially empty. Pulling an image from a registry incurs network traffic and registry costs.\nTo reduce network traffic and registry costs for your Shoot cluster, it is recommended to enable the Gardener’s Registry Cache extension to run a registry as pull-through cache in the Shoot cluster.\nThe use cases of using a pull-through cache are not only limited to cost savings. Using a pull-through cache makes the Kubernetes cluster resilient to failures with the upstream registry - outages, failures due to rate limiting.\nSolution Gardener’s Registry Cache extension deploys and manages a pull-through cache registry in the Shoot cluster.\nA pull-through cache registry is a registry that caches container images in its storage. The first time when an image is requested from the pull-through cache, it pulls the image from the upstream registry, returns it to the client, and stores it in its local storage. On subsequent requests for the same image, the pull-through cache serves the image from its storage, avoiding network traffic to the upstream registry.\nImagine that you have a DaemonSet in your Kubernetes cluster. In a cluster without a pull-through cache, every Node must pull the same container image from the upstream registry. In a cluster with a pull-through cache, the image is pulled once from the upstream registry and served later for all Nodes.\nA Shoot cluster setup with a registry cache for Docker Hub (docker.io).\nCost Considerations An image pull represents ingress traffic for a virtual machine (data is entering to the system from outside) and egress traffic for the upstream registry (data is leaving the system).\nIngress traffic from the internet to a virtual machine is free of charge on AWS, GCP, and Azure. However, the cloud providers charge NAT gateway costs for inbound and outbound data processed by the NAT gateway based on the processed data volume (per GB). The container registry offerings on the cloud providers charge for egress traffic - again, based on the data volume (per GB).\nHaving all of this in mind, the Registry Cache extension reduces NAT gateway costs for the Shoot cluster and container registry costs.\nTry It Out! We would also like to encourage you to try it! As a Gardener user, you can also reduce your infrastructure costs and increase resilience by enabling the Registry Cache for your Shoot clusters. The Registry Cache extension is a great fit for long running Shoot clusters that have high image pull rate.\nFor more information, refer to the Registry Cache extension documentation!\n","categories":"","description":"","excerpt":"Use Cases In Kubernetes, on every Node the container runtime daemon …","ref":"/blog/2024/04-22-gardeners-registry-cache-extension-another-cost-saving-win-and-more/","tags":"","title":"Gardener's Registry Cache Extension: Another Cost Saving Win and More"},{"body":"With the rising popularity of WebAssembly (WASM) and WebAssembly System Interface (WASI) comes a variety of integration possibilities. WASM is now not only suitable for the browser, but can be also utilized for running workloads on the server. In this post we will explore how you can get started writing serverless applications powered by SpinKube on a Gardener Shoot cluster. This post is inspired by a similar tutorial that goes through the steps of Deploying the Spin Operator on Azure Kubernetes Service. Keep in mind that this post does not aim to define a production environment. It is meant to show that Gardener Shoot clusters are able to run WebAssembly workloads, giving users the chance to experiment and explore this cutting-edge technology.\nPrerequisites kubectl - the Kubernetes command line tool helm - the package manager for Kubernetes A running Gardener Shoot cluster Gardener Shoot Cluster For this showcase I am using a Gardener Shoot cluster on AWS infrastructure with nodes powered by Garden Linux, although the steps should be applicable for other infrastructures as well, since Gardener aims to provide a homogenous Kubernetes experience.\nAs a prerequisite for next steps, verify that you have access to your Gardener Shoot cluster.\n# Verify the access to the Gardener Shoot cluster kubectl get ns NAME STATUS AGE default Active 4m1s kube-node-lease Active 4m1s kube-public Active 4m1s kube-system Active 4m1s If you are having troubles accessing the Gardener Shoot cluster, please consult the Accessing Shoot Clusters documentation page.\nDeploy the Spin Operator As a first step, we will install the Spin Operator Custom Resource Definitions and the Runtime Class needed by wasmtime-spin-v2.\n# Install Spin Operator CRDs kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.crds.yaml # Install the Runtime Class kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.runtime-class.yaml Next, we will install cert-manager, which is required for provisioning TLS certificates used by the admission webhook of the Spin Operator. If you face issues installing cert-manager, please consult the cert-manager installation documentation.\n# Add and update the Jetstack repository helm repo add jetstack https://charts.jetstack.io helm repo update # Install the cert-manager chart alongside with CRDs needed by cert-manager helm install \\ cert-manager jetstack/cert-manager \\ --namespace cert-manager \\ --create-namespace \\ --version v1.14.4 \\ --set installCRDs=true In order to install the containerd-wasm-shim on the Kubernetes nodes we will use the kwasm-operator. There is also a successor of kwasm-operator - runtime-class-manager which aims to address some of the limitations of kwasm-operator and provide a production grade implementation for deploying containerd shims on Kubernetes nodes. Since kwasm-operator is easier to install, for the purpose of this post we will use it instead of the runtime-class-manager.\n# Add the kwasm helm repository helm repo add kwasm http://kwasm.sh/kwasm-operator/ helm repo update # Install KWasm operator helm install \\ kwasm-operator kwasm/kwasm-operator \\ --namespace kwasm \\ --create-namespace \\ --set kwasmOperator.installerImage=ghcr.io/spinkube/containerd-shim-spin/node-installer:v0.13.1 # Annotate all nodes in the cluster so kwasm can select them and provision the required containerd shim kubectl annotate node --all kwasm.sh/kwasm-node=true We can see that a pod has started and completed in the kwasm namespace.\nkubectl -n kwasm get pod NAME READY STATUS RESTARTS AGE ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm-qhr8r 0/1 Completed 0 8s kwasm-operator-6c76c5f94b-8zt4s 1/1 Running 0 15s The logs of the kwasm-operator also indicate that the node was provisioned with the required shim.\nkubectl -n kwasm logs kwasm-operator-6c76c5f94b-8zt4s {\"level\":\"info\",\"node\":\"ip-10-180-7-60.eu-west-1.compute.internal\",\"time\":\"2024-04-18T05:44:25Z\",\"message\":\"Trying to Deploy on ip-10-180-7-60.eu-west-1.compute.internal\"} {\"level\":\"info\",\"time\":\"2024-04-18T05:44:31Z\",\"message\":\"Job ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm is still Ongoing\"} {\"level\":\"info\",\"time\":\"2024-04-18T05:44:31Z\",\"message\":\"Job ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm is Completed. Happy WASMing\"} Finally we can deploy the spin-operator alongside with a shim-executor.\nhelm install spin-operator \\ --namespace spin-operator \\ --create-namespace \\ --version 0.1.0 \\ --wait \\ oci://ghcr.io/spinkube/charts/spin-operator kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.shim-executor.yaml Deploy a Spin App Let’s deploy a sample Spin application using the following command:\nkubectl apply -f https://raw.githubusercontent.com/spinkube/spin-operator/main/config/samples/simple.yaml After the CRD has been picked up by the spin-operator, a pod will be created running the sample application. Let’s explore its logs.\nkubectl logs simple-spinapp-56687588d9-nbrtq Serving http://0.0.0.0:80 Available Routes: hello: http://0.0.0.0:80/hello go-hello: http://0.0.0.0:80/go-hello We can see the available routes served by the application. Let’s port forward to the application service and test them out.\nkubectl port-forward services/simple-spinapp 8000:80 Forwarding from 127.0.0.1:8000 -\u003e 80 Forwarding from [::1]:8000 -\u003e 80 In another terminal, we can verify that the application returns a response.\ncurl http://localhost:8000/hello Hello world from Spin!% This sets the ground for further experimentation and testing. What the SpinApp CRD provides as capabilities and API can be explored through the SpinApp CRD reference.\nCleanup Let’s clean all deployed resources so far.\n# Delete the spin app and its executor kubectl delete spinapp simple-spinapp kubectl delete spinappexecutors.core.spinoperator.dev containerd-shim-spin # Uninstall the spin-operator chart helm -n spin-operator uninstall spin-operator # Remove the kwasm.sh/kwasm-node annotation from nodes kubectl annotate node --all kwasm.sh/kwasm-node- # Uninstall the kwasm-operator chart helm -n kwasm uninstall kwasm-operator # Uninstall the cert-manager chart helm -n cert-manager uninstall cert-manager # Delete the runtime class and SpinApp CRDs kubectl delete runtimeclass wasmtime-spin-v2 kubectl delete crd spinappexecutors.core.spinoperator.dev kubectl delete crd spinapps.core.spinoperator.dev Conclusion In my opinion, WASM on the server is here to stay. Communities are expressing more and more interest in integrating Kubernetes with WASM workloads. As shown Gardener clusters are perfectly capable of supporting this use case. This setup is a great way to start exploring the capabilities that WASM can bring to the server. As stated in the introduction, bear in mind that this post does not define a production environment, but is rather meant to define a playground suitable for exploring and trying out ideas.\n","categories":"","description":"","excerpt":"With the rising popularity of WebAssembly (WASM) and WebAssembly …","ref":"/blog/2024/04-18-spinkube-gardener-shoot-cluster/","tags":"","title":"SpinKube on Gardener - Serverless WASM on Kubernetes"},{"body":"KubeCon + CloudNativeCon Europe 2024, recently held in Paris, was a testament to the robustness of the open-source community and its pivotal role in driving advancements in AI and cloud-native technologies. With a record attendance of over +12,000 participants, the conference underscored the ubiquity of cloud-native architectures and the business opportunities they provide.\nAI Everywhere LLMs and GenAI took center stage at the event, with discussions on challenges such as security, data management, and energy consumption. A popular quote stated, “If #inference is the new web application, #kubernetes is the new web server”. The conference emphasized the need for more open data models for AI to democratize the technology. Cloud-native platforms offer advantages for AI innovation, such as packaging models and dependencies as Docker packages and enhancing resource management for proper model execution. The community is exploring AI workload management, including using CPUs for inferencing and preprocessing data before handing it over to GPUs. CNCF took the initiative and put together an AI whitepaper outlining the apparent synergy between cloud-native technologies and AI.\nCluster Autopilot The conference showcased popular projects in the cloud-native ecosystem, including Kubernetes, Istio, and OpenTelemetry. Kubernetes was highlighted as a platform for running massive AI workloads. The UXL Foundation aims to enable multi-vendor AI workloads on Kubernetes, allowing developers to move AI workloads without being locked into a specific infrastructure. Every vendor we interacted with has assembled an AI-powered chatbot, which performs various functions – from assessing cluster health through analyzing cost efficiency and proposing workload optimizations to troubleshooting issues and alerting for potential challenges with upcoming Kubernetes version upgrades. Sysdig went even further with a chatbot, which answers the popular question, “Do any of my products have critical CVEs in production?” and analyzes workloads’ structure and configuration. Some chatbots leveraged the k8sgpt project, which joined the CNCF sandbox earlier this year.\nSophisticated Fleet Management The ecosystem showcased maturity in observability, platform engineering, security, and optimization, which will help operationalize AI workloads. Data demands and costs were also in focus, touching on data observability and cloud-cost management. Cloud-native technologies, also going beyond Kubernetes, are expected to play a crucial role in managing the increasing volume of data and scaling AI. Google showcased fleet management in their Google Hosted Cloud offering (ex-Anthos). It allows for defining teams and policies at the fleet level, later applied to all the Kubernetes clusters in the fleet, irrespective of the infrastructure they run on (GCP and beyond).\nWASM Everywhere The conference also highlighted the growing interest in WebAssembly (WASM) as a portable binary instruction format for executable programs and its integration with Kubernetes and other functions. The topic here started with a dedicated WASM pre-conference day, the sessions of which are available in the following playlist. WASM is positioned as the smoother approach to software distribution and modularity, providing more lightweight runtime execution options and an easier way for app developers to enter.\nRust on the Rise Several talks were promoting Rust as an ideal programming language for cloud-native workloads. It was even promoted as suitable for writing Kubernetes controllers.\nInternal Developer Platforms The event showcased the importance of Internal Developer Platforms (IDPs), both commercial and open-source, in facilitating the development process across all types of organizations – from Allianz to Mercedes. Backstage leads the pack by a large margin, with all relevant sessions being full or at capacity. Much effort goes into the modularization of Backstage, which was also a notable highlight at the conference.\nSustainability Sustainability was a key theme, with discussions on the role of cloud-native technologies in promoting green practices. The KubeCost application folks put a lot of effort into emphasizing the large amount of wasted money, which hyperscalers benefit from. In parallel – the kube-green project emphasized optimizing your cluster footprint to minimize CO2 emissions. The conference also highlighted the importance of open source in creating a level playing field for multiple players to compete, fostering diverse participation, and solving global challenges.\nCustomer Stories In contrast to the Chicago KubeCon in 2023, the one in Paris outlined multiple case studies, best practices, and reference scenarios. Many enterprises and their IT teams were well represented at KubeCon - regarding sessions, sponsorships, and participation. These companies strive to excel forward, reaping the efficiency and flexibility benefits cloud-native architectures provide. We came across multiple companies using Gardener as their Kubernetes management underlay – including FUGA Cloud, StackIT, and metal-stack Cloud. We eagerly anticipate more companies embracing Gardener at future events. The consistent feedback from these companies has been overwhelmingly positive—they absolutely love using Gardener and our shared excitement grows as the community thrives!\nNotable Talks Notable talks from leaders in the cloud-native world, including Solomon Hykes, Bob Wise, and representatives from KCP for Platforms and the United Nations, provided valuable insights into the future of AI and cloud-native technologies. All the talks are now uploaded to YouTube in the following playlist. Those do not include the various pre-conference days, available as separate playlists by CNCF.\nIn Conclusion… In conclusion, KubeCon 2024 showcased the intersection of AI and cloud-native technologies, the open-source community’s growth, and the cloud-native ecosystem’s maturity. Many enterprises are actively engaged there, innovating, trying, and growing their internal expertise. They’re using KubeCon as a recruiting event, expanding their internal talent pool and taking more of their internal operations and processes into their own hands. The event served as a platform for global collaboration, cross-company alignments, innovation, and the exchange of ideas, setting the stage for the future of cloud-native computing.\n","categories":"","description":"","excerpt":"KubeCon + CloudNativeCon Europe 2024, recently held in Paris, was a …","ref":"/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/","tags":"","title":"KubeCon / CloudNativeCon Europe 2024 Highlights"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/","tags":"","title":"Docs"},{"body":"Gardener Extension for certificate services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nConfiguration Example configuration for this extension controller:\napiVersion: shoot-cert-service.extensions.config.gardener.cloud/v1alpha1 kind: Configuration issuerName: gardener restrictIssuer: true # restrict issuer to any sub-domain of shoot.spec.dns.domain (default) acme: email: john.doe@example.com server: https://acme-v02.api.letsencrypt.org/directory # privateKey: | # Optional key for Let's Encrypt account. # -----BEGIN BEGIN RSA PRIVATE KEY----- # ... # -----END RSA PRIVATE KEY----- Extension-Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: \"extension-certificate-service\" namespace: shoot--project--abc spec: type: shoot-cert-service When an extension resource is reconciled, the extension controller will create an instance of Cert-Management as well as an Issuer with the ACME information provided in the configuration above. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters, i.e. it never deploys them directly.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for certificate services for shoot clusters","excerpt":"Gardener extension controller for certificate services for shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/","tags":"","title":"Certificate services"},{"body":"Changing alerting settings Certificates are normally renewed automatically 30 days before they expire. As a second line of defense, there is an alerting in Prometheus activated if the certificate is a few days before expiration. By default, the alert is triggered 15 days before expiration.\nYou can configure the days in the providerConfig of the extension. Setting it to 0 disables the alerting.\nIn this example, the days are changed to 3 days before expiration.\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig alerting: certExpirationAlertDays: 3 ","categories":"","description":"How to change the alerting on expiring certificates","excerpt":"How to change the alerting on expiring certificates","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/alerting/","tags":["task"],"title":"Changing alerting settings"},{"body":"Have you ever wondered how much more your Kubernetes cluster can scale before it breaks down?\nOf course, the answer is heavily dependent on your workloads. But be assured, any cluster will break eventually. Therefore, the best mitigation is to plan for sharding early and run multiple clusters instead of trying to optimize everything hoping to survive with a single cluster. Still, it is helpful to know when the time has come to scale out. This document aims at giving you the basic knowledge to keep a Gardener-managed Kubernetes cluster up and running while it scales according to your needs.\nWelcome to Planet Scale, Please Mind the Gap! For a complex, distributed system like Kubernetes it is impossible to give absolute thresholds for its scalability. Instead, the limit of a cluster’s scalability is a combination of various, interconnected dimensions.\nLet’s take a rather simple example of two dimensions - the number of Pods per Node and number of Nodes in a cluster. According to the scalability thresholds documentation, Kubernetes can scale up to 5000 Nodes and with default settings accommodate a maximum of 110 Pods on a single Node. Pushing only a single dimension towards its limit will likely harm the cluster. But if both are pushed simultaneously, any cluster will break way before reaching one dimension’s limit.\nWhat sounds rather straightforward in theory can be a bit trickier in reality. While 110 Pods is the default limit, we successfully pushed beyond that and in certain cases run up to 200 Pods per Node without breaking the cluster. This is possible in an environment where one knows and controls all workloads and cluster configurations. It still requires careful testing, though, and comes at the cost of limiting the scalability of other dimensions, like the number of Nodes.\nOf course, a Kubernetes cluster has a plethora of dimensions. Thus, when looking at a simple questions like “How many resources can I store in ETCD?”, the only meaningful answer must be: “it depends”\nThe following sections will help you to identify relevant dimensions and how they affect a Gardener-managed Kubernetes cluster’s scalability.\n“Official” Kubernetes Thresholds and Scalability Considerations To get started with the topic, please check the basic guidance provided by the Kubernetes community (specifically SIG Scalability):\n How we define scalability? Kubernetes Scalability Thresholds Furthermore, the problem space has been discussed in a KubeCon talk, the slides for which can be found here. You should at least read the slides before continuing.\nEssentially, it comes down to this:\n If you promise to:\n correctly configure your cluster use extensibility features “reasonably” keep the load in the cluster within recommended limits Then we promise that your cluster will function properly.\n With that knowledge in mind, let’s look at Gardener and eventually pick up the question about the number of objects in ETCD raised above.\nGardener-Specific Considerations The following considerations are based on experience with various large clusters that scaled in different dimensions. Just as explained above, pushing beyond even one of the limits is likely to cause issues at some point in time (but not guaranteed). Depending on the setup of your workloads however, it might work unexpectedly well. Nevertheless, we urge you take conscious decisions and rather think about sharding your workloads. Please keep in mind - your workload affects the overall stability and scalability of a cluster significantly.\nETCD The following section is based on a setup where ETCD Pods run on a dedicated Node pool and each Node has 8 vCPU and 32GB memory at least.\nETCD has a practical space limit of 8 GB. It caps the number of objects one can technically have in a Kubernetes cluster.\nOf course, the number is heavily influenced by each object’s size, especially when considering that secrets and configmaps may store up to 1MB of data. Another dimension is a cluster’s churn rate. Since ETCD stores a history of the keyspace, a higher churn rate reduces the number of objects. Gardener runs compaction every 30min and defragmentation once per day during a cluster’s maintenance window to ensure proper ETCD operations. However, it is still possible to overload ETCD. If the space limit is reached, ETCD will only accept READ or DELETE requests and manual interaction by a Gardener operator is needed to disarm the alarm, once you got below the threshold.\nTo avoid such a situation, you can monitor the current ETCD usage via the “ETCD” dashboard of the monitoring stack. It gives you the current DB size, as well as historical data for the past 2 weeks. While there are improvements planned to trigger compaction and defragmentation based on DB size, an ETCD should not grow up to this threshold. A typical, healthy DB size is less than 3 GB.\nFurthermore, the dashboard has a panel called “Memory”, which indicates the memory usage of the etcd pod(s). Using more than 16GB memory is a clear red flag, and you should reduce the load on ETCD.\nAnother dimension you should be aware of is the object count in ETCD. You can check it via the “API Server” dashboard, which features a “ETCD Object Counts By Resource” panel. The overall number of objects (excluding events, as they are stored in a different etcd instance) should not exceed 100k for most use cases.\nKube API Server The following section is based on a setup where kube-apiserver run as Pods and are scheduled to Nodes with at least 8 vCPU and 32GB memory.\nGardener can scale the Deployment of a kube-apiserver horizontally and vertically. Horizontal scaling is limited to a certain number of replicas and should not concern a stakeholder much. However, the CPU / memory consumption of an individual kube-apiserver pod poses a potential threat to the overall availability of your cluster. The vertical scaling of any kube-apiserver is limited by the amount of resources available on a single Node. Outgrowing the resources of a Node will cause a downtime and render the cluster unavailable.\nIn general, continuous CPU usage of up to 3 cores and 16 GB memory per kube-apiserver pod is considered to be safe. This gives some room to absorb spikes, for example when the caches are initialized. You can check the resource consumption by selecting kube-apiserver Pods in the “Kubernetes Pods” dashboard. If these boundaries are exceeded constantly, you need to investigate and derive measures to lower the load.\nFurther information is also recorded and made available through the monitoring stack. The dashboard “API Server Request Duration and Response Size” provides insights into the request processing time of kube-apiserver Pods. Related information like request rates, dropped requests or termination codes (e.g., 429 for too many requests) can be obtained from the dashboards “API Server” and “Kubernetes API Server Details”. They provide a good indicator for how well the system is dealing with its current load.\nReducing the load on the API servers can become a challenge. To get started, you may try to:\n Use immutable secrets and configmaps where possible to save watches. This pays off, especially when you have a high number of Nodes or just lots of secrets in general. Applications interacting with the K8s API: If you know an object by its name, use it. Using label selector queries is expensive, as the filtering happens only within the kube-apiserver and not etcd, hence all resources must first pass completely from etcd to kube-apiserver. Use (single object) caches within your controllers. Check the “Use cache for ShootStates in Gardenlet” issue for an example. Nodes When talking about the scalability of a Kubernetes cluster, Nodes are probably mentioned in the first place… well, obviously not in this guide. While vanilla Kubernetes lists 5000 Nodes as its upper limit, pushing that dimension is not feasible. Most clusters should run with fewer than 300 Nodes. But of course, the actual limit depends on the workloads deployed and can be lower or higher. As you scale your cluster, be extra careful and closely monitor ETCD and kube-apiserver.\nThe scalability of Nodes is subject to a range of limiting factors. Some of them can only be defined upon cluster creation and remain immutable during a cluster lifetime. So let’s discuss the most important dimensions.\nCIDR:\nUpon cluster creation, you have to specify or use the default values for several network segments. There are dedicated CIDRs for services, Pods, and Nodes. Each defines a range of IP addresses available for the individual resource type. Obviously, the maximum of possible Nodes is capped by the CIDR for Nodes. However, there is a second limiting factor, which is the pod CIDR combined with the nodeCIDRMaskSize. This mask is used to divide the pod CIDR into smaller subnets, where each blocks gets assigned to a node. With a /16 pod network and a /24 nodeCIDRMaskSize, a cluster can scale up to 256 Nodes. Please check Shoot Networking for details.\nEven though a /24 nodeCIDRMaskSize translates to a theoretical 256 pod IP addresses per Node, the maxPods setting should be less than 1/2 of this value. This gives the system some breathing room for churn and minimizes the risk for strange effects like mis-routed packages caused by immediate re-use of IPs.\nCloud provider capacity:\nMost of the time, Nodes in Kubernetes translate to virtual machines on a hyperscaler. An attempt to add more Nodes to a cluster might fail due to capacity issues resulting in an error message like this:\nCloud provider message - machine codes error: code = [Internal] message = [InsufficientInstanceCapacity: We currently do not have sufficient \u003cinstance type\u003e capacity in the Availability Zone you requested. Our system will be working on provisioning additional capacity. In heavily utilized regions, individual clusters are competing for scarce resources. So before choosing a region / zone, try to ensure that the hyperscaler supports your anticipated growth. This might be done through quota requests or by contacting the respective support teams. To mitigate such a situation, you may configure a worker pool with a different Node type and a corresponding priority expander as part of a shoot’s autoscaler section. Please consult the Autoscaler FAQ for more details.\nRolling of Node pools:\nThe overall number of Nodes is affecting the duration of a cluster’s maintenance. When upgrading a Node pool to a new OS image or Kubernetes version, all machines will be drained and deleted, and replaced with new ones. The more Nodes a cluster has, the longer this process will take, given that workloads are typically protected by PodDisruptionBudgets. Check Shoot Updates and Upgrades for details. Be sure to take this into consideration when planning maintenance.\nRoot disk:\nYou should be aware that the Node configuration impacts your workload’s performance too. Take the root disk of a Node, for example. While most hyperscalers offer the usage of HDD and SSD disks, it is strongly recommended to use SSD volumes as root disks. When there are lots of Pods on a Node or workloads making extensive use of emptyDir volumes, disk throttling becomes an issue. When a disk hits its IOPS limits, processes are stuck in IO-wait and slow down significantly. This can lead to a slow-down in the kubelet’s heartbeat mechanism and result in Nodes being replaced automatically, as they appear to be unhealthy. To analyze such a situation, you might have to run tools like iostat, sar or top directly on a Node.\nSwitching to an I/O optimized instance type (if offered for your infrastructure) can help to resolve issue. Please keep in mind that disks used via PersistentVolumeClaims have I/O limits as well. Sometimes these limits are related to the size and/or can be increased for individual disks.\nCloud Provider (Infrastructure) Limits In addition to the already mentioned capacity restrictions, a cloud provider may impose other limitations to a Kubernetes cluster’s scalability. One category is the account quota defining the number of resources allowed globally or per region. Make sure to request appropriate values that suit your needs and contain a buffer, for example for having more Nodes during a rolling update.\nAnother dimension is the network throughput per VM or network interface. While you may be able to choose a network-optimized Node type for your workload to mitigate issues, you cannot influence the available bandwidth for control plane components. Therefore, please ensure that the traffic on the ETCD does not exceed 100MB/s. The ETCD dashboard provides data for monitoring this metric.\nIn some environments the upstream DNS might become an issue too and make your workloads subject to rate limiting. Given the heterogeneity of cloud providers incl. private data centers, it is not possible to give any thresholds. Still, the “CoreDNS” and “NodeLocalDNS” dashboards can help to derive a workload’s usage pattern. Check the DNS autoscaling and NodeLocalDNS documentations for available configuration options.\nWebhooks While webhooks provide powerful means to manage a cluster, they are equally powerful in breaking a cluster upon a malfunction or unavailability. Imagine using a policy enforcing system like Kyverno or Open Policy Agent Gatekeeper. As part of the stack, both will deploy webhooks which are invoked for almost everything that happens in a cluster. Now, if this webhook gets either overloaded or is simply not available, the cluster will stop functioning properly.\nHence, you have to ensure proper sizing, quick processing time, and availability of the webhook serving Pods when deploying webhooks. Please consult Dynamic Admission Control (Availability and Timeouts sections) for details. You should also be aware of the time added to any request that has to go through a webhook, as the kube-apiserver sends the request for mutation / validation to another pod and waits for the response. The more resources being subject to an external webhook, the more likely this will become a bottleneck when having a high churn rate on resources. Within the Gardener monitoring stack, you can check the extra time per webhook via the “API Server (Admission Details)” dashboard, which has a panel for “Duration per Webhook”.\nIn Gardener, any webhook timeout should be less than 15 seconds. Due to the separation of Kubernetes data-plane (shoot) and control-plane (seed) in Gardener, the extra hop from kube-apiserver (control-plane) to webhook (data-plane) is more expensive. Please check Shoot Status for more details.\nCustom Resource Definitions Using Custom Resource Definitions (CRD) to extend a cluster’s API is a common Kubernetes pattern and so is writing an operator to act upon custom resources. Writing an efficient controller reduces the load on the kube-apiserver and allows for better scaling. As a starting point, you might want to read Gardener’s Kubernetes Clients Guide.\nAnother problematic dimension is the usage of conversion webhooks when having resources stored in different versions. Not only do they add latency (see Webhooks) but can also block the kube-controllermanager’s garbage collection. If a conversion webhook is unavailable, the garbage collector fails to list all resources and does not perform any cleanup. In order to avoid such a situation, it is highly recommended to use conversion webhooks only when necessary and complete the migration to a new version as soon as possible.\nConclusion As outlined by SIG Scalability, it is quite impossible to give limits or even recommendations fitting every individual use case. Instead, this guide outlines relevant dimensions and gives rather conservative recommendations based on usage patterns observed. By combining this information, it is possible to operate and scale a cluster in stable manner.\nWhile going beyond is certainly possible for some dimensions, it significantly increases the risk of instability. Typically, limits on the control-plane are introduced by the availability of resources like CPU or memory on a single machine and can hardly be influenced by any user. Therefore, utilizing the existing resources efficiently is key. Other parameters are controlled by a user. In these cases, careful testing may reveal actual limits for a specific use case.\nPlease keep in mind that all aspects of a workload greatly influence the stability and scalability of a Kubernetes cluster.\n","categories":"","description":"Know the boundary conditions when scaling your workloads","excerpt":"Know the boundary conditions when scaling your workloads","ref":"/docs/guides/administer-shoots/scalability/","tags":"","title":"Scalability of Gardener Managed Kubernetes Clusters"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2023/","tags":"","title":"2023"},{"body":"Developing highly available workload that can tolerate a zone outage is no trivial task. In this blog, we will explore various recommendations to get closer to that goal. While many recommendations are general enough, the examples are specific in how to achieve this in a Gardener-managed cluster and where/how to tweak the different control plane components. If you do not use Gardener, it may be still a worthwhile read as most settings can be influenced with most of the Kubernetes providers.\nFirst however, what is a zone outage? It sounds like a clear-cut “thing”, but it isn’t. There are many things that can go haywire. Here are some examples:\n Elevated cloud provider API error rates for individual or multiple services Network bandwidth reduced or latency increased, usually also effecting storage sub systems as they are network attached No networking at all, no DNS, machines shutting down or restarting, … Functional issues, of either the entire service (e.g., all block device operations) or only parts of it (e.g., LB listener registration) All services down, temporarily or permanently (the proverbial burning down data center 🔥) This and everything in between make it hard to prepare for such events, but you can still do a lot. The most important recommendation is to not target specific issues exclusively - tomorrow another service will fail in an unanticipated way. Also, focus more on meaningful availability than on internal signals (useful, but not as relevant as the former). Always prefer automation over manual intervention (e.g., leader election is a pretty robust mechanism, auto-scaling may be required as well, etc.).\nAlso remember that HA is costly - you need to balance it against the cost of an outage as silly as this may sound, e.g., running all this excess capacity “just in case” vs. “going down” vs. a risk-based approach in between where you have means that will kick in, but they are not guaranteed to work (e.g., if the cloud provider is out of resource capacity). Maybe some of your components must run at the highest possible availability level, but others not - that’s a decision only you can make.\nControl Plane The Kubernetes cluster control plane is managed by Gardener (as pods in separate infrastructure clusters to which you have no direct access) and can be set up with no failure tolerance (control plane pods will be recreated best-effort when resources are available) or one of the failure tolerance types node or zone.\nStrictly speaking, static workload does not depend on the (high) availability of the control plane, but static workload doesn’t rhyme with Cloud and Kubernetes and also means, that when you possibly need it the most, e.g., during a zone outage, critical self-healing or auto-scaling functionality won’t be available to you and your workload, if your control plane is down as well. That’s why it’s generally recommended to use the failure tolerance type zone for the control planes of productive clusters, at least in all regions that have 3+ zones. Regions that have only 1 or 2 zones don’t support the failure tolerance type zone and then your second best option is the failure tolerance type node, which means a zone outage can still take down your control plane, but individual node outages won’t.\nIn the shoot resource it’s merely only this what you need to add:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) This setting will scale out all control plane components for a Gardener cluster as necessary, so that no single zone outage can take down the control plane for longer than just a few seconds for the fail-over to take place (e.g., lease expiration and new leader election or readiness probe failure and endpoint removal). Components run highly available in either active-active (servers) or active-passive (controllers) mode at all times, the persistence (ETCD), which is consensus-based, will tolerate the loss of one zone and still maintain quorum and therefore remain operational. These are all patterns that we will revisit down below also for your own workload.\nWorker Pools Now that you have configured your Kubernetes cluster control plane in HA, i.e. spread it across multiple zones, you need to do the same for your own workload, but in order to do so, you need to spread your nodes across multiple zones first.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: provider: workers: - name: ... minimum: 6 maximum: 60 zones: - ... Prefer regions with at least 2, better 3+ zones and list the zones in the zones section for each of your worker pools. Whether you need 2 or 3 zones at a minimum depends on your fail-over concept:\n Consensus-based software components (like ETCD) depend on maintaining a quorum of (n/2)+1, so you need at least 3 zones to tolerate the outage of 1 zone. Primary/Secondary-based software components need just 2 zones to tolerate the outage of 1 zone. Then there are software components that can scale out horizontally. They are probably fine with 2 zones, but you also need to think about the load-shift and that the remaining zone must then pick up the work of the unhealthy zone. With 2 zones, the remaining zone must cope with an increase of 100% load. With 3 zones, the remaining zones must only cope with an increase of 50% load (per zone). In general, the question is also whether you have the fail-over capacity already up and running or not. If not, i.e. you depend on re-scheduling to a healthy zone or auto-scaling, be aware that during a zone outage, you will see a resource crunch in the healthy zones. If you have no automation, i.e. only human operators (a.k.a. “red button approach”), you probably will not get the machines you need and even with automation, it may be tricky. But holding the capacity available at all times is costly. In the end, that’s a decision only you can make. If you made that decision, please adapt the minimum and maximum settings for your worker pools accordingly.\nAlso, consider fall-back worker pools (with different/alternative machine types) and cluster autoscaler expanders using a priority-based strategy.\nGardener-managed clusters deploy the cluster autoscaler or CA for short and you can tweak the general CA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 If you want to be ready for a sudden spike or have some buffer in general, over-provision nodes by means of “placeholder” pods with low priority and appropriate resource requests. This way, they will demand nodes to be provisioned for them, but if any pod comes up with a regular/higher priority, the low priority pods will be evicted to make space for the more important ones. Strictly speaking, this is not related to HA, but it may be important to keep this in mind as you generally want critical components to be rescheduled as fast as possible and if there is no node available, it may take 3 minutes or longer to do so (depending on the cloud provider). Besides, not only zones can fail, but also individual nodes.\nReplicas (Horizontal Scaling) Now let’s talk about your workload. In most cases, this will mean to run multiple replicas. If you cannot do that (a.k.a. you have a singleton), that’s a bad situation to be in. Maybe you can run a spare (secondary) as backup? If you cannot, you depend on quick detection and rescheduling of your singleton (more on that below).\nObviously, things get messier with persistence. If you have persistence, you should ideally replicate your data, i.e. let your spare (secondary) “follow” your main (primary). If your software doesn’t support that, you have to deploy other means, e.g., volume snapshotting or side-backups (specific to the software you deploy; keep the backups regional, so that you can switch to another zone at all times). If you have to do those, your HA scenario becomes more a DR scenario and terms like RPO and RTO become relevant to you:\n Recovery Point Objective (RPO): Potential data loss, i.e. how much data will you lose at most (time between backups) Recovery Time Objective (RTO): Time until recovery, i.e. how long does it take you to be operational again (time to restore) Also, keep in mind that your persistent volumes are usually zonal, i.e. once you have a volume in one zone, it’s bound to that zone and you cannot get up your pod in another zone w/o first recreating the volume yourself (Kubernetes won’t help you here directly).\nAnyway, best avoid that, if you can (from technical and cost perspective). The best solution (and also the most costly one) is to run multiple replicas in multiple zones and keep your data replicated at all times, so that your RPO is always 0 (best). That’s what we do for Gardener-managed cluster HA control planes (ETCD) as any data loss may be disastrous and lead to orphaned resources (in addition, we deploy side cars that do side-backups for disaster recovery, with full and incremental snapshots with an RPO of 5m).\nSo, how to run with multiple replicas? That’s the easiest part in Kubernetes and the two most important resources, Deployments and StatefulSet, support that out of the box:\napiVersion: apps/v1 kind: Deployment | StatefulSet spec: replicas: ... The problem comes with the number of replicas. It’s easy only if the number is static, e.g., 2 for active-active/passive or 3 for consensus-based software components, but what with software components that can scale out horizontally? Here you usually do not set the number of replicas statically, but make use of the horizontal pod autoscaler or HPA for short (built-in; part of the kube-controller-manager). There are also other options like the cluster proportional autoscaler, but while the former works based on metrics, the latter is more a guestimate approach that derives the number of replicas from the number of nodes/cores in a cluster. Sometimes useful, but often blind to the actual demand.\nSo, HPA it is then for most of the cases. However, what is the resource (e.g., CPU or memory) that drives the number of desired replicas? Again, this is up to you, but not always are CPU or memory the best choices. In some cases, custom metrics may be more appropriate, e.g., requests per second (it was also for us).\nYou will have to create specific HorizontalPodAutoscaler resources for your scale target and can tweak the general HPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeControllerManager: horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s Resources (Vertical Scaling) While it is important to set a sufficient number of replicas, it is also important to give the pods sufficient resources (CPU and memory). This is especially true when you think about HA. When a zone goes down, you might need to get up replacement pods, if you don’t have them running already to take over the load from the impacted zone. Likewise, e.g., with active-active software components, you can expect the remaining pods to receive more load. If you cannot scale them out horizontally to serve the load, you will probably need to scale them out (or rather up) vertically. This is done by the vertical pod autoscaler or VPA for short (not built-in; part of the kubernetes/autoscaler repository).\nA few caveats though:\n You cannot use HPA and VPA on the same metrics as they would influence each other, which would lead to pod trashing (more replicas require fewer resources; fewer resources require more replicas) Scaling horizontally doesn’t cause downtimes (at least not when out-scaling and only one replica is affected when in-scaling), but scaling vertically does (if the pod runs OOM anyway, but also when new recommendations are applied, resource requests for existing pods may be changed, which causes the pods to be rescheduled). Although the discussion is going on for a very long time now, that is still not supported in-place yet (see KEP 1287, implementation in Kubernetes, implementation in VPA). VPA is a useful tool and Gardener-managed clusters deploy a VPA by default for you (HPA is supported anyway as it’s built into the kube-controller-manager). You will have to create specific VerticalPodAutoscaler resources for your scale target and can tweak the general VPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s While horizontal pod autoscaling is relatively straight-forward, it takes a long time to master vertical pod autoscaling. We saw performance issues, hard-coded behavior (on OOM, memory is bumped by +20% and it may take a few iterations to reach a good level), unintended pod disruptions by applying new resource requests (after 12h all targeted pods will receive new requests even though individually they would be fine without, which also drives active-passive resource consumption up), difficulties to deal with spiky workload in general (due to the algorithmic approach it takes), recommended requests may exceed node capacity, limit scaling is proportional and therefore often questionable, and more. VPA is a double-edged sword: useful and necessary, but not easy to handle.\nFor the Gardener-managed components, we mostly removed limits. Why?\n CPU limits have almost always only downsides. They cause needless CPU throttling, which is not even easily visible. CPU requests turn into cpu shares, so if the node has capacity, the pod may consume the freely available CPU, but not if you have set limits, which curtail the pod by means of cpu quota. There are only certain scenarios in which they may make sense, e.g., if you set requests=limits and thereby define a pod with guaranteed QoS, which influences your cgroup placement. However, that is difficult to do for the components you implement yourself and practically impossible for the components you just consume, because what’s the correct value for requests/limits and will it hold true also if the load increases and what happens if a zone goes down or with the next update/version of this component? If anything, CPU limits caused outages, not helped prevent them. As for memory limits, they are slightly more useful, because CPU is compressible and memory is not, so if one pod runs berserk, it may take others down (with CPU, cpu shares make it as fair as possible), depending on which OOM killer strikes (a complicated topic by itself). You don’t want the operating system OOM killer to strike as the result is unpredictable. Better, it’s the cgroup OOM killer or even the kubelet’s eviction, if the consumption is slow enough as it takes priorities into consideration even. If your component is critical and a singleton (e.g., node daemon set pods), you are better off also without memory limits, because letting the pod go OOM because of artificial/wrong memory limits can mean that the node becomes unusable. Hence, such components also better run only with no or a very high memory limit, so that you can catch the occasional memory leak (bug) eventually, but under normal operation, if you cannot decide about a true upper limit, rather not have limits and cause endless outages through them or when you need the pods the most (during a zone outage) where all your assumptions went out of the window. The downside of having poor or no limits and poor and no requests is that nodes may “die” more often. Contrary to the expectation, even for managed services, the managed service is not responsible or cannot guarantee the health of a node under all circumstances, since the end user defines what is run on the nodes (shared responsibility). If the workload exhausts any resource, it will be the end of the node, e.g., by compressing the CPU too much (so that the kubelet fails to do its work), exhausting the main memory too fast, disk space, file handles, or any other resource.\nThe kubelet allows for explicit reservation of resources for operating system daemons (system-reserved) and Kubernetes daemons (kube-reserved) that are subtracted from the actual node resources and become the allocatable node resources for your workload/pods. All managed services configure these settings “by rule of thumb” (a balancing act), but cannot guarantee that the values won’t waste resources or always will be sufficient. You will have to fine-tune them eventually and adapt them to your needs. In addition, you can configure soft and hard eviction thresholds to give the kubelet some headroom to evict “greedy” pods in a controlled way. These settings can be configured for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubelet: systemReserved: # explicit resource reservation for operating system daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 kubeReserved: # explicit resource reservation for Kubernetes daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 evictionSoft: # soft, i.e. graceful eviction (used if the node is about to run out of resources, avoiding hard evictions) memoryAvailable: 200Mi imageFSAvailable: 10% imageFSInodesFree: 10% nodeFSAvailable: 10% nodeFSInodesFree: 10% evictionSoftGracePeriod: # caps pod's `terminationGracePeriodSeconds` value during soft evictions (specific grace periods) memoryAvailable: 1m30s imageFSAvailable: 1m30s imageFSInodesFree: 1m30s nodeFSAvailable: 1m30s nodeFSInodesFree: 1m30s evictionHard: # hard, i.e. immediate eviction (used if the node is out of resources, avoiding the OS generally run out of resources fail processes indiscriminately) memoryAvailable: 100Mi imageFSAvailable: 5% imageFSInodesFree: 5% nodeFSAvailable: 5% nodeFSInodesFree: 5% evictionMinimumReclaim: # additional resources to reclaim after hitting the hard eviction thresholds to not hit the same thresholds soon after again memoryAvailable: 0Mi imageFSAvailable: 0Mi imageFSInodesFree: 0Mi nodeFSAvailable: 0Mi nodeFSInodesFree: 0Mi evictionMaxPodGracePeriod: 90 # caps pod's `terminationGracePeriodSeconds` value during soft evictions (general grace periods) evictionPressureTransitionPeriod: 5m0s # stabilization time window to avoid flapping of node eviction state You can tweak these settings also individually per worker pool (spec.provider.workers.kubernetes.kubelet...), which makes sense especially with different machine types (and also workload that you may want to schedule there).\nPhysical memory is not compressible, but you can overcome this issue to some degree (alpha since Kubernetes v1.22 in combination with the feature gate NodeSwap on the kubelet) with swap memory. You can read more in this introductory blog and the docs. If you chose to use it (still only alpha at the time of this writing) you may want to consider also the risks associated with swap memory:\n Reduced performance predictability Reduced performance up to page trashing Reduced security as secrets, normally held only in memory, could be swapped out to disk That said, the various options mentioned above are only remotely related to HA and will not be further explored throughout this document, but just to remind you: if a zone goes down, load patterns will shift, existing pods will probably receive more load and will require more resources (especially because it is often practically impossible to set “proper” resource requests, which drive node allocation - limits are always ignored by the scheduler) or more pods will/must be placed on the existing and/or new nodes and then these settings, which are generally critical (especially if you switch on bin-packing for Gardener-managed clusters as a cost saving measure), will become even more critical during a zone outage.\nProbes Before we go down the rabbit hole even further and talk about how to spread your replicas, we need to talk about probes first, as they will become relevant later. Kubernetes supports three kinds of probes: startup, liveness, and readiness probes. If you are a visual thinker, also check out this slide deck by Tim Hockin (Kubernetes networking SIG chair).\nBasically, the startupProbe and the livenessProbe help you restart the container, if it’s unhealthy for whatever reason, by letting the kubelet that orchestrates your containers on a node know, that it’s unhealthy. The former is a special case of the latter and only applied at the startup of your container, if you need to handle the startup phase differently (e.g., with very slow starting containers) from the rest of the lifetime of the container.\nNow, the readinessProbe helps you manage the ready status of your container and thereby pod (any container that is not ready turns the pod not ready). This again has impact on endpoints and pod disruption budgets:\n If the pod is not ready, the endpoint will be removed and the pod will not receive traffic anymore If the pod is not ready, the pod counts into the pod disruption budget and if the budget is exceeded, no further voluntary pod disruptions will be permitted for the remaining ready pods (e.g., no eviction, no voluntary horizontal or vertical scaling, if the pod runs on a node that is about to be drained or in draining, draining will be paused until the max drain timeout passes) As you can see, all of these probes are (also) related to HA (mostly the readinessProbe, but depending on your workload, you can also leverage livenessProbe and startupProbe into your HA strategy). If Kubernetes doesn’t know about the individual status of your container/pod, it won’t do anything for you (right away). That said, later/indirectly something might/will happen via the node status that can also be ready or not ready, which influences the pods and load balancer listener registration (a not ready node will not receive cluster traffic anymore), but this process is worker pool global and reacts delayed and also doesn’t discriminate between the containers/pods on a node.\nIn addition, Kubernetes also offers pod readiness gates to amend your pod readiness with additional custom conditions (normally, only the sum of the container readiness matters, but pod readiness gates additionally count into the overall pod readiness). This may be useful if you want to block (by means of pod disruption budgets that we will talk about next) the roll-out of your workload/nodes in case some (possibly external) condition fails.\nPod Disruption Budgets One of the most important resources that help you on your way to HA are pod disruption budgets or PDB for short. They tell Kubernetes how to deal with voluntary pod disruptions, e.g., during the deployment of your workload, when the nodes are rolled, or just in general when a pod shall be evicted/terminated. Basically, if the budget is reached, they block all voluntary pod disruptions (at least for a while until possibly other timeouts act or things happen that leave Kubernetes no choice anymore, e.g., the node is forcefully terminated). You should always define them for your workload.\nVery important to note is that they are based on the readinessProbe, i.e. even if all of your replicas are lively, but not enough of them are ready, this blocks voluntary pod disruptions, so they are very critical and useful. Here an example (you can specify either minAvailable or maxUnavailable in absolute numbers or as percentage):\napiVersion: policy/v1 kind: PodDisruptionBudget spec: maxUnavailable: 1 selector: matchLabels: ... And please do not specify a PDB of maxUnavailable being 0 or similar. That’s pointless, even detrimental, as it blocks then even useful operations, forces always the hard timeouts that are less graceful and it doesn’t make sense in the context of HA. You cannot “force” HA by preventing voluntary pod disruptions, you must work with the pod disruptions in a resilient way. Besides, PDBs are really only about voluntary pod disruptions - something bad can happen to a node/pod at any time and PDBs won’t make this reality go away for you.\nPDBs will not always work as expected and can also get in your way, e.g., if the PDB is violated or would be violated, it may possibly block whatever you are trying to do to salvage the situation, e.g., drain a node or deploy a patch version (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which Kubernetes doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\nPod Topology Spread Constraints Pod topology spread constraints or PTSC for short (no official abbreviation exists, but we will use this in the following) are enormously helpful to distribute your replicas across multiple zones, nodes, or any other user-defined topology domain. They complement and improve on pod (anti-)affinities that still exist and can be used in combination.\nPTSCs are an improvement, because they allow for maxSkew and minDomains. You can steer the “level of tolerated imbalance” with maxSkew, e.g., you probably want that to be at least 1, so that you can perform a rolling update, but this all depends on your deployment (maxUnavailable and maxSurge), etc. Stateful sets are a bit different (maxUnavailable) as they are bound to volumes and depend on them, so there usually cannot be 2 pods requiring the same volume. minDomains is a hint to tell the scheduler how far to spread, e.g., if all nodes in one zone disappeared because of a zone outage, it may “appear” as if there are only 2 zones in a 3 zones cluster and the scheduling decisions may end up wrong, so a minDomains of 3 will tell the scheduler to spread to 3 zones before adding another replica in one zone. Be careful with this setting as it also means, if one zone is down the “spread” is already at least 1, if pods run in the other zones. This is useful where you have exactly as many replicas as you have zones and you do not want any imbalance. Imbalance is critical as if you end up with one, nobody is going to do the (active) re-balancing for you (unless you deploy and configure additional non-standard components such as the descheduler). So, for instance, if you have something like a DBMS that you want to spread across 2 zones (active-passive) or 3 zones (consensus-based), you better specify minDomains of 2 respectively 3 to force your replicas into at least that many zones before adding more replicas to another zone (if supported).\nAnyway, PTSCs are critical to have, but not perfect, so we saw (unsurprisingly, because that’s how the scheduler works), that the scheduler may block the deployment of new pods because it takes the decision pod-by-pod (see for instance #109364).\nPod Affinities and Anti-Affinities As said, you can combine PTSCs with pod affinities and/or anti-affinities. Especially inter-pod (anti-)affinities may be helpful to place pods apart, e.g., because they are fall-backs for each other or you do not want multiple potentially resource-hungry “best-effort” or “burstable” pods side-by-side (noisy neighbor problem), or together, e.g., because they form a unit and you want to reduce the failure domain, reduce the network latency, and reduce the costs.\nTopology Aware Hints While topology aware hints are not directly related to HA, they are very relevant in the HA context. Spreading your workload across multiple zones may increase network latency and cost significantly, if the traffic is not shaped. Topology aware hints (beta since Kubernetes v1.23, replacing the now deprecated topology aware traffic routing with topology keys) help to route the traffic within the originating zone, if possible. Basically, they tell kube-proxy how to setup your routing information, so that clients can talk to endpoints that are located within the same zone.\nBe aware however, that there are some limitations. Those are called safeguards and if they strike, the hints are off and traffic is routed again randomly. Especially controversial is the balancing limitation as there is the assumption, that the load that hits an endpoint is determined by the allocatable CPUs in that topology zone, but that’s not always, if even often, the case (see for instance #113731 and #110714). So, this limitation hits far too often and your hints are off, but then again, it’s about network latency and cost optimization first, so it’s better than nothing.\nNetworking We have talked about networking only to some small degree so far (readiness probes, pod disruption budgets, topology aware hints). The most important component is probably your ingress load balancer - everything else is managed by Kubernetes. AWS, Azure, GCP, and also OpenStack offer multi-zonal load balancers, so make use of them. In Azure and GCP, LBs are regional whereas in AWS and OpenStack, they need to be bound to a zone, which the cloud-controller-manager does by observing the zone labels at the nodes (please note that this behavior is not always working as expected, see #570 where the AWS cloud-controller-manager is not readjusting to newly observed zones).\nPlease be reminded that even if you use a service mesh like Istio, the off-the-shelf installation/configuration usually never comes with productive settings (to simplify first-time installation and improve first-time user experience) and you will have to fine-tune your installation/configuration, much like the rest of your workload.\nRelevant Cluster Settings Following now a summary/list of the more relevant settings you may like to tune for Gardener-managed clusters:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) kubernetes: kubeAPIServer: defaultNotReadyTolerationSeconds: 300 defaultUnreachableTolerationSeconds: 300 kubelet: ... kubeScheduler: featureGates: MinDomainsInPodTopologySpread: true kubeControllerManager: nodeMonitorPeriod: 10s nodeMonitorGracePeriod: 40s horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: - ... # list of zones you want your worker pool nodes to be spread across, see above kubernetes: kubelet: ... # similar to `kubelet` above (cluster-wide settings), but here per worker pool (pool-specific settings), see above machineControllerManager: # optional, it allows to configure the machine-controller settings. machineCreationTimeout: 20m machineHealthTimeout: 10m machineDrainTimeout: 60h systemComponents: coreDNS: autoscaling: mode: horizontal # valid values are `horizontal` (driven by CPU load) and `cluster-proportional` (driven by number of nodes/cores) On spec.controlPlane.highAvailability.failureTolerance.type If set, determines the degree of failure tolerance for your control plane. zone is preferred, but only available if your control plane resides in a region with 3+ zones. See above and the docs.\nOn spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds This is a very interesting API server setting that lets Kubernetes decide how fast to evict pods from nodes whose status condition of type Ready is either Unknown (node status unknown, a.k.a unreachable) or False (kubelet not ready) (see node status conditions; please note that kubectl shows both values as NotReady which is a somewhat “simplified” visualization).\nYou can also override the cluster-wide API server settings individually per pod:\nspec: tolerations: - key: \"node.kubernetes.io/unreachable\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 - key: \"node.kubernetes.io/not-ready\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 This will evict pods on unreachable or not-ready nodes immediately, but be cautious: 0 is very aggressive and may lead to unnecessary disruptions. Again, you must decide for your own workload and balance out the pros and cons (e.g., long startup time).\nPlease note, these settings replace spec.kubernetes.kubeControllerManager.podEvictionTimeout that was deprecated with Kubernetes v1.26 (and acted as an upper bound).\nOn spec.kubernetes.kubeScheduler.featureGates.MinDomainsInPodTopologySpread Required to be enabled for minDomains to work with PTSCs (beta since Kubernetes v1.25, but off by default). See above and the docs. This tells the scheduler, how many topology domains to expect (=zones in the context of this document).\nOn spec.kubernetes.kubeControllerManager.nodeMonitorPeriod and nodeMonitorGracePeriod This is another very interesting kube-controller-manager setting that can help you speed up or slow down how fast a node shall be considered Unknown (node status unknown, a.k.a unreachable) when the kubelet is not updating its status anymore (see node status conditions), which effects eviction (see spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds above). The shorter the time window, the faster Kubernetes will act, but the higher the chance of flapping behavior and pod trashing, so you may want to balance that out according to your needs, otherwise stick to the default which is a reasonable compromise.\nOn spec.kubernetes.kubeControllerManager.horizontalPodAutoscaler... This configures horizontal pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.verticalPodAutoscaler... This configures vertical pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.clusterAutoscaler... This configures node auto-scaling in Gardener-managed clusters. See above and the docs for the detailed fields, especially about expanders, which may become life-saving in case of a zone outage when a resource crunch is setting in and everybody rushes to get machines in the healthy zones.\nIn case of a zone outage, it may be interesting to understand how the cluster autoscaler will put a worker pool in one zone into “back-off”. Unfortunately, the official cluster autoscaler documentation does not explain these details, but you can find hints in the source code:\nIf a node fails to come up, the node group (worker pool in that zone) will go into “back-off”, at first 5m, then exponentially longer until the maximum of 30m is reached. The “back-off” is reset after 3 hours. This in turn means, that nodes must be first considered Unknown, which happens when spec.kubernetes.kubeControllerManager.nodeMonitorPeriod.nodeMonitorGracePeriod lapses. Then they must either remain in this state until spec.provider.workers.machineControllerManager.machineHealthTimeout lapses for them to be recreated, which will fail in the unhealthy zone, or spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds lapses for the pods to be evicted (usually faster than node replacements, depending on your configuration), which will trigger the cluster autoscaler to create more capacity, but very likely in the same zone as it tries to balance its node groups at first, which will also fail in the unhealthy zone. It will be considered failed only when maxNodeProvisionTime lapses (usually close to spec.provider.workers.machineControllerManager.machineCreationTimeout) and only then put the node group into “back-off” and not retry for 5m at first and then exponentially longer. It’s critical to keep that in mind and accommodate for it. If you have already capacity up and running, the reaction time is usually much faster with leases (whatever you set) or endpoints (spec.kubernetes.kubeControllerManager.nodeMonitorPeriod.nodeMonitorGracePeriod), but if you depend on new/fresh capacity, the above should inform you how long you will have to wait for it.\nOn spec.provider.workers.minimum, maximum, maxSurge, maxUnavailable, zones, and machineControllerManager Each worker pool in Gardener may be configured differently. Among many other settings like machine type, root disk, Kubernetes version, kubelet settings, and many more you can also specify the lower and upper bound for the number of machines (minimum and maximum), how many machines may be added additionally during a rolling update (maxSurge) and how many machines may be in termination/recreation during a rolling update (maxUnavailable), and of course across how many zones the nodes shall be spread (zones).\nInteresting is also the configuration for Gardener’s machine-controller-manager or MCM for short that provisions, monitors, terminates, replaces, or updates machines that back your nodes:\n The shorter machineCreationTimeout is, the faster MCM will retry to create a machine/node, if the process is stuck on cloud provider side. It is set to useful/practical timeouts for the different cloud providers and you probably don’t want to change those (in the context of HA at least). Please align with the cluster autoscaler’s maxNodeProvisionTime. The shorter machineHealthTimeout is, the faster MCM will replace machines/nodes in case the kubelet isn’t reporting back, which translates to Unknown, or reports back with NotReady, or the node-problem-detector that Gardener deploys for you reports a non-recoverable issue/condition (e.g., read-only file system). If it is too short however, you risk node and pod trashing, so be careful. The shorter machineDrainTimeout is, the faster you can get rid of machines/nodes that MCM decided to remove, but this puts a cap on the grace periods and PDBs. They are respected up until the drain timeout lapses - then the machine/node will be forcefully terminated, whether or not the pods are still in termination or not even terminated because of PDBs. Those PDBs will then be violated, so be careful here as well. Please align with the cluster autoscaler’s maxGracefulTerminationSeconds. Especially the last two settings may help you recover faster from cloud provider issues.\nOn spec.systemComponents.coreDNS.autoscaling DNS is critical, in general and also within a Kubernetes cluster. Gardener-managed clusters deploy CoreDNS, a graduated CNCF project. Gardener supports 2 auto-scaling modes for it, horizontal (using HPA based on CPU) and cluster-proportional (using cluster proportional autoscaler that scales the number of pods based on the number of nodes/cores, not to be confused with the cluster autoscaler that scales nodes based on their utilization). Check out the docs, especially the trade-offs why you would chose one over the other (cluster-proportional gives you more configuration options, if CPU-based horizontal scaling is insufficient to your needs). Consider also Gardener’s feature node-local DNS to decouple you further from the DNS pods and stabilize DNS. Again, that’s not strictly related to HA, but may become important during a zone outage, when load patterns shift and pods start to initialize/resolve DNS records more frequently in bulk.\nMore Caveats Unfortunately, there are a few more things of note when it comes to HA in a Kubernetes cluster that may be “surprising” and hard to mitigate:\n If the kubelet restarts, it will report all pods as NotReady on startup until it reruns its probes (#100277), which leads to temporary endpoint and load balancer target removal (#102367). This topic is somewhat controversial. Gardener uses rolling updates and a jitter to spread necessary kubelet restarts as good as possible. If a kube-proxy pod on a node turns NotReady, all load balancer traffic to all pods (on this node) under services with externalTrafficPolicy local will cease as the load balancer will then take this node out of serving. This topic is somewhat controversial as well. So, please remember that externalTrafficPolicy local not only has the disadvantage of imbalanced traffic spreading, but also a dependency to the kube-proxy pod that may and will be unavailable during updates. Gardener uses rolling updates to spread necessary kube-proxy updates as good as possible. These are just a few additional considerations. They may or may not affect you, but other intricacies may. It’s a reminder to be watchful as Kubernetes may have one or two relevant quirks that you need to consider (and will probably only find out over time and with extensive testing).\nMeaningful Availability Finally, let’s go back to where we started. We recommended to measure meaningful availability. For instance, in Gardener, we do not trust only internal signals, but track also whether Gardener or the control planes that it manages are externally available through the external DNS records and load balancers, SNI-routing Istio gateways, etc. (the same path all users must take). It’s a huge difference whether the API server’s internal readiness probe passes or the user can actually reach the API server and it does what it’s supposed to do. Most likely, you will be in a similar spot and can do the same.\nWhat you do with these signals is another matter. Maybe there are some actionable metrics and you can trigger some active fail-over, maybe you can only use it to improve your HA setup altogether. In our case, we also use it to deploy mitigations, e.g., via our dependency-watchdog that watches, for instance, Gardener-managed API servers and shuts down components like the controller managers to avert cascading knock-off effects (e.g., melt-down if the kubelets cannot reach the API server, but the controller managers can and start taking down nodes and pods).\nEither way, understanding how users perceive your service is key to the improvement process as a whole. Even if you are not struck by a zone outage, the measures above and tracking the meaningful availability will help you improve your service.\nThank you for your interest and we wish you no or a “successful” zone outage next time. 😊\nWant to know more about Gardener? The Gardener project is Open Source and hosted on GitHub.\nFeedback and contributions are always welcome!\nAll channels for getting in touch or learning about the project are listed on our landing page. We are cordially inviting interested parties to join our bi-weekly meetings.\n","categories":"","description":"","excerpt":"Developing highly available workload that can tolerate a zone outage …","ref":"/blog/2023/03-27-high-availability-and-zone-outage-toleration/","tags":"","title":"High Availability and Zone Outage Toleration"},{"body":"Implementing High Availability and Tolerating Zone Outages Developing highly available workload that can tolerate a zone outage is no trivial task. You will find here various recommendations to get closer to that goal. While many recommendations are general enough, the examples are specific in how to achieve this in a Gardener-managed cluster and where/how to tweak the different control plane components. If you do not use Gardener, it may be still a worthwhile read.\nFirst however, what is a zone outage? It sounds like a clear-cut “thing”, but it isn’t. There are many things that can go haywire. Here are some examples:\n Elevated cloud provider API error rates for individual or multiple services Network bandwidth reduced or latency increased, usually also effecting storage sub systems as they are network attached No networking at all, no DNS, machines shutting down or restarting, … Functional issues, of either the entire service (e.g. all block device operations) or only parts of it (e.g. LB listener registration) All services down, temporarily or permanently (the proverbial burning down data center 🔥) This and everything in between make it hard to prepare for such events, but you can still do a lot. The most important recommendation is to not target specific issues exclusively - tomorrow another service will fail in an unanticipated way. Also, focus more on meaningful availability than on internal signals (useful, but not as relevant as the former). Always prefer automation over manual intervention (e.g. leader election is a pretty robust mechanism, auto-scaling may be required as well, etc.).\nAlso remember that HA is costly - you need to balance it against the cost of an outage as silly as this may sound, e.g. running all this excess capacity “just in case” vs. “going down” vs. a risk-based approach in between where you have means that will kick in, but they are not guaranteed to work (e.g. if the cloud provider is out of resource capacity). Maybe some of your components must run at the highest possible availability level, but others not - that’s a decision only you can make.\nControl Plane The Kubernetes cluster control plane is managed by Gardener (as pods in separate infrastructure clusters to which you have no direct access) and can be set up with no failure tolerance (control plane pods will be recreated best-effort when resources are available) or one of the failure tolerance types node or zone.\nStrictly speaking, static workload does not depend on the (high) availability of the control plane, but static workload doesn’t rhyme with Cloud and Kubernetes and also means, that when you possibly need it the most, e.g. during a zone outage, critical self-healing or auto-scaling functionality won’t be available to you and your workload, if your control plane is down as well. That’s why, even though the resource consumption is significantly higher, we generally recommend to use the failure tolerance type zone for the control planes of productive clusters, at least in all regions that have 3+ zones. Regions that have only 1 or 2 zones don’t support the failure tolerance type zone and then your second best option is the failure tolerance type node, which means a zone outage can still take down your control plane, but individual node outages won’t.\nIn the shoot resource it’s merely only this what you need to add:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) This setting will scale out all control plane components for a Gardener cluster as necessary, so that no single zone outage can take down the control plane for longer than just a few seconds for the fail-over to take place (e.g. lease expiration and new leader election or readiness probe failure and endpoint removal). Components run highly available in either active-active (servers) or active-passive (controllers) mode at all times, the persistence (ETCD), which is consensus-based, will tolerate the loss of one zone and still maintain quorum and therefore remain operational. These are all patterns that we will revisit down below also for your own workload.\nWorker Pools Now that you have configured your Kubernetes cluster control plane in HA, i.e. spread it across multiple zones, you need to do the same for your own workload, but in order to do so, you need to spread your nodes across multiple zones first.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: provider: workers: - name: ... minimum: 6 maximum: 60 zones: - ... Prefer regions with at least 2, better 3+ zones and list the zones in the zones section for each of your worker pools. Whether you need 2 or 3 zones at a minimum depends on your fail-over concept:\n Consensus-based software components (like ETCD) depend on maintaining a quorum of (n/2)+1, so you need at least 3 zones to tolerate the outage of 1 zone. Primary/Secondary-based software components need just 2 zones to tolerate the outage of 1 zone. Then there are software components that can scale out horizontally. They are probably fine with 2 zones, but you also need to think about the load-shift and that the remaining zone must then pick up the work of the unhealthy zone. With 2 zones, the remaining zone must cope with an increase of 100% load. With 3 zones, the remaining zones must only cope with an increase of 50% load (per zone). In general, the question is also whether you have the fail-over capacity already up and running or not. If not, i.e. you depend on re-scheduling to a healthy zone or auto-scaling, be aware that during a zone outage, you will see a resource crunch in the healthy zones. If you have no automation, i.e. only human operators (a.k.a. “red button approach”), you probably will not get the machines you need and even with automation, it may be tricky. But holding the capacity available at all times is costly. In the end, that’s a decision only you can make. If you made that decision, please adapt the minimum, maximum, maxSurge and maxUnavailable settings for your worker pools accordingly (visit this section for more information).\nAlso, consider fall-back worker pools (with different/alternative machine types) and cluster autoscaler expanders using a priority-based strategy.\nGardener-managed clusters deploy the cluster autoscaler or CA for short and you can tweak the general CA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 If you want to be ready for a sudden spike or have some buffer in general, over-provision nodes by means of “placeholder” pods with low priority and appropriate resource requests. This way, they will demand nodes to be provisioned for them, but if any pod comes up with a regular/higher priority, the low priority pods will be evicted to make space for the more important ones. Strictly speaking, this is not related to HA, but it may be important to keep this in mind as you generally want critical components to be rescheduled as fast as possible and if there is no node available, it may take 3 minutes or longer to do so (depending on the cloud provider). Besides, not only zones can fail, but also individual nodes.\nReplicas (Horizontal Scaling) Now let’s talk about your workload. In most cases, this will mean to run multiple replicas. If you cannot do that (a.k.a. you have a singleton), that’s a bad situation to be in. Maybe you can run a spare (secondary) as backup? If you cannot, you depend on quick detection and rescheduling of your singleton (more on that below).\nObviously, things get messier with persistence. If you have persistence, you should ideally replicate your data, i.e. let your spare (secondary) “follow” your main (primary). If your software doesn’t support that, you have to deploy other means, e.g. volume snapshotting or side-backups (specific to the software you deploy; keep the backups regional, so that you can switch to another zone at all times). If you have to do those, your HA scenario becomes more a DR scenario and terms like RPO and RTO become relevant to you:\n Recovery Point Objective (RPO): Potential data loss, i.e. how much data will you lose at most (time between backups) Recovery Time Objective (RTO): Time until recovery, i.e. how long does it take you to be operational again (time to restore) Also, keep in mind that your persistent volumes are usually zonal, i.e. once you have a volume in one zone, it’s bound to that zone and you cannot get up your pod in another zone w/o first recreating the volume yourself (Kubernetes won’t help you here directly).\nAnyway, best avoid that, if you can (from technical and cost perspective). The best solution (and also the most costly one) is to run multiple replicas in multiple zones and keep your data replicated at all times, so that your RPO is always 0 (best). That’s what we do for Gardener-managed cluster HA control planes (ETCD) as any data loss may be disastrous and lead to orphaned resources (in addition, we deploy side cars that do side-backups for disaster recovery, with full and incremental snapshots with an RPO of 5m).\nSo, how to run with multiple replicas? That’s the easiest part in Kubernetes and the two most important resources, Deployments and StatefulSet, support that out of the box:\napiVersion: apps/v1 kind: Deployment | StatefulSet spec: replicas: ... The problem comes with the number of replicas. It’s easy only if the number is static, e.g. 2 for active-active/passive or 3 for consensus-based software components, but what with software components that can scale out horizontally? Here you usually do not set the number of replicas statically, but make use of the horizontal pod autoscaler or HPA for short (built-in; part of the kube-controller-manager). There are also other options like the cluster proportional autoscaler, but while the former works based on metrics, the latter is more a guestimate approach that derives the number of replicas from the number of nodes/cores in a cluster. Sometimes useful, but often blind to the actual demand.\nSo, HPA it is then for most of the cases. However, what is the resource (e.g. CPU or memory) that drives the number of desired replicas? Again, this is up to you, but not always are CPU or memory the best choices. In some cases, custom metrics may be more appropriate, e.g. requests per second (it was also for us).\nYou will have to create specific HorizontalPodAutoscaler resources for your scale target and can tweak the general HPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeControllerManager: horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s Resources (Vertical Scaling) While it is important to set a sufficient number of replicas, it is also important to give the pods sufficient resources (CPU and memory). This is especially true when you think about HA. When a zone goes down, you might need to get up replacement pods, if you don’t have them running already to take over the load from the impacted zone. Likewise, e.g. with active-active software components, you can expect the remaining pods to receive more load. If you cannot scale them out horizontally to serve the load, you will probably need to scale them out (or rather up) vertically. This is done by the vertical pod autoscaler or VPA for short (not built-in; part of the kubernetes/autoscaler repository).\nA few caveats though:\n You cannot use HPA and VPA on the same metrics as they would influence each other, which would lead to pod trashing (more replicas require fewer resources; fewer resources require more replicas) Scaling horizontally doesn’t cause downtimes (at least not when out-scaling and only one replica is affected when in-scaling), but scaling vertically does (if the pod runs OOM anyway, but also when new recommendations are applied, resource requests for existing pods may be changed, which causes the pods to be rescheduled). Although the discussion is going on for a very long time now, that is still not supported in-place yet (see KEP 1287, implementation in Kubernetes, implementation in VPA). VPA is a useful tool and Gardener-managed clusters deploy a VPA by default for you (HPA is supported anyway as it’s built into the kube-controller-manager). You will have to create specific VerticalPodAutoscaler resources for your scale target and can tweak the general VPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s While horizontal pod autoscaling is relatively straight-forward, it takes a long time to master vertical pod autoscaling. We saw performance issues, hard-coded behavior (on OOM, memory is bumped by +20% and it may take a few iterations to reach a good level), unintended pod disruptions by applying new resource requests (after 12h all targeted pods will receive new requests even though individually they would be fine without, which also drives active-passive resource consumption up), difficulties to deal with spiky workload in general (due to the algorithmic approach it takes), recommended requests may exceed node capacity, limit scaling is proportional and therefore often questionable, and more. VPA is a double-edged sword: useful and necessary, but not easy to handle.\nFor the Gardener-managed components, we mostly removed limits. Why?\n CPU limits have almost always only downsides. They cause needless CPU throttling, which is not even easily visible. CPU requests turn into cpu shares, so if the node has capacity, the pod may consume the freely available CPU, but not if you have set limits, which curtail the pod by means of cpu quota. There are only certain scenarios in which they may make sense, e.g. if you set requests=limits and thereby define a pod with guaranteed QoS, which influences your cgroup placement. However, that is difficult to do for the components you implement yourself and practically impossible for the components you just consume, because what’s the correct value for requests/limits and will it hold true also if the load increases and what happens if a zone goes down or with the next update/version of this component? If anything, CPU limits caused outages, not helped prevent them. As for memory limits, they are slightly more useful, because CPU is compressible and memory is not, so if one pod runs berserk, it may take others down (with CPU, cpu shares make it as fair as possible), depending on which OOM killer strikes (a complicated topic by itself). You don’t want the operating system OOM killer to strike as the result is unpredictable. Better, it’s the cgroup OOM killer or even the kubelet’s eviction, if the consumption is slow enough as it takes priorities into consideration even. If your component is critical and a singleton (e.g. node daemon set pods), you are better off also without memory limits, because letting the pod go OOM because of artificial/wrong memory limits can mean that the node becomes unusable. Hence, such components also better run only with no or a very high memory limit, so that you can catch the occasional memory leak (bug) eventually, but under normal operation, if you cannot decide about a true upper limit, rather not have limits and cause endless outages through them or when you need the pods the most (during a zone outage) where all your assumptions went out of the window. The downside of having poor or no limits and poor and no requests is that nodes may “die” more often. Contrary to the expectation, even for managed services, the managed service is not responsible or cannot guarantee the health of a node under all circumstances, since the end user defines what is run on the nodes (shared responsibility). If the workload exhausts any resource, it will be the end of the node, e.g. by compressing the CPU too much (so that the kubelet fails to do its work), exhausting the main memory too fast, disk space, file handles, or any other resource.\nThe kubelet allows for explicit reservation of resources for operating system daemons (system-reserved) and Kubernetes daemons (kube-reserved) that are subtracted from the actual node resources and become the allocatable node resources for your workload/pods. All managed services configure these settings “by rule of thumb” (a balancing act), but cannot guarantee that the values won’t waste resources or always will be sufficient. You will have to fine-tune them eventually and adapt them to your needs. In addition, you can configure soft and hard eviction thresholds to give the kubelet some headroom to evict “greedy” pods in a controlled way. These settings can be configured for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubelet: kubeReserved: # explicit resource reservation for Kubernetes daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 evictionSoft: # soft, i.e. graceful eviction (used if the node is about to run out of resources, avoiding hard evictions) memoryAvailable: 200Mi imageFSAvailable: 10% imageFSInodesFree: 10% nodeFSAvailable: 10% nodeFSInodesFree: 10% evictionSoftGracePeriod: # caps pod's `terminationGracePeriodSeconds` value during soft evictions (specific grace periods) memoryAvailable: 1m30s imageFSAvailable: 1m30s imageFSInodesFree: 1m30s nodeFSAvailable: 1m30s nodeFSInodesFree: 1m30s evictionHard: # hard, i.e. immediate eviction (used if the node is out of resources, avoiding the OS generally run out of resources fail processes indiscriminately) memoryAvailable: 100Mi imageFSAvailable: 5% imageFSInodesFree: 5% nodeFSAvailable: 5% nodeFSInodesFree: 5% evictionMinimumReclaim: # additional resources to reclaim after hitting the hard eviction thresholds to not hit the same thresholds soon after again memoryAvailable: 0Mi imageFSAvailable: 0Mi imageFSInodesFree: 0Mi nodeFSAvailable: 0Mi nodeFSInodesFree: 0Mi evictionMaxPodGracePeriod: 90 # caps pod's `terminationGracePeriodSeconds` value during soft evictions (general grace periods) evictionPressureTransitionPeriod: 5m0s # stabilization time window to avoid flapping of node eviction state You can tweak these settings also individually per worker pool (spec.provider.workers.kubernetes.kubelet...), which makes sense especially with different machine types (and also workload that you may want to schedule there).\nPhysical memory is not compressible, but you can overcome this issue to some degree (alpha since Kubernetes v1.22 in combination with the feature gate NodeSwap on the kubelet) with swap memory. You can read more in this introductory blog and the docs. If you chose to use it (still only alpha at the time of this writing) you may want to consider also the risks associated with swap memory:\n Reduced performance predictability Reduced performance up to page trashing Reduced security as secrets, normally held only in memory, could be swapped out to disk That said, the various options mentioned above are only remotely related to HA and will not be further explored throughout this document, but just to remind you: if a zone goes down, load patterns will shift, existing pods will probably receive more load and will require more resources (especially because it is often practically impossible to set “proper” resource requests, which drive node allocation - limits are always ignored by the scheduler) or more pods will/must be placed on the existing and/or new nodes and then these settings, which are generally critical (especially if you switch on bin-packing for Gardener-managed clusters as a cost saving measure), will become even more critical during a zone outage.\nProbes Before we go down the rabbit hole even further and talk about how to spread your replicas, we need to talk about probes first, as they will become relevant later. Kubernetes supports three kinds of probes: startup, liveness, and readiness probes. If you are a visual thinker, also check out this slide deck by Tim Hockin (Kubernetes networking SIG chair).\nBasically, the startupProbe and the livenessProbe help you restart the container, if it’s unhealthy for whatever reason, by letting the kubelet that orchestrates your containers on a node know, that it’s unhealthy. The former is a special case of the latter and only applied at the startup of your container, if you need to handle the startup phase differently (e.g. with very slow starting containers) from the rest of the lifetime of the container.\nNow, the readinessProbe helps you manage the ready status of your container and thereby pod (any container that is not ready turns the pod not ready). This again has impact on endpoints and pod disruption budgets:\n If the pod is not ready, the endpoint will be removed and the pod will not receive traffic anymore If the pod is not ready, the pod counts into the pod disruption budget and if the budget is exceeded, no further voluntary pod disruptions will be permitted for the remaining ready pods (e.g. no eviction, no voluntary horizontal or vertical scaling, if the pod runs on a node that is about to be drained or in draining, draining will be paused until the max drain timeout passes) As you can see, all of these probes are (also) related to HA (mostly the readinessProbe, but depending on your workload, you can also leverage livenessProbe and startupProbe into your HA strategy). If Kubernetes doesn’t know about the individual status of your container/pod, it won’t do anything for you (right away). That said, later/indirectly something might/will happen via the node status that can also be ready or not ready, which influences the pods and load balancer listener registration (a not ready node will not receive cluster traffic anymore), but this process is worker pool global and reacts delayed and also doesn’t discriminate between the containers/pods on a node.\nIn addition, Kubernetes also offers pod readiness gates to amend your pod readiness with additional custom conditions (normally, only the sum of the container readiness matters, but pod readiness gates additionally count into the overall pod readiness). This may be useful if you want to block (by means of pod disruption budgets that we will talk about next) the roll-out of your workload/nodes in case some (possibly external) condition fails.\nPod Disruption Budgets One of the most important resources that help you on your way to HA are pod disruption budgets or PDB for short. They tell Kubernetes how to deal with voluntary pod disruptions, e.g. during the deployment of your workload, when the nodes are rolled, or just in general when a pod shall be evicted/terminated. Basically, if the budget is reached, they block all voluntary pod disruptions (at least for a while until possibly other timeouts act or things happen that leave Kubernetes no choice anymore, e.g. the node is forcefully terminated). You should always define them for your workload.\nVery important to note is that they are based on the readinessProbe, i.e. even if all of your replicas are lively, but not enough of them are ready, this blocks voluntary pod disruptions, so they are very critical and useful. Here an example (you can specify either minAvailable or maxUnavailable in absolute numbers or as percentage):\napiVersion: policy/v1 kind: PodDisruptionBudget spec: maxUnavailable: 1 selector: matchLabels: ... And please do not specify a PDB of maxUnavailable being 0 or similar. That’s pointless, even detrimental, as it blocks then even useful operations, forces always the hard timeouts that are less graceful and it doesn’t make sense in the context of HA. You cannot “force” HA by preventing voluntary pod disruptions, you must work with the pod disruptions in a resilient way. Besides, PDBs are really only about voluntary pod disruptions - something bad can happen to a node/pod at any time and PDBs won’t make this reality go away for you.\nPDBs will not always work as expected and can also get in your way, e.g. if the PDB is violated or would be violated, it may possibly block whatever you are trying to do to salvage the situation, e.g. drain a node or deploy a patch version (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which Kubernetes doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server, beta and enabled by default since Kubernetes v1.27) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\nPod Topology Spread Constraints Pod topology spread constraints or PTSC for short (no official abbreviation exists, but we will use this in the following) are enormously helpful to distribute your replicas across multiple zones, nodes, or any other user-defined topology domain. They complement and improve on pod (anti-)affinities that still exist and can be used in combination.\nPTSCs are an improvement, because they allow for maxSkew and minDomains. You can steer the “level of tolerated imbalance” with maxSkew, e.g. you probably want that to be at least 1, so that you can perform a rolling update, but this all depends on your deployment (maxUnavailable and maxSurge), etc. Stateful sets are a bit different (maxUnavailable) as they are bound to volumes and depend on them, so there usually cannot be 2 pods requiring the same volume. minDomains is a hint to tell the scheduler how far to spread, e.g. if all nodes in one zone disappeared because of a zone outage, it may “appear” as if there are only 2 zones in a 3 zones cluster and the scheduling decisions may end up wrong, so a minDomains of 3 will tell the scheduler to spread to 3 zones before adding another replica in one zone. Be careful with this setting as it also means, if one zone is down the “spread” is already at least 1, if pods run in the other zones. This is useful where you have exactly as many replicas as you have zones and you do not want any imbalance. Imbalance is critical as if you end up with one, nobody is going to do the (active) re-balancing for you (unless you deploy and configure additional non-standard components such as the descheduler). So, for instance, if you have something like a DBMS that you want to spread across 2 zones (active-passive) or 3 zones (consensus-based), you better specify minDomains of 2 respectively 3 to force your replicas into at least that many zones before adding more replicas to another zone (if supported).\nAnyway, PTSCs are critical to have, but not perfect, so we saw (unsurprisingly, because that’s how the scheduler works), that the scheduler may block the deployment of new pods because it takes the decision pod-by-pod (see for instance #109364).\nPod Affinities and Anti-Affinities As said, you can combine PTSCs with pod affinities and/or anti-affinities. Especially inter-pod (anti-)affinities may be helpful to place pods apart, e.g. because they are fall-backs for each other or you do not want multiple potentially resource-hungry “best-effort” or “burstable” pods side-by-side (noisy neighbor problem), or together, e.g. because they form a unit and you want to reduce the failure domain, reduce the network latency, and reduce the costs.\nTopology Aware Hints While topology aware hints are not directly related to HA, they are very relevant in the HA context. Spreading your workload across multiple zones may increase network latency and cost significantly, if the traffic is not shaped. Topology aware hints (beta since Kubernetes v1.23, replacing the now deprecated topology aware traffic routing with topology keys) help to route the traffic within the originating zone, if possible. Basically, they tell kube-proxy how to setup your routing information, so that clients can talk to endpoints that are located within the same zone.\nBe aware however, that there are some limitations. Those are called safeguards and if they strike, the hints are off and traffic is routed again randomly. Especially controversial is the balancing limitation as there is the assumption, that the load that hits an endpoint is determined by the allocatable CPUs in that topology zone, but that’s not always, if even often, the case (see for instance #113731 and #110714). So, this limitation hits far too often and your hints are off, but then again, it’s about network latency and cost optimization first, so it’s better than nothing.\nNetworking We have talked about networking only to some small degree so far (readiness probes, pod disruption budgets, topology aware hints). The most important component is probably your ingress load balancer - everything else is managed by Kubernetes. AWS, Azure, GCP, and also OpenStack offer multi-zonal load balancers, so make use of them. In Azure and GCP, LBs are regional whereas in AWS and OpenStack, they need to be bound to a zone, which the cloud-controller-manager does by observing the zone labels at the nodes (please note that this behavior is not always working as expected, see #570 where the AWS cloud-controller-manager is not readjusting to newly observed zones).\nPlease be reminded that even if you use a service mesh like Istio, the off-the-shelf installation/configuration usually never comes with productive settings (to simplify first-time installation and improve first-time user experience) and you will have to fine-tune your installation/configuration, much like the rest of your workload.\nRelevant Cluster Settings Following now a summary/list of the more relevant settings you may like to tune for Gardener-managed clusters:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) kubernetes: kubeAPIServer: defaultNotReadyTolerationSeconds: 300 defaultUnreachableTolerationSeconds: 300 kubelet: ... kubeScheduler: featureGates: MinDomainsInPodTopologySpread: true kubeControllerManager: nodeMonitorGracePeriod: 40s horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: - ... # list of zones you want your worker pool nodes to be spread across, see above kubernetes: kubelet: ... # similar to `kubelet` above (cluster-wide settings), but here per worker pool (pool-specific settings), see above machineControllerManager: # optional, it allows to configure the machine-controller settings. machineCreationTimeout: 20m machineHealthTimeout: 10m machineDrainTimeout: 60h systemComponents: coreDNS: autoscaling: mode: horizontal # valid values are `horizontal` (driven by CPU load) and `cluster-proportional` (driven by number of nodes/cores) On spec.controlPlane.highAvailability.failureTolerance.type If set, determines the degree of failure tolerance for your control plane. zone is preferred, but only available if your control plane resides in a region with 3+ zones. See above and the docs.\nOn spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds This is a very interesting API server setting that lets Kubernetes decide how fast to evict pods from nodes whose status condition of type Ready is either Unknown (node status unknown, a.k.a unreachable) or False (kubelet not ready) (see node status conditions; please note that kubectl shows both values as NotReady which is a somewhat “simplified” visualization).\nYou can also override the cluster-wide API server settings individually per pod:\nspec: tolerations: - key: \"node.kubernetes.io/unreachable\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 - key: \"node.kubernetes.io/not-ready\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 This will evict pods on unreachable or not-ready nodes immediately, but be cautious: 0 is very aggressive and may lead to unnecessary disruptions. Again, you must decide for your own workload and balance out the pros and cons (e.g. long startup time).\nPlease note, these settings replace spec.kubernetes.kubeControllerManager.podEvictionTimeout that was deprecated with Kubernetes v1.26 (and acted as an upper bound).\nOn spec.kubernetes.kubeScheduler.featureGates.MinDomainsInPodTopologySpread Required to be enabled for minDomains to work with PTSCs (beta since Kubernetes v1.25, but off by default). See above and the docs. This tells the scheduler, how many topology domains to expect (=zones in the context of this document).\nOn spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod This is another very interesting kube-controller-manager setting that can help you speed up or slow down how fast a node shall be considered Unknown (node status unknown, a.k.a unreachable) when the kubelet is not updating its status anymore (see node status conditions), which effects eviction (see spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds above). The shorter the time window, the faster Kubernetes will act, but the higher the chance of flapping behavior and pod trashing, so you may want to balance that out according to your needs, otherwise stick to the default which is a reasonable compromise.\nOn spec.kubernetes.kubeControllerManager.horizontalPodAutoscaler... This configures horizontal pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.verticalPodAutoscaler... This configures vertical pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.clusterAutoscaler... This configures node auto-scaling in Gardener-managed clusters. See above and the docs for the detailed fields, especially about expanders, which may become life-saving in case of a zone outage when a resource crunch is setting in and everybody rushes to get machines in the healthy zones.\nIn case of a zone outage, it is critical to understand how the cluster autoscaler will put a worker pool in one zone into “back-off” and what the consequences for your workload will be. Unfortunately, the official cluster autoscaler documentation does not explain these details, but you can find hints in the source code:\nIf a node fails to come up, the node group (worker pool in that zone) will go into “back-off”, at first 5m, then exponentially longer until the maximum of 30m is reached. The “back-off” is reset after 3 hours. This in turn means, that nodes must be first considered Unknown, which happens when spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod lapses (e.g. at the beginning of a zone outage). Then they must either remain in this state until spec.provider.workers.machineControllerManager.machineHealthTimeout lapses for them to be recreated, which will fail in the unhealthy zone, or spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds lapses for the pods to be evicted (usually faster than node replacements, depending on your configuration), which will trigger the cluster autoscaler to create more capacity, but very likely in the same zone as it tries to balance its node groups at first, which will fail in the unhealthy zone. It will be considered failed only when maxNodeProvisionTime lapses (usually close to spec.provider.workers.machineControllerManager.machineCreationTimeout) and only then put the node group into “back-off” and not retry for 5m (at first and then exponentially longer). Only then you can expect new node capacity to be brought up somewhere else.\nDuring the time of ongoing node provisioning (before a node group goes into “back-off”), the cluster autoscaler may have “virtually scheduled” pending pods onto those new upcoming nodes and will not reevaluate these pods anymore unless the node provisioning fails (which will fail during a zone outage, but the cluster autoscaler cannot know that and will therefore reevaluate its decision only after it has given up on the new nodes).\nIt’s critical to keep that in mind and accommodate for it. If you have already capacity up and running, the reaction time is usually much faster with leases (whatever you set) or endpoints (spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod), but if you depend on new/fresh capacity, the above should inform you how long you will have to wait for it and for how long pods might be pending (because capacity is generally missing and pending pods may have been “virtually scheduled” to new nodes that won’t come up until the node group goes eventually into “back-off” and nodes in the healthy zones come up).\nOn spec.provider.workers.minimum, maximum, maxSurge, maxUnavailable, zones, and machineControllerManager Each worker pool in Gardener may be configured differently. Among many other settings like machine type, root disk, Kubernetes version, kubelet settings, and many more you can also specify the lower and upper bound for the number of machines (minimum and maximum), how many machines may be added additionally during a rolling update (maxSurge) and how many machines may be in termination/recreation during a rolling update (maxUnavailable), and of course across how many zones the nodes shall be spread (zones).\nGardener divides minimum, maximum, maxSurge, maxUnavailable values by the number of zones specified for this worker pool. This fact must be considered when you plan the sizing of your worker pools.\nExample:\n provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: [\"a\", \"b\", \"c\"] The resulting MachineDeployments per zone will get minimum: 2, maximum: 20, maxSurge: 1, maxUnavailable: 0. If another zone is added all values will be divided by 4, resulting in: Less workers per zone. ⚠️ One MachineDeployment with maxSurge: 0, i.e. there will be a replacement of nodes without rolling updates. Interesting is also the configuration for Gardener’s machine-controller-manager or MCM for short that provisions, monitors, terminates, replaces, or updates machines that back your nodes:\n The shorter machineCreationTimeout is, the faster MCM will retry to create a machine/node, if the process is stuck on cloud provider side. It is set to useful/practical timeouts for the different cloud providers and you probably don’t want to change those (in the context of HA at least). Please align with the cluster autoscaler’s maxNodeProvisionTime. The shorter machineHealthTimeout is, the faster MCM will replace machines/nodes in case the kubelet isn’t reporting back, which translates to Unknown, or reports back with NotReady, or the node-problem-detector that Gardener deploys for you reports a non-recoverable issue/condition (e.g. read-only file system). If it is too short however, you risk node and pod trashing, so be careful. The shorter machineDrainTimeout is, the faster you can get rid of machines/nodes that MCM decided to remove, but this puts a cap on the grace periods and PDBs. They are respected up until the drain timeout lapses - then the machine/node will be forcefully terminated, whether or not the pods are still in termination or not even terminated because of PDBs. Those PDBs will then be violated, so be careful here as well. Please align with the cluster autoscaler’s maxGracefulTerminationSeconds. Especially the last two settings may help you recover faster from cloud provider issues.\nOn spec.systemComponents.coreDNS.autoscaling DNS is critical, in general and also within a Kubernetes cluster. Gardener-managed clusters deploy CoreDNS, a graduated CNCF project. Gardener supports 2 auto-scaling modes for it, horizontal (using HPA based on CPU) and cluster-proportional (using cluster proportional autoscaler that scales the number of pods based on the number of nodes/cores, not to be confused with the cluster autoscaler that scales nodes based on their utilization). Check out the docs, especially the trade-offs why you would chose one over the other (cluster-proportional gives you more configuration options, if CPU-based horizontal scaling is insufficient to your needs). Consider also Gardener’s feature node-local DNS to decouple you further from the DNS pods and stabilize DNS. Again, that’s not strictly related to HA, but may become important during a zone outage, when load patterns shift and pods start to initialize/resolve DNS records more frequently in bulk.\nMore Caveats Unfortunately, there are a few more things of note when it comes to HA in a Kubernetes cluster that may be “surprising” and hard to mitigate:\n If the kubelet restarts, it will report all pods as NotReady on startup until it reruns its probes (#100277), which leads to temporary endpoint and load balancer target removal (#102367). This topic is somewhat controversial. Gardener uses rolling updates and a jitter to spread necessary kubelet restarts as good as possible. If a kube-proxy pod on a node turns NotReady, all load balancer traffic to all pods (on this node) under services with externalTrafficPolicy local will cease as the load balancer will then take this node out of serving. This topic is somewhat controversial as well. So, please remember that externalTrafficPolicy local not only has the disadvantage of imbalanced traffic spreading, but also a dependency to the kube-proxy pod that may and will be unavailable during updates. Gardener uses rolling updates to spread necessary kube-proxy updates as good as possible. These are just a few additional considerations. They may or may not affect you, but other intricacies may. It’s a reminder to be watchful as Kubernetes may have one or two relevant quirks that you need to consider (and will probably only find out over time and with extensive testing).\nMeaningful Availability Finally, let’s go back to where we started. We recommended to measure meaningful availability. For instance, in Gardener, we do not trust only internal signals, but track also whether Gardener or the control planes that it manages are externally available through the external DNS records and load balancers, SNI-routing Istio gateways, etc. (the same path all users must take). It’s a huge difference whether the API server’s internal readiness probe passes or the user can actually reach the API server and it does what it’s supposed to do. Most likely, you will be in a similar spot and can do the same.\nWhat you do with these signals is another matter. Maybe there are some actionable metrics and you can trigger some active fail-over, maybe you can only use it to improve your HA setup altogether. In our case, we also use it to deploy mitigations, e.g. via our dependency-watchdog that watches, for instance, Gardener-managed API servers and shuts down components like the controller managers to avert cascading knock-off effects (e.g. melt-down if the kubelets cannot reach the API server, but the controller managers can and start taking down nodes and pods).\nEither way, understanding how users perceive your service is key to the improvement process as a whole. Even if you are not struck by a zone outage, the measures above and tracking the meaningful availability will help you improve your service.\nThank you for your interest.\n","categories":"","description":"","excerpt":"Implementing High Availability and Tolerating Zone Outages Developing …","ref":"/docs/guides/high-availability/best-practices/","tags":"","title":"Best Practices"},{"body":"Overview Gardener provides chaostoolkit modules to simulate compute and network outages for various cloud providers such as AWS, Azure, GCP, OpenStack/Converged Cloud, and VMware vSphere, as well as pod disruptions for any Kubernetes cluster.\nThe API, parameterization, and implementation is as homogeneous as possible across the different cloud providers, so that you have only minimal effort. As a Gardener user, you benefit from an additional garden module that leverages the generic modules, but exposes their functionality in the most simple, homogeneous, and secure way (no need to specify cloud provider credentials, cluster credentials, or filters explicitly; retrieves credentials and stores them in memory only).\nInstallation The name of the package is chaosgarden and it was developed and tested with Python 3.9+. It’s being published to PyPI, so that you can comfortably install it via Python’s package installer pip (you may want to create a virtual environment before installing it):\npip install chaosgarden ℹ️ If you want to use the VMware vSphere module, please note the remarks in requirements.txt for vSphere. Those are not contained in the published PyPI package.\nThe package can be used directly from Python scripts and supports this usage scenario with additional convenience that helps launch actions and probes in background (more on actions and probes later), so that you can compose also complex scenarios with ease.\nIf this technology is new to you, you will probably prefer the chaostoolkit CLI in combination with experiment files, so we need to install the CLI next:\npip install chaostoolkit Please verify that it was installed properly by running:\nchaos --help Usage ℹ️ We assume you are using Gardener and run Gardener-managed shoot clusters. You can also use the generic cloud provider and Kubernetes chaosgarden modules, but configuration and secrets will then differ. Please see the module docs for details.\nA Simple Experiment The most important command is the run command, but before we can use it, we need to compile an experiment file first. Let’s start with a simple one, invoking only a read-only 📖 action from chaosgarden that lists cloud provider machines and networks (depends on cloud provider) for the “first” zone of one of your shoot clusters.\nLet’s assume, your project is called my-project and your shoot is called my-shoot, then we need to create the following experiment:\n{ \"title\": \"assess-filters-impact\", \"description\": \"assess-filters-impact\", \"method\": [ { \"type\": \"action\", \"name\": \"assess-filters-impact\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"assess_cloud_provider_filters_impact\", \"arguments\": { \"zone\": 0 } } } ], \"configuration\": { \"garden_project\": \"my-project\", \"garden_shoot\": \"my-shoot\" } } We are not yet there and need one more thing to do before we can run it: We need to “target” the Gardener landscape resp. Gardener API server where you have created your shoot cluster (not to be confused with your shoot cluster API server). If you do not know what this is or how to download the Gardener API server kubeconfig, please follow these instructions. You can either download your personal credentials or project credentials (see creation of a serviceaccount) to interact with Gardener. For now (fastest and most convenient way, but generally not recommended), let’s use your personal credentials, but if you later plan to automate your experiments, please use proper project credentials (a serviceaccount is not bound to your person, but to the project, and can be restricted using RBAC roles and role bindings, which is why we recommend this for production).\nTo download your personal credentials, open the Gardener Dashboard and click on your avatar in the upper right corner of the page. Click “My Account”, then look for the “Access” pane, then “Kubeconfig”, then press the “Download” button and save the kubeconfig to disk. Run the following command next:\nexport KUBECONFIG=path/to/kubeconfig We are now set and you can run your first experiment:\nchaos run path/to/experiment You should see output like this (depends on cloud provider):\n[INFO] Validating the experiment's syntax [INFO] Installing signal handlers to terminate all active background threads on involuntary signals (note that SIGKILL cannot be handled). [INFO] Experiment looks valid [INFO] Running experiment: assess-filters-impact [INFO] Steady-state strategy: default [INFO] Rollbacks strategy: default [INFO] No steady state hypothesis defined. That's ok, just exploring. [INFO] Playing your experiment's method now... [INFO] Action: assess-filters-impact [INFO] Validating client credentials and listing probably impacted instances and/or networks with the given arguments zone='world-1a' and filters={'instances': [{'Name': 'tag-key', 'Values': ['kubernetes.io/cluster/shoot--my-project--my-shoot']}], 'vpcs': [{'Name': 'tag-key', 'Values': ['kubernetes.io/cluster/shoot--my-project--my-shoot']}]}: [INFO] 1 instance(s) would be impacted: [INFO] - i-aabbccddeeff0000 [INFO] 1 VPC(s) would be impacted: [INFO] - vpc-aabbccddeeff0000 [INFO] Let's rollback... [INFO] No declared rollbacks, let's move on. [INFO] Experiment ended with status: completed 🎉 Congratulations! You successfully ran your first chaosgarden experiment.\nA Destructive Experiment Now let’s break 🪓 your cluster. Be advised that this experiment will be destructive in the sense that we will temporarily network-partition all nodes in one availability zone (machine termination or restart is available with chaosgarden as well). That means, these nodes and their pods won’t be able to “talk” to other nodes, pods, and services. Also, the API server will become unreachable for them and the API server will report them as unreachable (confusingly shown as NotReady when you run kubectl get nodes and Unknown in the status Ready condition when you run kubectl get nodes --output yaml).\nBeing unreachable will trigger service endpoint and load balancer de-registration (when the node’s grace period lapses) as well as eventually pod eviction and machine replacement (which will continue to fail under test). We won’t run the experiment long enough for all of these effects to materialize, but the longer you run it, the more will happen, up to temporarily giving up/going into “back-off” for the affected worker pool in that zone. You will also see that the Kubernetes cluster autoscaler will try to create a new machine almost immediately, if pods are pending for the affected zone (which will initially fail under test, but may succeed later, which again depends on the runtime of the experiment and whether or not the cluster autoscaler goes into “back-off” or not).\nBut for now, all of this doesn’t matter as we want to start “small”. You can later read up more on the various settings and effects in our best practices guide on high availability.\nPlease create a new experiment file, this time with this content:\n{ \"title\": \"run-network-failure-simulation\", \"description\": \"run-network-failure-simulation\", \"method\": [ { \"type\": \"action\", \"name\": \"run-network-failure-simulation\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"run_cloud_provider_network_failure_simulation\", \"arguments\": { \"mode\": \"total\", \"zone\": 0, \"duration\": 60 } } } ], \"rollbacks\": [ { \"type\": \"action\", \"name\": \"rollback-network-failure-simulation\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"rollback_cloud_provider_network_failure_simulation\", \"arguments\": { \"mode\": \"total\", \"zone\": 0 } } } ], \"configuration\": { \"garden_project\": { \"type\": \"env\", \"key\": \"GARDEN_PROJECT\" }, \"garden_shoot\": { \"type\": \"env\", \"key\": \"GARDEN_SHOOT\" } } } ℹ️ There is an even more destructive action that terminates or alternatively restarts machines in a given zone 🔥 (immediately or delayed with some randomness/chaos for maximum inconvenience for the nodes and pods). You can find links to all these examples at the end of this tutorial.\nThis experiment is very similar, but this time we will break 🪓 your cluster - for 60s. If that’s too short to even see a node or pod transition from Ready to NotReady (actually Unknown), then increase the duration. Depending on the workload that your cluster runs, you may already see effects of the network partitioning, because it is effective immediately. It’s just that Kubernetes cannot know immediately and rather assumes that something is failing only after the node’s grace period lapses, but the actual workload is impacted immediately.\nMost notably, this experiment also has a rollbacks section, which is invoked even if you abort the experiment or it fails unexpectedly, but only if you run the CLI with the option --rollback-strategy always which we will do soon. Any chaosgarden action that can undo its activity, will do that implicitly when the duration lapses, but it is a best practice to always configure a rollbacks section in case something unexpected happens. Should you be in panic and just want to run the rollbacks section, you can remove all other actions and the CLI will execute the rollbacks section immediately.\nOne other thing is different in the second experiment as well. We now read the name of the project and the shoot from the environment, i.e. a configuration section can automatically expand environment variables. Also useful to know (not shown here), chaostoolkit supports variable substitution too, so that you have to define variables only once. Please note that you can also add a secrets section that can also automatically expand environment variables. For instance, instead of targeting the Gardener API server via $KUBECONFIG, which is supported by our chaosgarden package natively, you can also explicitly refer to it in a secrets section (for brevity reasons not shown here either).\nLet’s now run your second experiment (please watch your nodes and pods in parallel, e.g. by running watch kubectl get nodes,pods --output wide in another terminal):\nexport GARDEN_PROJECT=my-project export GARDEN_SHOOT=my-shoot chaos run --rollback-strategy always path/to/experiment The output of the run command will be similar to the one above, but longer. It will mention either machines or networks that were network-partitioned (depends on cloud provider), but should revert everything back to normal.\nNormally, you would not only run actions in the method section, but also probes as part of a steady state hypothesis. Such steady state hypothesis probes are run before and after the actions to validate that the “system” was in a healthy state before and gets back to a healthy state after the actions ran, hence show that the “system” is in a steady state when not under test. Eventually, you will write your own probes that don’t even have to be part of a steady state hypothesis. We at Gardener run multi-zone (multiple zones at once) and rolling-zone (strike each zone once) outages with continuous custom probes all within the method section to validate our KPIs continuously under test (e.g. how long do the individual fail-overs take/how long is the actual outage). The most complex scenarios are even run via Python scripts as all actions and probes can also be invoked directly (which is what the CLI does).\nHigh Availability Developing highly available workload that can tolerate a zone outage is no trivial task. You can find more information on how to achieve this goal in our best practices guide on high availability.\nThank you for your interest in Gardener chaos engineering and making your workload more resilient.\nFurther Reading Here some links for further reading:\n Examples: Experiments, Scripts Gardener Chaos Engineering: GitHub, PyPI, Module Docs for Gardener Users Chaos Toolkit Core: Home Page, Installation, Concepts, GitHub ","categories":"","description":"","excerpt":"Overview Gardener provides chaostoolkit modules to simulate compute …","ref":"/docs/guides/high-availability/chaos-engineering/","tags":"","title":"Chaos Engineering"},{"body":"Highly Available Shoot Control Plane Shoot resource offers a way to request for a highly available control plane.\nFailure Tolerance Types A highly available shoot control plane can be setup with either a failure tolerance of zone or node.\nNode Failure Tolerance The failure tolerance of a node will have the following characteristics:\n Control plane components will be spread across different nodes within a single availability zone. There will not be more than one replica per node for each control plane component which has more than one replica. Worker pool should have a minimum of 3 nodes. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different node within a single availability zone. Zone Failure Tolerance The failure tolerance of a zone will have the following characteristics:\n Control plane components will be spread across different availability zones. There will be at least one replica per zone for each control plane component which has more than one replica. Gardener scheduler will automatically select a seed which has a minimum of 3 zones to host the shoot control plane. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different zone. Shoot Spec To request for a highly available shoot control plane Gardener provides the following configuration in the shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: \u003cnode | zone\u003e Allowed Transitions\nIf you already have a shoot cluster with non-HA control plane, then the following upgrades are possible:\n Upgrade of non-HA shoot control plane to HA shoot control plane with node failure tolerance. Upgrade of non-HA shoot control plane to HA shoot control plane with zone failure tolerance. However, it is essential that the seed which is currently hosting the shoot control plane should be multi-zonal. If it is not, then the request to upgrade will be rejected. Note: There will be a small downtime during the upgrade, especially for etcd, which will transition from a single node etcd cluster to a multi-node etcd cluster.\n Disallowed Transitions\nIf you already have a shoot cluster with HA control plane, then the following transitions are not possible:\n Upgrade of HA shoot control plane from node failure tolerance to zone failure tolerance is currently not supported, mainly because already existing volumes are bound to the zone they were created in originally. Downgrade of HA shoot control plane with zone failure tolerance to node failure tolerance is currently not supported, mainly because of the same reason as above, that already existing volumes are bound to the respective zones they were created in originally. Downgrade of HA shoot control plane with either node or zone failure tolerance, to a non-HA shoot control plane is currently not supported, mainly because etcd-druid does not currently support scaling down of a multi-node etcd cluster to a single-node etcd cluster. Zone Outage Situation Implementing highly available software that can tolerate even a zone outage unscathed is no trivial task. You may find our HA Best Practices helpful to get closer to that goal. In this document, we collected many options and settings for you that also Gardener internally uses to provide a highly available service.\nDuring a zone outage, you may be forced to change your cluster setup on short notice in order to compensate for failures and shortages resulting from the outage. For instance, if the shoot cluster has worker nodes across three zones where one zone goes down, the computing power from these nodes is also gone during that time. Changing the worker pool (shoot.spec.provider.workers[]) and infrastructure (shoot.spec.provider.infrastructureConfig) configuration can eliminate this disbalance, having enough machines in healthy availability zones that can cope with the requests of your applications.\nGardener relies on a sophisticated reconciliation flow with several dependencies for which various flow steps wait for the readiness of prior ones. During a zone outage, this can block the entire flow, e.g., because all three etcd replicas can never be ready when a zone is down, and required changes mentioned above can never be accomplished. For this, a special one-off annotation shoot.gardener.cloud/skip-readiness helps to skip any readiness checks in the flow.\n The shoot.gardener.cloud/skip-readiness annotation serves as a last resort if reconciliation is stuck because of important changes during an AZ outage. Use it with caution, only in exceptional cases and after a case-by-case evaluation with your Gardener landscape administrator. If used together with other operations like Kubernetes version upgrades or credential rotation, the annotation may lead to a severe outage of your shoot control plane.\n ","categories":"","description":"Failure tolerance types `node` and `zone`. Possible mitigations for zone or node outages","excerpt":"Failure tolerance types `node` and `zone`. Possible mitigations for …","ref":"/docs/guides/high-availability/control-plane/","tags":"","title":"Control Plane"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2022/","tags":"","title":"2022"},{"body":"Presenters This community call was led by Pawel Palucki and Alexander D. Kanevskiy.\nTopics Alexander Kanevskiy begins the community call by giving an overview of CRI-resource-manager, describing it as a “hardware aware container runtime”, and also going over what it brings to the user in terms of features and policies.\nPawel Palucki continues by giving details on the policy that will later be used in the demo and the use case demonstrated in it. He then goes over the “must have” features of any extension - observability and the ability to deploy and configure objects with it.\nThe demo then begins, mixed with slides giving further information at certain points regarding the installation process, static and dynamic configuration flow, healthchecks and recovery mode, and access to logs, among others.\nThe presentation is concluded by Pawel showcasing the new features coming to CRI-resource-manager with its next releases and sharing some tips for other extension developers.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Pawel Palucki and Alexander …","ref":"/blog/2022/10.20-gardener-community-meeting-october-2/","tags":"","title":"Community Call - Get more computing power in Gardener by overcoming Kubelet limitations with CRI-resource-manager"},{"body":"Presenters This community call was led by Raymond de Jong.\nTopics This meeting explores the uses of Cilium, an open source software used to secure the network connectivity between application services deployed using Kubernetes, and Hubble, the networking and security observability platform built on top of it.\nRaymond de Jong begins the meeting by giving an introduction of Cillium and eBPF and how they are both used in Kubernetes networking and services. He then goes over the ways of running Cillium - either by using a supported cloud provider or by CNI chaining.\nThe next topic introduced is the Cluster Mesh and the different use cases for it, offering high availability, shared services, local and remote service affinity, and the ability to split services.\nIn regards to security, being an identity-based security solution utilizing API-aware authorization, Cillium implements Hubble in order to increase its observability. Hubble combines hubble UI, hubble API and hubble Metrics - Grafana and Prometheus, in order to provide service dependency maps, detailed flow visibility and built-in metrics for operations and applications stability.\nThe final topic covered is the Service Mesh, offering service maps and the ability to integrate Cluster Mesh features.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Raymond de Jong.\nTopics This …","ref":"/blog/2022/10.06-gardener-community-meeting-october/","tags":"","title":"Community Call - Cilium / Isovalent Presentation"},{"body":"Manage certificates with Gardener for default domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please keep reading this article. You want to use a certificate for a custom domain. If this is your case, please see Manage certificates with Gardener for public domain Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Since you are using the default DNS name, all DNS configuration should already be done and ready.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will automatically be created.\nUsing an ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\"spec: tls: - hosts: # Must not exceed 64 characters. - short.ingress.shoot.project.default-domain.gardener.cloud # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: short.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Using a service type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/purpose: managed # Certificate and private key reside in this secret. cert.gardener.cloud/secretname: tls-secret # You may add more domains separated by commas (e.g. \"service.shoot.project.default-domain.gardener.cloud, amazing.shoot.project.default-domain.gardener.cloud\") dns.gardener.cloud/dnsnames: \"service.shoot.project.default-domain.gardener.cloud\" dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: short.ingress.shoot.project.default-domain.gardener.cloud secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden If you’re interested in the current progress of your request, you’re advised to consult the description, more specifically the status attribute in case the issuance failed.\nRequest a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.ingress.shoot.project.default-domain.gardener.cloud\" spec: tls: - hosts: - amazing.ingress.shoot.project.default-domain.gardener.cloud secretName: tls-secret rules: - host: amazing.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nMore information For more information and more examples about using the certificate extension, please see Manage certificates with Gardener for public domain\n","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/request_default_domain_cert/","tags":["task"],"title":"Manage certificates with Gardener for default domain"},{"body":"Manage certificates with Gardener for public domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please see Manage certificates with Gardener for default domain You want to use a certificate for a custom domain. If this is your case, please keep reading this article. Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Your custom domain is under a public top level domain (e.g. .com) Your custom zone is resolvable with a public resolver via the internet (e.g. 8.8.8.8) You have a custom DNS provider configured and working (see “DNS Providers”) As part of the Let’s Encrypt ACME challenge validation process, Gardener sets a DNS TXT entry and Let’s Encrypt checks if it can both resolve and authenticate it. Therefore, it’s important that your DNS-entries are publicly resolvable. You can check this by querying e.g. Googles public DNS server and if it returns an entry your DNS is publicly visible:\n# returns the A record for cert-example.example.com using Googles DNS server (8.8.8.8) dig cert-example.example.com @8.8.8.8 A DNS provider In order to issue certificates for a custom domain you need to specify a DNS provider which is permitted to create DNS records for subdomains of your requested domain in the certificate. For example, if you request a certificate for host.example.com your DNS provider must be capable of managing subdomains of host.example.com.\nDNS providers are normally specified in the shoot manifest. To learn more on how to configure one, please see the DNS provider documentation.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) Gateways (both Istio gateways and from the Gateway API) Certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will be created automatically.\nUsing an Ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed # Optional but recommended, this is going to create the DNS entry at the same time dns.gardener.cloud/class: garden dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/commonname: \"*.example.com\" # optional, if not specified the first name from spec.tls[].hosts is used as common name #cert.gardener.cloud/dnsnames: \"\" # optional, if not specified the names from spec.tls[].hosts are used #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" spec: tls: - hosts: # Must not exceed 64 characters. - amazing.example.com # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Replace the hosts and rules[].host value again with your own domain and adjust the remaining Ingress attributes in accordance with your deployment (e.g. the above is for an istio Ingress controller and forwards traffic to a service1 on port 80).\nUsing a Service of type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/secretname: tls-secret dns.gardener.cloud/dnsnames: example.example.com dns.gardener.cloud/class: garden # Optional dns.gardener.cloud/ttl: \"600\" cert.gardener.cloud/commonname: \"*.example.example.com\" cert.gardener.cloud/dnsnames: \"\" #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using a Gateway resource Please see Istio Gateways or Gateway API for details.\nUsing the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: amazing.example.com secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden # If delegated domain for DNS01 challenge should be used. This has only an effect if a CNAME record is set for # '_acme-challenge.amazing.example.com'. # For example: If a CNAME record exists '_acme-challenge.amazing.example.com' =\u003e '_acme-challenge.writable.domain.com', # the DNS challenge will be written to '_acme-challenge.writable.domain.com'. #followCNAME: true # optionally set labels for the secret #secretLabels: # key1: value1 # key2: value2 # Optionally specify the preferred certificate chain: if the CA offers multiple certificate chains, prefer the chain with an issuer matching this Subject Common Name. If no match, the default offered chain will be used. #preferredChain: \"ISRG Root X1\" # Optionally specify algorithm and key size for private key. Allowed algorithms: \"RSA\" (allowed sizes: 2048, 3072, 4096) and \"ECDSA\" (allowed sizes: 256, 384) # If not specified, RSA with 2048 is used. #privateKey: # algorithm: ECDSA # size: 384 Supported attributes Here is a list of all supported annotations regarding the certificate extension:\n Path Annotation Value Required Description N/A cert.gardener.cloud/purpose: managed Yes when using annotations Flag for Gardener that this specific Ingress or Service requires a certificate spec.commonName cert.gardener.cloud/commonname: E.g. “*.demo.example.com” or “special.example.com” Certificate and Ingress : No Service: Yes, if DNS names unset Specifies for which domain the certificate request will be created. If not specified, the names from spec.tls[].hosts are used. This entry must comply with the 64 character limit. spec.dnsNames cert.gardener.cloud/dnsnames: E.g. “special.example.com” Certificate and Ingress : No Service: Yes, if common name unset Additional domains the certificate should be valid for (Subject Alternative Name). If not specified, the names from spec.tls[].hosts are used. Entries in this list can be longer than 64 characters. spec.secretRef.name cert.gardener.cloud/secretname: any-name Yes for certificate and Service Specifies the secret which contains the certificate/key pair. If the secret is not available yet, it’ll be created automatically as soon as the certificate has been issued. spec.issuerRef.name cert.gardener.cloud/issuer: E.g. gardener No Specifies the issuer you want to use. Only necessary if you request certificates for custom domains. N/A cert.gardener.cloud/revoked: true otherwise always false No Use only to revoke a certificate, see reference for more details spec.followCNAME cert.gardener.cloud/follow-cname E.g. true No Specifies that the usage of a delegated domain for DNS challenges is allowed. Details see Follow CNAME. spec.preferredChain cert.gardener.cloud/preferred-chain E.g. ISRG Root X1 No Specifies the Common Name of the issuer for selecting the certificate chain. Details see Preferred Chain. spec.secretLabels cert.gardener.cloud/secret-labels for annotation use e.g. key1=value1,key2=value2 No Specifies labels for the certificate secret. spec.privateKey.algorithm cert.gardener.cloud/private-key-algorithm RSA, ECDSA No Specifies algorithm for private key generation. The default value is depending on configuration of the extension (default of the default is RSA). You may request a new certificate without privateKey settings to find out the concrete defaults in your Gardener. spec.privateKey.size cert.gardener.cloud/private-key-size \"256\", \"384\", \"2048\", \"3072\", \"4096\" No Specifies size for private key generation. Allowed values for RSA are 2048, 3072, and 4096. For ECDSA allowed values are 256 and 384. The default values are depending on the configuration of the extension (defaults of the default values are 3072 for RSA and 384 for ECDSA respectively). Request a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.example.com\" spec: tls: - hosts: - amazing.example.com secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nUsing a custom Issuer Most Gardener deployment with the certification extension enabled have a preconfigured garden issuer. It is also usually configured to use Let’s Encrypt as the certificate provider.\nIf you need a custom issuer for a specific cluster, please see Using a custom Issuer\nQuotas For security reasons there may be a default quota on the certificate requests per day set globally in the controller registration of the shoot-cert-service.\nThe default quota only applies if there is no explicit quota defined for the issuer itself with the field requestsPerDayQuota, e.g.:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' requestsPerDayQuota: 10 DNS Propagation As stated before, cert-manager uses the ACME challenge protocol to authenticate that you are the DNS owner for the domain’s certificate you are requesting. This works by creating a DNS TXT record in your DNS provider under _acme-challenge.example.example.com containing a token to compare with. The TXT record is only applied during the domain validation. Typically, the record is propagated within a few minutes. But if the record is not visible to the ACME server for any reasons, the certificate request is retried again after several minutes. This means you may have to wait up to one hour after the propagation problem has been resolved before the certificate request is retried. Take a look in the events with kubectl describe ingress example for troubleshooting.\nCharacter Restrictions Due to restriction of the common name to 64 characters, you may to leave the common name unset in such cases.\nFor example, the following request is invalid:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-invalid namespace: default spec: commonName: morethan64characters.ingress.shoot.project.default-domain.gardener.cloud But it is valid to request a certificate for this domain if you have left the common name unset:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: dnsNames: - morethan64characters.ingress.shoot.project.default-domain.gardener.cloud References Gardener cert-management Managing DNS with Gardener ","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/request_cert/","tags":["task"],"title":"Manage certificates with Gardener for public domain"},{"body":"Using a custom Issuer Another possibility to request certificates for custom domains is a dedicated issuer.\n Note: This is only needed if the default issuer provided by Gardener is restricted to shoot related domains or you are using domain names not visible to public DNS servers. Which means that your senario most likely doesn’t require your to add an issuer.\n The custom issuers are specified normally in the shoot manifest. If the shootIssuers feature is enabled, it can alternatively be defined in the shoot cluster.\nCustom issuer in the shoot manifest kind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' privateKeySecretName: my-privatekey # referenced resource, the private key must be stored in the secret at `data.privateKey` (optionally, only needed as alternative to auto registration) #precheckNameservers: # to provide special set of nameservers to be used for prechecking DNSChallenges for an issuer #- dns1.private.company-net:53 #- dns2.private.company-net:53\" #shootIssuers: # if true, allows to specify issuers in the shoot cluster #enabled: true resources: - name: my-privatekey resourceRef: apiVersion: v1 kind: Secret name: custom-issuer-privatekey # name of secret in Gardener project If you are using an ACME provider for private domains, you may need to change the nameservers used for checking the availability of the DNS challenge’s TXT record before the certificate is requested from the ACME provider. By default, only public DNS servers may be used for this purpose. At least one of the precheckNameservers must be able to resolve the private domain names.\nUsing the custom issuer To use the custom issuer in a certificate, just specify its name in the spec.\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate spec: ... issuerRef: name: custom-issuer ... For source resources like Ingress or Service use the cert.gardener.cloud/issuer annotation.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/issuer: custom-issuer ... Custom issuer in the shoot cluster Prerequiste: The shootIssuers feature has to be enabled. It is either enabled globally in the ControllerDeployment or in the shoot manifest with:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig shootIssuers: enabled: true # if true, allows to specify issuers in the shoot cluster ... Example for specifying an Issuer resource and its Secret directly in any namespace of the shoot cluster:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Issuer metadata: name: my-own-issuer namespace: my-namespace spec: acme: domains: include: - my.own.domain.com email: some.user@my.own.domain.com privateKeySecretRef: name: my-own-issuer-secret namespace: my-namespace server: https://acme-v02.api.letsencrypt.org/directory --- apiVersion: v1 kind: Secret metadata: name: my-own-issuer-secret namespace: my-namespace type: Opaque data: privateKey: ... # replace '...' with valus encoded as base64 Using the custom shoot issuer To use the custom issuer in a certificate, just specify its name and namespace in the spec.\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate spec: ... issuerRef: name: my-own-issuer namespace: my-namespace ... For source resources like Ingress or Service use the cert.gardener.cloud/issuer annotation.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/issuer: my-namespace/my-own-issuer ... ","categories":"","description":"How to define a custom issuer forma shoot cluster","excerpt":"How to define a custom issuer forma shoot cluster","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/custom_shoot_issuer/","tags":["task"],"title":"Using a custom Issuer"},{"body":"Introduction of Disruptions We need to understand that some kind of voluntary disruptions can happen to pods. For example, they can be caused by cluster administrators who want to perform automated cluster actions, like upgrading and autoscaling clusters. Typical application owner actions include:\n deleting the deployment or other controller that manages the pod updating a deployment’s pod template causing a restart directly deleting a pod (e.g., by accident) Setup Pod Disruption Budgets Kubernetes offers a feature called PodDisruptionBudget (PDB) for each application. A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions.\nThe most common use case is when you want to protect an application specified by one of the built-in Kubernetes controllers:\n Deployment ReplicationController ReplicaSet StatefulSet A PodDisruptionBudget has three fields:\n A label selector .spec.selector to specify the set of pods to which it applies. .spec.minAvailable which is a description of the number of pods from that set that must still be available after the eviction, even in the absence of the evicted pod. minAvailable can be either an absolute number or a percentage. .spec.maxUnavailable which is a description of the number of pods from that set that can be unavailable after the eviction. It can be either an absolute number or a percentage. Cluster Upgrade or Node Deletion Failed due to PDB Violation Misconfiguration of the PDB could block the cluster upgrade or node deletion processes. There are two main cases that can cause a misconfiguration.\nCase 1: The replica of Kubernetes controllers is 1 Only 1 replica is running: there is no replicaCount setup or replicaCount for the Kubernetes controllers is set to 1\n PDB configuration\n spec: minAvailable: 1 To fix this PDB misconfiguration, you need to change the value of replicaCount for the Kubernetes controllers to a number greater than 1\n Case 2: HPA configuration violates PDB In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand. The HorizontalPodAutoscaler manages the replicas field of the Kubernetes controllers.\n There is no replicaCount setup or replicaCount for the Kubernetes controllers is set to 1\n PDB configuration\n spec: minAvailable: 1 HPA configuration\n spec: minReplicas: 1 To fix this PDB misconfiguration, you need to change the value of HPA minReplicas to be greater than 1\n Related Links Specifying a Disruption Budget for Your Application Horizontal Pod Autoscaling ","categories":"","description":"","excerpt":"Introduction of Disruptions We need to understand that some kind of …","ref":"/docs/guides/applications/pod-disruption-budget/","tags":"","title":"Specifying a Disruption Budget for Kubernetes Controllers"},{"body":"Presenters This community call was led by Jens Schneider and Lothar Gesslein.\nOverview Starting the development of a new Gardener extension can be challenging, when you are not an expert in the Gardener ecosystem yet. Therefore, the first half of this community call led by Jens Schneider aims to provide a “getting started tutorial” at a beginner level. 23Technologies have developed a minimal working example for Gardener extensions, gardener-extension-mwe, hosted in a Github repository. Jens is following the Getting started with Gardener extension development tutorial, which aims to provide exactly that.\nIn the second part of the community call, Lothar Gesslein introduces the gardener-extension-shoot-flux, which allows for the automated installation of arbitrary Kubernetes resources into shoot clusters. As this extension relies on Flux, an overview of Flux’s capabilities is also provided.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end.\nYou can find the tutorials in this community call at:\n Getting started with Gardener extension development A Gardener Extension for universal Shoot Configuration If you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end of the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Jens Schneider and Lothar …","ref":"/blog/2022/06.17-gardener-community-meeting-june/","tags":"","title":"Community Call - Gardener Extension Development"},{"body":"Presenters This community call was led by Tim Ebert and Rafael Franzke.\nOverview So far, deploying Gardener locally was not possible end-to-end. While you certainly could run the Gardener components in a minikube or kind cluster, creating shoot clusters always required to register seeds backed by cloud provider infrastructure like AWS, Azure, etc..\nConsequently, developing Gardener locally was similarly complicated, and the entry barrier for new contributors was way too high.\nIn a previous community call (Hackathon “Hack The Metal”), we already presented a new approach for overcoming these hurdles and complexities.\nNow we would like to present the Local Provider Extension for Gardener and show how it can be used to deploy Gardener locally, allowing you to quickly get your feet wet with the project.\nIn this session, Tim Ebert goes through the process of setting up a local Gardener cluster. After his demonstration, Rafael Franzke showcases a different approach to building your clusters locally, which, while more complicated, offers a much faster build time.\nYou can find the tutorials in this community call at:\n Deploying Gardener locally Running Gardener locally If you are left with any questions regarding the content, you might find the answers in the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Tim Ebert and Rafael …","ref":"/blog/2022/03.23-gardener-community-meeting-march/","tags":"","title":"Community Call - Deploying and Developing Gardener Locally"},{"body":"Presenters This community call was led by Holger Kosser, Lukas Gross and Peter Sutter.\nOverview Watch the recording of our February 2022 Community call to see how to get started with the gardenctl-v2 and watch a walkthrough for gardenctl-v2 features. You’ll learn about targeting, secure shoot cluster access, SSH, and how to use cloud provider CLIs natively.\nThe session is led by Lukas Gross, who begins by giving some information on the motivations behind creating a new version of gardenctl - providing secure access to shoot clustes, enabling direct usage of kubectl and cloud provider CLIs and managing cloud provider resources for SSH access.\nHolger Kosser then takes over in order to delve deeper into the concepts behind the implementation of gardenctl-2, going over Targeting, Gardenlogin and Cloud Provider CLIs. After that, Peter Sutter does the first demo, where he presents the main features in gardenctl-2.\nThe next part details how to get started with gardenctl, followed by another demo. The landscape requirements are also discussed, as well as future plans and enhancement requests.\nYou can find the slides for this community call at Google Slides.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Holger Kosser, Lukas Gross …","ref":"/blog/2022/02.17-gardener-community-meeting-february/","tags":"","title":"Community Call - Gardenctl-v2"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2021/","tags":"","title":"2021"},{"body":"Happy New Year Gardeners! As we greet 2021, we also celebrate Gardener’s third anniversary. Gardener was born with its first open source commit on 10.1.2018 (its inception within SAP was of course some 9 months earlier):\ncommit d9619d01845db8c7105d27596fdb7563158effe1 Author: Gardener Development Community \u003cgardener.opensource@sap.com\u003e Date: Wed Jan 10 13:07:09 2018 +0100 Initial version of gardener This is the initial contribution to the Open Source Gardener project. ... Looking back, three years down the line, the project initiators were working towards a special goal: Publishing Gardener as an open source project on Github.com. Join us as we look back at how it all began, the challenges Gardener aims to solve, and why open source and the community was and is the project’s key enabler.\nGardener Kick-Off: “We opted to BUILD ourselves” Early 2017, SAP put together a small, jelled team of experts with a clear mission: work out how SAP could serve Kubernetes based environments (as a service) for all teams within the company. Later that same year, SAP also joined the CNCF as a platinum member.\nWe first deliberated intensively on the BUY options (including acquisitions, due to the size and estimated volume needed at SAP). There were some early products from commercial vendors and startups available that did not bind exclusively to one of the hyperscalers, but these products did not cover many of our crucial and immediate requirements for a multi-cloud environment.\nUltimately, we opted to BUILD ourselves. This decision was not made lightly, because right from the start, we knew that we would have to cover thousands of clusters, across the globe, on all kinds of infrastructures. We would have to be able to create them at scale as well as manage them 24x7. And thus, we predicted the need to invest into automation of all aspects, to keep the service TCO at a minimum, and to offer an enterprise worthy SLA early on. This particular endeavor grew into launching the project Gardener, first internally, and ultimately fulfilling all checks, externally based on open source. Its mission statement, in a nutshell, is “Universal Kubernetes at scale”. Now, that’s quite bold. But we also had a nifty innovation that helped us tremendously along the way. And we can openly reveal the secret here: Gardener was built, not only for creating Kubernetes at scale, but it was built (recursively) in Kubernetes itself.\nWhat Do You Get with Gardener? Gardener offers managed and homogenous Kubernetes clusters on IaaS providers like AWS, Azure, GCP, AliCloud, Open Telekom Cloud, SCS, OVH and more, but also covers versatile infrastructures like OpenStack, VMware or bare metal. Day-1 and Day-2 operations are an integral part of a cluster’s feature set. This means that Gardener is not only capable of provisioning or de-provisioning thousands of clusters, but also of monitoring your cluster’s health state, upgrading components in a rolling fashion, or scaling the control plane as well as worker nodes up and down depending on the current resource demand.\nSome features mentioned above might sound familiar to you, simply because they’re squarely derived from Kubernetes. Concretely, if you explore a Gardener managed end-user cluster, you’ll never see the so-called “control plane components” (Kube-Apiserver, Kube-Controller-Manager, Kube-Scheduler, etc.) The reason is that they run as Pods inside another, hosting/seeding Kubernetes cluster. Speaking in Gardener terms, the latter is called a Seed cluster, and the end-user cluster is called a Shoot cluster; and thus the botanical naming scheme for Gardener was born. Further assets like infrastructure components or worker machines are modelled as managed Kubernetes objects too. This allows Gardener to leverage all the great and production proven features of Kubernetes for managing Kubernetes clusters. Our blog post on Kubernetes.io reveals more details about the architectural refinements.\nFigure 1: Gardener architecture overview End-users directly benefit from Gardener’s recursive architecture. Many of the requirements that we identified for the Gardener service turned out to be highly convenient for shoot owners. For instance, Seed clusters are usually equipped with DNS and x509 services. At the same time, these service offerings can be extended to requests coming from the Shoot clusters i.e., end-users get domain names and certificates for their applications out of the box.\nRecognizing the Power of Open Source The Gardener team immediately profited from open source: from Kubernetes obviously, and all its ecosystem projects. That all facilitated our project’s very fast and robust development. But it does not answer:\n“Why would SAP open source a tool that clearly solves a monetizable enterprise requirement?\"_\nShort spoiler alert: it initially involved a leap of faith. If we just look at our own decision path, it is undeniable that developers, and with them entire industries, gravitate towards open source. We chose Linux, Containers, and Kubernetes exactly because they are open, and we could bet on network effects, especially around skills. The same decision process is currently replicated in thousands of companies, with the same results. Why? Because all companies are digitally transforming. They are becoming software companies as well to a certain extent. Many of them are also our customers and in many discussions, we recognized that they have the same challenges that we are solving with Gardener. This, in essence, was a key eye opener. We were confident that if we developed Gardener as open source, we’d not only seize the opportunity to shape a Kubernetes management tool that finds broad interest and adoption outside of our use case at SAP, but we could solve common challenges faster with the help of a community, and that in consequence would sustain continuous feature development.\nCoincidently, that was also when the SAP Open Source Program Office (OSPO) was launched. It supported us making a case to develop Gardener completely as open source. Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.\nFigure 2: Example immediate community contribution Differentiating Gardener from Other Kubernetes Management Solutions Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!\nAlso, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.\nWho Is Using Gardener? Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.\nFlant, PingCap, StackIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.\nAn interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. For Kubernetes-as-a-Service, they evaluated and chose Gardener because it was open and a perfect candidate. With Gardener’s extension capabilities, it was possible to bring managed Kubernetes clusters to their very own bare metal stack, metal-stack.io. Of course, this meant implementation effort. But by reusing the Gardener project, FI-TS was able to leverage our standard with minimal adjustments for their special use-case. Subsequently, with their contributions, SAP was able to make Gardener more open for the community.\nFull Speed Ahead with the Community in 2021 Some of the current and most active topics are about the installer (Landscaper), control plane migration, automated seed management and documentation. Even though once you are into Kubernetes and then Gardener, all complexity falls into place, you can make all the semantic connections yourself. But beginners that join the community without much prior knowledge should experience a ramp-up with slighter slope. And that is currently a pain point. Experts directly ask questions about documentation not being up-to-date or clear enough. We prioritized the functionality of what you get with Gardener at the outset and need to catch up. But here is the good part: Now that we are starting the installation subject, later we will have a much broader picture of what we need to install and maintain Gardener, and how we will build it.\n In a community call last summer, we gave an overview of what we are building: The Landscaper. With this tool, we will be able to not only install a full Gardener landscape, but we will also streamline patches, updates and upgrades with the Landscaper. Gardener adopters can then attach to a release train from the project and deploy Gardener into a dev, canary and multiple production environments sequentially. Like we do at SAP.\nKey Takeaways in Three Years of Gardener #1 Open Source is Strategic Open Source is not just about using freely available libraries, components, or tools to optimize your own software production anymore. It is strategic, unfolds for projects like Gardener, and that in the meantime has also reached the Board Room.\n#2 Solving Concrete Challenges by Co-Innovation Users of a particular product or service increasingly vote/decide for open source variants, such as project Gardener, because that allows them to freely innovate and solve concrete challenges by developing exactly what they require (see FI-TS example). This user-centric process has tremendous advantages. It clears out the middleman and other vested interests. You have access to the full code. And lastly, if others start using and contributing to your innovation, it allows enterprises to secure their investments for the long term. And that re-enforces point #1 for enterprises that have yet to create a strategic Open Source Program Office.\n#3 Cloud Native Skills Gardener solves problems by applying Kubernetes and Kubernetes principles itself. Developers and operators who obtain familiarity with Kubernetes will immediately notice and appreciate our concept and can contribute intuitively. The Gardener maintainers feel responsible to facilitate community members and contributors. Barriers will further be reduced by our ongoing landscaper and documentation efforts. This is why we are so confident on Gardener adoption.\nThe Gardener team is gladly welcoming new community members, especially regarding adoption and contribution. Feel invited to try out your very own Gardener installation, join our Slack channel or community calls. We’re looking forward to seeing you there!\n","categories":"","description":"","excerpt":"Happy New Year Gardeners! As we greet 2021, we also celebrate …","ref":"/blog/2021/02.01-happy-anniversary-gardener/","tags":"","title":"Happy Anniversary, Gardener! Three Years of Open Source Kubernetes Management"},{"body":"Kubernetes is a cloud-native enabler built around the principles for a resilient, manageable, observable, highly automated, loosely coupled system. We know that Kubernetes is infrastructure agnostic with the help of a provider specific Cloud Controller Manager. But Kubernetes has explicitly externalized the management of the nodes. Once they appear - correctly configured - in the cluster, Kubernetes can use them. If nodes fail, Kubernetes can’t do anything about it, external tooling is required. But every tool, every provider is different. So, why not elevate node management to a first class Kubernetes citizen? Why not create a Kubernetes native resource that manages machines just like pods? Such an approach is brought to you by the Machine Controller Manager (aka MCM), which, of course, is an open sourced project. MCM gives you the following benefits:\n seamlessly manage machines/nodes with a declarative API (of course, across different cloud providers) integrate generically with the cluster autoscaler plugin with tools such as the node-problem-detector transport the immutability design principle to machine/nodes implement e.g. rolling upgrades of machines/nodes Machine Controller Manager aka MCM Machine Controller Manager is a group of cooperative controllers that manage the lifecycle of the worker machines. It is inspired by the design of Kube Controller Manager in which various sub controllers manage their respective Kubernetes Clients.\nMachine Controller Manager reconciles a set of Custom Resources namely MachineDeployment, MachineSet and Machines which are managed \u0026 monitored by their controllers MachineDeployment Controller, MachineSet Controller, Machine Controller respectively along with another cooperative controller called the Safety Controller.\nUnderstanding the sub-controllers and Custom Resources of MCM The Custom Resources MachineDeployment, MachineSet and Machines are very much analogous to the native K8s resources of Deployment, ReplicaSet and Pods respectively. So, in the context of MCM:\n MachineDeployment provides a declarative update for MachineSet and Machines. MachineDeployment Controller reconciles the MachineDeployment objects and manages the lifecycle of MachineSet objects. MachineDeployment consumes a provider specific MachineClass in its spec.template.spec, which is the template of the VM spec that would be spawned on the cloud by MCM. MachineSet ensures that the specified number of Machine replicas are running at a given point of time. MachineSet Controller reconciles the MachineSet objects and manages the lifecycle of Machine objects. Machines are the actual VMs running on the cloud platform provided by one of the supported cloud providers. Machine Controller is the controller that actually communicates with the cloud provider to create/update/delete machines on the cloud. There is a Safety Controller responsible for handling the unidentified or unknown behaviours from the cloud providers. Along with the above Custom Controllers and Resources, MCM requires the MachineClass to use K8s Secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. Workings of MCM Figure 1: In-Tree Machine Controller Manager In MCM, there are two K8s clusters in the scope — a Control Cluster and a Target Cluster. The Control Cluster is the K8s cluster where the MCM is installed to manage the machine lifecycle of the Target Cluster. In other words, the Control Cluster is the one where the machine-* objects are stored. The Target Cluster is where all the node objects are registered. These clusters can be two distinct clusters or the same cluster, whichever fits.\nWhen a MachineDeployment object is created, the MachineDeployment Controller creates the corresponding MachineSet object. The MachineSet Controller in-turn creates the Machine objects. The Machine Controller then talks to the cloud provider API and actually creates the VMs on the cloud.\nThe cloud initialization script that is introduced into the VMs via the K8s Secret consumed by the MachineClasses talks to the KCM (K8s Controller Manager) and creates the node objects. After registering themselves to the Target Cluster, nodes start sending health signals to the machine objects. That is when MCM updates the status of the machine object from Pending to Running.\nMore on Safety Controller Safety Controller contains the following functions:\nOrphan VM Handling It lists all the VMs in the cloud; matching the tag of given cluster name and maps the VMs with the Machine objects using the ProviderID field. VMs without any backing Machine objects are logged and deleted after confirmation. This handler runs every 30 minutes and is configurable via --machine-safety-orphan-vms-period flag. Freeze Mechanism Safety Controller freezes the MachineDeployment and MachineSet controller if the number of Machine objects goes beyond a certain threshold on top of the Spec.Replicas. It can be configured by the flag --safety-up or --safety-down and also --machine-safety-overshooting-period. Safety Controller freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. Safety Controller unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. Evolution of MCM from In-Tree to Out-of-Tree (OOT) MCM supports declarative management of machines in a K8s Cluster on various cloud providers like AWS, Azure, GCP, AliCloud, OpenStack, Metal-stack, Packet, KubeVirt, VMWare, Yandex. It can, of course, be easily extended to support other cloud providers.\nGoing ahead, having the implementation of the Machine Controller Manager supporting too many cloud providers would be too much upkeep from both a development and a maintenance point of view. Which is why the Machine Controller component of MCM has been moved to Out-of-Tree design, where the Machine Controller for each respective cloud provider runs as an independent executable, even though typically packaged under the same deployment.\nFigure 2: Out-Of-Tree (OOT) Machine Controller Manager This OOT Machine Controller will implement a common interface to manage the VMs on the respective cloud provider. Now, while the Machine Controller deals with the Machine objects, the Machine Controller Manager (MCM) deals with higher level objects such as the MachineSet and MachineDeployment objects.\nA lot of contributions are already being made towards an OOT Machine Controller Manager for various cloud providers. Below are the links to the repositories:\n Out of Tree Machine Controller Manager for AliCloud Out of Tree Machine Controller Manager for AWS Out of Tree Machine Controller Manager for Azure Out of Tree Machine Controller Manager for GCP Out of Tree Machine Controller Manager for KubeVirt Out of Tree Machine Controller Manager for Metal Out of Tree Machine Controller Manager for vSphere Out of Tree Machine Controller Manager for Yandex Watch the Out of Tree Machine Controller Manager video on our Gardener Project YouTube channel to understand more about OOT MCM.\nWho Uses MCM? Gardener\nMCM is originally developed and employed by a K8s Control Plane as a Service called Gardener. However, the MCM’s design is elegant enough to be employed when managing the machines of any independent K8s clusters, without having to necessarily associate it with Gardener.\nMetal Stack\nMetal-stack is a set of microservices that implements Metal as a Service (MaaS). It enables you to turn your hardware into elastic cloud infrastructure. Metal-stack employs the adopted Machine Controller Manager to their Metal API. Check out an introduction to it in metal-stack - kubernetes on bare metal.\nSky UK Limited\nSky UK Limited (a broadcaster) migrated their Kubernetes node management from Ansible to Machine Controller Manager. Check out the How Sky is using Machine Controller Manager (MCM) and autoscaler video on our Gardener Project YouTube channel.\nAlso, other interesting use cases with MCM are implemented by Kubernetes enthusiasts, who for example adjusted the Machine Controller Manager to provision machines in the cloud to extend a local Raspberry-Pi K3s cluster. This topic is covered in detail in the 2020-07-03 Gardener Community Meeting on our Gardener Project YouTube channel.\nConclusion Machine Controller Manager is the leading automation tool for machine management for, and in, Kubernetes. And the best part is that it is open sourced. It is freely (and easily) usable and extensible, and the community more than welcomes contributions.\nIf you want to know more about Machine Controller Manager or find out about a similar scope for your solutions, feel free to visit the GitHub page machine-controller-manager. We are so excited to see what you achieve with Machine Controller Manager.\n","categories":"","description":"","excerpt":"Kubernetes is a cloud-native enabler built around the principles for a …","ref":"/blog/2021/01.25-machine-controller-manager/","tags":"","title":"Machine Controller Manager"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2020/","tags":"","title":"2020"},{"body":"STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz Group, which consists of Lidl, Kaufland, as well as production and recycling companies. Following the industry trend, the Schwarz Group is in the process of a digital transformation. STACKIT enables this transformation by helping to modernize the internal IT of the company branches.\nWhat is STACKIT and the STACKIT Kubernetes Engine (SKE)? STACKIT started with colocation solutions for internal and external customers in Europe-based data centers, which was then expanded to a full cloud platform stack providing an IaaS layer with VMs, storage and network, as well as a PaaS layer including Cloud Foundry and a growing set of cloud services, like databases, messaging, etc.\nWith containers and Kubernetes becoming the lingua franca of the cloud, we are happy to announce the STACKIT Kubernetes Engine (SKE), which has been released as Beta in November this year. We decided to use Gardener as the cluster management engine underneath SKE - for good reasons as you will see – and we would like to share our experiences with Gardener when working on the SKE Beta release, and serve as a testimonial for this technology.\nFigure 1: STACKIT Component Diagram Why We Chose Gardener as a Cluster Management Tool We started with the Kubernetes endeavor in the beginning of 2020 with a newly formed agile team that consisted of software engineers, highly experienced in IT operations and development. After some exploration and a short conceptual phase, we had a clear-cut opinion on how the cluster management for STACKIT should look like: we were looking for a highly customizable tool that could be adapted to the specific needs of STACKIT and the Schwarz Group, e.g. in terms of network setup or the infrastructure layer it should be running on. Moreover, the tool should be scalable to a high number of managed Kubernetes clusters and should therefore provide a fully automated operation experience. As an open source project, contributing and influencing the tool, as well as collaborating with a larger community were important aspects that motivated us. Furthermore, we aimed to offer cluster management as a self-service in combination with an excellent user experience. Our objective was to have the managed clusters come with enterprise-grade SLAs – i.e. with “batteries included”, as some say.\nWith this mission, we started our quest through the world of Kubernetes and soon found Gardener to be a hot candidate of cluster management tools that seemed to fulfill our demands. We quickly got in contact and received a warm welcome from the Gardener community. As an interested potential adopter, but in the early days of the COVID-19 lockdown, we managed to organize an online workshop during which we got an introduction and deep dive into Gardener and discussed the STACKIT use cases. We learned that Gardener is extensible in many dimensions, and that contributions are always welcome and encouraged. Once we understood the basic Gardener concepts of Garden, Shoot and Seed clusters, its inception design and how this extends Kubernetes concepts in a natural way, we were eager to evaluate this tool in more detail.\nAfter this evaluation, we were convinced that this tool fulfilled all our requirements - a decision was made and off we went.\nHow Gardener was Adapted and Extended by SKE After becoming familiar with Gardener, we started to look into its code base to adapt it to the specific needs of the STACKIT OpenStack environment. Changes and extensions were made in order to get it integrated into the STACKIT environment, and whenever reasonable, we contributed those changes back:\n To run smoothly with the STACKIT OpenStack layer, the Gardener configuration was adapted in different places, e.g. to support CSI driver or to configure the domains of a shoot API server or ingress. Gardener was extended to support shoots and shooted seeds in dual stack and dual home setup. This is used in SKE for the communication between shooted seeds and the Garden cluster. SKE uses a private image registry for the Gardener installation in order to resolve dependencies to public image registries and to have more control over the used Gardener versions. To install and run Gardener with the private image registry, some new configurations need to be introduced into Gardener. Gardener is a first-class API based service what allowed us to smoothly integrate it into the STACKIT User Interface. We were also able to jump-start and utilize the Gardener Dashboard for our Beta release by merely adjusting the look-\u0026-feel, i.e. colors, labels and icons. Figure 2: Gardener Dashboard adapted to STACKIT UI style Experience with Gardener Operations As no OpenStack installation is identical to one another, getting Gardener to run stable on the STACKIT IaaS layer revealed some operational challenges. For instance, it was challenging to find the right configuration for Cinder CSI.\nTo test for its resilience, we tried to break the managed clusters with a Chaos Monkey test, e.g. by deleting services or components needed by Kubernetes and Gardener to work properly. The reconciliation feature of Gardener fixed all those problems automatically, so that damaged Shoot clusters became operational again after a short period of time. Thus, we were not able to break Shoot clusters from an end user perspective permanently, despite our efforts. Which again speaks for Gardener’s first-class cloud native design.\nWe also participated in a fruitful community support: For several challenges we contacted the community channel and help was provided in a timely manner. A lesson learned was that raising an issue in the community early on, before getting stuck too long on your own with an unresolved problem, is essential and efficient.\nSummary Gardener is used by SKE to provide a managed Kubernetes offering for internal use cases of the Schwarz Group as well as for the public cloud offering of STACKIT. Thanks to Gardener, it was possible to get from zero to a Beta release in only about half a year’s time – this speaks for itself. Within this period, we were able to integrate Gardener into the STACKIT environment, i.e. in its OpenStack IaaS layer, its management tools and its identity provisioning solution.\nGardener has become a vital building block in STACKIT’s cloud native platform offering. For the future, the possibility to manage clusters also on other infrastructures and hyperscalers is seen as another great opportunity for extended use cases. The open co-innovation exchange with the Gardener community member companies has also opened the door to commercial co-operation.\n","categories":"","description":"","excerpt":"STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz …","ref":"/blog/2020/12.03-stackit-kubernetes-engine-with-gardener/","tags":"","title":"STACKIT Kubernetes Engine with Gardener"},{"body":"Prerequisites Please read the following background material on Authenticating.\nOverview Kubernetes on its own doesn’t provide any user management. In other words, users aren’t managed through Kubernetes resources. Whenever you refer to a human user it’s sufficient to use a unique ID, for example, an email address. Nevertheless, Gardener project owners can use an identity provider to authenticate user access for shoot clusters in the following way:\n Configure an Identity Provider using OpenID Connect (OIDC). Configure a local kubectl oidc-login to enable oidc-login. Configure the shoot cluster to share details of the OIDC-compliant identity provider with the Kubernetes API Server. Authorize an authenticated user using role-based access control (RBAC). Verify the result Note Gardener allows administrators to modify aspects of the control plane setup. It gives administrators full control of how the control plane is parameterized. While this offers much flexibility, administrators need to ensure that they don’t configure a control plane that goes beyond the service level agreements of the responsible operators team. Configure an Identity Provider Create a tenant in an OIDC compatible Identity Provider. For simplicity, we use Auth0, which has a free plan.\n In your tenant, create a client application to use authentication with kubectl:\n Provide a Name, choose Native as application type, and choose CREATE.\n In the tab Settings, copy the following parameters to a local text file:\n Domain\nCorresponds to the issuer in OIDC. It must be an https-secured endpoint (Auth0 requires a trailing / at the end). For more information, see Issuer Identifier.\n Client ID\n Client Secret\n Configure the client to have a callback url of http://localhost:8000. This callback connects to your local kubectl oidc-login plugin:\n Save your changes.\n Verify that https://\u003cAuth0 Domain\u003e/.well-known/openid-configuration is reachable.\n Choose Users \u0026 Roles \u003e Users \u003e CREATE USERS to create a user with a user and password:\n Note Users must have a verified email address. Configure a Local kubectl oidc-login Install the kubectl plugin oidc-login. We highly recommend the krew installation tool, which also makes other plugins easily available.\nkubectl krew install oidc-login The response looks like this:\nUpdated the local copy of plugin index. Installing plugin: oidc-login CAVEATS: \\ | You need to setup the OIDC provider, Kubernetes API server, role binding and kubeconfig. | See https://github.com/int128/kubelogin for more. / Installed plugin: oidc-login Prepare a kubeconfig for later use:\ncp ~/.kube/config ~/.kube/config-oidc Modify the configuration of ~/.kube/config-oidc as follows:\napiVersion: v1 kind: Config ... contexts: - context: cluster: shoot--project--mycluster user: my-oidc name: shoot--project--mycluster ... users: - name: my-oidc user: exec: apiVersion: client.authentication.k8s.io/v1beta1 command: kubectl args: - oidc-login - get-token - --oidc-issuer-url=https://\u003cIssuer\u003e/ - --oidc-client-id=\u003cClient ID\u003e - --oidc-client-secret=\u003cClient Secret\u003e - --oidc-extra-scope=email,offline_access,profile To test our OIDC-based authentication, the context shoot--project--mycluster of ~/.kube/config-oidc is used in a later step. For now, continue to use the configuration ~/.kube/config with administration rights for your cluster.\nConfigure the Shoot Cluster Modify the shoot cluster YAML as follows, using the client ID and the domain (as issuer) from the settings of the client application you created in Auth0:\nkind: Shoot apiVersion: garden.sapcloud.io/v1beta1 metadata: name: mycluster namespace: garden-project ... spec: kubernetes: kubeAPIServer: oidcConfig: clientID: \u003cClient ID\u003e issuerURL: \"https://\u003cIssuer\u003e/\" usernameClaim: email This change of the Shoot manifest triggers a reconciliation. Once the reconciliation is finished, your OIDC configuration is applied. It doesn’t invalidate other certificate-based authentication methods. Wait for Gardener to reconcile the change. It can take up to 5 minutes.\nAuthorize an Authenticated User In Auth0, you created a user with a verified email address, test@test.com in our example. For simplicity, we authorize a single user identified by this email address with the cluster role view:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: viewer-test roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: view subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: test@test.com As administrator, apply the cluster role binding in your shoot cluster.\nVerify the Result To step into the shoes of your user, use the prepared kubeconfig file ~/.kube/config-oidc, and switch to the context that uses oidc-login:\ncd ~/.kube export KUBECONFIG=$(pwd)/config-oidc kubectl config use-context `shoot--project--mycluster` kubectl delegates the authentication to plugin oidc-login the first time the user uses kubectl to contact the API server, for example:\nkubectl get all The plugin opens a browser for an interactive authentication session with Auth0, and in parallel serves a local webserver for the configured callback.\n Enter your login credentials.\nYou should get a successful response from the API server:\nOpening in existing browser session. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 100.64.0.1 \u003cnone\u003e 443/TCP 86m Note After a successful login, kubectl uses a token for authentication so that you don’t have to provide user and password for every new kubectl command. How long the token is valid can be configured. If you want to log in again earlier, reset plugin oidc-login:\n Delete directory ~/.kube/cache/oidc-login. Delete the browser cache. To see if your user uses the cluster role view, do some checks with kubectl auth can-i.\n The response for the following commands should be no:\nkubectl auth can-i create clusterrolebindings kubectl auth can-i get secrets kubectl auth can-i describe secrets The response for the following commands should be yes:\nkubectl auth can-i list pods kubectl auth can-i get pods If the last step is successful, you’ve configured your cluster to authenticate against an identity provider using OIDC.\nRelated Links Auth0 Pricing ","categories":"","description":"Use OpenID Connect to authenticate users to access shoot clusters","excerpt":"Use OpenID Connect to authenticate users to access shoot clusters","ref":"/docs/guides/administer-shoots/oidc-login/","tags":"","title":"Authenticating with an Identity Provider"},{"body":"Dear community, we’re happy to announce a new minor release of Gardener, in fact, the 16th in 2020! v1.13 came out just today after a couple of weeks of code improvements and feature implementations. As usual, this blog post provides brief summaries for the most notable changes that we introduce with this version. Behind the scenes (and not explicitly highlighted below) we are progressing on internal code restructurings and refactorings to ease further extensions and to enhance development productivity. Speaking of those: You might be interested in watching the recording of the last Gardener Community Meeting which includes a detailed session for v2 of Terraformer, a complete rewrite in Golang, and improved state handling.\nNotable Changes in v1.13 The main themes of Gardener’s v1.13 release are increments for feature gate promotions, scalability and robustness, and cleanups and refactorings. The community plans to continue on those and wants to deliver at least one more release in 2020.\nAutomatic Quotas for Gardener Resources (gardener/gardener#3072) Gardener already supports ResourceQuotas since the last release, however, it was still up to operators/administrators to create these objects in project namespaces. Obviously, in large Gardener installations with thousands of projects, this is a quite challenging task. With this release, we are shipping an improvement in the Project controller in the gardener-controller-manager that allows operators to automatically create ResourceQuotas based on configuration. Operators can distinguish via project label selectors which default quotas shall be defined for various projects. Please find more details at Gardener Controller Manager!\nResource Capacity and Reservations for Seeds (gardener/gardener#3075) The larger the Gardener landscape, the more seed clusters you require. Naturally, they have limits of how many shoots they can accommodate (based on constraints of the underlying infrastructure provider and/or seed cluster configuration). Until this release, there were no means to prevent a seed cluster from becoming overloaded (and potentially die due to this load). Now you define resource capacity and reservations in the gardenlet’s component configuration, similar to how the kubelet announces allocatable resources for Node objects. We are defaulting this to 250 shoots, but you might want to adapt this value for your own environment.\nDistributed Gardenlet Rollout for Shooted Seeds (gardener/gardener#3135) With the same motivation, i.e., to improve catering with large landscapes, we allow operators to configure distributed rollouts of gardenlets for shooted seeds. When a new Gardener version is being deployed in landscapes with a high number of shooted seeds, gardenlets of earlier versions were immediately re-deploying copies of themselves into the shooted seeds they manage. This leads to a large number of new gardenlet pods that all roughly start at the same time. Depending on the size of the landscape, this may trouble the gardener-apiservers as all of them are starting to fill their caches and create watches at the same time. By default, this rollout is now randomized within a 5m time window, i.e., it may take up to 5m until all gardenlets in all seeds have been updated.\nProgressing on Beta-Promotion for APIServerSNI Feature Gate (gardener/gardener#3082, gardener/gardener#3143) The alpha APIServerSNI feature will drastically reduce the costs for load balancers in the seed clusters, thus, it is effectively contributing to Gardener’s “minimal TCO” goal. In this release we are introducing an important improvement that optimizes the connectivity when pods talk to their control plane by avoiding an extra network hop. This is realized by a MutatingWebhookConfiguration whose server runs as a sidecar container in the kube-apiserver pod in the seed (only when the APIServerSNI feature gate is enabled). The webhook injects a KUBERNETES_SERVICE_HOST environment variable into pods in the shoot which prevents the additional network hop to the apiserver-proxy on all worker nodes. You can read more about it in APIServerSNI environment variable injection.\nMore Control Plane Configurability (gardener/gardener#3141, gardener/gardener#3139) A main capability beloved by Gardener users is its openness when it comes to configurability and fine-tuning of the Kubernetes control plane components. Most managed Kubernetes offerings are not exposing options of the master components, but Gardener’s Shoot API offers a selected set of settings. With this release we are allowing to change the maximum number of (non-)mutating requests for the kube-apiserver of shoot clusters. Similarly, the grace period before deleting pods on failed nodes can now be fine-grained for the kube-controller-manager.\nImproved Project Resource Handling (gardener/gardener#3137, gardener/gardener#3136, gardener/gardener#3179) Projects are an important resource in the Gardener ecosystem as they enable collaboration with team members. A couple of improvements have landed into this release. Firstly, duplicates in the member list were not validated so far. With this release, the gardener-apiserver is automatically merging them, and in future releases requests with duplicates will be denied. Secondly, specific Projects may now be excluded from the stale checks if desired. Lastly, namespaces for Projects that were adopted (i.e., those that exist before the Project already) will now no longer be deleted when the Project is being deleted. Please note that this only applies for newly created Projects.\nRemoval of Deprecated Labels and Annotations (gardener/gardener#3094) The core.gardener.cloud API group succeeded the old garden.sapcloud.io API group in the beginning of 2020, however, a lot of labels and annotations with the old API group name were still supported. We have continued with the process of removing those deprecated (but replaced with the new API group name) names. Concretely, the project labels garden.sapcloud.io/role=project and project.garden.sapcloud.io/name=\u003cproject-name\u003e are no longer supported now. Similarly, the shoot.garden.sapcloud.io/use-as-seed and shoot.garden.sapcloud.io/ignore-alerts annotations got deleted. We are not finished yet, but we do small increments and plan to progress on the topic until we finally get rid of all artifacts with the old API group name.\nNodeLocalDNS Network Policy Rules Adapted (gardener/gardener#3184) The alpha NodeLocalDNS feature was already introduced and explained with Gardener v1.8 with the motivation to overcome certain bottlenecks with the horizontally auto-scaled CoreDNS in all shoot clusters. Unfortunately, due to a bug in the network policy rules, it was not working in all environments. We have fixed this one now, so it should be ready for further tests and investigations. Come give it a try!\nPlease bear in mind that this blog post only highlights the most noticeable changes and improvements, but there is a whole bunch more, including a ton of bug fixes in older versions! Come check out the full release notes and share your feedback in our #gardener Slack channel!\n","categories":"","description":"","excerpt":"Dear community, we’re happy to announce a new minor release of …","ref":"/blog/2020/11.23-gardener-v1.13-released/","tags":"","title":"Gardener v1.13 Released"},{"body":" This is a guest commentary from metal-stack.\nmetal-stack is a software that provides an API for provisioning and managing physical servers in the data center. To categorize this product, the terms “Metal-as-a-Service” (MaaS) or “bare metal cloud” are commonly used. One reason that you stumbled upon this blog post could be that you saw errors like the following in your ETCD instances:\netcd-main-0 etcd 2020-09-03 06:00:07.556157 W | etcdserver: read-only range request \"key:\\\"/registry/deployments/shoot--pwhhcd--devcluster2/kube-apiserver\\\" \" with result \"range_response_count:1 size:9566\" took too long (13.95374909s) to execute As it turns out, 14 seconds are way too slow for running Kubernetes API servers. It makes them go into a crash loop (leader election fails). Even worse, this whole thing is self-amplifying: The longer a response takes, the more requests queue up, leading to response times increasing further and further. The system is very unlikely to recover. 😞\nOn Github, you can easily find the reason for this problem. Most probably your disks are too slow (see etcd-io/etcd#10860). So, when you are (like in our case) on GKE and run your ETCD on their default persistent volumes, consider moving from standard disks to SSDs and the error messages should disappear. A guide on how to use SSD volumes on GKE can be found at Using SSD persistent disks.\nCase closed? Well. For some people it might be. But when you are seeing this in your Gardener infrastructure, it’s likely that there is something going wrong. The entire ETCD management is fully managed by Gardener, which makes the problem a bit more interesting to look at. This blog post strives to cover topics such as:\n Gardener operating principles Gardener architecture and ETCD management Pitfalls with multi-cloud environments Migrating GCP volumes to a new storage class We from metal-stack learned quite a lot about the capabilities of Gardener through this problem. We are happy to share this experience with a broader audience. Gardener adopters and operators read on.\nHow Gardener Manages ETCDs In our infrastructure, we use Gardener to provision Kubernetes clusters on bare metal machines in our own data centers using metal-stack. Even if the entire stack could be running on-premise, our initial seed cluster and the metal control plane are hosted on GKE. This way, we do not need to manage a single Kubernetes cluster in our entire landscape manually. As soon as we have Gardener deployed on this initial cluster, we can spin up further Seeds in our own data centers through the concept of ManagedSeeds.\nTo make this easier to understand, let us give you a simplified picture of how our Gardener production setup looks like:\nFigure 1: Simplified View on Our Production Setup For every shoot cluster, Gardener deploys an individual, standalone ETCD as a stateful set into a shoot namespace. The deployment of the ETCD stateful set is managed by a controller called etcd-druid, which reconciles a special resource of the kind etcds.druid.gardener.cloud. This Etcd resource is getting deployed during the shoot provisioning flow in the gardenlet.\nFor failure-safety, the etcd-druid deploys the official ETCD container image along with a sidecar project called etcd-backup-restore. The sidecar automatically takes backups of the ETCD and stores them at a cloud provider, e.g. in S3 Buckets, Google Buckets, or similar. In case the ETCD comes up without or with corrupted data, the sidecar looks into the backup buckets and automatically restores the latest backup before ETCD starts up. This entire approach basically takes away the pain for operators to manually have to restore data in the event of data loss.\nNote We found the etcd-backup-restore project very intriguing. It was the inspiration for us to come up with a similar sidecar for the databases we use with metal-stack. This project is called backup-restore-sidecar. We can cope with postgres and rethinkdb database at the moment and more to come. Feel free to check it out when you are interested. As it’s the nature for multi-cloud applications to act upon a variety of cloud providers, with a single installation of Gardener, it is easily possible to spin up new Kubernetes clusters not only on GCP, but on other supported cloud platforms, too.\nWhen the Gardenlet deploys a resource like the Etcd resource into a shoot namespace, a provider-specific extension-controller has the chance to manipulate it through a mutating webhook. This way, a cloud provider can adjust the generic Gardener resource to fit the provider-specific needs. For every cloud that Gardener supports, there is such an extension-controller. For metal-stack, we also maintain one, called gardener-extension-provider-metal.\nNote A side note for cloud providers: Meanwhile, new cloud providers can be added fully out-of-tree, i.e. without touching any of Gardener’s sources. This works through API extensions and CRDs. Gardener handles generic resources and backpacks provider-specific configuration through raw extensions. When you are a cloud provider on your own, this is really encouraging because you can integrate with Gardener without any burdens. You can find documentation on how to integrate your cloud into Gardener at Adding Cloud Providers and Extensibility Overview. The Mistake Is in the Deployment This section contains code examples from Gardener v1.8. Now that we know how the ETCDs are managed by Gardener, we can come back to the original problem from the beginning of this article. It turned out that the real problem was a misconfiguration in our deployment. Gardener actually does use SSD-backed storage on GCP for ETCDs by default. During reconciliation, the gardener-extension-controller-gcp deploys a storage class called gardener.cloud-fast that enables accessing SSDs on GCP.\nBut for some reason, in our cluster we did not find such a storage class. And even more interesting, we did not use the gardener-extension-provider-gcp for any shoot reconciliation, only for ETCD backup purposes. And that was the big mistake we made: We reconciled the shoot control plane completely with gardener-extension-provider-metal even though our initial Seed actually runs on GKE and specific parts of the shoot control plane should be reconciled by the GCP extension-controller instead!\nThis is how the initial Seed resource looked like:\napiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: initial-seed spec: ... provider: region: gke type: metal ... ... Surprisingly, this configuration was working pretty well for a long time. The initial seed properly produced the Kubernetes control planes of our managed seeds that looked like this:\n$ kubectl get controlplanes.extensions.gardener.cloud NAME TYPE PURPOSE STATUS AGE fra-equ01 metal Succeeded 85d fra-equ01-exposure metal exposure Succeeded 85d And this is another interesting observation: There are two ControlPlane resources. One regular resource and one with an exposure purpose. Gardener distinguishes between two types for this exact reason: Environments where the shoot control plane runs on a different cloud provider than the Kubernetes worker nodes. The regular ControlPlane resource gets reconciled by the provider configured in the Shoot resource, and the exposure type ControlPlane by the provider configured in the Seed resource.\nWith the existing configuration the gardener-extension-provider-gcp does not kick in and hence, it neither deploys the gardener.cloud-fast storage class nor does it mutate the Etcd resource to point to it. And in the end, we are left with ETCD volumes using the default storage class (which is what we do for ETCD stateful sets in the metal-stack seeds, because our default storage class uses csi-lvm that writes into logical volumes on the SSD disks in our physical servers).\nThe correction we had to make was a one-liner: Setting the provider type of the initial Seed resource to gcp.\n$ kubectl get seed initial-seed -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: initial-seed spec: ... provider: region: gke type: gcp # \u003c-- here ... ... This change moved over the control plane exposure reconciliation to the gardener-extension-provider-gcp:\n$ kubectl get -n \u003cshoot-namespace\u003e controlplanes.extensions.gardener.cloud NAME TYPE PURPOSE STATUS AGE fra-equ01 metal Succeeded 85d fra-equ01-exposure gcp exposure Succeeded 85d And boom, after some time of waiting for all sorts of magic reconciliations taking place in the background, the missing storage class suddenly appeared:\n$ kubectl get sc NAME PROVISIONER gardener.cloud-fast kubernetes.io/gce-pd standard (default) kubernetes.io/gce-pd Also, the Etcd resource was now configured properly to point to the new storage class:\n$ kubectl get -n \u003cshoot-namespace\u003e etcd etcd-main -o yaml apiVersion: druid.gardener.cloud/v1alpha1 kind: Etcd metadata: ... name: etcd-main spec: ... storageClass: gardener.cloud-fast # \u003c-- was pointing to default storage class before! volumeClaimTemplate: main-etcd ... Note Only the etcd-main storage class gets changed to gardener.cloud-fast. The etcd-events configuration will still point to standard disk storage because this ETCD is much less occupied as compared to the etcd-main stateful set. The Migration Now that the deployment was in place such that this mistake would not repeat in the future, we still had the ETCDs running on the default storage class. The reconciliation does not delete the existing persistent volumes (PVs) on its own.\nTo bring production back up quickly, we temporarily moved the ETCD pods to other nodes in the GKE cluster. These were nodes which were less occupied, such that the disk throughput was a little higher than before. But surely that was not a final solution.\nFor a proper solution we had to move the ETCD data out of the standard disk PV into a SSD-based PV.\nEven though we had the etcd-backup-restore sidecar, we did not want to fully rely on the restore mechanism to do the migration. The backup should only be there for emergency situations when something goes wrong. Thus, we came up with another approach to introduce the SSD volume: GCP disk snapshots. This is how we did the migration:\n Scale down etcd-druid to zero in order to prevent it from disturbing your migration Scale down the kube-apiservers deployment to zero, then wait for the ETCD stateful to take another clean snapshot Scale down the ETCD stateful set to zero as well (in order to prevent Gardener from trying to bring up the downscaled resources, we used small shell constructs like while true; do kubectl scale deploy etcd-druid --replicas 0 -n garden; sleep 1; done) Take a drive snapshot in GCP from the volume that is referenced by the ETCD PVC Create a new disk in GCP from the snapshot on a SSD disk Delete the existing PVC and PV of the ETCD (oops, data is now gone!) Manually deploy a PV into your Kubernetes cluster that references this new SSD disk Manually deploy a PVC with the name of the original PVC and let it reference the PV that you have just created Scale up the ETCD stateful set and check that ETCD is running properly (if something went terribly wrong, you still have the backup from the etcd-backup-restore sidecar, delete the PVC and PV again and let the sidecar bring up ETCD instead) Scale up the kube-apiserver deployment again Scale up etcd-druid again (stop your shell hacks ;D) This approach worked very well for us and we were able to fix our production deployment issue. And what happened: We have never seen any crashing kube-apiservers again. 🎉\nConclusion As bad as problems in production are, they are the best way for learning from your mistakes. For new users of Gardener it can be pretty overwhelming to understand the rich configuration possibilities that Gardener brings. However, once you get a hang of how Gardener works, the application offers an exceptional versatility that makes it very much suitable for production use-cases like ours.\nThis example has shown how Gardener:\n Can handle arbitrary layers of infrastructure hosted by different cloud providers. Allows provider-specific tweaks to gain ideal performance for every cloud you want to support. Leverages Kubernetes core principles across the entire project architecture, making it vastly extensible and resilient. Brings useful disaster recovery mechanisms to your infrastructure (e.g. with etcd-backup-restore). We hope that you could take away something new through this blog post. With this article we also want to thank the SAP Gardener team for helping us to integrate Gardener with metal-stack. It’s been a great experience so far. 😄 😍\n","categories":"","description":"In this case study, our friends from metal-stack lead you through their journey of migrating Gardener ETCD volumes in their production environment.","excerpt":"In this case study, our friends from metal-stack lead you through …","ref":"/blog/2020/11.20-case-study-migrating-etcd-volumes-in-production/","tags":"","title":"Case Study: Migrating ETCD Volumes in Production"},{"body":"Two months after our last Gardener release update, we are happy again to present release v1.11 and v1.12 in this blog post. Control plane migration, load balancer consolidation, and new security features are just a few topics we progressed with. As always, a detailed list of features, improvements, and bug fixes can be found in the release notes of each release. If you are going to update from a previous Gardener version, please take the time to go through the action items in the release notes.\nNotable Changes in v1.12 Release v1.12, fresh from the oven, is shipped with plenty of improvements, features, and some API changes we want to pick up in the next sections.\nDrop Functionless DNS Providers (gardener/gardener#3036) This release drops the support for the so-called functionless DNS providers. Those are providers in a shoot’s specification (.spec.dns.providers) which don’t serve the shoot’s domain (.spec.dns.domain), but are created by Gardener in the seed cluster to serve DNS requests coming from the shoot cluster. If such providers don’t specify a type or secretName, the creation or update request for the corresponding shoot is denied.\nSeed Taints (gardener/gardener#2955) In an earlier release, we reserved a dedicated section in seed.spec.settings as a replacement for disable-capacity-reservation, disable-dns, invisible taints. These already deprecated taints were still considered and synced, which gave operators enough time to switch their integration to the new settings field. As of version v1.12, support for them has been discontinued and they are automatically removed from seed objects. You may use the actual taint names in a future release of Gardener again.\nLoad Balancer Events During Shoot Reconciliation (gardener/gardener#3028) As Gardener is capable of managing thousands of clusters, it is crucial to keep operation efforts at a minimum. This release demonstrates this endeavor by further improving error reporting to the end user. During a shoot’s reconciliation, Gardener creates Services of type LoadBalancer in the shoot cluster, e.g. for VPN or Nginx-Ingress addon, and waits for a successful creation. However, in the past we experienced that occurring issues caused by the party creating the load balancer (typically Cloud-Controller-Manager) are only exposed in the logs or as events. Gardener now fetches these event messages and propagates them to the shoot status in case of a failure. Users can then often fix the problem themselves, if for example the failure discloses an exhausted quota on the cloud provider.\nKonnectivityTunnel Feature per Shoot(gardener/gardener#3007) Since release v1.6, Gardener has been capable of reversing the tunnel direction from the seed to the shoot via the KonnectivityTunnel feature gate. With this release we make it possible to control the feature per shoot. We recommend to selectively enable the KonnectivityTunnel, as it is still in alpha state.\nReference Protection (gardener/gardener#2771, gardener/gardener 1708419) Shoot clusters may refer to external objects, like Secrets for specified DNS providers or they have a reference to an audit policy ConfigMap. Deleting those objects while any shoot still references them causes server errors, often only recoverable by an immense amount of manual operations effort. To prevent such scenarios, Gardener now adds a new finalizer gardener.cloud/reference-protection to these objects and removes it as soon as the object itself becomes releasable. Due to compatibility reasons, we decided that the handling for the audit policy ConfigMaps is delivered as an opt-in feature first, so please familiarize yourself with the necessary settings in the Gardener Controller Manager component config if you already plan to enable it.\nSupport for Resource Quotas (gardener/gardener#2627) After the Kubernetes upstream change (kubernetes/kubernetes#93537) for externalizing the backing admission plugin has been accepted, we are happy to announce the support of ResourceQuotas for Gardener offered resource kinds. ResourceQuotas allow you to specify a maximum number of objects per namespace, especially for end-user objects like Shoots or SecretBindings in a project namespace. Even though the admission plugin is enabled by default in the Gardener API Server, make sure the Kube Controller Manager runs the resourcequota controller as well.\nWatch Out Developers, Terraformer v2 is Coming! (gardener/gardener#3034) Although not related only to Gardener core, the preparation towards Terraformer v2 in the extensions library is still an important milestone to mention. With Terraformer v2, Gardener extensions using Terraform scripts will benefit from great consistency improvements. Please check out PR #3034, which demonstrates necessary steps to transition to Terraformer v2 as soon as it’s released.\nNotable Changes in v1.11 The Gardener community worked eagerly to deliver plenty of improvements with version v1.11. Those help us to further progress with topics like control plane migration, which is actively being worked on, or to harden our load balancer consolidation (APIServerSNI) feature. Besides improvements and fixes (full list available in release notes), this release contains major features as well, and we don’t want to miss a chance to walk you through them.\nGardener Admission Controller (gardener/gardener#2832), (gardener/gardener#2781) In this release, all admission related HTTP handlers moved from the Gardener Controller Manager (GCM) to the new component Gardener Admission Controller. The admission controller is rather a small component as opposed to GCM with regards to memory footprint and CPU consumption, and thus allows you to run multiple replicas of it much cheaper than it was before. We certainly recommend specifying the admission controller deployment with more than one replica, since it reduces the odds of a system-wide outage and increases the performance of your Gardener service.\nBesides the already known Namespace and Kubeconfig Secret validation, a new admission handler Resource-Size-Validator was added to the admission controller. It allows operators to restrict the size for all kinds of Kubernetes objects, especially sent by end-users to the Kubernetes or Gardener API Server. We address a security concern with this feature to prevent denial of service attacks in which an attacker artificially increases the size of objects to exhaust your object store, API server caches, or to let Gardener and Kubernetes controllers run out-of-memory. The documentation reveals an approach of finding the right resource size for your setup and why you should create exceptions for technical users and operators.\nDeferring Shoot Progress Reporting (gardener/gardener#2909) Shoot progress reporting is the continuous update process of a shoot’s .status.lastOperation field while the shoot is being reconciled by Gardener. Many steps are involved during reconciliation and depending on the size of your setup, the updates might become an issue for the Gardener API Server, which will refrain from processing further requests for a certain period. With .controllers.shoot.progressReportPeriod in Gardenlet’s component configuration, you can now delay these updates for the specified period.\nNew Policy for Controller Registrations (gardener/gardener#2896) A while ago, we added support for different policies in ControllerRegistrations which determine under which circumstances the deployments of registration controllers happen in affected seed clusters. If you specify the new policy AlwaysExceptNoShoots, the respective extension controller will be deployed to all seed cluster hosting at least one shoot cluster. After all shoot clusters from a seed are gone, the extension deployment will be deleted again. A full list of supported policies can be found at Registering Extension Controllers.\n","categories":"","description":"","excerpt":"Two months after our last Gardener release update, we are happy again …","ref":"/blog/2020/11.04-gardener-v1.11-and-v1.12-released/","tags":"","title":"Gardener v1.11 and v1.12 Released"},{"body":"The Gardener team is happy to announce that Gardener now offers support for an additional, often requested, infrastructure/virtualization technology, namely KubeVirt! Gardener can now provide Kubernetes-conformant clusters using KubeVirt managed Virtual Machines in the environment of your choice. This integration has been tested and works with any qualified Kubernetes (provider) cluster that is compatibly configured to host the required KubeVirt components, in particular for example Red Hat OpenShift Virtualization.\nGardener enables Kubernetes consumers to centralize and operate efficiently homogenous Kubernetes clusters across different IaaS providers and even private environments. This way the same cloud-based application version can be hosted and operated by its vendor or consumer on a variety of infrastructures. When a new customer or your development team demands for a new infrastructure provider, Gardener helps you to quickly and easily on-board your workload. Furthermore, on this new infrastructure, Gardener keeps the seamless Kubernetes management experience for your Kubernetes operators, while upholding the consistency of the CI/CD pipeline of your software development team.\nArchitecture and Workflow Gardener is based on the idea of three types of clusters – Garden cluster, Seed cluster and Shoot cluster (see Figure 1). The Garden cluster is used to control the entire Kubernetes environment centrally in a highly scalable design. The highly available seed clusters are used to host the end users (shoot) clusters’ control planes. Finally, the shoot clusters consist only of worker nodes to host the cloud native applications.\nFigure 1: Gardener Architecture An integration of the Gardener open source project with a new cloud provider follows a standard Gardener extensibility approach. The integration requires two new components: a provider extension and a Machine Controller Manager (MCM) extension. Both components together enable Gardener to instruct the new cloud provider. They run in the Gardener seed clusters that host the control planes of the shoots based on that cloud provider. The role of the provider extension is to manage the provider-specific aspects of the shoot clusters’ lifecycle, including infrastructure, control plane, worker nodes, and others. It works in cooperation with the MCM extension, which in particular is responsible to handle machines that are provisioned as worker nodes for the shoot clusters. To get this job done, the MCM extension leverages the VM management/API capabilities available with the respective cloud provider.\nSetting up a Kubernetes cluster always involves a flow of interdependent steps (see Figure 2), beginning with the generation of certificates and preparation of the infrastructure, continuing with the provisioning of the control plane and the worker nodes, and ending with the deployment of system components. Gardener can be configured to utilize the KubeVirt extensions in its generic workflow at the right extension points, and deliver the desired outcome of a KubeVirt backed cluster.\nFigure 2: Generic cluster reconciliation flow with extension points Gardener Integration with KubeVirt in Detail Integration with KubeVirt follows the Gardener extensibility concept and introduces the two new components mentioned above: the KubeVirt Provider Extension and the KubeVirt Machine Controller Manager (MCM) Extension.\nFigure 3: Gardener integration with KubeVirt The KubeVirt Provider Extension consists of three separate controllers that handle respectively the infrastructure, the control plane, and the worker nodes of the shoot cluster.\nThe Infrastructure Controller configures the network communication between the shoot worker nodes. By default, shoot worker nodes only use the provider cluster’s pod network. To achieve higher level of network isolation and better performance, it is possible to add more networks and replace the default pod network with a different network using container network interface (CNI) plugins available in the provider cluster. This is currently based on Multus CNI and NetworkAttachmentDefinitions.\nExample infrastructure configuration in a shoot definition:\nprovider: type: kubevirt infrastructureConfig: apiVersion: kubevirt.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: tenantNetworks: - name: network-1 config: |{ \"cniVersion\": \"0.4.0\", \"name\": \"bridge-firewall\", \"plugins\": [ { \"type\": \"bridge\", \"isGateway\": true, \"isDefaultGateway\": true, \"ipMasq\": true, \"ipam\": { \"type\": \"host-local\", \"subnet\": \"10.100.0.0/16\" } }, { \"type\": \"firewall\" } ] } default: true The Control Plane Controller deploys a Cloud Controller Manager (CCM). This is a Kubernetes control plane component that embeds cloud-specific control logic. As any other CCM, it runs the Node controller that is responsible for initializing Node objects, annotating and labeling them with cloud-specific information, obtaining the node’s hostname and IP addresses, and verifying the node’s health. It also runs the Service controller that is responsible for setting up load balancers and other infrastructure components for Service resources that require them.\nFinally, the Worker Controller is responsible for managing the worker nodes of the Gardener shoot clusters.\nExample worker configuration in a shoot definition:\nprovider: type: kubevirt workers: - name: cpu-worker minimum: 1 maximum: 2 machine: type: standard-1 image: name: ubuntu version: \"18.04\" volume: type: default size: 20Gi zones: - europe-west1-c For more information about configuring the KubeVirt Provider Extension as an end-user, see Using the KubeVirt provider extension with Gardener as end-user.\nEnabling Your Gardener Setup to Leverage a KubeVirt Compatible Environment The very first step required is to define the machine types (VM types) for VMs that will be available. This is achieved via the CloudProfile custom resource. The machine types configuration includes details such as CPU, GPU, memory, OS image, and more.\nExample CloudProfile custom resource:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: kubevirt spec: type: kubevirt providerConfig: apiVersion: kubevirt.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: ubuntu versions: - version: \"18.04\" sourceURL: \"https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img\" kubernetes: versions: - version: \"1.18.5\" machineImages: - name: ubuntu versions: - version: \"18.04\" machineTypes: - name: standard-1 cpu: \"1\" gpu: \"0\" memory: 4Gi volumeTypes: - name: default class: default regions: - name: europe-west1 zones: - name: europe-west1-b - name: europe-west1-c - name: europe-west1-d Once a machine type is defined, it can be referenced in shoot definitions. This information is used by the KubeVirt Provider Extension to generate MachineDeployment and MachineClass custom resources required by the KubeVirt MCM extension for managing the worker nodes of the shoot clusters during the reconciliation process.\nFor more information about configuring the KubeVirt Provider Extension as an operator, see Using the KubeVirt provider extension with Gardener as operator.\nKubeVirt Machine Controller Manager (MCM) Extension The KubeVirt MCM Extension is responsible for managing the VMs that are used as worker nodes of the Gardener shoot clusters using the virtualization capabilities of KubeVirt. This extension handles all necessary lifecycle management activities, such as machines creation, fetching, updating, listing, and deletion.\nThe KubeVirt MCM Extension implements the Gardener’s common driver interface for managing VMs in different cloud providers. As already mentioned, the KubeVirt MCM Extension is using the MachineDeployments and MachineClasses – an abstraction layer that follows the Kubernetes native declarative approach - to get instructions from the KubeVirt Provider Extension about the required machines for the shoot worker nodes. Also, the cluster austoscaler integrates with the scale subresource of the MachineDeployment resource. This way, Gardener offers a homogeneous autoscaling experience across all supported providers.\nWhen a new shoot cluster is created or when a new worker node is needed for an existing shoot cluster, a new Machine will be created, and at that time, the KubeVirt MCM extension will create a new KubeVirt VirtualMachine in the provider cluster. This VirtualMachine will be created based on a set of configurations in the MachineClass that follows the specification of the KubeVirt provider.\nThe KubeVirt MCM Extension has two main components. The MachinePlugin is responsible for handling the machine objects, and the PluginSPI is in charge of making calls to the cloud provider interface, to manage its resources.\nFigure 4: KubeVirt MCM extension workflow and architecture As shown in Figure 4, the MachinePlugin receives a machine request from the MCM and starts its processing by decoding the request, doing partial validation, extracting the relevant information, and sending it to the PluginSPI.\nThe PluginSPI then creates, gets, or deletes VirtualMachines depending on the method called by the MachinePlugin. It extracts the kubeconfig of the provider cluster and handles all other required KubeVirt resources such as the secret that holds the cloud-init configurations, and DataVolumes that are mounted as disks to the VMs.\nSupported Environments The Gardener KubeVirt support is currently qualified on:\n KubeVirt v0.32.0 (and later) Red Hat OpenShift Container Platform 4.4 (and later) There are also plans for further improvements and new features, for example integration with CSI drivers for storage management. Details about the implementation progress can be found in the Gardener project on GitHub.\nYou can find further resources about the open source project Gardener at https://gardener.cloud.\n","categories":"","description":"","excerpt":"The Gardener team is happy to announce that Gardener now offers …","ref":"/blog/2020/10.19-gardener-integrates-with-kubevirt/","tags":"","title":"Gardener Integrates with KubeVirt"},{"body":"Do you want to understand how Gardener creates and updates Kubernetes clusters (Shoots)? Well, it’s complicated, but if you are not afraid of large diagrams and are a visual learner like me, this might be useful to you.\nIntroduction In this blog post I will share a technical diagram which attempts to tie together the various components involved when Gardener creates a Kubernetes cluster. I have created and curated the diagram, which visualizes the Shoot reconciliation flow since I started developing on Gardener. Aside from serving as a memory aid for myself, I created it in hopes that it may potentially help contributors to understand a core piece of the complex Gardener machinery. Please be advised that the diagram and components involved are large. Although it can be easily divided into multiple diagrams, I want to show all the components and connections in a single diagram to create an overview of the reconciliation flow.\nThe goal is to visualize the interactions of the components involved in the Shoot creation. It is not intended to serve as a documentation of every component involved.\nBackground Taking a step back, the Gardener README states:\n In essence, Gardener is an extension API server that comes along with a bundle of custom controllers. It introduces new API objects in an existing Kubernetes cluster (which is called a garden cluster) in order to use them for the management of end-user Kubernetes clusters (which are called shoot clusters). These shoot clusters are described via declarative cluster specifications which are observed by the controllers. They will bring up the clusters, reconcile their state, perform automated updates and make sure they are always up and running.\n This means that Gardener, just like any Kubernetes controller, creates Kubernetes clusters (Shoots) using a reconciliation loop.\nThe Gardenlet contains the controller and reconciliation loop responsible for the creation, update, deletion, and migration of Shoot clusters (there are more, but we spare them in this article). In addition, the Gardener Controller Manager also reconciles Shoot resources, but only for seed-independent functionality such as Shoot hibernation, Shoot maintenance or quota control.\nThis blog post is about the reconciliation loop in the Gardenlet responsible for creating and updating Shoot clusters. The code can be found in the gardener/gardener repository. The reconciliation loops of the extension controllers can be found in their individual repositories.\nShoot Reconciliation Flow Diagram When Gardner creates a Shoot cluster, there are three conceptual layers involved: the Garden cluster, the Seed cluster and the Shoot cluster. Each layer represents a top-level section in the diagram (similar to a lane in a BPMN diagram).\nIt might seem confusing that the Shoot cluster itself is a layer, because the whole flow in the first place is about creating the Shoot cluster. I decided to introduce this separate layer to make a clear distinction between which resources exist in the Seed API server (managed by Gardener) and which in the Shoot API server (accessible by the Shoot owner).\nEach section contains several components. Components are mostly Kubernetes resources in a Gardener installation (e.g. the gardenlet deployment in the Seed cluster).\nThis is the list of components:\n(Virtual) Garden Cluster\n Gardener Extension API server Validating Provider Webhooks Project Namespace Seed Cluster\n Gardenlet Seed API server every Shoot Control Plane has a dedicated namespace in the Seed. Cloud Provider (owned by Stakeholder) Arguably part of the Shoot cluster but used by components in the Seed cluster to create the infrastructure for the Shoot. Gardener DNS extension Provider Extension (such as gardener-extension-provider-aws) Gardener Extension ETCD Druid Gardener Resource Manager Operating System Extension (such as gardener-extension-os-gardenlinux) Networking Extension (such as gardener-extension-networking-cilium) Machine Controller Manager ContainerRuntime extension (such as gardener-extension-runtime-gvisor) Shoot API server (in the Shoot Namespace in the Seed cluster) Shoot Cluster\n Cloud Provider Compute API (owned by Stakeholder) - for VM/Node creation. VM / Bare metal node hosted by Cloud Provider (in Stakeholder owned account). How to Use the Diagram The diagram:\n should be read from top to bottom - starting in the top left corner with the creation of the Shoot resource via the Gardener Extension API server. should not require an encompassing documentation / description. More detailed documentation on the components itself can usually be found in the respective repository. does not show which activities execute in parallel (many) and also does not describe the exact dependencies between the steps. This can be found out by looking at the source code. It however tries to put the activities in a logical order of execution during the reconciliation flow. Occasionally, there is an info box with additional information next to parts in the diagram that in my point of view require further explanation. Large example resource for the Gardener CRDs (e.g Worker CRD, Infrastructure CRD) are placed on the left side and are referenced by a dotted line (—–).\nBe aware that Gardener is an evolving project, so the diagram will most likely be already outdated by the time you are reading this. Nevertheless, it should give a solid starting point for further explorations into the details of Gardener.\nFlow Diagram The diagram can be found below and on GitHub. There are multiple formats available (svg, vsdx, draw.io, html).\nPlease open an issue or open a PR in the repository if information is missing or is incorrect. Thanks!\n\n","categories":"","description":"","excerpt":"Do you want to understand how Gardener creates and updates Kubernetes …","ref":"/blog/2020/10.19-shoot-reconciliation-details/","tags":"","title":"Shoot Reconciliation Details"},{"body":"Summer holidays aren’t over yet, still, the Gardener community was able to release two new minor versions in the past weeks. Despite being limited in capacity these days, we were able to reach some major milestones, like adding Kubernetes v1.19 support and the long-delayed automated gardenlet certificate rotation. Whilst we continue to work on topics related to scalability, robustness, and better observability, we agreed to adjust our focus a little more into the areas of development productivity, code quality and unit/integration testing for the upcoming releases.\nNotable Changes in v1.10 Gardener v1.10 was a comparatively small release (measured by the number of changes) but it comes with some major features!\nKubernetes 1.19 Support (gardener/gardener#2799) The newest minor release of Kubernetes is now supported by Gardener (and all the maintained provider extensions)! Predominantly, we have enabled CSI migration for OpenStack now that it got promoted to beta, i.e. 1.19 shoots will no longer use the in-tree Cinder volume provisioner. The CSI migration enablement for Azure got postponed (to at least 1.20) due to some issues that the Kubernetes community is trying to fix in the 1.20 release cycle. As usual, the 1.19 release notes should be considered before upgrading your shoot clusters.\nAutomated Certificate Rotation for gardenlet (gardener/gardener#2542) Similar to the kubelet, the gardenlet supports TLS bootstrapping when deployed into a new seed cluster. It will request a client certificate for the garden cluster using the CertificateSigningRequest API of Kubernetes and store the generated results in a Secret object in the garden namespace of its seed. These certificates are usually valid for one year. We have now added support for automatic renewals if the expiration dates are approaching.\nImproved Monitoring Alerts (gardener/gardener#2776) We have worked on a larger refactoring to improve reliability and accuracy of our monitoring alerts for both shoot control planes in the seed, as well as shoot system components running on worker nodes. The improvements are primarily for operators and should result in less false positive alerts. Also, the alerts should fire less frequently and are better grouped in order to reduce to overall amount of alerts.\nSeed Deletion Protection (gardener/gardener#2732) Our validation to improve robustness and countermeasures against accidental mistakes has been improved. Earlier, it was possible to remove the use-as-seed annotation for shooted seeds or directly set the deletionTimestamp on Seed objects, despite of the fact that they might still run shoot control planes. Seed deletion would not start in these cases, although, it would disrupt the system unnecessarily, and result in some unexpected behaviour. The Gardener API server is now forbidding such requests if the seeds are not completely empty yet.\nLogging Improvements for Loki (multiple PRs) After we released our large logging stack refactoring (from EFK to Loki) with Gardener v1.8, we have continued to work on reliability, quality and user feedback in general. We aren’t done yet, though, Gardener v1.10 includes a bunch of improvements which will help to graduate the Logging feature gate to beta and GA, eventually.\nNotable Changes in v1.9 The v1.9 release contained tons of small improvements and adjustments in various areas of the code base and a little less new major features. However, we don’t want to miss the opportunity to highlight a few of them.\nCRI Validation in CloudProfiles (gardener/gardener#2137) A couple of releases back we have introduced support for containerd and the ContainerRuntime extension API. The supported container runtimes are operating system specific, and until now it wasn’t possible for end-users to easily figure out whether they can enable containerd or other ContainerRuntime extensions for their shoots. With this change, Gardener administrators/operators can now provide that information in the .spec.machineImages section in the CloudProfile resource. This also allows for enhanced validation and prevents misconfigurations.\nNew Shoot Event Controller (gardener/gardener#2649) The shoot controllers in both the gardener-controller-manager and gardenlet fire several Events for some important operations (e.g., automated hibernation/wake-up due to hibernation schedule, automated Kubernetes/machine image version update during maintenance, etc.). Earlier, the only way to prolong the lifetime of these events was to modify the --event-ttl command line parameter of the garden cluster’s kube-apiserver. This came with the disadvantage that all events were kept for a longer time (not only those related to Shoots that an operator is usually interested in and ideally wants to store for a couple of days). The new shoot event controller allows to achieve this by deleting non-shoot events. This helps operators and end-users to better understand which changes were applied to their shoots by Gardener.\nEarly Deployment of the Logging Stack for New Shoots (gardener/gardener#2750) Since the first introduction of the Logging feature gate two years back, the logging stack was only deployed at the very end of the shoot creation. This had the disadvantage that control plane pod logs were not kept in case the shoot creation flow is interrupted before the logging stack could be deployed. In some situations, this was preventing fetching relevant information about why a certain control plane component crashed. We now deploy the logging stack very early in the shoot creation flow to always have access to such information.\n","categories":"","description":"","excerpt":"Summer holidays aren’t over yet, still, the Gardener community was …","ref":"/blog/2020/09.11-gardener-v1.9-and-v1.10-released/","tags":"","title":"Gardener v1.9 and v1.10 Released"},{"body":"Even if we are in the midst of the summer holidays, a new Gardener release came out yesterday: v1.8.0! It’s main themes are the large change of our logging stack to Loki (which was already explained in detail on a blog post on grafana.com), more configuration options to optimize the utilization of a shoot, node-local DNS, new project roles, and significant improvements for the Kubernetes client that Gardener uses to interact with the many different clusters.\nNotable Changes Logging 2.0: EFK Stack Replaced by Loki (gardener/gardener#2515) Since two years or so, Gardener could optionally provision a dedicated logging stack per seed and per shoot which was based on fluent-bit, fluentd, ElasticSearch and Kibana. This feature was still hidden behind an alpha-level feature gate and never got promoted to beta so far. Due to various limitations of this solution, we decided to replace the EFK stack with Loki. As we already have Prometheus and Grafana deployments for both users and operators by default for all clusters, the choice was just natural. Please find out more on this topic at this dedicated blog post.\nCluster Identities and DNSOwner Objects (gardener/gardener#2471, gardener/gardener#2576) The shoot control plane migration topic is ongoing since a few months already, and we are very much progressing with it. A first alpha version will probably make it out soon. As part of these endeavors, we introduced cluster identities and the usage of DNSOwner objects in this release. Both are needed to gracefully migrate the DNSEntry extension objects from the old seed to the new seed as part of the control plane migration process. Please find out more on this topic at this blog post.\nNew uam Role for Project Members to Limit User Access Management Privileges (gardener/gardener#2611) In order to allow external user access management system to integrate with Gardener and to fulfil certain compliance aspects, we have introduced a new role called uam for Project members (next to admin and viewer). Only if a user has this role, then he/she is allowed to add/remove other human users to the respective Project. By default, all newly created Projects assign this role only to the owner while, for backwards-compatibility reasons, it will be assigned for all members for existing projects. Project owners can steadily revoke this access as desired. Interestingly, the uam role is backed by a custom RBAC verb called manage-members, i.e., the Gardener API server is only admitting changes to the human Project members if the respective user is bound to this RBAC verb.\nNew Node-Local DNS Feature for Shoots (gardener/gardener#2528) By default, we are using CoreDNS as DNS plugin in shoot clusters which we auto-scale horizontally using HPA. However, in some situations we are discovering certain bottlenecks with it, e.g., unreliable UDP connections, unnecessary node hopping, inefficient load balancing, etc. To further optimize the DNS performance for shoot clusters, it is now possible to enable a new alpha-level feature gate in the gardenlet’s componentconfig: NodeLocalDNS. If enabled, all shoots will get a new DaemonSet to run a DNS server on each node.\nMore kubelet and API Server Configurability (gardener/gardener#2574, gardener/gardener#2668) One large benefit of Gardener is that it allows you to optimize the usage of your control plane as well as worker nodes by exposing relevant configuration parameters in the Shoot API. In this version, we are adding support to configure kubelet’s values for systemReserved and kubeReserved resources as well as the kube-apiserver’s watch cache sizes. This allows end-users to get to better node utilization and/or performance for their shoot clusters.\nConfigurable Timeout Settings for machine-controller-manager (gardener/gardener#2563) One very central component in Project Gardener is the machine-controller-manager for managing the worker nodes of shoot clusters. It has extensive qualities with respect to node lifecycle management and rolling updates. As such, it uses certain timeout values, e.g. when creating or draining nodes, or when checking their health. Earlier, those were not customizable by end-users, but we are adding this possibility now. You can fine-grain these settings per worker pool in the Shoot API such that you can optimize the lifecycle management of your worker nodes even more!\nImproved Usage of Cached Client to Reduce Network I/O (gardener/gardener#2635, gardener/gardener#2637) In the last Gardener release v1.7 we have introduced a huge refactoring the clients that we use to interact with the many different Kubernetes clusters. This is to further optimize the network I/O performed by leveraging watches and caches as good as possible. It’s still an alpha-level feature that must be explicitly enabled in the Gardenlet’s component configuration, though, with this release we have improved certain things in order to pave the way for beta promotion. For example, we were initially also using a cached client when interacting with shoots. However, as the gardenlet runs in the seed as well (and thus can communicate cluster-internally with the kube-apiservers of the respective shoots) this cache is not necessary and just memory overhead. We have removed it again and saw the memory usage getting lower again. More to come!\nAWS EBS Volume Encryption by Default (gardener/gardener-extension-provider-aws#147) The Shoot API already exposed the possibility to encrypt the root disks of worker nodes since quite a while, but it was disabled by default (for backwards-compatibility reasons). With this release we have change this default, so new shoot worker nodes will be provisioned with encrypted root disks out-of-the-box. However, the g4dn instance types of AWS don’t support this encryption, so when you use them you have to explicitly disable the encryption in the worker pool configuration.\nLiveness Probe for Gardener API Server Deployment (gardener/gardener#2647) A small, but very valuable improvement is the introduction of a liveness probe for our Gardener API server. As it’s built with the same library like the Kubernetes API server, it exposes two endpoints at /livez and /readyz which were created exactly for the purpose of live- and readiness probes. With Gardener v1.8, the Helm chart contains a liveness probe configuration by default, and we are awaiting an upstream fix (kubernetes/kubernetes#93599) to also enable the readiness probe. This will help in a smoother rolling update of the Gardener API server pods, i.e., preventing clients from talking to a not yet initialized or already terminating API server instance.\nWebhook Ports Changed to Enable OpenShift (gardener/gardener#2660) In order to make it possible to run Gardener on OpenShift clusters as well, we had to make a change in the port configuration for the webhooks we are using in both Gardener and the extension controllers. Earlier, all the webhook servers directly exposed port 443, i.e., a system port which is a security concern and disallowed in OpenShift. We have changed this port now across all places and also adapted our network policies accordingly. This is most likely not the last necessary change to enable this scenario, however, it’s a great improvement to push the project forward.\nIf you’re interested in more details and even more improvements, you can read all the release notes for Gardener v1.8.0.\n","categories":"","description":"","excerpt":"Even if we are in the midst of the summer holidays, a new Gardener …","ref":"/blog/2020/08.06-gardener-v1.8.0-released/","tags":"","title":"Gardener v1.8.0 Released"},{"body":"Gardener is showing successful collaboration with its growing community of contributors and adopters. With this come some success stories, including PingCAP using Gardener to implement its managed service.\nAbout PingCAP and Its TiDB Cloud PingCAP started in 2015, when three seasoned infrastructure engineers working at leading Internet companies got sick and tired of the way databases were managed, scaled and maintained. Seeing no good solution on the market, they decided to build their own - the open-source way. With the help of a first-class team and hundreds of contributors from around the globe, PingCAP is building a distributed NewSQL, hybrid transactional and analytical processing (HTAP) database.\nIts flagship project, TiDB, is a cloud-native distributed SQL database with MySQL compatibility, and one of the most popular open-source database projects - with 23.5K+ stars and 400+ contributors. Its sister project TiKV is a Cloud Native Interactive Landscape project.\nPingCAP envisioned their managed TiDB service, known as TiDB Cloud, to be multi-tenant, secure, cost-efficient, and to be compatible with different cloud providers. As a result, the company turned to Gardener to build their managed TiDB cloud service offering.\nTiDB Cloud Beta Preview Limitations with Other Public Managed Kubernetes Services Previously, PingCAP encountered issues while using other public managed K8s cluster services, to develop the first version of its TiDB Cloud. Their worst pain point was that they felt helpless when encountering certain malfunctions. PingCAP wasn’t able to do much to resolve these issues, except waiting for the providers’ help. More specifically, they experienced problems due to cloud-provider specific Kubernetes system upgrades, delays in the support response (which could be avoided in exchange of a costly support fee), and no control over when things got fixed.\nThere was also a lot of cloud-specific integration work needed to follow a multi-cloud strategy, which proved to be expensive both to produce and maintain. With one of these managed K8s services, you would have to integrate the instance API, as opposed to a solution like Gardener, which provides a unified API for all clouds. Such a unified API eliminates the need to worry about cloud specific-integration work altogether.\nWhy PingCAP Chose Gardener to Build TiDB Cloud “Gardener has similar concepts to Kubernetes. Each Kubernetes cluster is just like a Kubernetes pod, so the similar concepts apply, and the controller pattern makes Gardener easy to manage. It was also easy to extend, as the team was already very familiar with Kubernetes, so it wasn’t hard for us to extend Gardener. We also saw that Gardener has a very active community, which is always a plus!”\n- Aylei Wu, (Cloud Engineer) at PingCAP\n At first glance, PingCAP had initial reservations about using Gardener - mainly due to its adoption level (still at the beginning) and an apparent complexity of use. However, these were soon eliminated as they learned more about the solution. As Aylei Wu mentioned during the last Gardener community meeting, “a good product speaks for itself”, and once the company got familiar with Gardener, they quickly noticed that the concepts were very similar to Kubernetes, which they were already familiar with.\nThey recognized that Gardener would be their best option, as it is highly extensible and provides a unified abstraction API layer. In essence, the machines can be managed via a machine controller manager for different cloud providers - without having to worry about the individual cloud APIs.\nThey agreed that Gardener’s solution, although complex, was definitely worth it. Even though it is a relatively new solution, meaning they didn’t have access to other user testimonials, they decided to go with the service since it checked all the boxes (and as SAP was running it productively with a huge fleet). PingCAP also came to the conclusion that building a managed Kubernetes service themselves would not be easy. Even if they were to build a managed K8s service, they would have to heavily invest in development and would still end up with an even more complex platform than Gardener’s. For all these reasons combined, PingCAP decided to go with Gardener to build its TiDB Cloud.\nHere are certain features of Gardener that PingCAP found appealing:\n Cloud agnostic: Gardener’s abstractions for cloud-specific integrations dramatically reduce the investment in supporting more than one cloud infrastructure. Once the integration with Amazon Web Services was done, moving on to Google Cloud Platform proved to be relatively easy. (At the moment, TiDB Cloud has subscription plans available for both GCP and AWS, and they are planning to support Alibaba Cloud in the future.) Familiar concepts: Gardener is K8s native; its concepts are easily related to core Kubernetes concepts. As such, it was easy to onboard for a K8s experienced team like PingCAP’s SRE team. Easy to manage and extend: Gardener’s API and extensibility are easy to implement, which has a positive impact on the implementation, maintenance costs and time-to-market. Active community: Prompt and quality responses on Slack from the Gardener team tremendously helped to quickly onboard and produce an efficient solution. How PingCAP Built TiDB Cloud with Gardener On a technical level, PingCAP’s set-up overview includes the following:\n A Base Cluster globally, which is the top-level control plane of TiDB Cloud A Seed Cluster per cloud provider per region, which makes up the fundamental data plane of TiDB Cloud A Shoot Cluster is dynamically provisioned per tenant per cloud provider per region when requested A tenant may create one or more TiDB clusters in a Shoot Cluster As a real world example, PingCAP sets up the Base Cluster and Seed Clusters in advance. When a tenant creates its first TiDB cluster under the us-west-2 region of AWS, a Shoot Cluster will be dynamically provisioned in this region, and will host all the TiDB clusters of this tenant under us-west-2. Nevertheless, if another tenant requests a TiDB cluster in the same region, a new Shoot Cluster will be provisioned. Since different Shoot Clusters are located in different VPCs and can even be hosted under different AWS accounts, TiDB Cloud is able to achieve hard isolation between tenants and meet the critical security requirements for our customers.\nTo automate these processes, PingCAP creates a service in the Base Cluster, known as the TiDB Cloud “Central” service. The Central is responsible for managing shoots and the TiDB clusters in the Shoot Clusters. As shown in the following diagram, user operations go to the Central, being authenticated, authorized, validated, stored and then applied asynchronously in a controller manner. The Central will talk to the Gardener API Server to create and scale Shoot clusters. The Central will also access the Shoot API Service to deploy and reconcile components in the Shoot cluster, including control components (TiDB Operator, API Proxy, Usage Reporter for billing, etc.) and the TiDB clusters.\nTiDB Cloud on Gardener Architecture Overview What’s Next for PingCAP and Gardener With the initial success of using the project to build TiDB Cloud, PingCAP is now working heavily on the stability and day-to-day operations of TiDB Cloud on Gardener. This includes writing Infrastructure-as-Code scripts/controllers with it to achieve GitOps, building tools to help diagnose problems across regions and clusters, as well as running chaos tests to identify and eliminate potential risks. After benefiting greatly from the community, PingCAP will continue to contribute back to Gardener.\nIn the future, PingCAP also plans to support more cloud providers like AliCloud and Azure. Moreover, PingCAP may explore the opportunity of running TiDB Cloud in on-premise data centers with the constantly expanding support this project provides. Engineers at PingCAP enjoy the ease of learning from Gardener’s Kubernetes-like concepts and being able to apply them everywhere. Gone are the days of heavy integrations with different clouds and worrying about vendor stability. With this project, PingCAP now sees broader opportunities to land TiDB Cloud on various infrastructures to meet the needs of their global user group.\nStay tuned, more blog posts to come on how Gardener is collaborating with its contributors and adopters to bring fully-managed clusters at scale everywhere! If you want to join in on the fun, connect with our community.\n","categories":"","description":"","excerpt":"Gardener is showing successful collaboration with its growing …","ref":"/blog/2020/05.27-pingcaps-experience/","tags":"","title":"PingCAP’s Experience in Implementing Their Managed TiDB Service with Gardener"},{"body":"The Gardener project website just received a serious facelift. Here are some of the highlights:\n A completely new landing page, emphasizing both on Gardener’s value proposition and the open community behind it. The Community page was reconstructed for quick access to the various community channels and will soon merge the Adopters page. It will provide a better insight into success stories from the community. Improved blogs layout. One-click sharing options are available starting with simple URL copy link and twitter button and others will closely follow up. While we are at it, give it a try. Spread the word. Website builds also got to a new level with:\n Containerization. The whole build environment is containerized now, eliminating differences between local and CI/CD setup and reducing content developers focus only to the /documentation repository. Running a local server for live preview of changes as you make them when developing content for the website, is now as easy as runing make serve in your local /documentation clone. Numerous improvements to the buld scripts. More configuration options, authenticated requests, fault tolerance and performance. Good news for Windows WSL users who will now enjoy a significantly support. See the updated README for details on that. A number of improvements in layouts styles, site assets and hugo site-building techniques. But hey, THAT’S NOT ALL!\nStay tuned for more improvements around the corner. The biggest ones are aligning the documentation with the new theme and restructuring it along, more emphasis on community success stories all around, more sharing options and more than a handful of shortcodes for content development and … let’s cut the spoilers here.\nI hope you will like it. Let us know what you think about it. Feel free to leave comments and discuss on Twitter and Slack, or in case of issues - on GitHub.\nGo ahead and help us spread the word: https://gardener.cloud\n","categories":"","description":"","excerpt":"The Gardener project website just received a serious facelift. Here …","ref":"/blog/2020/05.11-new-website-same-green-flower/","tags":"","title":"New Website, Same Green Flower"},{"body":"TL;DR Note Details of the description might change in the near future since Heptio was taken over by VMWare which might result in different GitHub repositories or other changes. Please don’t hesitate to inform us in case you encounter any issues. In general, Backup and Restore (BR) covers activities enabling an organization to bring a system back in a consistent state, e.g., after a disaster or to setup a new system. These activities vary in a very broad way depending on the applications and its persistency.\nKubernetes objects like Pods, Deployments, NetworkPolicies, etc. configure Kubernetes internal components and might as well include external components like load balancer and persistent volumes of the cloud provider. The BR of external components and their configurations might be difficult to handle in case manual configurations were needed to prepare these components.\nTo set the expectations right from the beginning, this tutorial covers the BR of Kubernetes deployments which might use persistent volumes. The BR of any manual configuration of external components, e.g., via the cloud providers console, is not covered here, as well as the BR of a whole Kubernetes system.\nThis tutorial puts the focus on the open source tool Velero (formerly Heptio Ark) and its functionality to explain the BR process.\n #body-inner blockquote { border: 0; padding: 10px; margin-top: 40px; margin-bottom: 40px; border-radius: 4px; background-color: rgba(0,0,0,0.05); box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23); position:relative; padding-left:60px; } #body-inner blockquote:before { content: \"i\"; font-weight: bold; position: absolute; top: 0; bottom: 0; left: 0; background-color: #00a273; color: white; vertical-align: middle; margin: auto; width: 36px; font-size: 30px; text-align: center; } Basically, Velero allows you to:\n backup and restore your Kubernetes cluster resources and persistent volumes (on-demand or scheduled) backup or restore all objects in your cluster, or filter resources by type, namespace, and/or label by default, all persistent volumes are backed up (configurable) replicate your production environment for development and testing environments define an expiration date per backup execute pre- and post-activities in a container of a pod when a backup is created (see Hooks) extend Velero by Plugins, e.g., for Object and Block store (see Plugins) Velero consists of a server side component and a client tool. The server components consists of Custom Resource Definitions (CRD) and controllers to perform the activities. The client tool communicates with the K8s API server to, e.g., create objects like a Backup object.\nThe diagram below explains the backup process. When creating a backup, Velero client makes a call to the Kubernetes API server to create a Backup object (1). The BackupController notices the new Backup object, validates the object (2) and begins the backup process (3). Based on the filter settings provided by the Velero client it collects the resources in question (3). The BackupController creates a tar ball with the Kubernetes objects and stores it in the backup location, e.g., AWS S3 (4) as well as snapshots of persistent volumes (5).\nThe size of the backup tar ball corresponds to the number of objects in etcd. The gzipped archive contains the Json representations of the objects.\nNote As of the writing of this tutorial, Velero or any other BR tool for Shoot clusters is not provided by Gardener. Getting Started At first, clone the Velero GitHub repository and get the Velero client from the releases or build it from source via make all in the main directory of the cloned GitHub repository.\nTo use an AWS S3 bucket as storage for the backup files and the persistent volumes, you need to:\n create a S3 bucket as the backup target create an AWS IAM user for Velero configure the Velero server create a secret for your AWS credentials For details about this setup, check the Set Permissions for Velero documentation. Moreover, it is possible to use other supported storage providers.\nNote Per default, Velero is installed in the namespace velero. To change the namespace, check the documentation. Velero offers a wide range of filter possibilities for Kubernetes resources, e.g filter by namespaces, labels or resource types. The filter settings can be combined and used as include or exclude, which gives a great flexibility for selecting resources.\nNote Carefully set labels and/or use namespaces for your deployments to make the selection of the resources to be backed up easier. The best practice would be to check in advance which resources are selected with the defined filter. Exemplary Use Cases Below are some use cases which could give you an idea on how to use Velero. You can also check Velero’s documentation for other introductory examples.\nHelm Based Deployments To be able to use Helm charts in your Kubernetes cluster, you need to install the Helm client helm and the server component tiller. Per default the server component is installed in the namespace kube-system. Even if it is possible to select single deployments via the filter settings of Velero, you should consider to install tiller in a separate namespace via helm init --tiller-namespace \u003cyour namespace\u003e. This approach applies as well for all Helm charts to be deployed - consider separate namespaces for your deployments as well by using the parameter --namespace.\nTo backup a Helm based deployment, you need to backup both Tiller and the deployment. Only then the deployments could be managed via Helm. As mentioned above, the selection of resources would be easier in case they are separated in namespaces.\nSeparate Backup Locations In case you run all your Kubernetes clusters on a single cloud provider, there is probably no need to store the backups in a bucket of a different cloud provider. However, if you run Kubernetes clusters on different cloud provider, you might consider to use a bucket on just one cloud provider as the target for the backups, e.g., to benefit from a lower price tag for the storage.\nPer default, Velero assumes that both the persistent volumes and the backup location are on the same cloud provider. During the setup of Velero, a secret is created using the credentials for a cloud provider user who has access to both objects (see the policies, e.g., for the AWS configuration).\nNow, since the backup location is different from the volume location, you need to follow these steps (described here for AWS):\n configure as documented the volume storage location in examples/aws/06-volumesnapshotlocation.yaml and provide the user credentials. In this case, the S3 related settings like the policies can be omitted\n create the bucket for the backup in the cloud provider in question and a user with the appropriate credentials and store them in a separate file similar to credentials-ark\n create a secret which contains two credentials, one for the volumes and one for the backup target, e.g., by using the command kubectl create secret generic cloud-credentials --namespace heptio-ark --from-file cloud=credentials-ark --from-file backup-target=backup-ark\n configure in the deployment manifest examples/aws/10-deployment.yaml the entries in volumeMounts, env and volumes accordingly, e.g., for a cluster running on AWS and the backup target bucket on GCP a configuration could look similar to:\n Note Some links might get broken in the near future since Heptio was taken over by VMWare which might result in different GitHub repositories or other changes. Please don’t hesitate to inform us in case you encounter any issues. Example Velero deployment # Copyright 2017 the Heptio Ark contributors. # # Licensed under the Apache License, Version 2.0 (the \"License\"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an \"AS IS\" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. --- apiVersion: apps/v1beta1 kind: Deployment metadata: namespace: velero name: velero spec: replicas: 1 template: metadata: labels: component: velero annotations: prometheus.io/scrape: \"true\" prometheus.io/port: \"8085\" prometheus.io/path: \"/metrics\" spec: restartPolicy: Always serviceAccountName: velero containers: - name: velero image: gcr.io/heptio-images/velero:latest command: - /velero args: - server volumeMounts: - name: cloud-credentials mountPath: /credentials - name: plugins mountPath: /plugins - name: scratch mountPath: /scratch env: - name: AWS_SHARED_CREDENTIALS_FILE value: /credentials/cloud - name: GOOGLE_APPLICATION_CREDENTIALS value: /credentials/backup-target - name: VELERO_SCRATCH_DIR value: /scratch volumes: - name: cloud-credentials secret: secretName: cloud-credentials - name: plugins emptyDir: {} - name: scratch emptyDir: {} finally, configure the backup storage location in examples/aws/05-backupstoragelocation.yaml to use, in this case, a GCP bucket\n Limitations Below is a potentially incomplete list of limitations. You can also consult Velero’s documentation to get up to date information.\n Only full backups of selected resources are supported. Incremental backups are not (yet) supported. However, by using filters it is possible to restrict the backup to specific resources Inconsistencies might occur in case of changes during the creation of the backup Application specific actions are not considered by default. However, they might be handled by using Velero’s Hooks or Plugins ","categories":"","description":"Details about backup and recovery of Kubernetes objects based on the open source tool [Velero](https://velero.io/).","excerpt":"Details about backup and recovery of Kubernetes objects based on the …","ref":"/docs/guides/administer-shoots/backup-restore/","tags":"","title":"Backup and Restore of Kubernetes Objects"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2019/","tags":"","title":"2019"},{"body":"Feature flags are used to change the behavior of a program at runtime without forcing a restart.\nAlthough they are essential in a native cloud environment, they cannot be implemented without significant effort on some platforms. Kubernetes has made this trivial. Here we will implement them through labels and annotations, but you can also implement them by connecting directly to the Kubernetes API Server.\nPossible Use Cases Turn on/off a specific instance Turn on/off the profiling of a specific instance Change the logging level, to capture detailed logs during a specific event Change caching strategy at runtime Change timeouts in production Toggle on/off some special verification ","categories":"","description":"","excerpt":"Feature flags are used to change the behavior of a program at runtime …","ref":"/blog/2019/06.11-feature-flags-in-kubernetes-applications/","tags":"","title":"Feature Flags in Kubernetes Applications"},{"body":"The kubectl command-line tool uses kubeconfig files to find the information it needs in order to choose a cluster and communicate with its API server.\nWhat happens if the kubeconfig file of your production cluster is leaked or published by accident? Since there is no possibility to rotate or revoke the initial kubeconfig, there is only one way to protect your infrastructure or application if the kubeconfig has leaked - delete the cluster.\nLearn more on Organizing Access Using kubeconfig Files.\n","categories":"","description":"","excerpt":"The kubectl command-line tool uses kubeconfig files to find the …","ref":"/blog/2019/06.11-organizing-access-using-kubeconfig-files/","tags":"","title":"Organizing Access Using kubeconfig Files"},{"body":"Speed up Your Terminal Workflow Use the Kubernetes command-line tool, kubectl, to deploy and manage applications on Kubernetes. Using kubectl, you can inspect cluster resources, as well as create, delete, and update components.\nYou will probably run more than a hundred kubectl commands on some days and you should speed up your terminal workflow with with some shortcuts. Of course, there are good shortcuts and bad shortcuts (lazy coding, lack of security review, etc.), but let’s stick with the positives and talk about a good shortcut: bash aliases in your .profile.\nWhat are those mysterious .profile and .bash_profile files you’ve heard about?\nNote The contents of a .profile file are executed on every log-in of the owner of the file What’s the .bash_profile then? It’s exactly the same, but under a different name. The unix shell you are logging into, in this case OS X, looks for etc/profile and loads it if it exists. Then it looks for ~/.bash_profile, ~/.bash_login and finally ~/.profile, and loads the first one of these it finds.\nPopulating the .profile File Here is the fantastic time saver that needs to be in your shell profile:\n# time save number one. shortcut for kubectl # alias k=\"kubectl\" # Start a shell in a pod AND kill them after leaving # alias ksh=\"kubectl run busybox -i --tty --image=busybox --restart=Never --rm -- sh\" # opens a bash # alias kbash=\"kubectl run busybox -i --tty --image=busybox --restart=Never --rm -- ash\" # activate/exports the kuberconfig.yaml in the current working directory # alias kexport=\"export KUBECONFIG=`pwd`/kubeconfig.yaml\" # usage: kurl http://your-svc.namespace.cluster.local # # we need for this our very own image...never trust an unknown image.. alias kurl=\"docker run --rm byrnedo/alpine-curl\" All the kubectl tab completions still work fine with these aliases, so you’re not losing that speed.\nNote If the approach above does not work for you add the following lines in your ~/.bashrc instead:\n# time save number one. shortcut for kubectl # alias k=\"kubectl\" # Enable kubectl completion source \u003c(k completion bash | sed s/kubectl/k/g) ","categories":"","description":"Some bash tips that save you some time","excerpt":"Some bash tips that save you some time","ref":"/docs/guides/client-tools/bash-tips/","tags":"","title":"Fun with kubectl Aliases"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2018/","tags":"","title":"2018"},{"body":"Green Tea Matcha Cookies For a team event during the Christmas season we decided to completely reinterpret the topic cookies. :-)\nMatcha cookies have the delicate flavor and color of green tea. These soft, pillowy and chewy green tea cookies are perfect with tea. And of course they fit perfectly to our logo.\nIngredients 1 stick butter, softened ⅞ cup of granulated sugar 1 cup + 2 tablespoons all-purpose flour 2 eggs 1¼ tablespoons culinary grade matcha powder 1 teaspoon baking powder pinch of salt Instructions Cream together the butter and sugar in a large mixing bowl - it should be creamy colored and airy. A hand blender or stand mixer works well for this. This helps the cookie become fluffy and chewy. Gently incorporate the eggs to the butter mixture one at a time. In a separate bowl, sift together all the dry ingredients. Add the dry ingredients to the wet by adding a little at a time and folding or gently mixing the batter together. Keep going until you’ve incorporated all the remaining flour mixture. The dough should be a beautiful green color. Chill the dough for at least an hour - up to overnight. The longer the better! Preheat your oven to 325 F. Roll the dough into balls the size of ping pong balls and place them on a non-stick cookie sheet. Bake them for 12-15 minutes until the bottoms just start to become golden brown and the cookie no longer looks wet in the middle. Note: you can always bake them at 350 F for a less moist, fluffy cookie. It will bake faster by about 2-4 minutes 350 F so watch them closely. Remove and let cool on a rack and enjoy! Note Make sure you get culinary grade matcha powder. You should be able to find this in Asian or natural grocers.\n","categories":"","description":"","excerpt":"Green Tea Matcha Cookies For a team event during the Christmas season …","ref":"/blog/2018/12.25-gardener_cookies/","tags":"","title":"Gardener Cookies"},{"body":"…they mess up the figure.\nFor a team event during the Christmas season we decided to completely reinterpret the topic cookies… since the vegetables have gone on a well-deserved vacation. :-)\nGet the recipe at Gardener Cookies.\n","categories":"","description":"","excerpt":"…they mess up the figure.\nFor a team event during the Christmas season …","ref":"/blog/2018/12.22-cookies-are-dangerous/","tags":"","title":"Cookies Are Dangerous..."},{"body":"You want to experiment with Kubernetes or set up a customer scenario, but don’t want to run the cluster 24 / 7 due to cost reasons?\nGardener gives you the possibility to scale your cluster down to zero nodes.\nLearn more on Hibernate a Cluster.\n","categories":"","description":"","excerpt":"You want to experiment with Kubernetes or set up a customer scenario, …","ref":"/blog/2018/07.11-hibernate-a-cluster-to-save-money/","tags":"","title":"Hibernate a Cluster to Save Money"},{"body":"Running as Root User Whenever possible, do not run containers as root users. One could be tempted to say that in Kubernetes, the node and pods are well separated, however, the host and the container share the same kernel. If the container is compromised, a root user can damage the underlying node.\nInstead of running a root user, use RUN groupadd -r anygroup \u0026\u0026 useradd -r -g anygroup myuser to create a group and a user in it. Use the USER command to switch to this user.\nStoring Data or Logs in Containers Containers are ideal for stateless applications and should be transient. This means that no data or logs should be stored in the container, as they are lost when the container is closed. If absolutely necessary, you can use persistence volumes instead to persist them outside the containers.\nHowever, an ELK stack is preferred for storing and processing log files.\nLearn more on Common Kubernetes Antipattern.\n","categories":"","description":"","excerpt":"Running as Root User Whenever possible, do not run containers as root …","ref":"/blog/2018/06.11-anti-patterns/","tags":"","title":"Anti Patterns"},{"body":"In summer 2018, the Gardener project team asked Kinvolk to execute several penetration tests in its role as a third-party contractor. The goal of this ongoing work is to increase the security of all Gardener stakeholders in the open source community. Following the Gardener architecture, the control plane of a Gardener managed shoot cluster resides in the corresponding seed cluster. This is a Control-Plane-as-a-Service with a network air gap.\nAlong the way we found various kinds of security issues, for example, due to misconfiguration or missing isolation, as well as two special problems with upstream Kubernetes and its Control-Plane-as-a-Service architecture.\nLearn more on Auditing Kubernetes for Secure Setup.\n","categories":"","description":"","excerpt":"In summer 2018, the Gardener project team asked Kinvolk to execute …","ref":"/blog/2018/06.11-auditing-kubernetes-for-secure-setup/","tags":"","title":"Auditing Kubernetes for Secure Setup"},{"body":"Microservices tend to use smaller runtimes but you can use what you have today - and this can be a problem in Kubernetes.\nSwitching your architecture from a monolith to microservices has many advantages, both in the way you write software and the way it is used throughout its lifecycle. In this post, my attempt is to cover one problem which does not get as much attention and discussion - size of the technology stack.\nGeneral Purpose Technology Stack There is a tendency to be more generalized in development and to apply this pattern to all services. One feels that a homogeneous image of the technology stack is good if it is the same for all services.\nOne forgets, however, that a large percentage of the integrated infrastructure is not used by all services in the same way, and is therefore only a burden. Thus, resources are wasted and the entire application becomes expensive in operation and scales very badly.\nLight Technology Stack Due to the lightweight nature of your service, you can run more containers on a physical server and virtual machines. The result is higher resource utilization.\nAdditionally, microservices are developed and deployed as containers independently of each another. This means that a development team can develop, optimize, and deploy a microservice without impacting other subsystems.\n","categories":"","description":"","excerpt":"Microservices tend to use smaller runtimes but you can use what you …","ref":"/blog/2018/06.11-big-things-come-in-small-packages/","tags":"","title":"Big Things Come in Small Packages"},{"body":"The Gardener project team has analyzed the impact of the Gardener CVE-2018-2475 and the Kubernetes CVE-2018-1002105 on the Gardener Community Setup. Following some recommendations it is possible to mitigate both vulnerabilities.\n","categories":"","description":"","excerpt":"The Gardener project team has analyzed the impact of the Gardener …","ref":"/blog/2018/06.11-hardening-the-gardener-community-setup/","tags":"","title":"Hardening the Gardener Community Setup"},{"body":" Kubernetes is only available in Docker for Mac 17.12 CE and higher on the Edge channel. Kubernetes support is not included in Docker for Mac Stable releases. To find out more about Stable and Edge channels and how to switch between them, see general configuration. Docker for Mac 17.12 CE (and higher) Edge includes a standalone Kubernetes server that runs on Mac, so that you can test deploying your Docker workloads on Kubernetes. The Kubernetes client command, kubectl, is included and configured to connect to the local Kubernetes server. If you have kubectl already installed and pointing to some other environment, such as minikube or a GKE cluster, be sure to change the context so that kubectl is pointing to docker-for-desktop. Read more on Docker.com.\nI recommend to setup your shell to see which KUBECONFIG is active.\n","categories":"","description":"","excerpt":" Kubernetes is only available in Docker for Mac 17.12 CE and higher …","ref":"/blog/2018/06.11-kubernetes-is-available-in-docker-for-mac-17-12-ce/","tags":"","title":"Kubernetes is Available in Docker for Mac 17.12 CE"},{"body":"…or DENY all traffic from other namespaces\nYou can configure a NetworkPolicy to deny all traffic from other namespaces while allowing all traffic coming from the same namespace the pod is deployed to. There are many reasons why you may choose to configure Kubernetes network policies:\n Isolate multi-tenant deployments Regulatory compliance Ensure containers assigned to different environments (e.g. dev/staging/prod) cannot interfere with each another Learn more on Namespace Isolation.\n","categories":"","description":"","excerpt":"…or DENY all traffic from other namespaces\nYou can configure a …","ref":"/blog/2018/06.11-namespace-isolation/","tags":"","title":"Namespace Isolation"},{"body":"Should I use:\n❌ one namespace per user/developer? ❌ one namespace per team? ❌ one per service type? ❌ one namespace per application type? 😄 one namespace per running instance of your application? Apply the Principle of Least Privilege\nAll user accounts should run as few privileges as possible at all times, and also launch applications with as few privileges as possible. If you share a cluster for a different user separated by a namespace, the user has access to all namespaces and services per default. It can happen that a user accidentally uses and destroys the namespace of a productive application or the namespace of another developer.\nKeep in mind - By default namespaces don’t provide:\n Network Isolation Access Control Audit Logging on user level ","categories":"","description":"","excerpt":"Should I use:\n❌ one namespace per user/developer? ❌ one namespace per …","ref":"/blog/2018/06.11-namespace-scope/","tags":"","title":"Namespace Scope"},{"body":"The efs-provisioner allows you to mount EFS storage as PersistentVolumes in Kubernetes. It consists of a container that has access to an AWS EFS resource. The container reads a configmap containing the EFS filesystem ID, the AWS region and the name identifying the efs-provisioner. This name will be used later when you create a storage class.\nWhy EFS When you have an application running on multiple nodes which require shared access to a file system. When you have an application that requires multiple virtual machines to access the same file system at the same time, AWS EFS is a tool that you can use. EFS supports encryption. EFS is SSD based storage and its storage capacity and pricing will scale in or out as needed, so there is no need for the system administrator to do additional operations. It can grow to a petabyte scale. EFS now supports NFSv4 lock upgrading and downgrading, so yes, you can use sqlite with EFS… even if it was possible before. EFS is easy to setup. Why Not EFS Sometimes when you think about using a service like EFS, you may also think about vendor lock-in and its negative sides. Making an EFS backup may decrease your production FS performance; the throughput used by backups counts towards your total file system throughput. EFS is expensive when compared to EBS (roughly twice the price of EBS storage). EFS is not the magical solution for all your distributed FS problems, it can be slow in many cases. Test, benchmark, and measure to ensure that EFS is a good solution for your use case. EFS distributed architecture results in a latency overhead for each file read/write operation. If you have the possibility to use a CDN, don’t use EFS, use it for the files which can’t be stored in a CDN. Don’t use EFS as a caching system, sometimes you could be doing this unintentionally. Last but not least, even if EFS is a fully managed NFS, you will face performance problems in many cases, resolving them takes time and needs effort. ","categories":"","description":"","excerpt":"The efs-provisioner allows you to mount EFS storage as …","ref":"/blog/2018/06.11-readwritemany-dynamically-provisioned-persistent-volumes-using-amazon-efs/","tags":"","title":"ReadWriteMany - Dynamically Provisioned Persistent Volumes Using Amazon EFS"},{"body":"The storage is definitely the most complex and important part of an application setup. Once this part is completed, one of the most problematic parts could be solved.\nMounting an S3 bucket into a pod using FUSE allows you to access data stored in S3 via the filesystem. The mount is a pointer to an S3 location, so the data is never synced locally. Once mounted, any pod can read or even write from that directory without the need for explicit keys.\nHowever, it can be used to import and parse large amounts of data into a database.\nLearn more on Shared S3 Storage.\n","categories":"","description":"","excerpt":"The storage is definitely the most complex and important part of an …","ref":"/blog/2018/06.11-shared-storage-with-s3-backend/","tags":"","title":"Shared Storage with S3 Backend"},{"body":"One thing that always bothered me was that I couldn’t get the logs of several pods at once with kubectl. A simple tail -f \u003cpath-to-logfile\u003e isn’t possible. Certainly, you can use kubectl logs -f \u003cpod-id\u003e, but it doesn’t help if you want to monitor more than one pod at a time.\nThis is something you really need a lot, at least if you run several instances of a pod behind a deploymentand you don’t have a log viewer service like Kibana set up.\nIn that case, kubetail comes to the rescue. It is a small bash script that allows you to aggregate the log files of several pods at the same time in a simple way. The script is called kubetail and is available at GitHub.\n","categories":"","description":"","excerpt":"One thing that always bothered me was that I couldn’t get the logs of …","ref":"/blog/2018/06.11-watching-logs-of-several-pods/","tags":"","title":"Watching Logs of Several Pods"},{"body":"Multi-node etcd cluster instances via etcd-druid This document proposes an approach (along with some alternatives) to support provisioning and management of multi-node etcd cluster instances via etcd-druid and etcd-backup-restore.\nContent Multi-node etcd cluster instances via etcd-druid Content Goal Background and Motivation Single-node etcd cluster Multi-node etcd-cluster Dynamic multi-node etcd cluster Prior Art ETCD Operator from CoreOS etcdadm from kubernetes-sigs Etcd Cluster Operator from Improbable-Engineering General Approach to ETCD Cluster Management Bootstrapping Assumptions Adding a new member to an etcd cluster Note Alternative Managing Failures Removing an existing member from an etcd cluster Restarting an existing member of an etcd cluster Recovering an etcd cluster from failure of majority of members Kubernetes Context Alternative ETCD Configuration Alternative Data Persistence Persistent Ephemeral Disk In-memory How to detect if valid metadata exists in an etcd member Recommendation How to detect if valid data exists in an etcd member Recommendation Separating peer and client traffic Cutting off client requests Manipulating Client Service podSelector Health Check Backup Failure Alternative Status Members Note Member name as the key Member Leases Conditions ClusterSize Alternative Decision table for etcd-druid based on the status 1. Pink of health Observed state Recommended Action 2. Member status is out of sync with their leases Observed state Recommended Action 3. All members are Ready but AllMembersReady condition is stale Observed state Recommended Action 4. Not all members are Ready but AllMembersReady condition is stale Observed state Recommended Action 5. Majority members are Ready but Ready condition is stale Observed state Recommended Action 6. Majority members are NotReady but Ready condition is stale Observed state Recommended Action 7. Some members have been in Unknown status for a while Observed state Recommended Action 8. Some member pods are not Ready but have not had the chance to update their status Observed state Recommended Action 9. Quorate cluster with a minority of members NotReady Observed state Recommended Action 10. Quorum lost with a majority of members NotReady Observed state Recommended Action 11. Scale up of a healthy cluster Observed state Recommended Action 12. Scale down of a healthy cluster Observed state Recommended Action 13. Superfluous member entries in Etcd status Observed state Recommended Action Decision table for etcd-backup-restore during initialization 1. First member during bootstrap of a fresh etcd cluster Observed state Recommended Action 2. Addition of a new following member during bootstrap of a fresh etcd cluster Observed state Recommended Action 3. Restart of an existing member of a quorate cluster with valid metadata and data Observed state Recommended Action 4. Restart of an existing member of a quorate cluster with valid metadata but without valid data Observed state Recommended Action 5. Restart of an existing member of a quorate cluster without valid metadata Observed state Recommended Action 6. Restart of an existing member of a non-quorate cluster with valid metadata and data Observed state Recommended Action 7. Restart of the first member of a non-quorate cluster without valid data Observed state Recommended Action 8. Restart of a following member of a non-quorate cluster without valid data Observed state Recommended Action Backup Leading ETCD main container’s sidecar is the backup leader Independent leader election between backup-restore sidecars History Compaction Defragmentation Work-flows in etcd-backup-restore Work-flows independent of leader election in all members Work-flows only on the leading member High Availability Zonal Cluster - Single Availability Zone Alternative Regional Cluster - Multiple Availability Zones Alternative PodDisruptionBudget Rolling updates to etcd members Follow Up Ephemeral Volumes Shoot Control-Plane Migration Performance impact of multi-node etcd clusters Metrics, Dashboards and Alerts Costs Future Work Gardener Ring Autonomous Shoot Clusters Optimization of recovery from non-quorate cluster with some member containing valid data Optimization of rolling updates to unhealthy etcd clusters Goal Enhance etcd-druid and etcd-backup-restore to support provisioning and management of multi-node etcd cluster instances within a single Kubernetes cluster. The etcd CRD interface should be simple to use. It should preferably work with just setting the spec.replicas field to the desired value and should not require any more configuration in the CRD than currently required for the single-node etcd instances. The spec.replicas field is part of the scale sub-resource implementation in Etcd CRD. The single-node and multi-node scenarios must be automatically identified and managed by etcd-druid and etcd-backup-restore. The etcd clusters (single-node or multi-node) managed by etcd-druid and etcd-backup-restore must automatically recover from failures (even quorum loss) and disaster (e.g. etcd member persistence/data loss) as much as possible. It must be possible to dynamically scale an etcd cluster horizontally (even between single-node and multi-node scenarios) by simply scaling the Etcd scale sub-resource. It must be possible to (optionally) schedule the individual members of an etcd clusters on different nodes or even infrastructure availability zones (within the hosting Kubernetes cluster). Though this proposal tries to cover most aspects related to single-node and multi-node etcd clusters, there are some more points that are not goals for this document but are still in the scope of either etcd-druid/etcd-backup-restore and/or gardener. In such cases, a high-level description of how they can be addressed in the future are mentioned at the end of the document.\nBackground and Motivation Single-node etcd cluster At present, etcd-druid supports only single-node etcd cluster instances. The advantages of this approach are given below.\n The problem domain is smaller. There are no leader election and quorum related issues to be handled. It is simpler to setup and manage a single-node etcd cluster. Single-node etcd clusters instances have less request latency than multi-node etcd clusters because there is no requirement to replicate the changes to the other members before committing the changes. etcd-druid provisions etcd cluster instances as pods (actually as statefulsets) in a Kubernetes cluster and Kubernetes is quick (\u003c20s) to restart container/pods if they go down. Also, etcd-druid is currently only used by gardener to provision etcd clusters to act as back-ends for Kubernetes control-planes and Kubernetes control-plane components (kube-apiserver, kubelet, kube-controller-manager, kube-scheduler etc.) can tolerate etcd going down and recover when it comes back up. Single-node etcd clusters incur less cost (CPU, memory and storage) It is easy to cut-off client requests if backups fail by using readinessProbe on the etcd-backup-restore healthz endpoint to minimize the gap between the latest revision and the backup revision. The disadvantages of using single-node etcd clusters are given below.\n The database verification step by etcd-backup-restore can introduce additional delays whenever etcd container/pod restarts (in total ~20-25s). This can be much longer if a database restoration is required. Especially, if there are incremental snapshots that need to be replayed (this can be mitigated by compacting the incremental snapshots in the background). Kubernetes control-plane components can go into CrashloopBackoff if etcd is down for some time. This is mitigated by the dependency-watchdog. But Kubernetes control-plane components require a lot of resources and create a lot of load on the etcd cluster and the apiserver when they come out of CrashloopBackoff. Especially, in medium or large sized clusters (\u003e 20 nodes). Maintenance operations such as updates to etcd (and updates to etcd-druid of etcd-backup-restore), rolling updates to the nodes of the underlying Kubernetes cluster and vertical scaling of etcd pods are disruptive because they cause etcd pods to be restarted. The vertical scaling of etcd pods is somewhat mitigated during scale down by doing it only during the target clusters’ maintenance window. But scale up is still disruptive. We currently use some form of elastic storage (via persistentvolumeclaims) for storing which have some upper-bounds on the I/O latency and throughput. This can be potentially be a problem for large clusters (\u003e 220 nodes). Also, some cloud providers (e.g. Azure) take a long time to attach/detach volumes to and from machines which increases the down time to the Kubernetes components that depend on etcd. It is difficult to use ephemeral/local storage (to achieve better latency/throughput as well as to circumvent volume attachment/detachment) for single-node etcd cluster instances. Multi-node etcd-cluster The advantages of introducing support for multi-node etcd clusters via etcd-druid are below.\n Multi-node etcd cluster is highly-available. It can tolerate disruption to individual etcd pods as long as the quorum is not lost (i.e. more than half the etcd member pods are healthy and ready). Maintenance operations such as updates to etcd (and updates to etcd-druid of etcd-backup-restore), rolling updates to the nodes of the underlying Kubernetes cluster and vertical scaling of etcd pods can be done non-disruptively by respecting poddisruptionbudgets for the various multi-node etcd cluster instances hosted on that cluster. Kubernetes control-plane components do not see any etcd cluster downtime unless quorum is lost (which is expected to be lot less frequent than current frequency of etcd container/pod restarts). We can consider using ephemeral/local storage for multi-node etcd cluster instances because individual member restarts can afford to take time to restore from backup before (re)joining the etcd cluster because the remaining members serve the requests in the meantime. High-availability across availability zones is also possible by specifying (anti)affinity for the etcd pods (possibly via kupid). Some disadvantages of using multi-node etcd clusters due to which it might still be desirable, in some cases, to continue to use single-node etcd cluster instances in the gardener context are given below.\n Multi-node etcd cluster instances are more complex to manage. The problem domain is larger including the following. Leader election Quorum loss Managing rolling changes Backups to be taken from only the leading member. More complex to cut-off client requests if backups fail to minimize the gap between the latest revision and the backup revision is under control. Multi-node etcd cluster instances incur more cost (CPU, memory and storage). Dynamic multi-node etcd cluster Though it is not part of this proposal, it is conceivable to convert a single-node etcd cluster into a multi-node etcd cluster temporarily to perform some disruptive operation (etcd, etcd-backup-restore or etcd-druid updates, etcd cluster vertical scaling and perhaps even node rollout) and convert it back to a single-node etcd cluster once the disruptive operation has been completed. This will necessarily still involve a down-time because scaling from a single-node etcd cluster to a three-node etcd cluster will involve etcd pod restarts, it is still probable that it can be managed with a shorter down time than we see at present for single-node etcd clusters (on the other hand, converting a three-node etcd cluster to five node etcd cluster can be non-disruptive).\nThis is definitely not to argue in favour of such a dynamic approach in all cases (eventually, if/when dynamic multi-node etcd clusters are supported). On the contrary, it makes sense to make use of static (fixed in size) multi-node etcd clusters for production scenarios because of the high-availability.\nPrior Art ETCD Operator from CoreOS etcd operator\nProject status: archived\nThis project is no longer actively developed or maintained. The project exists here for historical reference. If you are interested in the future of the project and taking over stewardship, please contact etcd-dev@googlegroups.com.\n etcdadm from kubernetes-sigs etcdadm is a command-line tool for operating an etcd cluster. It makes it easy to create a new cluster, add a member to, or remove a member from an existing cluster. Its user experience is inspired by kubeadm.\n It is a tool more tailored for manual command-line based management of etcd clusters with no API’s. It also makes no assumptions about the underlying platform on which the etcd clusters are provisioned and hence, doesn’t leverage any capabilities of Kubernetes.\nEtcd Cluster Operator from Improbable-Engineering Etcd Cluster Operator\nEtcd Cluster Operator is an Operator for automating the creation and management of etcd inside of Kubernetes. It provides a custom resource definition (CRD) based API to define etcd clusters with Kubernetes resources, and enable management with native Kubernetes tooling._\n Out of all the alternatives listed here, this one seems to be the only possible viable alternative. Parts of its design/implementations are similar to some of the approaches mentioned in this proposal. However, we still don’t propose to use it as -\n The project is still in early phase and is not mature enough to be consumed as is in productive scenarios of ours. The resotration part is completely different which makes it difficult to adopt as-is and requries lot of re-work with the current restoration semantics with etcd-backup-restore making the usage counter-productive. General Approach to ETCD Cluster Management Bootstrapping There are three ways to bootstrap an etcd cluster which are static, etcd discovery and DNS discovery. Out of these, the static way is the simplest (and probably faster to bootstrap the cluster) and has the least external dependencies. Hence, it is preferred in this proposal. But it requires that the initial (during bootstrapping) etcd cluster size (number of members) is already known before bootstrapping and that all of the members are already addressable (DNS,IP,TLS etc.). Such information needs to be passed to the individual members during startup using the following static configuration.\n ETCD_INITIAL_CLUSTER The list of peer URLs including all the members. This must be the same as the advertised peer URLs configuration. This can also be passed as initial-cluster flag to etcd. ETCD_INITIAL_CLUSTER_STATE This should be set to new while bootstrapping an etcd cluster. ETCD_INITIAL_CLUSTER_TOKEN This is a token to distinguish the etcd cluster from any other etcd cluster in the same network. Assumptions ETCD_INITIAL_CLUSTER can use DNS instead of IP addresses. We need to verify this by deleting a pod (as against scaling down the statefulset) to ensure that the pod IP changes and see if the recreated pod (by the statefulset controller) re-joins the cluster automatically. DNS for the individual members is known or computable. This is true in the case of etcd-druid setting up an etcd cluster using a single statefulset. But it may not necessarily be true in other cases (multiple statefulset per etcd cluster or deployments instead of statefulsets or in the case of etcd cluster with members distributed across more than one Kubernetes cluster. Adding a new member to an etcd cluster A new member can be added to an existing etcd cluster instance using the following steps.\n If the latest backup snapshot exists, restore the member’s etcd data to the latest backup snapshot. This can reduce the load on the leader to bring the new member up to date when it joins the cluster. If the latest backup snapshot doesn’t exist or if the latest backup snapshot is not accessible (please see backup failure) and if the cluster itself is quorate, then the new member can be started with an empty data. But this will will be suboptimal because the new member will fetch all the data from the leading member to get up-to-date. The cluster is informed that a new member is being added using the MemberAdd API including information like the member name and its advertised peer URLs. The new etcd member is then started with ETCD_INITIAL_CLUSTER_STATE=existing apart from other required configuration. This proposal recommends this approach.\nNote If there are incremental snapshots (taken by etcd-backup-restore), they cannot be applied because that requires the member to be started in isolation without joining the cluster which is not possible. This is acceptable if the amount of incremental snapshots are managed to be relatively small. This adds one more reason to increase the priority of the issue of incremental snapshot compaction. There is a time window, between the MemberAdd call and the new member joining the cluster and getting up to date, where the cluster is vulnerable to leader elections which could be disruptive. Alternative With v3.4, the new raft learner approach can be used to mitigate some of the possible disruptions mentioned above. Then the steps will be as follows.\n If the latest backup snapshot exists, restore the member’s etcd data to the latest backup snapshot. This can reduce the load on the leader to bring the new member up to date when it joins the cluster. The cluster is informed that a new member is being added using the MemberAddAsLearner API including information like the member name and its advertised peer URLs. The new etcd member is then started with ETCD_INITIAL_CLUSTER_STATE=existing apart from other required configuration. Once the new member (learner) is up to date, it can be promoted to a full voting member by using the MemberPromote API This approach is new and involves more steps and is not recommended in this proposal. It can be considered in future enhancements.\nManaging Failures A multi-node etcd cluster may face failures of diffent kinds during its life-cycle. The actions that need to be taken to manage these failures depend on the failure mode.\nRemoving an existing member from an etcd cluster If a member of an etcd cluster becomes unhealthy, it must be explicitly removed from the etcd cluster, as soon as possible. This can be done by using the MemberRemove API. This ensures that only healthy members participate as voting members.\nA member of an etcd cluster may be removed not just for managing failures but also for other reasons such as -\n The etcd cluster is being scaled down. I.e. the cluster size is being reduced An existing member is being replaced by a new one for some reason (e.g. upgrades) If the majority of the members of the etcd cluster are healthy and the member that is unhealthy/being removed happens to be the leader at that moment then the etcd cluster will automatically elect a new leader. But if only a minority of etcd clusters are healthy after removing the member then the the cluster will no longer be quorate and will stop accepting write requests. Such an etcd cluster needs to be recovered via some kind of disaster-recovery.\nRestarting an existing member of an etcd cluster If the existing member of an etcd cluster restarts and retains an uncorrupted data directory after the restart, then it can simply re-join the cluster as an existing member without any API calls or configuration changes. This is because the relevant metadata (including member ID and cluster ID) are maintained in the write ahead logs. However, if it doesn’t retain an uncorrupted data directory after the restart, then it must first be removed and added as a new member.\nRecovering an etcd cluster from failure of majority of members If a majority of members of an etcd cluster fail but if they retain their uncorrupted data directory then they can be simply restarted and they will re-form the existing etcd cluster when they come up. However, if they do not retain their uncorrupted data directory, then the etcd cluster must be recovered from latest snapshot in the backup. This is very similar to bootstrapping with the additional initial step of restoring the latest snapshot in each of the members. However, the same limitation about incremental snapshots, as in the case of adding a new member, applies here. But unlike in the case of adding a new member, not applying incremental snapshots is not acceptable in the case of etcd cluster recovery. Hence, if incremental snapshots are required to be applied, the etcd cluster must be recovered in the following steps.\n Restore a new single-member cluster using the latest snapshot. Apply incremental snapshots on the single-member cluster. Take a full snapshot which can now be used while adding the remaining members. Add new members using the latest snapshot created in the step above. Kubernetes Context Users will provision an etcd cluster in a Kubernetes cluster by creating an etcd CRD resource instance. A multi-node etcd cluster is indicated if the spec.replicas field is set to any value greater than 1. The etcd-druid will add validation to ensure that the spec.replicas value is an odd number according to the requirements of etcd. The etcd-druid controller will provision a statefulset with the etcd main container and the etcd-backup-restore sidecar container. It will pass on the spec.replicas field from the etcd resource to the statefulset. It will also supply the right pre-computed configuration to both the containers. The statefulset controller will create the pods based on the pod template in the statefulset spec and these individual pods will be the members that form the etcd cluster. This approach makes it possible to satisfy the assumption that the DNS for the individual members of the etcd cluster must be known/computable. This can be achieved by using a headless service (along with the statefulset) for each etcd cluster instance. Then we can address individual pods/etcd members via the predictable DNS name of \u003cstatefulset_name\u003e-{0|1|2|3|…|n}.\u003cheadless_service_name\u003e from within the Kubernetes namespace (or from outside the Kubernetes namespace by appending .\u003cnamespace\u003e.svc.\u003ccluster_domain\u003e suffix). The etcd-druid controller can compute the above configurations automatically based on the spec.replicas in the etcd resource.\nThis proposal recommends this approach.\nAlternative One statefulset is used for each member (instead of one statefulset for all members). While this approach gives a flexibility to have different pod specifications for the individual members, it makes managing the individual members (e.g. rolling updates) more complicated. Hence, this approach is not recommended.\nETCD Configuration As mentioned in the general approach section, there are differences in the configuration that needs to be passed to individual members of an etcd cluster in different scenarios such as bootstrapping, adding a new member, removing a member, restarting an existing member etc. Managing such differences in configuration for individual pods of a statefulset is tricky in the recommended approach of using a single statefulset to manage all the member pods of an etcd cluster. This is because statefulset uses the same pod template for all its pods.\nThe recommendation is for etcd-druid to provision the base configuration template in a ConfigMap which is passed to all the pods via the pod template in the StatefulSet. The initialization flow of etcd-backup-restore (which is invoked every time the etcd container is (re)started) is then enhanced to generate the customized etcd configuration for the corresponding member pod (in a shared volume between etcd and the backup-restore containers) based on the supplied template configuration. This will require that etcd-backup-restore will have to have a mechanism to detect which scenario listed above applies during any given member container/pod restart.\nAlternative As mentioned above, one statefulset is used for each member of the etcd cluster. Then different configuration (generated directly by etcd-druid) can be passed in the pod templates of the different statefulsets. Though this approach is advantageous in the context of managing the different configuration, it is not recommended in this proposal because it makes the rest of the management (e.g. rolling updates) more complicated.\nData Persistence The type of persistence used to store etcd data (including the member ID and cluster ID) has an impact on the steps that are needed to be taken when the member pods or containers (minority of them or majority) need to be recovered.\nPersistent Like the single-node case, persistentvolumes can be used to persist ETCD data for all the member pods. The individual member pods then get their own persistentvolumes. The advantage is that individual members retain their member ID across pod restarts and even pod deletion/recreation across Kubernetes nodes. This means that member pods that crash (or are unhealthy) can be restarted automatically (by configuring livenessProbe) and they will re-join the etcd cluster using their existing member ID without any need for explicit etcd cluster management).\nThe disadvantages of this approach are as follows.\n The number of persistentvolumes increases linearly with the cluster size which is a cost-related concern. Network-mounted persistentvolumes might eventually become a performance bottleneck under heavy load for a latency-sensitive component like ETCD. Volume attach/detach issues when associated with etcd cluster instances cause downtimes to the target shoot clusters that are backed by those etcd cluster instances. Ephemeral The ephemeral volumes use-case is considered as an optimization and may be planned as a follow-up action.\nDisk Ephemeral persistence can be achieved in Kubernetes by using either emptyDir volumes or local persistentvolumes to persist ETCD data. The advantages of this approach are as follows.\n Potentially faster disk I/O. The number of persistent volumes does not increase linearly with the cluster size (at least not technically). Issues related volume attachment/detachment can be avoided. The main disadvantage of using ephemeral persistence is that the individual members may retain their identity and data across container restarts but not across pod deletion/recreation across Kubernetes nodes. If the data is lost then on restart of the member pod, the older member (represented by the container) has to be removed and a new member has to be added.\nUsing emptyDir ephemeral persistence has the disadvantage that the volume doesn’t have its own identity. So, if the member pod is recreated but scheduled on the same node as before then it will not retain the identity as the persistence is lost. But it has the advantage that scheduling of pods is unencumbered especially during pod recreation as they are free to be scheduled anywhere.\nUsing local persistentvolumes has the advantage that the volume has its own indentity and hence, a recreated member pod will retain its identity if scheduled on the same node. But it has the disadvantage of tying down the member pod to a node which is a problem if the node becomes unhealthy requiring etcd druid to take additional actions (such as deleting the local persistent volume).\nBased on these constraints, if ephemeral persistence is opted for, it is recommended to use emptyDir ephemeral persistence.\nIn-memory In-memory ephemeral persistence can be achieved in Kubernetes by using emptyDir with medium: Memory. In this case, a tmpfs (RAM-backed file-system) volume will be used. In addition to the advantages of ephemeral persistence, this approach can achieve the fastest possible disk I/O. Similarly, in addition to the disadvantages of ephemeral persistence, in-memory persistence has the following additional disadvantages.\n More memory required for the individual member pods. Individual members may not at all retain their data and identity across container restarts let alone across pod restarts/deletion/recreation across Kubernetes nodes. I.e. every time an etcd container restarts, the old member (represented by the container) will have to be removed and a new member has to be added. How to detect if valid metadata exists in an etcd member Since the likelyhood of a member not having valid metadata in the WAL files is much more likely in the ephemeral persistence scenario, one option is to pass the information that ephemeral persistence is being used to the etcd-backup-restore sidecar (say, via command-line flags or environment variables).\nBut in principle, it might be better to determine this from the WAL files directly so that the possibility of corrupted WAL files also gets handled correctly. To do this, the wal package has some functions that might be useful.\nRecommendation It might be possible that using the wal package for verifying if valid metadata exists might be performance intensive. So, the performance impact needs to be measured. If the performance impact is acceptable (both in terms of resource usage and time), it is recommended to use this way to verify if the member contains valid metadata. Otherwise, alternatives such as a simple check that WAL folder exists coupled with the static information about use of persistent or ephemeral storage might be considered.\nHow to detect if valid data exists in an etcd member The initialization sequence in etcd-backup-restore already includes database verification. This would suffice to determine if the member has valid data.\nRecommendation Though ephemeral persistence has performance and logistics advantages, it is recommended to start with persistent data for the member pods. In addition to the reasons and concerns listed above, there is also the additional concern that in case of backup failure, the risk of additional data loss is a bit higher if ephemeral persistence is used (simultaneous quoram loss is sufficient) when compared to persistent storage (simultaenous quorum loss with majority persistence loss is needed). The risk might still be acceptable but the idea is to gain experience about how frequently member containers/pods get restarted/recreated, how frequently leader election happens among members of an etcd cluster and how frequently etcd clusters lose quorum. Based on this experience, we can move towards using ephemeral (perhaps even in-memory) persistence for the member pods.\nSeparating peer and client traffic The current single-node ETCD cluster implementation in etcd-druid and etcd-backup-restore uses a single service object to act as the entry point for the client traffic. There is no separation or distinction between the client and peer traffic because there is not much benefit to be had by making that distinction.\nIn the multi-node ETCD cluster scenario, it makes sense to distinguish between and separate the peer and client traffic. This can be done by using two services.\n peer To be used for peer communication. This could be a headless service. client To be used for client communication. This could be a normal ClusterIP service like it is in the single-node case. The main advantage of this approach is that it makes it possible (if needed) to allow only peer to peer communication while blocking client communication. Such a thing might be required during some phases of some maintenance tasks (manual or automated).\nCutting off client requests At present, in the single-node ETCD instances, etcd-druid configures the readinessProbe of the etcd main container to probe the healthz endpoint of the etcd-backup-restore sidecar which considers the status of the latest backup upload in addition to the regular checks about etcd and the side car being up and healthy. This has the effect of setting the etcd main container (and hence the etcd pod) as not ready if the latest backup upload failed. This results in the endpoints controller removing the pod IP address from the endpoints list for the service which eventually cuts off ingress traffic coming into the etcd pod via the etcd client service. The rationale for this is to fail early when the backup upload fails rather than continuing to serve requests while the gap between the last backup and the current data increases which might lead to unacceptably large amount of data loss if disaster strikes.\nThis approach will not work in the multi-node scenario because we need the individual member pods to be able to talk to each other to maintain the cluster quorum when backup upload fails but need to cut off only client ingress traffic.\nIt is recommended to separate the backup health condition tracking taking appropriate remedial actions. With that, the backup health condition tracking is now separated to the BackupReady condition in the Etcd resource status and the cutting off of client traffic (which could now be done for more reasons than failed backups) can be achieved in a different way described below.\nManipulating Client Service podSelector The client traffic can be cut off by updating (manually or automatically by some component) the podSelector of the client service to add an additional label (say, unhealthy or disabled) such that the podSelector no longer matches the member pods created by the statefulset. This will result in the client ingress traffic being cut off. The peer service is left unmodified so that peer communication is always possible.\nHealth Check The etcd main container and the etcd-backup-restore sidecar containers will be configured with livenessProbe and readinessProbe which will indicate the health of the containers and effectively the corresponding ETCD cluster member pod.\nBackup Failure As described above using readinessProbe failures based on latest backup failure is not viable in the multi-node ETCD scenario.\nThough cutting off traffic by manipulating client service podSelector is workable, it may not be desirable.\nIt is recommended that on backup failure, the leading etcd-backup-restore sidecar (the one that is responsible for taking backups at that point in time, as explained in the backup section below, updates the BackupReady condition in the Etcd status and raises a high priority alert to the landscape operators but does not cut off the client traffic.\nThe reasoning behind this decision to not cut off the client traffic on backup failure is to allow the Kubernetes cluster’s control plane (which relies on the ETCD cluster) to keep functioning as long as possible and to avoid bringing down the control-plane due to a missed backup.\nThe risk of this approach is that with a cascaded sequence of failures (on top of the backup failure), there is a chance of more data loss than the frequency of backup would otherwise indicate.\nTo be precise, the risk of such an additional data loss manifests only when backup failure as well as a special case of quorum loss (majority of the members are not ready) happen in such a way that the ETCD cluster needs to be re-bootstrapped from the backup. As described here, re-bootstrapping the ETCD cluster requires restoration from the latest backup only when a majority of members no longer have uncorrupted data persistence.\nIf persistent storage is used, this will happen only when backup failure as well as a majority of the disks/volumes backing the ETCD cluster members fail simultaneously. This would indeed be rare and might be an acceptable risk.\nIf ephemeral storage is used (especially, in-memory), the data loss will happen if a majority of the ETCD cluster members become NotReady (requiring a pod restart) at the same time as the backup failure. This may not be as rare as majority members’ disk/volume failure. The risk can be somewhat mitigated at least for planned maintenance operations by postponing potentially disruptive maintenance operations when BackupReady condition is false (vertical scaling, rolling updates, evictions due to node roll-outs).\nBut in practice (when ephemeral storage is used), the current proposal suggests restoring from the latest full backup even when a minority of ETCD members (even a single pod) restart both to speed up the process of the new member catching up to the latest revision but also to avoid load on the leading member which needs to supply the data to bring the new member up-to-date. But as described here, in case of a minority member failure while using ephemeral storage, it is possible to restart the new member with empty data and let it fetch all the data from the leading member (only if backup is not accessible). Though this is suboptimal, it is workable given the constraints and conditions. With this, the risk of additional data loss in the case of ephemeral storage is only if backup failure as well as quorum loss happens. While this is still less rare than the risk of additional data loss in case of persistent storage, the risk might be tolerable. Provided the risk of quorum loss is not too high. This needs to be monitored/evaluated before opting for ephemeral storage.\nGiven these constraints, it is better to dynamically avoid/postpone some potentially disruptive operations when BackupReady condition is false. This has the effect of allowing n/2 members to be evicted when the backups are healthy and completely disabling evictions when backups are not healthy.\n Skip/postpone potentially disruptive maintenance operations (listed below) when the BackupReady condition is false. Vertical scaling. Rolling updates, Basically, any updates to the StatefulSet spec which includes vertical scaling. Dynamically toggle the minAvailable field of the PodDisruptionBudget between n/2 + 1 and n (where n is the ETCD desired cluster size) whenever the BackupReady condition toggles between true and false. This will mean that etcd-backup-restore becomes Kubernetes-aware. But there might be reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to update the etcd resource status with latest full snapshot details). This enhancement should keep etcd-backup-restore backward compatible. I.e. it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-client-service-updates which can be defaulted to false for backward compatibility).\nAlternative The alternative is for etcd-druid to implement the above functionality.\nBut etcd-druid is centrally deployed in the host Kubernetes cluster and cannot scale well horizontally. So, it can potentially be a bottleneck if it is involved in regular health check mechanism for all the etcd clusters it manages. Also, the recommended approach above is more robust because it can work even if etcd-druid is down when the backup upload of a particular etcd cluster fails.\nStatus It is desirable (for the etcd-druid and landscape administrators/operators) to maintain/expose status of the etcd cluster instances in the status sub-resource of the Etcd CRD. The proposed structure for maintaining the status is as shown in the example below.\napiVersion: druid.gardener.cloud/v1alpha1 kind: Etcd metadata: name: etcd-main spec: replicas: 3 ... ... status: ... conditions: - type: Ready # Condition type for the readiness of the ETCD cluster status: \"True\" # Indicates of the ETCD Cluster is ready or not lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: Quorate # Quorate|QuorumLost - type: AllMembersReady # Condition type for the readiness of all the member of the ETCD cluster status: \"True\" # Indicates if all the members of the ETCD Cluster are ready lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: AllMembersReady # AllMembersReady|NotAllMembersReady - type: BackupReady # Condition type for the readiness of the backup of the ETCD cluster status: \"True\" # Indicates if the backup of the ETCD cluster is ready lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: FullBackupSucceeded # FullBackupSucceeded|IncrementalBackupSucceeded|FullBackupFailed|IncrementalBackupFailed ... clusterSize: 3 ... replicas: 3 ... members: - name: etcd-main-0 # member pod name id: 272e204152 # member Id role: Leader # Member|Leader status: Ready # Ready|NotReady|Unknown lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: LeaseSucceeded # LeaseSucceeded|LeaseExpired|UnknownGracePeriodExceeded|PodNotRead - name: etcd-main-1 # member pod name id: 272e204153 # member Id role: Member # Member|Leader status: Ready # Ready|NotReady|Unknown lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: LeaseSucceeded # LeaseSucceeded|LeaseExpired|UnknownGracePeriodExceeded|PodNotRead This proposal recommends that etcd-druid (preferrably, the custodian controller in etcd-druid) maintains most of the information in the status of the Etcd resources described above.\nOne exception to this is the BackupReady condition which is recommended to be maintained by the leading etcd-backup-restore sidecar container. This will mean that etcd-backup-restore becomes Kubernetes-aware. But there are other reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to maintain health conditions). This enhancement should keep etcd-backup-restore backward compatible. But it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-etcd-status-updates which can be defaulted to false for backward compatibility).\nMembers The members section of the status is intended to be maintained by etcd-druid (preferraby, the custodian controller of etcd-druid) based on the leases of the individual members.\nNote An earlier design in this proposal was for the individual etcd-backup-restore sidecars to update the corresponding status.members entries themselves. But this was redesigned to use member leases to avoid conflicts rising from frequent updates and the limitations in the support for Server-Side Apply in some versions of Kubernetes.\nThe spec.holderIdentity field in the leases is used to communicate the ETCD member id and role between the etcd-backup-restore sidecars and etcd-druid.\nMember name as the key In an ETCD cluster, the member id is the unique identifier for a member. However, this proposal recommends using a single StatefulSet whose pods form the members of the ETCD cluster and Pods of a StatefulSet have uniquely indexed names as well as uniquely addressible DNS.\nThis proposal recommends that the name of the member (which is the same as the name of the member Pod) be used as the unique key to identify a member in the members array. This can minimise the need to cleanup superfluous entries in the members array after the member pods are gone to some extent because the replacement pods for any member will share the same name and will overwrite the entry with a possibly new member id.\nThere is still the possibility of not only superfluous entries in the members array but also superfluous members in the ETCD cluster for which there is no corresponding pod in the StatefulSet anymore.\nFor example, if an ETCD cluster is scaled up from 3 to 5 and the new members were failing constantly due to insufficient resources and then if the ETCD client is scaled back down to 3 and failing member pods may not have the chance to clean up their member entries (from the members array as well as from the ETCD cluster) leading to superfluous members in the cluster that may have adverse effect on quorum of the cluster.\nHence, the superfluous entries in both members array as well as the ETCD cluster need to be cleaned up as appropriate.\nMember Leases One Kubernetes lease object per desired ETCD member is maintained by etcd-druid (preferrably, the custodian controller in etcd-druid). The lease objects will be created in the same namespace as their owning Etcd object and will have the same name as the member to which they correspond (which, in turn would be the same as the pod name in which the member ETCD process runs).\nThe lease objects are created and deleted only by etcd-druid but are continually renewed within the leaseDurationSeconds by the individual etcd-backup-restore sidecars (corresponding to their members) if the the corresponding ETCD member is ready and is part of the ETCD cluster.\nThis will mean that etcd-backup-restore becomes Kubernetes-aware. But there are other reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to maintain health conditions). This enhancement should keep etcd-backup-restore backward compatible. But it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-etcd-lease-renewal which can be defaulted to false for backward compatibility).\nA member entry in the Etcd resource status would be marked as Ready (with reason: LeaseSucceeded) if the corresponding pod is ready and the corresponding lease has not yet expired. The member entry would be marked as NotReady if the corresponding pod is not ready (with reason PodNotReady) or as Unknown if the corresponding lease has expired (with reason: LeaseExpired).\nWhile renewing the lease, the etcd-backup-restore sidecars also maintain the ETCD member id and their role (Leader or Member) separated by : in the spec.holderIdentity field of the corresponding lease object since this information is only available to the ETCD member processes and the etcd-backup-restore sidecars (e.g. 272e204152:Leader or 272e204153:Member). When the lease objects are created by etcd-druid, the spec.holderIdentity field would be empty.\nThe value in spec.holderIdentity in the leases is parsed and copied onto the id and role fields of the corresponding status.members by etcd-druid.\nConditions The conditions section in the status describe the overall condition of the ETCD cluster. The condition type Ready indicates if the ETCD cluster as a whole is ready to serve requests (i.e. the cluster is quorate) even though some minority of the members are not ready. The condition type AllMembersReady indicates of all the members of the ETCD cluster are ready. The distinction between these conditions could be significant for both external consumers of the status as well as etcd-druid itself. Some maintenance operations might be safe to do (e.g. rolling updates) only when all members of the cluster are ready. The condition type BackupReady indicates of the most recent backup upload (full or incremental) succeeded. This information also might be significant because some maintenance operations might be safe to do (e.g. anything that involves re-bootstrapping the ETCD cluster) only when backup is ready.\nThe Ready and AllMembersReady conditions can be maintained by etcd-druid based on the status in the members section. The BackupReady condition will be maintained by the leading etcd-backup-restore sidecar that is in charge of taking backups.\nMore condition types could be introduced in the future if specific purposes arise.\nClusterSize The clusterSize field contains the current size of the ETCD cluster. It will be actively kept up-to-date by etcd-druid in all scenarios.\n Before bootstrapping the ETCD cluster (during cluster creation or later bootstrapping because of quorum failure), etcd-druid will clear the status.members array and set status.clusterSize to be equal to spec.replicas. While the ETCD cluster is quorate, etcd-druid will actively set status.clusterSize to be equal to length of the status.members whenever the length of the array changes (say, due to scaling of the ETCD cluster). Given that clusterSize reliably represents the size of the ETCD cluster, it can be used to calculate the Ready condition.\nAlternative The alternative is for etcd-druid to maintain the status in the Etcd status sub-resource. But etcd-druid is centrally deployed in the host Kubernetes cluster and cannot scale well horizontally. So, it can potentially be a bottleneck if it is involved in regular health check mechanism for all the etcd clusters it manages. Also, the recommended approach above is more robust because it can work even if etcd-druid is down when the backup upload of a particular etcd cluster fails.\nDecision table for etcd-druid based on the status The following decision table describes the various criteria etcd-druid takes into consideration to determine the different etcd cluster management scenarios and the corresponding reconciliation actions it must take. The general principle is to detect the scenario and take the minimum action to move the cluster along the path to good health. The path from any one scenario to a state of good health will typically involve going through multiple reconciliation actions which probably take the cluster through many other cluster management scenarios. Especially, it is proposed that individual members auto-heal where possible, even in the case of the failure of a majority of members of the etcd cluster and that etcd-druid takes action only if the auto-healing doesn’t happen for a configured period of time.\n1. Pink of health Observed state Cluster Size Desired: n Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Nothing to do\n2. Member status is out of sync with their leases Observed state Cluster Size Desired: n Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: r Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: l conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Mark the l members corresponding to the expired leases as Unknown with reason LeaseExpired and with id populated from spec.holderIdentity of the lease if they are not already updated so.\nMark the n - l members corresponding to the active leases as Ready with reason LeaseSucceeded and with id populated from spec.holderIdentity of the lease if they are not already updated so.\nPlease refer here for more details.\n3. All members are Ready but AllMembersReady condition is stale Observed state Cluster Size Desired: N/A Current: N/A StatefulSet replicas Desired: n Ready: N/A Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: N/A AllMembersReady: false BackupReady: N/A Recommended Action Mark the status condition type AllMembersReady to true.\n4. Not all members are Ready but AllMembersReady condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: N/A Ready: r where 0 \u003c= r \u003c n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n Members with expired lease: h where 0 \u003c h \u003c n conditions: Ready: N/A AllMembersReady: true BackupReady: N/A where (nr + u + h) \u003e 0 or r \u003c n\n Recommended Action Mark the status condition type AllMembersReady to false.\n5. Majority members are Ready but Ready condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: n Ready: r where r \u003e n/2 Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n/2 Members with expired lease: N/A conditions: Ready: false AllMembersReady: N/A BackupReady: N/A where 0 \u003c (nr + u + h) \u003c n/2\n Recommended Action Mark the status condition type Ready to true.\n6. Majority members are NotReady but Ready condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: n Ready: r where 0 \u003c r \u003c n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n Members with expired lease: N/A conditions: Ready: true AllMembersReady: N/A BackupReady: N/A where (nr + u + h) \u003e n/2 or r \u003c n/2\n Recommended Action Mark the status condition type Ready to false.\n7. Some members have been in Unknown status for a while Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: N/A Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where u \u003c= n Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Mark the u members as NotReady in Etcd status with reason: UnknownGracePeriodExceeded.\n8. Some member pods are not Ready but have not had the chance to update their status Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: n Ready: s where s \u003c n Etcd status members Total: N/A Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Mark the n - s members (corresponding to the pods that are not Ready) as NotReady in Etcd status with reason: PodNotReady\n9. Quorate cluster with a minority of members NotReady Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: n Ready: n - f Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: f where f \u003c n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: N/A conditions: Ready: true AllMembersReady: false BackupReady: true Recommended Action Delete the f NotReady member pods to force restart of the pods if they do not automatically restart via failed livenessProbe. The expectation is that they will either re-join the cluster as an existing member or remove themselves and join as new members on restart of the container or pod and renew their leases.\n10. Quorum lost with a majority of members NotReady Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: n Ready: n - f Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: f where f \u003e= n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: false AllMembersReady: false BackupReady: true Recommended Action Scale down the StatefulSet to replicas: 0. Ensure that all member pods are deleted. Ensure that all the members are removed from Etcd status. Delete and recreate all the member leases. Recover the cluster from loss of quorum as discussed here.\n11. Scale up of a healthy cluster Observed state Cluster Size Desired: d Current: n where d \u003e n StatefulSet replicas Desired: N/A Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Add d - n new members by scaling the StatefulSet to replicas: d. The rest of the StatefulSet spec need not be updated until the next cluster bootstrapping (alternatively, the rest of the StatefulSet spec can be updated pro-actively once the new members join the cluster. This will trigger a rolling update).\nAlso, create the additional member leases for the d - n new members.\n12. Scale down of a healthy cluster Observed state Cluster Size Desired: d Current: n where d \u003c n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Remove d - n existing members (numbered d, d + 1 … n) by scaling the StatefulSet to replicas: d. The StatefulSet spec need not be updated until the next cluster bootstrapping (alternatively, the StatefulSet spec can be updated pro-actively once the superfluous members exit the cluster. This will trigger a rolling update).\nAlso, delete the member leases for the d - n members being removed.\nThe superfluous entries in the members array will be cleaned up as explained here. The superfluous members in the ETCD cluster will be cleaned up by the leading etcd-backup-restore sidecar.\n13. Superfluous member entries in Etcd status Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: m where m \u003e n Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Remove the superfluous m - n member entries from Etcd status (numbered n, n+1 … m). Remove the superfluous m - n member leases if they exist. The superfluous members in the ETCD cluster will be cleaned up by the leading etcd-backup-restore sidecar.\nDecision table for etcd-backup-restore during initialization As discussed above, the initialization sequence of etcd-backup-restore in a member pod needs to generate suitable etcd configuration for its etcd container. It also might have to handle the etcd database verification and restoration functionality differently in different scenarios.\nThe initialization sequence itself is proposed to be as follows. It is an enhancement of the existing initialization sequence. The details of the decisions to be taken during the initialization are given below.\n1. First member during bootstrap of a fresh etcd cluster Observed state Cluster Size: n Etcd status members: Total: 0 Ready: 0 Status contains own member: false Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: false Backup Backup exists: false Backup has incremental snapshots: false Recommended Action Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n2. Addition of a new following member during bootstrap of a fresh etcd cluster Observed state Cluster Size: n Etcd status members: Total: m where 0 \u003c m \u003c n Ready: m Status contains own member: false Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: false Backup Backup exists: false Backup has incremental snapshots: false Recommended Action Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n3. Restart of an existing member of a quorate cluster with valid metadata and data Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: true Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Re-use previously generated etcd configuration and return success.\n4. Restart of an existing member of a quorate cluster with valid metadata but without valid data Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Remove self as a member (old member ID) from the etcd cluster as well as Etcd status. Add self as a new member of the etcd cluster as well as in the Etcd status. If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\n5. Restart of an existing member of a quorate cluster without valid metadata Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: N/A Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Remove self as a member (old member ID) from the etcd cluster as well as Etcd status. Add self as a new member of the etcd cluster as well as in the Etcd status. If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\n6. Restart of an existing member of a non-quorate cluster with valid metadata and data Observed state Cluster Size: n Etcd status members: Total: m where m \u003c n/2 Ready: r where r \u003c n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: true Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Re-use previously generated etcd configuration and return success.\n7. Restart of the first member of a non-quorate cluster without valid data Observed state Cluster Size: n Etcd status members: Total: 0 Ready: 0 Status contains own member: false Data persistence WAL directory has cluster/ member metadata: N/A Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action If backups do not exist, create an empty data and WAL directory. If backups exist, restore the latest full snapshot. Start a single-node embedded etcd with initial cluster peer URLs containing only own peer URL and initial cluster state new. If incremental snapshots exist, apply them serially (honouring source transactions). Take and upload a full snapshot after incremental snapshots are applied successfully (please see here for more reasons why). Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n8. Restart of a following member of a non-quorate cluster without valid data Observed state Cluster Size: n Etcd status members: Total: m where 1 \u003c m \u003c n Ready: r where 1 \u003c r \u003c n Status contains own member: false Data persistence WAL directory has cluster/ member metadata: N/A Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\nBackup Only one of the etcd-backup-restore sidecars among the members are required to take the backup for a given ETCD cluster. This can be called a backup leader. There are two possibilities to ensure this.\nLeading ETCD main container’s sidecar is the backup leader The backup-restore sidecar could poll the etcd cluster and/or its own etcd main container to see if it is the leading member in the etcd cluster. This information can be used by the backup-restore sidecars to decide that sidecar of the leading etcd main container is the backup leader (i.e. responsible to for taking/uploading backups regularly).\nThe advantages of this approach are as follows.\n The approach is operationally and conceptually simple. The leading etcd container and backup-restore sidecar are always located in the same pod. Network traffic between the backup container and the etcd cluster will always be local. The disadvantage is that this approach may not age well in the future if we think about moving the backup-restore container as a separate pod rather than a sidecar container.\nIndependent leader election between backup-restore sidecars We could use the etcd lease mechanism to perform leader election among the backup-restore sidecars. For example, using something like go.etcd.io/etcd/clientv3/concurrency.\nThe advantage and disadvantages are pretty much the opposite of the approach above. The advantage being that this approach may age well in the future if we think about moving the backup-restore container as a separate pod rather than a sidecar container.\nThe disadvantages are as follows.\n The approach is operationally and conceptually a bit complex. The leading etcd container and backup-restore sidecar might potentially belong to different pods. Network traffic between the backup container and the etcd cluster might potentially be across nodes. History Compaction This proposal recommends to configure automatic history compaction on the individual members.\nDefragmentation Defragmentation is already triggered periodically by etcd-backup-restore. This proposal recommends to enhance this functionality to be performed only by the leading backup-restore container. The defragmentation must be performed only when etcd cluster is in full health and must be done in a rolling manner for each members to avoid disruption. The leading member should be defragmented last after all the rest of the members have been defragmented to minimise potential leadership changes caused by defragmentation. If the etcd cluster is unhealthy when it is time to trigger scheduled defragmentation, the defragmentation must be postponed until the cluster becomes healthy. This check must be done before triggering defragmentation for each member.\nWork-flows in etcd-backup-restore There are different work-flows in etcd-backup-restore. Some existing flows like initialization, scheduled backups and defragmentation have been enhanced or modified. Some new work-flows like status updates have been introduced. Some of these work-flows are sensitive to which etcd-backup-restore container is leading and some are not.\nThe life-cycle of these work-flows is shown below. Work-flows independent of leader election in all members Serve the HTTP API that all members are expected to support currently but some HTTP API call which are used to take out-of-sync delta or full snapshot should delegate the incoming HTTP requests to the leading-sidecar and one of the possible approach to achieve this is via an HTTP reverse proxy. Check the health of the respective etcd member and renew the corresponding member lease. Work-flows only on the leading member Take backups (full and incremental) at configured regular intervals Defragment all the members sequentially at configured regular intervals Cleanup superflous members from the ETCD cluster for which there is no corresponding pod (the ordinal in the pod name is greater than the cluster size) at regular intervals (or whenever the Etcd resource status changes by watching it) The cleanup of superfluous entries in status.members array is already covered here High Availability Considering that high-availability is the primary reason for using a multi-node etcd cluster, it makes sense to distribute the individual member pods of the etcd cluster across different physical nodes. If the underlying Kubernetes cluster has nodes from multiple availability zones, it makes sense to also distribute the member pods across nodes from different availability zones.\nOne possibility to do this is via SelectorSpreadPriority of kube-scheduler but this is only best-effort and may not always be enforced strictly.\nIt is better to use pod anti-affinity to enforce such distribution of member pods.\nZonal Cluster - Single Availability Zone A zonal cluster is configured to consist of nodes belonging to only a single availability zone in a region of the cloud provider. In such a case, we can at best distribute the member pods of a multi-node etcd cluster instance only across different nodes in the configured availability zone.\nThis can be done by specifying pod anti-affinity in the specification of the member pods using kubernetes.io/hostname as the topology key.\napiVersion: apps/v1 kind: StatefulSet ... spec: ... template: ... spec: ... affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: {} # podSelector that matches the member pods of the given etcd cluster instance topologyKey: \"kubernetes.io/hostname\" ... ... ... The recommendation is to keep etcd-druid agnostic of such topics related scheduling and cluster-topology and to use kupid to orthogonally inject the desired pod anti-affinity.\nAlternative Another option is to build the functionality into etcd-druid to include the required pod anti-affinity when it provisions the StatefulSet that manages the member pods. While this has the advantage of avoiding a dependency on an external component like kupid, the disadvantage is that we might need to address development or testing use-cases where it might be desirable to avoid distributing member pods and schedule them on as less number of nodes as possible. Also, as mentioned below, kupid can be used to distribute member pods of an etcd cluster instance across nodes in a single availability zone as well as across nodes in multiple availability zones with very minor variation. This keeps the solution uniform regardless of the topology of the underlying Kubernetes cluster.\nRegional Cluster - Multiple Availability Zones A regional cluster is configured to consist of nodes belonging to multiple availability zones (typically, three) in a region of the cloud provider. In such a case, we can distribute the member pods of a multi-node etcd cluster instance across nodes belonging to different availability zones.\nThis can be done by specifying pod anti-affinity in the specification of the member pods using topology.kubernetes.io/zone as the topology key. In Kubernetes clusters using Kubernetes release older than 1.17, the older (and now deprecated) failure-domain.beta.kubernetes.io/zone might have to be used as the topology key.\napiVersion: apps/v1 kind: StatefulSet ... spec: ... template: ... spec: ... affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: {} # podSelector that matches the member pods of the given etcd cluster instance topologyKey: \"topology.kubernetes.io/zone ... ... ... The recommendation is to keep etcd-druid agnostic of such topics related scheduling and cluster-topology and to use kupid to orthogonally inject the desired pod anti-affinity.\nAlternative Another option is to build the functionality into etcd-druid to include the required pod anti-affinity when it provisions the StatefulSet that manages the member pods. While this has the advantage of avoiding a dependency on an external component like kupid, the disadvantage is that such built-in support necessarily limits what kind of topologies of the underlying cluster will be supported. Hence, it is better to keep etcd-druid altogether agnostic of issues related to scheduling and cluster-topology.\nPodDisruptionBudget This proposal recommends that etcd-druid should deploy PodDisruptionBudget (minAvailable set to floor(\u003ccluster size\u003e/2) + 1) for multi-node etcd clusters (if AllMembersReady condition is true) to ensure that any planned disruptive operation can try and honour the disruption budget to ensure high availability of the etcd cluster while making potentially disrupting maintenance operations.\nAlso, it is recommended to toggle the minAvailable field between floor(\u003ccluster size\u003e/2) and \u003cnumber of members with status Ready true\u003e whenever the AllMembersReady condition toggles between true and false. This is to disable eviction of any member pods when not all members are Ready.\nIn case of a conflict, the recommendation is to use the highest of the applicable values for minAvailable.\nRolling updates to etcd members Any changes to the Etcd resource spec that might result in a change to StatefulSet spec or otherwise result in a rolling update of member pods should be applied/propagated by etcd-druid only when the etcd cluster is fully healthy to reduce the risk of quorum loss during the updates. This would include vertical autoscaling changes (via, HVPA). If the cluster status unhealthy (i.e. if either AllMembersReady or BackupReady conditions are false), etcd-druid must restore it to full health before proceeding with such operations that lead to rolling updates. This can be further optimized in the future to handle the cases where rolling updates can still be performed on an etcd cluster that is not fully healthy.\nFollow Up Ephemeral Volumes See section Ephemeral Volumes.\nShoot Control-Plane Migration This proposal adds support for multi-node etcd clusters but it should not have significant impact on shoot control-plane migration any more than what already present in the single-node etcd cluster scenario. But to be sure, this needs to be discussed further.\nPerformance impact of multi-node etcd clusters Multi-node etcd clusters incur a cost on write performance as compared to single-node etcd clusters. This performance impact needs to be measured and documented. Here, we should compare different persistence option for the multi-nodeetcd clusters so that we have all the information necessary to take the decision balancing the high-availability, performance and costs.\nMetrics, Dashboards and Alerts There are already metrics exported by etcd and etcd-backup-restore which are visualized in monitoring dashboards and also used in triggering alerts. These might have hidden assumptions about single-node etcd clusters. These might need to be enhanced and potentially new metrics, dashboards and alerts configured to cover the multi-node etcd cluster scenario.\nEspecially, a high priority alert must be raised if BackupReady condition becomes false.\nCosts Multi-node etcd clusters will clearly involve higher cost (when compared with single-node etcd clusters) just going by the CPU and memory usage for the additional members. Also, the different options for persistence for etcd data for the members will have different cost implications. Such cost impact needs to be assessed and documented to help navigate the trade offs between high availability, performance and costs.\nFuture Work Gardener Ring Gardener Ring, requires provisioning and management of an etcd cluster with the members distributed across more than one Kubernetes cluster. This cannot be achieved by etcd-druid alone which has only the view of a single Kubernetes cluster. An additional component that has the view of all the Kubernetes clusters involved in setting up the gardener ring will be required to achieve this. However, etcd-druid can be used by such a higher-level component/controller (for example, by supplying the initial cluster configuration) such that individual etcd-druid instances in the individual Kubernetes clusters can manage the corresponding etcd cluster members.\nAutonomous Shoot Clusters Autonomous Shoot Clusters also will require a highly availble etcd cluster to back its control-plane and the multi-node support proposed here can be leveraged in that context. However, the current proposal will not meet all the needs of a autonomous shoot cluster. Some additional components will be required that have the overall view of the autonomous shoot cluster and they can use etcd-druid to manage the multi-node etcd cluster. But this scenario may be different from that of Gardener Ring in that the individual etcd members of the cluster may not be hosted on different Kubernetes clusters.\nOptimization of recovery from non-quorate cluster with some member containing valid data It might be possible to optimize the actions during the recovery of a non-quorate cluster where some of the members contain valid data and some other don’t. The optimization involves verifying the data of the valid members to determine the data of which member is the most recent (even considering the latest backup) so that the full snapshot can be taken from it before recovering the etcd cluster. Such an optimization can be attempted in the future.\nOptimization of rolling updates to unhealthy etcd clusters As mentioned above, optimizations to proceed with rolling updates to unhealthy etcd clusters (without first restoring the cluster to full health) can be pursued in future work.\n","categories":"","description":"","excerpt":"Multi-node etcd cluster instances via etcd-druid This document …","ref":"/docs/other-components/etcd-druid/proposals/01-multi-node-etcd-clusters/","tags":"","title":"01 Multi Node Etcd Clusters"},{"body":"Snapshot Compaction for Etcd Current Problem To ensure recoverability of Etcd, backups of the database are taken at regular interval. Backups are of two types: Full Snapshots and Incremental Snapshots.\nFull Snapshots Full snapshot is a snapshot of the complete database at given point in time.The size of the database keeps changing with time and typically the size is relatively large (measured in 100s of megabytes or even in gigabytes. For this reason, full snapshots are taken after some large intervals.\nIncremental Snapshots Incremental Snapshots are collection of events on Etcd database, obtained through running WATCH API Call on Etcd. After some short intervals, all the events that are accumulated through WATCH API Call are saved in a file and named as Incremental Snapshots at relatively short time intervals.\nRecovery from the Snapshots Recovery from Full Snapshots As the full snapshots are snapshots of the complete database, the whole database can be recovered from a full snapshot in one go. Etcd provides API Call to restore the database from a full snapshot file.\nRecovery from Incremental Snapshots Delta snapshots are collection of retrospective Etcd events. So, to restore from Incremental snapshot file, the events from the file are needed to be applied sequentially on Etcd database through Etcd Put/Delete API calls. As it is heavily dependent on Etcd calls sequentially, restoring from Incremental Snapshot files can take long if there are numerous commands captured in Incremental Snapshot files.\nDelta snapshots are applied on top of running Etcd database. So, if there is inconsistency between the state of database at the point of applying and the state of the database when the delta snapshot commands were captured, restoration will fail.\nCurrently, in Gardener setup, Etcd is restored from the last full snapshot and then the delta snapshots, which were captured after the last full snapshot.\nThe main problem with this is that the complete restoration time can be unacceptably large if the rate of change coming into the etcd database is quite high because there are large number of events in the delta snapshots to be applied sequentially. A secondary problem is that, though auto-compaction is enabled for etcd, it is not quick enough to compact all the changes from the incremental snapshots being re-applied during the relatively short period of time of restoration (as compared to the actual period of time when the incremental snapshots were accumulated). This may lead to the etcd pod (the backup-restore sidecar container, to be precise) to run out of memory and/or storage space even if it is sufficient for normal operations.\nSolution Compaction command To help with the problem mentioned earlier, our proposal is to introduce compact subcommand with etcdbrctl. On execution of compact command, A separate embedded Etcd process will be started where the Etcd data will be restored from the snapstore (exactly as in the restoration scenario today). Then the new Etcd database will be compacted and defragmented using Etcd API calls. The compaction will strip off the Etcd database of old revisions as per the Etcd auto-compaction configuration. The defragmentation will free up the unused fragment memory space released after compaction. Then a full snapshot of the compacted database will be saved in snapstore which then can be used as the base snapshot during any subsequent restoration (or backup compaction).\nHow the solution works The newly introduced compact command does not disturb the running Etcd while compacting the backup snapshots. The command is designed to run potentially separately (from the main Etcd process/container/pod). Etcd Druid can be configured to run the newly introduced compact command as a separate job (scheduled periodically) based on total number of Etcd events accumulated after the most recent full snapshot.\nEtcd-druid flags: Etcd-druid introduces the following flags to configure the compaction job:\n --enable-backup-compaction (default false): Set this flag to true to enable the automatic compaction of etcd backups when the threshold value denoted by CLI flag --etcd-events-threshold is exceeded. --compaction-workers (default 3): Number of worker threads of the CompactionJob controller. The controller creates a backup compaction job if a certain etcd event threshold is reached. If compaction is enabled, the value for this flag must be greater than zero. --etcd-events-threshold (default 1000000): Total number of etcd events that can be allowed before a backup compaction job is triggered. --active-deadline-duration (default 3h): Duration after which a running backup compaction job will be terminated. --metrics-scrape-wait-duration (default 0s): Duration to wait for after compaction job is completed, to allow Prometheus metrics to be scraped. Points to take care while saving the compacted snapshot: As compacted snapshot and the existing periodic full snapshots are taken by different processes running in different pods but accessing same store to save the snapshots, some problems may arise:\n When uploading the compacted snapshot to the snapstore, there is the problem of how does the restorer know when to start using the newly compacted snapshot. This communication needs to be atomic. With a regular schedule for compaction that happens potentially separately from the main etcd pod, is there a need for regular scheduled full snapshots anymore? We are planning to introduce new directory structure, under v2 prefix, for saving the snapshots (compacted and full), as mentioned in details below. But for backward compatibility, we also need to consider the older directory, which is currently under v1 prefix, during accessing snapshots. How to swap full snapshot with compacted snapshot atomically Currently, full snapshots and the subsequent delta snapshots are grouped under same prefix path in the snapstore. When a full snapshot is created, it is placed under a prefix/directory with the name comprising of timestamp. Then subsequent delta snapshots are also pushed into the same directory. Thus each prefix/directory contains a single full snapshot and the subsequent delta snapshots. So far, it is the job of ETCDBR to start main Etcd process and snapshotter process which takes full snapshot and delta snapshot periodically. But as per our proposal, compaction will be running as parallel process to main Etcd process and snapshotter process. So we can’t reliably co-ordinate between the processes to achieve switching to the compacted snapshot as the base snapshot atomically.\nCurrent Directory Structure - Backup-192345 - Full-Snapshot-0-1-192345 - Incremental-Snapshot-1-100-192355 - Incremental-Snapshot-100-200-192365 - Incremental-Snapshot-200-300-192375 - Backup-192789 - Full-Snapshot-0-300-192789 - Incremental-Snapshot-300-400-192799 - Incremental-Snapshot-400-500-192809 - Incremental-Snapshot-500-600-192819 To solve the problem, proposal is:\n ETCDBR will take the first full snapshot after it starts main Etcd Process and snapshotter process. After taking the first full snapshot, snapshotter will continue taking full snapshots. On the other hand, ETCDBR compactor command will be run as periodic job in a separate pod and use the existing full or compacted snapshots to produce further compacted snapshots. Full snapshots and compacted snapshots will be named after same fashion. So, there is no need of any mechanism to choose which snapshots(among full and compacted snapshot) to consider as base snapshots. Flatten the directory structure of backup folder. Save all the full snapshots, delta snapshots and compacted snapshots under same directory/prefix. Restorer will restore from full/compacted snapshots and delta snapshots sorted based on the revision numbers in name (or timestamp if the revision numbers are equal). Proposed Directory Structure Backup : - Full-Snapshot-0-1-192355 (Taken by snapshotter) - Incremental-Snapshot-revision-1-100-192365 - Incremental-Snapshot-revision-100-200-192375 - Full-Snapshot-revision-0-200-192379 (Taken by snapshotter) - Incremental-Snapshot-revision-200-300-192385 - Full-Snapshot-revision-0-300-192386 (Taken by compaction job) - Incremental-Snapshot-revision-300-400-192396 - Incremental-Snapshot-revision-400-500-192406 - Incremental-Snapshot-revision-500-600-192416 - Full-Snapshot-revision-0-600-192419 (Taken by snapshotter) - Full-Snapshot-revision-0-600-192420 (Taken by compaction job) What happens to the delta snapshots that were compacted? The proposed compaction sub-command in etcdbrctl (and hence, the CronJob provisioned by etcd-druid that will schedule it at a regular interval) would only upload the compacted full snapshot. It will not delete the snapshots (delta or full snapshots) that were compacted. These snapshots which were superseded by a freshly uploaded compacted snapshot would follow the same life-cycle as other older snapshots. I.e. they will be garbage collected according to the configured backup snapshot retention policy. For example, if an exponential retention policy is configured and if compaction is done every 30m then there might be at most 48 additional (compacted) full snapshots (24h * 2) in the backup for the latest day. As time rolls forward to the next day, these additional compacted snapshots (along with the delta snapshots that were compacted into them) will get garbage collected retaining only one full snapshot for the day before according to the retention policy.\nFuture work In the future, we have plan to stop the snapshotter just after taking the first full snapshot. Then, the compaction job will be solely responsible for taking subsequent full snapshots. The directory structure would be looking like following:\nBackup : - Full-Snapshot-0-1-192355 (Taken by snapshotter) - Incremental-Snapshot-revision-1-100-192365 - Incremental-Snapshot-revision-100-200-192375 - Incremental-Snapshot-revision-200-300-192385 - Full-Snapshot-revision-0-300-192386 (Taken by compaction job) - Incremental-Snapshot-revision-300-400-192396 - Incremental-Snapshot-revision-400-500-192406 - Incremental-Snapshot-revision-500-600-192416 - Full-Snapshot-revision-0-600-192420 (Taken by compaction job) Backward Compatibility Restoration : The changes to handle the newly proposed backup directory structure must be backward compatible with older structures at least for restoration because we need have to restore from backups in the older structure. This includes the support for restoring from a backup without a metadata file if that is used in the actual implementation. Backup : For new snapshots (even on a backup containing the older structure), the new structure may be used. The new structure must be setup automatically including creating the base full snapshot. Garbage collection : The existing functionality of garbage collection of snapshots (full and incremental) according to the backup retention policy must be compatible with both old and new backup folder structure. I.e. the snapshots in the older backup structure must be retained in their own structure and the snapshots in the proposed backup structure should be retained in the proposed structure. Once all the snapshots in the older backup structure go out of the retention policy and are garbage collected, we can think of removing the support for older backup folder structure. Note: Compactor will run parallel to current snapshotter process and work only if there is any full snapshot already present in the store. By current design, a full snapshot will be taken if there is already no full snapshot or the existing full snapshot is older than 24 hours. It is not limitation but a design choice. As per proposed design, the backup storage will contain both periodic full snapshots as well as periodic compacted snapshot. Restorer will pickup the base snapshot whichever is latest one.\n","categories":"","description":"","excerpt":"Snapshot Compaction for Etcd Current Problem To ensure recoverability …","ref":"/docs/other-components/etcd-druid/proposals/02-snapshot-compaction/","tags":"","title":"02 Snapshot Compaction"},{"body":"Scaling-up a single-node to multi-node etcd cluster deployed by etcd-druid To mark a cluster for scale-up from single node to multi-node etcd, just patch the etcd custom resource’s .spec.replicas from 1 to 3 (for example).\nChallenges for scale-up Etcd cluster with single replica don’t have any peers, so no peer communication is required hence peer URL may or may not be TLS enabled. However, while scaling up from single node etcd to multi-node etcd, there will be a requirement to have peer communication between members of the etcd cluster. Peer communication is required for various reasons, for instance for members to sync up cluster state, data, and to perform leader election or any cluster wide operation like removal or addition of a member etc. Hence in a multi-node etcd cluster we need to have TLS enable peer URL for peer communication. Providing the correct configuration to start new etcd members as it is different from boostrapping a cluster since these new etcd members will join an existing cluster. Approach We first went through the etcd doc of update-advertise-peer-urls to find out information regarding peer URL updation. Interestingly, etcd doc has mentioned the following:\nTo update the advertise peer URLs of a member, first update it explicitly via member command and then restart the member. But we can’t assume peer URL is not TLS enabled for single-node cluster as it depends on end-user. A user may or may not enable the TLS for peer URL for a single node etcd cluster. So, How do we detect whether peer URL was enabled or not when cluster is marked for scale-up?\nDetecting if peerURL TLS is enabled or not For this we use an annotation in member lease object member.etcd.gardener.cloud/tls-enabled set by backup-restore sidecar of etcd. As etcd configuration is provided by backup-restore, so it can find out whether TLS is enabled or not and accordingly set this annotation member.etcd.gardener.cloud/tls-enabled to either true or false in member lease object. And with the help of this annotation and config-map values etcd-druid is able to detect whether there is a change in a peer URL or not.\nEtcd-Druid helps in scaling up etcd cluster Now, it is detected whether peer URL was TLS enabled or not for single node etcd cluster. Etcd-druid can now use this information to take action:\n If peer URL was already TLS enabled then no action is required from etcd-druid side. Etcd-druid can proceed with scaling up the cluster. If peer URL was not TLS enabled then etcd-druid has to intervene and make sure peer URL should be TLS enabled first for the single node before marking the cluster for scale-up. Action taken by etcd-druid to enable the peerURL TLS Etcd-druid will update the etcd-bootstrap config-map with new config like initial-cluster,initial-advertise-peer-urls etc. Backup-restore will detect this change and update the member lease annotation to member.etcd.gardener.cloud/tls-enabled: \"true\". In case the peer URL TLS has been changed to enabled: Etcd-druid will add tasks to the deployment flow: Check if peer TLS has been enabled for existing StatefulSet pods, by checking the member leases for the annotation member.etcd.gardener.cloud/tls-enabled. If peer TLS enablement is pending for any of the members, then check and patch the StatefulSet with the peer TLS volume mounts, if not already patched. This will cause a rolling update of the existing StatefulSet pods, which allows etcd-backup-restore to update the member peer URL in the etcd cluster. Requeue this reconciliation flow until peer TLS has been enabled for all the existing etcd members. After PeerURL is TLS enabled After peer URL TLS enablement for single node etcd cluster, now etcd-druid adds a scale-up annotation: gardener.cloud/scaled-to-multi-node to the etcd statefulset and etcd-druid will patch the statefulsets .spec.replicas to 3(for example). The statefulset controller will then bring up new pods(etcd with backup-restore as a sidecar). Now etcd’s sidecar i.e backup-restore will check whether this member is already a part of a cluster or not and incase it is unable to check (may be due to some network issues) then backup-restore checks presence of this annotation: gardener.cloud/scaled-to-multi-node in etcd statefulset to detect scale-up. If it finds out it is the scale-up case then backup-restore adds new etcd member as a learner first and then starts the etcd learner by providing the correct configuration. Once learner gets in sync with the etcd cluster leader, it will get promoted to a voting member.\nProviding the correct etcd config As backup-restore detects that it’s a scale-up scenario, backup-restore sets initial-cluster-state to existing as this member will join an existing cluster and it calculates the rest of the config from the updated config-map provided by etcd-druid.\nFuture improvements: The need of restarting etcd pods twice will change in the future. please refer: https://github.com/gardener/etcd-backup-restore/issues/538\n","categories":"","description":"","excerpt":"Scaling-up a single-node to multi-node etcd cluster deployed by …","ref":"/docs/other-components/etcd-druid/proposals/03-scaling-up-an-etcd-cluster/","tags":"","title":"03 Scaling Up An Etcd Cluster"},{"body":"Question You have deployed an application with a web UI or an internal endpoint in your Kubernetes (K8s) cluster. How to access this endpoint without an external load balancer (e.g., Ingress)?\nThis tutorial presents two options:\n Using Kubernetes port forward Using Kubernetes apiserver proxy Please note that the options described here are mostly for quick testing or troubleshooting your application. For enabling access to your application for productive environment, please refer to the official Kubernetes documentation.\nSolution 1: Using Kubernetes Port Forward You could use the port forwarding functionality of kubectl to access the pods from your local host without involving a service.\nTo access any pod follow these steps:\n Run kubectl get pods Note down the name of the pod in question as \u003cyour-pod-name\u003e Run kubectl port-forward \u003cyour-pod-name\u003e \u003clocal-port\u003e:\u003cyour-app-port\u003e Run a web browser or curl locally and enter the URL: http(s)://localhost:\u003clocal-port\u003e In addition, kubectl port-forward allows using a resource name, such as a deployment name or service name, to select a matching pod to port forward. More details can be found in the Kubernetes documentation.\nThe main drawback of this approach is that the pod’s name changes as soon as it is restarted. Moreover, you need to have a web browser on your client and you need to make sure that the local port is not already used by an application running on your system. Finally, sometimes the port forwarding is canceled due to nonobvious reasons. This leads to a kind of shaky approach. A more stable possibility is based on accessing the app via the kube-proxy, which accesses the corresponding service.\nSolution 2: Using the apiserver Proxy of Your Kubernetes Cluster There are several different proxies in Kubernetes. In this tutorial we will be using apiserver proxy to enable the access to the services in your cluster without Ingress. Unlike the first solution, here a service is required.\nUse the following format to compose a URL for accessing your service through an existing proxy on the Kubernetes cluster:\nhttps://\u003cyour-cluster-master\u003e/api/v1/namespace/\u003cyour-namespace\u003e/services/\u003cyour-service\u003e:\u003cyour-service-port\u003e/proxy/\u003cservice-endpoint\u003e\nExample:\n your-main-cluster your-namespace your-service your-service-port your-service-endpoint url to access service api.testclstr.cpet.k8s.sapcloud.io default nginx-svc 80 / http://api.testclstr.cpet.k8s.sapcloud.io/api/v1/namespaces/default/services/nginx-svc:80/proxy/ api.testclstr.cpet.k8s.sapcloud.io default docker-nodejs-svc 4500 /cpu?baseNumber=4 https://api.testclstr.cpet.k8s.sapcloud.io/api/v1/namespaces/default/services/docker-nodejs-svc:4500/proxy/cpu?baseNumber=4 For more details on the format, please refer to the official Kubernetes documentation.\nNote There are applications which do not support relative URLs yet, e.g. Prometheus (as of November, 2022). This typically leads to missing JavaScript objects, which could be investigated with your browser’s development tools. If such an issue occurs, please use the port-forward approach described above. ","categories":"","description":"","excerpt":"Question You have deployed an application with a web UI or an internal …","ref":"/docs/guides/applications/access-pod-from-local/","tags":"","title":"Access a Port of a Pod Locally"},{"body":"Access Restrictions The dashboard can be configured with access restrictions.\nAccess restrictions are shown for regions that have a matching label in the CloudProfile\n regions: - name: pangaea-north-1 zones: - name: pangaea-north-1a - name: pangaea-north-1b - name: pangaea-north-1c labels: seed.gardener.cloud/eu-access: \"true\" If the user selects the access restriction, spec.seedSelector.matchLabels[key] will be set. When selecting an option, metadata.annotations[optionKey] will be set. The value that is set depends on the configuration. See 2. under Configuration section below.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: annotations: support.gardener.cloud/eu-access-for-cluster-addons: \"true\" support.gardener.cloud/eu-access-for-cluster-nodes: \"true\" ... spec: seedSelector: matchLabels: seed.gardener.cloud/eu-access: \"true\" In order for the shoot (with enabled access restriction) to be scheduled on a seed, the seed needs to have the label set. E.g.\napiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: seed.gardener.cloud/eu-access: \"true\" ... Configuration As gardener administrator:\n you can control the visibility of the chips with the accessRestriction.items[].display.visibleIf and accessRestriction.items[].options[].display.visibleIf property. E.g. in this example the access restriction chip is shown if the value is true and the option is shown if the value is false. you can control the value of the input field (switch / checkbox) with the accessRestriction.items[].input.inverted and accessRestriction.items[].options[].input.inverted property. Setting the inverted property to true will invert the value. That means that when selecting the input field the value will be'false' instead of 'true'. you can configure the text that is displayed when no access restriction options are available by setting accessRestriction.noItemsText example values.yaml: accessRestriction: noItemsText: No access restriction options available for region {region} and cloud profile {cloudProfile} items: - key: seed.gardener.cloud/eu-access display: visibleIf: true # title: foo # optional title, if not defined key will be used # description: bar # optional description displayed in a tooltip input: title: EU Access description: | This service is offered to you with our regular SLAs and 24x7 support for the control plane of the cluster. 24x7 support for cluster add-ons and nodes is only available if you meet the following conditions: options: - key: support.gardener.cloud/eu-access-for-cluster-addons display: visibleIf: false # title: bar # optional title, if not defined key will be used # description: baz # optional description displayed in a tooltip input: title: No personal data is used as name or in the content of Gardener or Kubernetes resources (e.g. Gardener project name or Kubernetes namespace, configMap or secret in Gardener or Kubernetes) description: | If you can't comply, only third-level/dev support at usual 8x5 working hours in EEA will be available to you for all cluster add-ons such as DNS and certificates, Calico overlay network and network policies, kube-proxy and services, and everything else that would require direct inspection of your cluster through its API server inverted: true - key: support.gardener.cloud/eu-access-for-cluster-nodes display: visibleIf: false input: title: No personal data is stored in any Kubernetes volume except for container file system, emptyDirs, and persistentVolumes (in particular, not on hostPath volumes) description: | If you can't comply, only third-level/dev support at usual 8x5 working hours in EEA will be available to you for all node-related components such as Docker and Kubelet, the operating system, and everything else that would require direct inspection of your nodes through a privileged pod or SSH inverted: true ","categories":"","description":"","excerpt":"Access Restrictions The dashboard can be configured with access …","ref":"/docs/dashboard/access-restrictions/","tags":"","title":"Access Restrictions"},{"body":"Access to the Garden Cluster for Extensions Gardener offers different means to provide or equip registered extensions with a kubeconfig which may be used to connect to the garden cluster.\nAdmission Controllers For extensions with an admission controller deployment, gardener-operator injects a token-based kubeconfig as a volume and volume mount. The token is valid for 12h, automatically renewed, and associated with a dedicated ServiceAccount in the garden cluster. The path to this kubeconfig is revealed under the GARDEN_KUBECONFIG environment variable, also added to the pod spec(s).\nExtensions on Seed Clusters Extensions that are installed on seed clusters via a ControllerInstallation can simply read the kubeconfig file specified by the GARDEN_KUBECONFIG environment variable to create a garden cluster client. With this, they use a short-lived token (valid for 12h) associated with a dedicated ServiceAccount in the seed-\u003cseed-name\u003e namespace to securely access the garden cluster. The used ServiceAccounts are granted permissions in the garden cluster similar to gardenlet clients.\nBackground Historically, gardenlet has been the only component running in the seed cluster that has access to both the seed cluster and the garden cluster. Accordingly, extensions running on the seed cluster didn’t have access to the garden cluster.\nStarting from Gardener v1.74.0, there is a new mechanism for components running on seed clusters to get access to the garden cluster. For this, gardenlet runs an instance of the TokenRequestor for requesting tokens that can be used to communicate with the garden cluster.\nUsing Gardenlet-Managed Garden Access By default, extensions are equipped with secure access to the garden cluster using a dedicated ServiceAccount without requiring any additional action. They can simply read the file specified by the GARDEN_KUBECONFIG and construct a garden client with it.\nWhen installing a ControllerInstallation, gardenlet creates two secrets in the installation’s namespace: a generic garden kubeconfig (generic-garden-kubeconfig-\u003chash\u003e) and a garden access secret (garden-access-extension). Note that the ServiceAccount created based on this access secret will be created in the respective seed-* namespace in the garden cluster and labelled with controllerregistration.core.gardener.cloud/name=\u003cname\u003e.\nAdditionally, gardenlet injects volume, volumeMounts, and two environment variables into all (init) containers in all objects in the apps and batch API groups:\n GARDEN_KUBECONFIG: points to the path where the generic garden kubeconfig is mounted. SEED_NAME: set to the name of the Seed where the extension is installed. This is useful for restricting watches in the garden cluster to relevant objects. If an object already contains the GARDEN_KUBECONFIG environment variable, it is not overwritten and injection of volume and volumeMounts is skipped.\nFor example, a Deployment deployed via a ControllerInstallation will be mutated as follows:\napiVersion: apps/v1 kind: Deployment metadata: name: gardener-extension-provider-local annotations: reference.resources.gardener.cloud/secret-795f7ca6: garden-access-extension reference.resources.gardener.cloud/secret-d5f5a834: generic-garden-kubeconfig-81fb3a88 spec: template: metadata: annotations: reference.resources.gardener.cloud/secret-795f7ca6: garden-access-extension reference.resources.gardener.cloud/secret-d5f5a834: generic-garden-kubeconfig-81fb3a88 spec: containers: - name: gardener-extension-provider-local env: - name: GARDEN_KUBECONFIG value: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig/kubeconfig - name: SEED_NAME value: local volumeMounts: - mountPath: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig name: garden-kubeconfig readOnly: true volumes: - name: garden-kubeconfig projected: defaultMode: 420 sources: - secret: items: - key: kubeconfig path: kubeconfig name: generic-garden-kubeconfig-81fb3a88 optional: false - secret: items: - key: token path: token name: garden-access-extension optional: false The generic garden kubeconfig will look like this:\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: LS0t... server: https://garden.local.gardener.cloud:6443 name: garden contexts: - context: cluster: garden user: extension name: garden current-context: garden users: - name: extension user: tokenFile: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig/token Manually Requesting a Token for the Garden Cluster Seed components that need to communicate with the garden cluster can request a token in the garden cluster by creating a garden access secret. This secret has to be labelled with resources.gardener.cloud/purpose=token-requestor and resources.gardener.cloud/class=garden, e.g.:\napiVersion: v1 kind: Secret metadata: name: garden-access-example namespace: example labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: garden annotations: serviceaccount.resources.gardener.cloud/name: example type: Opaque This will instruct gardenlet to create a new ServiceAccount named example in its own seed-\u003cseed-name\u003e namespace in the garden cluster, request a token for it, and populate the token in the secret’s data under the token key.\nPermissions in the Garden Cluster Both the SeedAuthorizer and the SeedRestriction plugin handle extensions clients and generally grant the same permissions in the garden cluster to them as to gardenlet clients. With this, extensions are restricted to work with objects in the garden cluster that are related to seed they are running one just like gardenlet. Note that if the plugins are not enabled, extension clients are only granted read access to global resources like CloudProfiles (this is granted to all authenticated users). There are a few exceptions to the granted permissions as documented here.\nAdditional Permissions If an extension needs access to additional resources in the garden cluster (e.g., extension-specific custom resources), permissions need to be granted via the usual RBAC means. Let’s consider the following example: An extension requires the privileges to create authorization.k8s.io/v1.SubjectAccessReviews (which is not covered by the “default” permissions mentioned above). This requires a human Gardener operator to create a ClusterRole in the garden cluster with the needed rules:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: extension-create-subjectaccessreviews annotations: authorization.gardener.cloud/extensions-serviceaccount-selector: '{\"matchLabels\":{\"controllerregistration.core.gardener.cloud/name\":\"\u003cextension-name\u003e\"}}' labels: authorization.gardener.cloud/custom-extensions-permissions: \"true\" rules: - apiGroups: - authorization.k8s.io resources: - subjectaccessreviews verbs: - create Note the label authorization.gardener.cloud/extensions-serviceaccount-selector which contains a label selector for ServiceAccounts.\nThere is a controller part of gardener-controller-manager which takes care of maintaining the respective ClusterRoleBinding resources. It binds all ServiceAccounts in the seed namespaces in the garden cluster (i.e., all extension clients) whose labels match. You can read more about this controller here.\nCustom Permissions If an extension wants to create a dedicated ServiceAccount for accessing the garden cluster without automatically inheriting all permissions of the gardenlet, it first needs to create a garden access secret in its extension namespace in the seed cluster:\napiVersion: v1 kind: Secret metadata: name: my-custom-component namespace: \u003cextension-namespace\u003e labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: garden annotations: serviceaccount.resources.gardener.cloud/name: my-custom-component-extension-foo serviceaccount.resources.gardener.cloud/labels: '{\"foo\":\"bar}' type: Opaque ❗️️Do not prefix the service account name with extension- to prevent inheriting the gardenlet permissions! It is still recommended to add the extension name (e.g., as a suffix) for easier identification where this ServiceAccount comes from.\nNext, you can follow the same approach described above. However, the authorization.gardener.cloud/extensions-serviceaccount-selector annotation should not contain controllerregistration.core.gardener.cloud/name=\u003cextension-name\u003e but rather custom labels, e.g. foo=bar.\nThis way, the created ServiceAccount will only get the permissions of above ClusterRole and nothing else.\nRenewing All Garden Access Secrets Operators can trigger an automatic renewal of all garden access secrets in a given Seed and their requested ServiceAccount tokens, e.g., when rotating the garden cluster’s ServiceAccount signing key. For this, the Seed has to be annotated with gardener.cloud/operation=renew-garden-access-secrets.\n","categories":"","description":"","excerpt":"Access to the Garden Cluster for Extensions Gardener offers different …","ref":"/docs/gardener/extensions/garden-api-access/","tags":"","title":"Access to the Garden Cluster for Extensions"},{"body":"Accessing Shoot Clusters After creation of a shoot cluster, end-users require a kubeconfig to access it. There are several options available to get to such kubeconfig.\nshoots/adminkubeconfig Subresource The shoots/adminkubeconfig subresource allows users to dynamically generate temporary kubeconfigs that can be used to access shoot cluster with cluster-admin privileges. The credentials associated with this kubeconfig are client certificates which have a very short validity and must be renewed before they expire (by calling the subresource endpoint again).\nThe username associated with such kubeconfig will be the same which is used for authenticating to the Gardener API. Apart from this advantage, the created kubeconfig will not be persisted anywhere.\nIn order to request such a kubeconfig, you can run the following commands (targeting the garden cluster):\nexport NAMESPACE=garden-my-namespace export SHOOT_NAME=my-shoot export KUBECONFIG=\u003ckubeconfig for garden cluster\u003e # can be set using \"gardenctl target --garden \u003clandscape\u003e\" kubectl create \\ -f \u003c(printf '{\"spec\":{\"expirationSeconds\":600}}') \\ --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/adminkubeconfig | \\ jq -r \".status.kubeconfig\" | \\ base64 -d You also can use controller-runtime client (\u003e= v0.14.3) to create such a kubeconfig from your go code like so:\nexpiration := 10 * time.Minute expirationSeconds := int64(expiration.Seconds()) adminKubeconfigRequest := \u0026authenticationv1alpha1.AdminKubeconfigRequest{ Spec: authenticationv1alpha1.AdminKubeconfigRequestSpec{ ExpirationSeconds: \u0026expirationSeconds, }, } err := client.SubResource(\"adminkubeconfig\").Create(ctx, shoot, adminKubeconfigRequest) if err != nil { return err } config = adminKubeconfigRequest.Status.Kubeconfig In Python, you can use the native kubernetes client to create such a kubeconfig like this:\n# This script first loads an existing kubeconfig from your system, and then sends a request to the Gardener API to create a new kubeconfig for a shoot cluster. # The received kubeconfig is then decoded and a new API client is created for interacting with the shoot cluster. import base64 import json from kubernetes import client, config import yaml # Set configuration options shoot_name=\"my-shoot\" # Name of the shoot project_namespace=\"garden-my-namespace\" # Namespace of the project # Load kubeconfig from default ~/.kube/config config.load_kube_config() api = client.ApiClient() # Create kubeconfig request kubeconfig_request = { 'apiVersion': 'authentication.gardener.cloud/v1alpha1', 'kind': 'AdminKubeconfigRequest', 'spec': { 'expirationSeconds': 600 } } response = api.call_api(resource_path=f'/apis/core.gardener.cloud/v1beta1/namespaces/{project_namespace}/shoots/{shoot_name}/adminkubeconfig', method='POST', body=kubeconfig_request, auth_settings=['BearerToken'], _preload_content=False, _return_http_data_only=True, ) decoded_kubeconfig = base64.b64decode(json.loads(response.data)[\"status\"][\"kubeconfig\"]).decode('utf-8') print(decoded_kubeconfig) # Create an API client to interact with the shoot cluster shoot_api_client = config.new_client_from_config_dict(yaml.safe_load(decoded_kubeconfig)) v1 = client.CoreV1Api(shoot_api_client) Note: The gardenctl-v2 tool simplifies targeting shoot clusters. It automatically downloads a kubeconfig that uses the gardenlogin kubectl auth plugin. This transparently manages authentication and certificate renewal without containing any credentials.\n shoots/viewerkubeconfig Subresource The shoots/viewerkubeconfig subresource works similar to the shoots/adminkubeconfig. The difference is that it returns a kubeconfig with read-only access for all APIs except the core/v1.Secret API and the resources which are specified in the spec.kubernetes.kubeAPIServer.encryptionConfig field in the Shoot (see this document).\nIn order to request such a kubeconfig, you can run follow almost the same code as above - the only difference is that you need to use the viewerkubeconfig subresource. For example, in bash this looks like this:\nexport NAMESPACE=garden-my-namespace export SHOOT_NAME=my-shoot kubectl create \\ -f \u003c(printf '{\"spec\":{\"expirationSeconds\":600}}') \\ --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/viewerkubeconfig | \\ jq -r \".status.kubeconfig\" | \\ base64 -d The examples for other programming languages are similar to the above and can be adapted accordingly.\nOpenID Connect The kube-apiserver of shoot clusters can be provided with OpenID Connect configuration via the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: oidcConfig: ... It is the end-user’s responsibility to incorporate the OpenID Connect configurations in the kubeconfig for accessing the cluster (i.e., Gardener will not automatically generate the kubeconfig based on these OIDC settings). The recommended way is using the kubectl plugin called kubectl oidc-login for OIDC authentication.\nIf you want to use the same OIDC configuration for all your shoots by default, then you can use the ClusterOpenIDConnectPreset and OpenIDConnectPreset API resources. They allow defaulting the .spec.kubernetes.kubeAPIServer.oidcConfig fields for newly created Shoots such that you don’t have to repeat yourself every time (similar to PodPreset resources in Kubernetes). ClusterOpenIDConnectPreset specified OIDC configuration applies to Projects and Shoots cluster-wide (hence, only available to Gardener operators), while OpenIDConnectPreset is Project-scoped. Shoots have to “opt-in” for such defaulting by using the oidc=enable label.\nFor further information on (Cluster)OpenIDConnectPreset, refer to ClusterOpenIDConnectPreset and OpenIDConnectPreset.\nFor shoots with Kubernetes version \u003e= 1.30, which have StructuredAuthenticationConfiguration feature gate enabled (enabled by default), it is advised to use Structured Authentication instead of configuring .spec.kubernetes.kubeAPIServer.oidcConfig. If oidcConfig is configured, it is translated into an AuthenticationConfiguration file to use for Structured Authentication configuration\nStructured Authentication For shoots with Kubernetes version \u003e= 1.30, which have StructuredAuthenticationConfiguration feature gate enabled (enabled by default), kube-apiserver of shoot clusters can be provided with Structured Authentication configuration via the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: kubeAPIServer: structuredAuthentication: configMapName: name-of-configmap-containing-authentication-config The configMapName references a user created ConfigMap in the project namespace containing the AuthenticationConfiguration in it’s config.yaml data field. Here is an example of such ConfigMap:\napiVersion: v1 kind: ConfigMap metadata: name: name-of-configmap-containing-authentication-config namespace: garden-my-project data: config.yaml: |apiVersion: apiserver.config.k8s.io/v1alpha1 kind: AuthenticationConfiguration jwt: - issuer: url: https://issuer1.example.com audiences: - audience1 - audience2 claimMappings: username: expression: 'claims.username' groups: expression: 'claims.groups' uid: expression: 'claims.uid' claimValidationRules: expression: 'claims.hd == \"example.com\"' message: \"the hosted domain name must be example.com\" Currently, only apiVersion: apiserver.config.k8s.io/v1alpha1 is supported. The user is resposible for the validity of the configured JWTAuthenticators.\nStatic Token kubeconfig Note: Static token kubeconfig is not available for Shoot clusters using Kubernetes version \u003e= 1.27. The shoots/adminkubeconfig subresource should be used instead.\n This kubeconfig contains a static token and provides cluster-admin privileges. It is created by default and persisted in the \u003cshoot-name\u003e.kubeconfig secret in the project namespace in the garden cluster.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: enableStaticTokenKubeconfig: true ... It is not the recommended method to access the shoot cluster, as the static token kubeconfig has some security flaws associated with it:\n The static token in the kubeconfig doesn’t have any expiration date. Read Credentials Rotation for Shoot Clusters to learn how to rotate the static token. The static token doesn’t have any user identity associated with it. The user in that token will always be system:cluster-admin, irrespective of the person accessing the cluster. Hence, it is impossible to audit the events in cluster. When the enableStaticTokenKubeconfig field is not explicitly set in the Shoot spec:\n for Shoot clusters using Kubernetes version \u003c 1.26, the field is defaulted to true. for Shoot clusters using Kubernetes version \u003e= 1.26, the field is defaulted to false. Note: Starting with Kubernetes 1.27, the enableStaticTokenKubeconfig field will be locked to false.\n ","categories":"","description":"","excerpt":"Accessing Shoot Clusters After creation of a shoot cluster, end-users …","ref":"/docs/gardener/shoot_access/","tags":"","title":"Accessing Shoot Clusters"},{"body":"Overview In order to add GitHub documentation to the website that is hosted outside of the main repository, you need to make changes to the central manifest. You can usually find it in the \u003corganization-name\u003e/\u003crepo-name\u003e/.docforge/ folder, for example gardener/documentation/.docforge.\nSample codeblock:\n- dir: machine-controller-manager structure: - file: _index.md frontmatter: title: Machine Controller Manager weight: 1 description: Declarative way of managing machines for Kubernetes cluster source: https://github.com/gardener/machine-controller-manager/blob/master/README.md - fileTree: https://github.com/gardener/machine-controller-manager/tree/master/docs This short code snippet adds a whole repository worth of content and contains examples of some of the most important elements:\n - dir: \u003cdir-name\u003e - the name of the directory in the navigation path structure: - required after using dir; shows that the following lines contain a file structure - file: _index.md - the content will be a single file; also creates an index file frontmatter: - allows for manual setting/overwriting of the various properties a file can have source: \u003clink\u003e - where the content for the file element is located - fileTree: \u003clink\u003e - the content will be a whole folder; also gives the location of the content Check the Notes and Tips section for useful advice when making changes to the manifest files.\nAdding Existing Documentation You can use the following templates in order to add documentation to the website that exists in other GitHub repositories.\nNote Proper indentation is incredibly important, as yaml relies on it for nesting! Adding a Single File You can add a single topic to the website by providing a link to it in the manifest.\n- dir: \u003cdir-name\u003e structure: - file: \u003cfile-name\u003e frontmatter: title: \u003ctopic-name\u003e description: \u003ctopic-description\u003e weight: \u003cweight\u003e source: https://github.com/\u003cpath\u003e/\u003cfile\u003e Example - dir: dashboard structure: - file: _index.md frontmatter: title: Dashboard description: The web UI for managing your projects and clusters weight: 3 source: https://github.com/gardener/dashboard/blob/master/README.md Adding Multiple Files You can also add multiple topics to the website at once, either through linking a whole folder or a manifest than contains the documentation structure.\nNote If the content you want to add does not have an _index.md file in it, it won’t show up as a single section on the website. You can fix this by adding the following after the structure: element:\n- file: _index.md frontmatter: title: \u003ctopic-name\u003e description: \u003ctopic-description\u003e weight: \u003cweight\u003e Linking a Folder - dir: \u003cdir-name\u003e structure: - fileTree: https://github.com/\u003cpath\u003e/\u003cfolder\u003e Example - dir: development structure: - fileTree: https://github.com/gardener/gardener/tree/master/docs/development Linking a Manifest File - dir: \u003cdir-name\u003e structure: - manifest: https://github.com/\u003cpath\u003e/manifest.yaml Example - dir: extensions structure: - manifest: https://github.com/gardener/documentation/blob/master/.docforge/documentation/gardener-extensions/gardener-extensions.yaml Notes and Tips If you want to place a file inside of an already existing directory in the main repo, you need to create a dir element that matches its name. If one already exists, simply add your link to its structure element. You can chain multiple files, folders, and manifests inside of a single structure element. For examples of frontmatter elements, see the Style Guide. ","categories":"","description":"","excerpt":"Overview In order to add GitHub documentation to the website that is …","ref":"/docs/contribute/documentation/adding-existing-documentation/","tags":"","title":"Adding Already Existing Documentation"},{"body":"Adding support for a new provider Steps to be followed while implementing a new (hyperscale) provider are mentioned below. This is the easiest way to add new provider support using a blueprint code.\nHowever, you may also develop your machine controller from scratch, which would provide you with more flexibility. First, however, make sure that your custom machine controller adheres to the Machine.Status struct defined in the MachineAPIs. This will make sure the MCM can act with higher-level controllers like MachineSet and MachineDeployment controller. The key is the Machine.Status.CurrentStatus.Phase key that indicates the status of the machine object.\nOur strong recommendation would be to follow the steps below. This provides the most flexibility required to support machine management for adding new providers. And if you feel to extend the functionality, feel free to update our machine controller libraries.\nSetting up your repository Create a new empty repository named machine-controller-manager-provider-{provider-name} on GitHub username/project. Do not initialize this repository with a README. Copy the remote repository URL (HTTPS/SSH) to this repository displayed once you create this repository. Now, on your local system, create directories as required. {your-github-username} given below could also be {github-project} depending on where you have created the new repository. mkdir -p $GOPATH/src/github.com/{your-github-username} Navigate to this created directory. cd $GOPATH/src/github.com/{your-github-username} Clone this repository on your local machine. git clone git@github.com:gardener/machine-controller-manager-provider-sampleprovider.git Rename the directory from machine-controller-manager-provider-sampleprovider to machine-controller-manager-provider-{provider-name}. mv machine-controller-manager-provider-sampleprovider machine-controller-manager-provider-{provider-name} Navigate into the newly-created directory. cd machine-controller-manager-provider-{provider-name} Update the remote origin URL to the newly created repository’s URL you had copied above. git remote set-url origin git@github.com:{your-github-username}/machine-controller-manager-provider-{provider-name}.git Rename GitHub project from gardener to {github-org/your-github-username} wherever you have cloned the repository above. Also, edit all occurrences of the word sampleprovider to {provider-name} in the code. Then, use the hack script given below to do the same. make rename-project PROJECT_NAME={github-org/your-github-username} PROVIDER_NAME={provider-name} eg: make rename-project PROJECT_NAME=gardener PROVIDER_NAME=AmazonWebServices (or) make rename-project PROJECT_NAME=githubusername PROVIDER_NAME=AWS Now, commit your changes and push them upstream. git add -A git commit -m \"Renamed SampleProvide to {provider-name}\" git push origin master Code changes required The contract between the Machine Controller Manager (MCM) and the Machine Controller (MC) AKA driver has been documented here and the machine error codes can be found here. You may refer to them for any queries.\n⚠️\n Keep in mind that there should be a unique way to map between machine objects and VMs. This can be done by mapping machine object names with VM-Name/ tags/ other metadata. Optionally, there should also be a unique way to map a VM to its machine class object. This can be done by tagging VM objects with tags/resource groups associated with the machine class. Steps to integrate Update the pkg/provider/apis/provider_spec.go specification file to reflect the structure of the ProviderSpec blob. It typically contains the machine template details in the MachineClass object. Follow the sample spec provided already in the file. A sample provider specification can be found here. Fill in the methods described at pkg/provider/core.go to manage VMs on your cloud provider. Comments are provided above each method to help you fill them up with desired REQUEST and RESPONSE parameters. A sample provider implementation for these methods can be found here. Fill in the required methods CreateMachine(), and DeleteMachine() methods. Optionally fill in methods like GetMachineStatus(), InitializeMachine, ListMachines(), and GetVolumeIDs(). You may choose to fill these once the working of the required methods seems to be working. GetVolumeIDs() expects VolumeIDs to be decoded from the volumeSpec based on the cloud provider. There is also an OPTIONAL method GenerateMachineClassForMigration() that helps in migration of {ProviderSpecific}MachineClass to MachineClass CR (custom resource). This only makes sense if you have an existing implementation (in-tree) acting on different CRD types. You would like to migrate this. If not, you MUST return an error (machine error UNIMPLEMENTED) to avoid processing this step. Perform validation of APIs that you have described and make it a part of your methods as required at each request. Write unit tests to make it work with your implementation by running make test. make test Tidy the go dependencies. make tidy Update the sample YAML files on the kubernetes/ directory to provide sample files through which the working of the machine controller can be tested. Update README.md to reflect any additional changes Testing your code changes Make sure $TARGET_KUBECONFIG points to the cluster where you wish to manage machines. Likewise, $CONTROL_NAMESPACE represents the namespaces where MCM is looking for machine CR objects, and $CONTROL_KUBECONFIG points to the cluster that holds these machine CRs.\n On the first terminal running at $GOPATH/src/github.com/{github-org/your-github-username}/machine-controller-manager-provider-{provider-name}, Run the machine controller (driver) using the command below. make start On the second terminal pointing to $GOPATH/src/github.com/gardener, Clone the latest MCM code git clone git@github.com:gardener/machine-controller-manager.git Navigate to the newly-created directory. cd machine-controller-manager Deploy the required CRDs from the machine-controller-manager repo, kubectl apply -f kubernetes/crds Run the machine-controller-manager in the master branch make start On the third terminal pointing to $GOPATH/src/github.com/{github-org/your-github-username}/machine-controller-manager-provider-{provider-name} Fill in the object files given below and deploy them as described below. Deploy the machine-class kubectl apply -f kubernetes/machine-class.yaml Deploy the kubernetes secret if required. kubectl apply -f kubernetes/secret.yaml Deploy the machine object and make sure it joins the cluster successfully. kubectl apply -f kubernetes/machine.yaml Once the machine joins, you can test by deploying a machine-deployment. Deploy the machine-deployment object and make sure it joins the cluster successfully. kubectl apply -f kubernetes/machine-deployment.yaml Make sure to delete both the machine and machine-deployment objects after use. kubectl delete -f kubernetes/machine.yaml kubectl delete -f kubernetes/machine-deployment.yaml Releasing your docker image Make sure you have logged into gcloud/docker using the CLI. To release your docker image, run the following. make release IMAGE_REPOSITORY=\u003clink-to-image-repo\u003e A sample kubernetes deploy file can be found at kubernetes/deployment.yaml. Update the same (with your desired MCM and MC images) to deploy your MCM pod. ","categories":"","description":"","excerpt":"Adding support for a new provider Steps to be followed while …","ref":"/docs/other-components/machine-controller-manager/cp_support_new/","tags":"","title":"Adding Support for a Cloud Provider"},{"body":"Extension Admission The extensions are expected to validate their respective resources for their extension specific configurations, when the resources are newly created or updated. For example, provider extensions would validate spec.provider.infrastructureConfig and spec.provider.controlPlaneConfig in the Shoot resource and spec.providerConfig in the CloudProfile resource, networking extensions would validate spec.networking.providerConfig in the Shoot resource. As best practice, the validation should be performed only if there is a change in the spec of the resource. Please find an exemplary implementation in the gardener/gardener-extension-provider-aws repository.\nWhen a resource is newly created or updated, Gardener adds an extension label for all the extension types referenced in the spec of the resource. This label is of the form \u003cextension-type\u003e.extensions.gardener.cloud/\u003cextension-name\u003e : \"true\". For example, an extension label for a provider extension type aws looks like provider.extensions.gardener.cloud/aws : \"true\". The extensions should add object selectors in their admission webhooks for these labels, to filter out the objects they are responsible for. At present, these labels are added to BackupEntrys, BackupBuckets, CloudProfiles, Seeds, SecretBindings and Shoots. Please see the types_constants.go file for the full list of extension labels.\n","categories":"","description":"","excerpt":"Extension Admission The extensions are expected to validate their …","ref":"/docs/gardener/extensions/admission/","tags":"","title":"Admission"},{"body":"Admission Configuration for the PodSecurity Admission Plugin If you wish to add your custom configuration for the PodSecurity plugin, you can do so in the Shoot spec under .spec.kubernetes.kubeAPIServer.admissionPlugins by adding:\nadmissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1 kind: PodSecurityConfiguration # Defaults applied when a mode label is not set. # # Level label values must be one of: # - \"privileged\" (default) # - \"baseline\" # - \"restricted\" # # Version label values must be one of: # - \"latest\" (default) # - specific version like \"v1.25\" defaults: enforce: \"privileged\" enforce-version: \"latest\" audit: \"privileged\" audit-version: \"latest\" warn: \"privileged\" warn-version: \"latest\" exemptions: # Array of authenticated usernames to exempt. usernames: [] # Array of runtime class names to exempt. runtimeClasses: [] # Array of namespaces to exempt. namespaces: [] For proper functioning of Gardener, kube-system namespace will also be automatically added to the exemptions.namespaces list.\n","categories":"","description":"Adding custom configuration for the `PodSecurity` plugin in `.spec.kubernetes.kubeAPIServer.admissionPlugins`","excerpt":"Adding custom configuration for the `PodSecurity` plugin in …","ref":"/docs/gardener/pod-security/","tags":"","title":"Admission Configuration for the `PodSecurity` Admission Plugin"},{"body":"See who is using Gardener Gardener adopters in production environments that have publicly shared details of their usage. SAP uses Gardener to deploy and manage Kubernetes clusters at scale in a uniform way across infrastructures (AWS, Azure, GCP, Alicloud, as well as generic interfaces to OpenStack and vSphere). Workloads include Databases (SAP HANA Cloud), Big Data (SAP Data Intelligence), Kyma, many other cloud native applications, and diverse business workloads. Gardener can now be run by customers on the Public Cloud Platform of the leading European Cloud Provider OVHcloud. ScaleUp Technologies runs Gardener within their public Openstack Clouds (Hamburg, Berlin, Düsseldorf). Their clients run all kinds of workloads on top of Gardener maintained Kubernetes clusters ranging from databases to Software-as-a-Service applications. Finanz Informatik Technologie Services GmbH uses Gardener to offer k8s as a service for customers in the financial industry in Germany. It is built on top of a “metal as a service” infrastructure implemented from scratch for k8s workloads in mind. The result is k8s on top of bare metal in minutes. PingCAP TiDB, is a cloud-native distributed SQL database with MySQL compatibility, and one of the most popular open-source database projects - with 23.5K+ stars and 400+ contributors. Its sister project TiKV is a Cloud Native Interactive Landscape project. PingCAP envisioned their managed TiDB service, known as TiDB Cloud, to be multi-tenant, secure, cost-efficient, and to be compatible with different cloud providers and they chose Gardener. Beezlabs uses Gardener to deliver Intelligent Process Automation platform, on multiple cloud providers and reduce costs and lock-in risks. b’nerd uses Gardener as the core technology for its own managed Kubernetes as a Service solution and operates multiple Gardener installations for several cloud hosting service providers. STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz Group, which includes Lidl, Kaufland, but also production and recycling companies. It uses Gardener to offer public and private Kubernetes as a service in own data centers in Europe and targets to become the cloud provider for German and European small and mid-sized companies. Supporting and managing multiple application landscapes on-premises and across different hyperscaler infrastructures can be painful. At T-Systems we use Gardener both for internal usage and to manage clusters for our customers. We love the openness of the project, the flexibility and the architecture that allows us to manage clusters around the world with only one team from one single pane of glass and to meet industry specific certification standards. The sovereignty by design is another great value, the technology implicitly brings along. The German-based company 23 Technologies uses Gardener to offer an enterprise-class Kubernetes engine for industrial use cases as well as cloud service providers and offers managed and professional services for it. 23T is also the team behind okeanos.dev, a public service that can be used by anyone to try out Gardener. B1 Systems GmbH is a international provider of Linux \u0026 Open Source consulting, training, managed service \u0026 support. We are founded in 2004 and based in Germany. Our team of 140 Linux experts offers tailor-made solutions based on cloud \u0026 container technologies, virtualization \u0026 high availability as well as monitoring, system \u0026 configuration management. B1 is using Gardener internally and also set up solutions/environments for customers. finleap connect GmbH is the leading independent Open Banking platform provider in Europe. It enables companies across a multitude of industries to provide the next generation of financial services by understanding how customers transact and interact. With its “full-stack” platform of solutions, finleap connect makes it possible for its clients to compliantly access the financial transactions data of customers, enrich said data with analytics tools, provide digital banking services and deliver high-quality, digital financial services products and services to customers. Gardener uniquly enables us to deploy our platform in Europe and across the globe in a uniform way on the providers preferred by our customers. Codesphere is a Cloud IDE with integrated and automated deployment of web apps. It uses Gardener internally to manage clusters that host customer deployments and internal systems all over the world. plusserver combines its own cloud offerings with hyperscaler platforms to provide individually tailored multi-cloud solutions. The plusserver Kubernetes Engine (PSKE) based on Gardener reduces the complexity in managing multi-cloud environments and enables companies to orchestrate their containers and cloud-native applications across a variety of platforms such as plusserver’s pluscloud open or hyperscalers such as AWS, either by mouseclick or via an API. With PSKE, companies remain vendor-independent and profit from guaranteed data sovereignty and data security due to GDPR-compliant cloud platforms in the certified plusserver data centers in Germany. Fuga Cloud uses Gardener as the basis for its Enterprise Managed Kubernetes (EMK), a platform that simplifies the management of your k8s and provides insight into usage and performance. The other Fuga Cloud services can be added with a mouse click, and the choice of another cloud provider is a negotiable option. Fuga Cloud stands for Digital Sovereignty, Data Portability and GDPR compatibility. metalstack.cloud uses Gardener and is based on the open-source software metal-stack.io, which is developed for regulated financial institutions. The focus here is on the highest possible security and compliance conformity. This makes metalstack.cloud perfect for running enterprise-grade container applications and provides your workloads with the highest possible performance. Cleura uses Gardener to power its Container Orchestration Engine for Cleura Public Cloud and Cleura Compliant Cloud. Cleura Container Orchestration Engine simplifies the creation and management of Kubernetes clusters through their user-friendly Cleura Cloud Management Panel or API, allowing users to focus on deploying applications instead of maintaining the underlying infrastructure. PITS Globale Datenrettungsdienste is a data recovery company located in Germany specializing in recovering lost or damaged files from hard drives, solid-state drives, flash drives, and other storage media. Gardener is used to handle highly-loaded internal infrastructure and provide reliable, fully-managed K8 cluster solutions. If you’re using Gardener and you aren’t on this list, submit a pull request! ","categories":"","description":"","excerpt":"See who is using Gardener Gardener adopters in production environments …","ref":"/adopter/","tags":"","title":"Adopters"},{"body":"Alerting Gardener uses Prometheus to gather metrics from each component. A Prometheus is deployed in each shoot control plane (on the seed) which is responsible for gathering control plane and cluster metrics. Prometheus can be configured to fire alerts based on these metrics and send them to an Alertmanager. The Alertmanager is responsible for sending the alerts to users and operators. This document describes how to setup alerting for:\n end-users/stakeholders/customers operators/administrators Alerting for Users To receive email alerts as a user, set the following values in the shoot spec:\nspec: monitoring: alerting: emailReceivers: - john.doe@example.com emailReceivers is a list of emails that will receive alerts if something is wrong with the shoot cluster.\nAlerting for Operators Currently, Gardener supports two options for alerting:\n Email Alerting Sending Alerts to an External Alertmanager Email Alerting Gardener provides the option to deploy an Alertmanager into each seed. This Alertmanager is responsible for sending out alerts to operators for each shoot cluster in the seed. Only email alerts are supported by the Alertmanager managed by Gardener. This is configurable by setting the Gardener controller manager configuration values alerting. See Gardener Configuration and Usage on how to configure the Gardener’s SMTP secret. If the values are set, a secret with the label gardener.cloud/role: alerting will be created in the garden namespace of the garden cluster. This secret will be used by each Alertmanager in each seed.\nExternal Alertmanager The Alertmanager supports different kinds of alerting configurations. The Alertmanager provided by Gardener only supports email alerts. If email is not sufficient, then alerts can be sent to an external Alertmanager. Prometheus will send alerts to a URL and then alerts will be handled by the external Alertmanager. This external Alertmanager is operated and configured by the operator (i.e. Gardener does not configure or deploy this Alertmanager). To configure sending alerts to an external Alertmanager, create a secret in the virtual garden cluster in the garden namespace with the label: gardener.cloud/role: alerting. This secret needs to contain a URL to the external Alertmanager and information regarding authentication. Supported authentication types are:\n No Authentication (none) Basic Authentication (basic) Mutual TLS (certificate) Remote Alertmanager Examples Note: The url value cannot be prepended with http or https.\n # No Authentication apiVersion: v1 kind: Secret metadata: labels: gardener.cloud/role: alerting name: alerting-auth namespace: garden data: # No Authentication auth_type: base64(none) url: base64(external.alertmanager.foo) # Basic Auth auth_type: base64(basic) url: base64(extenal.alertmanager.foo) username: base64(admin) password: base64(password) # Mutual TLS auth_type: base64(certificate) url: base64(external.alertmanager.foo) ca.crt: base64(ca) tls.crt: base64(certificate) tls.key: base64(key) insecure_skip_verify: base64(false) # Email Alerts (internal alertmanager) auth_type: base64(smtp) auth_identity: base64(internal.alertmanager.auth_identity) auth_password: base64(internal.alertmanager.auth_password) auth_username: base64(internal.alertmanager.auth_username) from: base64(internal.alertmanager.from) smarthost: base64(internal.alertmanager.smarthost) to: base64(internal.alertmanager.to) type: Opaque Configuring Your External Alertmanager Please refer to the Alertmanager documentation on how to configure an Alertmanager.\nWe recommend you use at least the following inhibition rules in your Alertmanager configuration to prevent excessive alerts:\ninhibit_rules: # Apply inhibition if the alert name is the same. - source_match: severity: critical target_match: severity: warning equal: ['alertname', 'service', 'cluster'] # Stop all alerts for type=shoot if there are VPN problems. - source_match: service: vpn target_match_re: type: shoot equal: ['type', 'cluster'] # Stop warning and critical alerts if there is a blocker - source_match: severity: blocker target_match_re: severity: ^(critical|warning)$ equal: ['cluster'] # If the API server is down inhibit no worker nodes alert. No worker nodes depends on kube-state-metrics which depends on the API server. - source_match: service: kube-apiserver target_match_re: service: nodes equal: ['cluster'] # If API server is down inhibit kube-state-metrics alerts. - source_match: service: kube-apiserver target_match_re: severity: info equal: ['cluster'] # No Worker nodes depends on kube-state-metrics. Inhibit no worker nodes if kube-state-metrics is down. - source_match: service: kube-state-metrics-shoot target_match_re: service: nodes equal: ['cluster'] Below is a graph visualizing the inhibition rules:\n","categories":"","description":"","excerpt":"Alerting Gardener uses Prometheus to gather metrics from each …","ref":"/docs/gardener/monitoring/alerting/","tags":"","title":"Alerting"},{"body":"Overview Sometimes operators want to find out why a certain node got removed. This guide helps to identify possible causes. There are a few potential reasons why nodes can be removed:\n broken node: a node becomes unhealthy and machine-controller-manager terminates it in an attempt to replace the unhealthy node with a new one scale-down: cluster-autoscaler sees that a node is under-utilized and therefore scales down a worker pool node rolling: configuration changes to a worker pool (or cluster) require all nodes of one or all worker pools to be rolled and thus all nodes to be replaced. Some possible changes are: the K8s/OS version changing machine types Helpful information can be obtained by using the logging stack. See Logging Stack for how to utilize the logging information in Gardener.\nFind Out Whether the Node Was unhealthy Check the Node Events A good first indication on what happened to a node can be obtained from the node’s events. Events are scraped and ingested into the logging system, so they can be found in the explore tab of Grafana (make sure to select loki as datasource) with a query like {job=\"event-logging\"} | unpack | object=\"Node/\u003cnode-name\u003e\" or find any event mentioning the node in question via a broader query like {job=\"event-logging\"}|=\"\u003cnode-name\u003e\".\nA potential result might reveal:\n{\"_entry\":\"Node ip-10-55-138-185.eu-central-1.compute.internal status is now: NodeNotReady\",\"count\":1,\"firstTimestamp\":\"2023-04-05T12:02:08Z\",\"lastTimestamp\":\"2023-04-05T12:02:08Z\",\"namespace\":\"default\",\"object\":\"Node/ip-10-55-138-185.eu-central-1.compute.internal\",\"origin\":\"shoot\",\"reason\":\"NodeNotReady\",\"source\":\"node-controller\",\"type\":\"Normal\"} Check machine-controller-manager Logs If a node was getting unhealthy, the last conditions can be found in the logs of the machine-controller-manager by using a query like {pod_name=~\"machine-controller-manager.*\"}|=\"\u003cnode-name\u003e\".\nCaveat: every node resource is backed by a corresponding machine resource managed by machine-controller-manager. Usually two corresponding node and machine resources have the same name with the exception of AWS. Here you first need to find with the above query the corresponding machine name, typically via a log like this\n2023-04-05 12:02:08 {\"log\":\"Conditions of Machine \\\"shoot--demo--cluster-pool-z1-6dffc-jh4z4\\\" with providerID \\\"aws:///eu-central-1/i-0a6ad1ca4c2e615dc\\\" and backing node \\\"ip-10-55-138-185.eu-central-1.compute.internal\\\" are changing\",\"pid\":\"1\",\"severity\":\"INFO\",\"source\":\"machine_util.go:629\"} This reveals that node ip-10-55-138-185.eu-central-1.compute.internal is backed by machine shoot--demo--cluster-pool-z1-6dffc-jh4z4. On infrastructures other than AWS you can omit this step.\nWith the machine name at hand, now search for log entries with {pod_name=~\"machine-controller-manager.*\"}|=\"\u003cmachine-name\u003e\". In case the node had failing conditions, you’d find logs like this:\n2023-04-05 12:02:08 {\"log\":\"Machine shoot--demo--cluster-pool-z1-6dffc-jh4z4 is unhealthy - changing MachineState to Unknown. Node conditions: [{Type:ClusterNetworkProblem Status:False LastHeartbeatTime:2023-04-05 11:58:39 +0000 UTC LastTransitionTime:2023-03-23 11:59:29 +0000 UTC Reason:NoNetworkProblems Message:no cluster network problems} ... {Type:Ready Status:Unknown LastHeartbeatTime:2023-04-05 11:55:27 +0000 UTC LastTransitionTime:2023-04-05 12:02:07 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.}]\",\"pid\":\"1\",\"severity\":\"WARN\",\"source\":\"machine_util.go:637\"} In the example above, the reason for an unhealthy node was that kubelet failed to renew its heartbeat. Typical reasons would be either a broken VM (that couldn’t execute kubelet anymore) or a broken network. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).\nIn both cases, the infrastructure provider might be able to provide more information on particular VM or network failures.\nWhatever the failure condition might have been, if a node gets unhealthy, it will be terminated by machine-controller-manager after the machineHealthTimeout has elapsed (this parameter can be configured in your shoot spec).\nCheck the Node Logs For each node the kernel and kubelet logs, as well as a few others, are scraped and can be queried with this query {nodename=\"\u003cnode-name\u003e\"} This might reveal OS specific issues or, in the absence of any logs (e.g., after the node went unhealthy), might indicate a network disruption or sudden VM termination. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).\nInfrastructure providers might be able to provide more information on particular VM failures in such cases.\nCheck the Network Problem Detector Dashboard If your Gardener installation utilizes gardener-extension-shoot-networking-problemdetector, you can check the dashboard named “Network Problem Detector” in Grafana for hints on network issues on the node of interest.\nScale-Down In general, scale-downs are managed by the cluster-autoscaler, its logs can be found with the query {container_name=\"cluster-autoscaler\"}. Attempts to remove a node can be found with the query {container_name=\"cluster-autoscaler\"}|=\"Scale-down: removing empty node\"\nIf a scale-down has caused disruptions in your workload, consider protecting your workload by adding PodDisruptionBudgets (see the autoscaler FAQ for more options).\nNode Rolling Node rolling can be caused by, e.g.:\n change of the K8s minor version of the cluster or a worker pool change of the OS version of the cluster or a worker pool change of the disk size/type or machine size/type of a worker pool change of node labels Changes like the above are done by altering the shoot specification and thus are recorded in the external auditlog system that is configured for the garden cluster.\n","categories":"","description":"Utilize Gardener's Monitoring and Logging to analyze removal and failures of nodes","excerpt":"Utilize Gardener's Monitoring and Logging to analyze removal and …","ref":"/docs/guides/monitoring-and-troubleshooting/analysing-node-failures/","tags":"","title":"Analyzing Node Removal and Failures"},{"body":"Specification ProviderSpec Schema Machine Machine is the representation of a physical or virtual machine.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string Machine metadata Kubernetes meta/v1.ObjectMeta ObjectMeta for machine object\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSpec Spec contains the specification of the machine\n class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n status MachineStatus Status contains fields depicting the status\n MachineClass MachineClass can be used to templatize and re-use provider configuration across multiple Machines / MachineSets / MachineDeployments.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineClass metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. nodeTemplate NodeTemplate (Optional) NodeTemplate contains subfields to track all node resources and other node info required to scale nodegroup from zero\n credentialsSecretRef Kubernetes core/v1.SecretReference CredentialsSecretRef can optionally store the credentials (in this case the SecretRef does not need to store them). This might be useful if multiple machine classes with the same credentials but different user-datas are used.\n providerSpec k8s.io/apimachinery/pkg/runtime.RawExtension Provider-specific configuration to use during node creation.\n provider string Provider is the combination of name and location of cloud-specific drivers.\n secretRef Kubernetes core/v1.SecretReference SecretRef stores the necessary secrets such as credentials or userdata.\n MachineDeployment MachineDeployment enables declarative updates for machines and MachineSets.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineDeployment metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineDeploymentSpec (Optional) Specification of the desired behavior of the MachineDeployment.\n replicas int32 (Optional) Number of desired machines. This is a pointer to distinguish between explicit zero and not specified. Defaults to 0.\n selector Kubernetes meta/v1.LabelSelector (Optional) Label selector for machines. Existing MachineSets whose machines are selected by this will be the ones affected by this MachineDeployment.\n template MachineTemplateSpec Template describes the machines that will be created.\n strategy MachineDeploymentStrategy (Optional) The MachineDeployment strategy to use to replace existing machines with new ones.\n minReadySeconds int32 (Optional) Minimum number of seconds for which a newly created machine should be ready without any of its container crashing, for it to be considered available. Defaults to 0 (machine will be considered available as soon as it is ready)\n revisionHistoryLimit *int32 (Optional) The number of old MachineSets to retain to allow rollback. This is a pointer to distinguish between explicit zero and not specified.\n paused bool (Optional) Indicates that the MachineDeployment is paused and will not be processed by the MachineDeployment controller.\n rollbackTo RollbackConfig (Optional) DEPRECATED. The config this MachineDeployment is rolling back to. Will be cleared after rollback is done.\n progressDeadlineSeconds *int32 (Optional) The maximum time in seconds for a MachineDeployment to make progress before it is considered to be failed. The MachineDeployment controller will continue to process failed MachineDeployments and a condition with a ProgressDeadlineExceeded reason will be surfaced in the MachineDeployment status. Note that progress will not be estimated during the time a MachineDeployment is paused. This is not set by default, which is treated as infinite deadline.\n status MachineDeploymentStatus (Optional) Most recently observed status of the MachineDeployment.\n MachineSet MachineSet TODO\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineSet metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSetSpec (Optional) replicas int32 (Optional) selector Kubernetes meta/v1.LabelSelector (Optional) machineClass ClassSpec (Optional) template MachineTemplateSpec (Optional) minReadySeconds int32 (Optional) status MachineSetStatus (Optional) ClassSpec (Appears on: MachineSetSpec, MachineSpec) ClassSpec is the class specification of machine\n Field Type Description apiGroup string API group to which it belongs\n kind string Kind for machine class\n name string Name of machine class\n ConditionStatus (string alias)\n (Appears on: MachineDeploymentCondition, MachineSetCondition) ConditionStatus are valid condition statuses\nCurrentStatus (Appears on: MachineStatus) CurrentStatus contains information about the current status of Machine.\n Field Type Description phase MachinePhase timeoutActive bool lastUpdateTime Kubernetes meta/v1.Time Last update time of current status\n LastOperation (Appears on: MachineSetStatus, MachineStatus, MachineSummary) LastOperation suggests the last operation performed on the object\n Field Type Description description string Description of the current operation\n errorCode string (Optional) ErrorCode of the current operation if any\n lastUpdateTime Kubernetes meta/v1.Time Last update time of current operation\n state MachineState State of operation\n type MachineOperationType Type of operation\n MachineConfiguration (Appears on: MachineSpec) MachineConfiguration describes the configurations useful for the machine-controller.\n Field Type Description drainTimeout Kubernetes meta/v1.Duration (Optional) MachineDraintimeout is the timeout after which machine is forcefully deleted.\n healthTimeout Kubernetes meta/v1.Duration (Optional) MachineHealthTimeout is the timeout after which machine is declared unhealhty/failed.\n creationTimeout Kubernetes meta/v1.Duration (Optional) MachineCreationTimeout is the timeout after which machinie creation is declared failed.\n maxEvictRetries *int32 (Optional) MaxEvictRetries is the number of retries that will be attempted while draining the node.\n nodeConditions *string (Optional) NodeConditions are the set of conditions if set to true for MachineHealthTimeOut, machine will be declared failed.\n MachineDeploymentCondition (Appears on: MachineDeploymentStatus) MachineDeploymentCondition describes the state of a MachineDeployment at a certain point.\n Field Type Description type MachineDeploymentConditionType Type of MachineDeployment condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastUpdateTime Kubernetes meta/v1.Time The last time this condition was updated.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n reason string The reason for the condition’s last transition.\n message string A human readable message indicating details about the transition.\n MachineDeploymentConditionType (string alias)\n (Appears on: MachineDeploymentCondition) MachineDeploymentConditionType are valid conditions of MachineDeployments\nMachineDeploymentSpec (Appears on: MachineDeployment) MachineDeploymentSpec is the specification of the desired behavior of the MachineDeployment.\n Field Type Description replicas int32 (Optional) Number of desired machines. This is a pointer to distinguish between explicit zero and not specified. Defaults to 0.\n selector Kubernetes meta/v1.LabelSelector (Optional) Label selector for machines. Existing MachineSets whose machines are selected by this will be the ones affected by this MachineDeployment.\n template MachineTemplateSpec Template describes the machines that will be created.\n strategy MachineDeploymentStrategy (Optional) The MachineDeployment strategy to use to replace existing machines with new ones.\n minReadySeconds int32 (Optional) Minimum number of seconds for which a newly created machine should be ready without any of its container crashing, for it to be considered available. Defaults to 0 (machine will be considered available as soon as it is ready)\n revisionHistoryLimit *int32 (Optional) The number of old MachineSets to retain to allow rollback. This is a pointer to distinguish between explicit zero and not specified.\n paused bool (Optional) Indicates that the MachineDeployment is paused and will not be processed by the MachineDeployment controller.\n rollbackTo RollbackConfig (Optional) DEPRECATED. The config this MachineDeployment is rolling back to. Will be cleared after rollback is done.\n progressDeadlineSeconds *int32 (Optional) The maximum time in seconds for a MachineDeployment to make progress before it is considered to be failed. The MachineDeployment controller will continue to process failed MachineDeployments and a condition with a ProgressDeadlineExceeded reason will be surfaced in the MachineDeployment status. Note that progress will not be estimated during the time a MachineDeployment is paused. This is not set by default, which is treated as infinite deadline.\n MachineDeploymentStatus (Appears on: MachineDeployment) MachineDeploymentStatus is the most recently observed status of the MachineDeployment.\n Field Type Description observedGeneration int64 (Optional) The generation observed by the MachineDeployment controller.\n replicas int32 (Optional) Total number of non-terminated machines targeted by this MachineDeployment (their labels match the selector).\n updatedReplicas int32 (Optional) Total number of non-terminated machines targeted by this MachineDeployment that have the desired template spec.\n readyReplicas int32 (Optional) Total number of ready machines targeted by this MachineDeployment.\n availableReplicas int32 (Optional) Total number of available machines (ready for at least minReadySeconds) targeted by this MachineDeployment.\n unavailableReplicas int32 (Optional) Total number of unavailable machines targeted by this MachineDeployment. This is the total number of machines that are still required for the MachineDeployment to have 100% available capacity. They may either be machines that are running but not yet available or machines that still have not been created.\n conditions []MachineDeploymentCondition Represents the latest available observations of a MachineDeployment’s current state.\n collisionCount *int32 (Optional) Count of hash collisions for the MachineDeployment. The MachineDeployment controller uses this field as a collision avoidance mechanism when it needs to create the name for the newest MachineSet.\n failedMachines []*github.com/gardener/machine-controller-manager/pkg/apis/machine/v1alpha1.MachineSummary (Optional) FailedMachines has summary of machines on which lastOperation Failed\n MachineDeploymentStrategy (Appears on: MachineDeploymentSpec) MachineDeploymentStrategy describes how to replace existing machines with new ones.\n Field Type Description type MachineDeploymentStrategyType (Optional) Type of MachineDeployment. Can be “Recreate” or “RollingUpdate”. Default is RollingUpdate.\n rollingUpdate RollingUpdateMachineDeployment (Optional) Rolling update config params. Present only if MachineDeploymentStrategyType =\nRollingUpdate. TODO: Update this to follow our convention for oneOf, whatever we decide it to be.\n MachineDeploymentStrategyType (string alias)\n (Appears on: MachineDeploymentStrategy) MachineDeploymentStrategyType are valid strategy types for rolling MachineDeployments\nMachineOperationType (string alias)\n (Appears on: LastOperation) MachineOperationType is a label for the operation performed on a machine object.\nMachinePhase (string alias)\n (Appears on: CurrentStatus) MachinePhase is a label for the condition of a machine at the current time.\nMachineSetCondition (Appears on: MachineSetStatus) MachineSetCondition describes the state of a machine set at a certain point.\n Field Type Description type MachineSetConditionType Type of machine set condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time (Optional) The last time the condition transitioned from one status to another.\n reason string (Optional) The reason for the condition’s last transition.\n message string (Optional) A human readable message indicating details about the transition.\n MachineSetConditionType (string alias)\n (Appears on: MachineSetCondition) MachineSetConditionType is the condition on machineset object\nMachineSetSpec (Appears on: MachineSet) MachineSetSpec is the specification of a MachineSet.\n Field Type Description replicas int32 (Optional) selector Kubernetes meta/v1.LabelSelector (Optional) machineClass ClassSpec (Optional) template MachineTemplateSpec (Optional) minReadySeconds int32 (Optional) MachineSetStatus (Appears on: MachineSet) MachineSetStatus holds the most recently observed status of MachineSet.\n Field Type Description replicas int32 Replicas is the number of actual replicas.\n fullyLabeledReplicas int32 (Optional) The number of pods that have labels matching the labels of the pod template of the replicaset.\n readyReplicas int32 (Optional) The number of ready replicas for this replica set.\n availableReplicas int32 (Optional) The number of available replicas (ready for at least minReadySeconds) for this replica set.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed by the controller.\n machineSetCondition []MachineSetCondition (Optional) Represents the latest available observations of a replica set’s current state.\n lastOperation LastOperation LastOperation performed\n failedMachines []github.com/gardener/machine-controller-manager/pkg/apis/machine/v1alpha1.MachineSummary (Optional) FailedMachines has summary of machines on which lastOperation Failed\n MachineSpec (Appears on: Machine, MachineTemplateSpec) MachineSpec is the specification of a Machine.\n Field Type Description class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n MachineState (string alias)\n (Appears on: LastOperation) MachineState is a current state of the operation.\nMachineStatus (Appears on: Machine) MachineStatus holds the most recently observed status of Machine.\n Field Type Description conditions []Kubernetes core/v1.NodeCondition Conditions of this machine, same as node\n lastOperation LastOperation Last operation refers to the status of the last operation performed\n currentStatus CurrentStatus Current status of the machine object\n lastKnownState string (Optional) LastKnownState can store details of the last known state of the VM by the plugins. It can be used by future operation calls to determine current infrastucture state\n MachineSummary MachineSummary store the summary of machine.\n Field Type Description name string Name of the machine object\n providerID string ProviderID represents the provider’s unique ID given to a machine\n lastOperation LastOperation Last operation refers to the status of the last operation performed\n ownerRef string OwnerRef\n MachineTemplateSpec (Appears on: MachineDeploymentSpec, MachineSetSpec) MachineTemplateSpec describes the data a machine should have when created from a template\n Field Type Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object’s metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSpec (Optional) Specification of the desired behavior of the machine. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status\n class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n NodeTemplate (Appears on: MachineClass) NodeTemplate contains subfields to track all node resources and other node info required to scale nodegroup from zero\n Field Type Description capacity Kubernetes core/v1.ResourceList Capacity contains subfields to track all node resources required to scale nodegroup from zero\n instanceType string Instance type of the node belonging to nodeGroup\n region string Region of the expected node belonging to nodeGroup\n zone string Zone of the expected node belonging to nodeGroup\n architecture *string (Optional) CPU Architecture of the node belonging to nodeGroup\n NodeTemplateSpec (Appears on: MachineSpec) NodeTemplateSpec describes the data a node should have when created from a template\n Field Type Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec Kubernetes core/v1.NodeSpec (Optional) NodeSpec describes the attributes that a node is created with.\n podCIDR string (Optional) PodCIDR represents the pod IP range assigned to the node.\n podCIDRs []string (Optional) podCIDRs represents the IP ranges assigned to the node for usage by Pods on that node. If this field is specified, the 0th entry must match the podCIDR field. It may contain at most 1 value for each of IPv4 and IPv6.\n providerID string (Optional) ID of the node assigned by the cloud provider in the format: ://\n unschedulable bool (Optional) Unschedulable controls node schedulability of new pods. By default, node is schedulable. More info: https://kubernetes.io/docs/concepts/nodes/node/#manual-node-administration\n taints []Kubernetes core/v1.Taint (Optional) If specified, the node’s taints.\n configSource Kubernetes core/v1.NodeConfigSource (Optional) Deprecated: Previously used to specify the source of the node’s configuration for the DynamicKubeletConfig feature. This feature is removed.\n externalID string (Optional) Deprecated. Not all kubelets will set this field. Remove field after 1.13. see: https://issues.k8s.io/61966\n RollbackConfig (Appears on: MachineDeploymentSpec) RollbackConfig is the config to rollback a MachineDeployment\n Field Type Description revision int64 (Optional) The revision to rollback to. If set to 0, rollback to the last revision.\n RollingUpdateMachineDeployment (Appears on: MachineDeploymentStrategy) RollingUpdateMachineDeployment is the spec to control the desired behavior of rolling update.\n Field Type Description maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) The maximum number of machines that can be unavailable during the update. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). Absolute number is calculated from percentage by rounding down. This can not be 0 if MaxSurge is 0. By default, a fixed value of 1 is used. Example: when this is set to 30%, the old MC can be scaled down to 70% of desired machines immediately when the rolling update starts. Once new machines are ready, old MC can be scaled down further, followed by scaling up the new MC, ensuring that the total number of machines available at all times during the update is at least 70% of desired machines.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) The maximum number of machines that can be scheduled above the desired number of machines. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). This can not be 0 if MaxUnavailable is 0. Absolute number is calculated from percentage by rounding up. By default, a value of 1 is used. Example: when this is set to 30%, the new MC can be scaled up immediately when the rolling update starts, such that the total number of old and new machines do not exceed 130% of desired machines. Once old machines have been killed, new MC can be scaled up further, ensuring that total number of machines running at any time during the update is atmost 130% of desired machines.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Specification ProviderSpec Schema Machine Machine is the …","ref":"/docs/other-components/machine-controller-manager/documents/apis/","tags":"","title":"Apis"},{"body":"Overview Similar to the kube-apiserver, the gardener-apiserver comes with a few in-tree managed admission plugins. If you want to get an overview of the what and why of admission plugins then this document might be a good start.\nThis document lists all existing admission plugins with a short explanation of what it is responsible for.\nClusterOpenIDConnectPreset, OpenIDConnectPreset (both enabled by default)\nThese admission controllers react on CREATE operations for Shoots. If the Shoot does not specify any OIDC configuration (.spec.kubernetes.kubeAPIServer.oidcConfig=nil), then it tries to find a matching ClusterOpenIDConnectPreset or OpenIDConnectPreset, respectively. If there are multiple matches, then the one with the highest weight “wins”. In this case, the admission controller will default the OIDC configuration in the Shoot.\nControllerRegistrationResources (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for ControllerRegistrations. It validates that there exists only one ControllerRegistration in the system that is primarily responsible for a given kind/type resource combination. This prevents misconfiguration by the Gardener administrator/operator.\nCustomVerbAuthorizer (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Projects and NamespacedCloudProfiles.\nFor Projects it validates whether the user is bound to an RBAC role with the modify-spec-tolerations-whitelist verb in case the user tries to change the .spec.tolerations.whitelist field of the respective Project resource. Usually, regular project members are not bound to this custom verb, allowing the Gardener administrator to manage certain toleration whitelists on Project basis.\nFor NamespacedCloudProfiles it validates whether the user is assigned an RBAC role with the modify-spec-kubernetes verb when attempting to change the .spec.kubernetes field, or the modify-spec-machineimages verb when attempting to change the .spec.machineImages field of the respective NamespacedCloudProfile resource.\nDeletionConfirmation (enabled by default)\nThis admission controller reacts on DELETE operations for Projects, Shoots, and ShootStates. It validates that the respective resource is annotated with a deletion confirmation annotation, namely confirmation.gardener.cloud/deletion=true. Only if this annotation is present it allows the DELETE operation to pass. This prevents users from accidental/undesired deletions. In addition, it applies the “four-eyes principle for deletion” concept if the Project is configured accordingly. Find all information about it in this document.\nFurthermore, this admission controller reacts on CREATE or UPDATE operations for Shoots. It makes sure that the deletion.gardener.cloud/confirmed-by annotation is properly maintained in case the Shoot deletion is confirmed with above mentioned annotation.\nExposureClass (enabled by default)\nThis admission controller reacts on Create operations for Shoots. It mutates Shoot resources which have an ExposureClass referenced by merging both their shootSelectors and/or tolerations into the Shoot resource.\nExtensionValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for BackupEntrys, BackupBuckets, Seeds, and Shoots. For all the various extension types in the specifications of these objects, it validates whether there exists a ControllerRegistration in the system that is primarily responsible for the stated extension type(s). This prevents misconfigurations that would otherwise allow users to create such resources with extension types that don’t exist in the cluster, effectively leading to failing reconciliation loops.\nExtensionLabels (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for BackupBuckets, BackupEntrys, CloudProfiles, Seeds, SecretBindings and Shoots. For all the various extension types in the specifications of these objects, it adds a corresponding label in the resource. This would allow extension admission webhooks to filter out the resources they are responsible for and ignore all others. This label is of the form \u003cextension-type\u003e.extensions.gardener.cloud/\u003cextension-name\u003e : \"true\". For example, an extension label for provider extension type aws, looks like provider.extensions.gardener.cloud/aws : \"true\".\nProjectValidator (enabled by default)\nThis admission controller reacts on CREATE operations for Projects. It prevents creating Projects with a non-empty .spec.namespace if the value in .spec.namespace does not start with garden-.\n⚠️ This admission plugin will be removed in a future release and its business logic will be incorporated into the static validation of the gardener-apiserver.\nResourceQuota (enabled by default)\nThis admission controller enables object count ResourceQuotas for Gardener resources, e.g. Shoots, SecretBindings, Projects, etc.\n ⚠️ In addition to this admission plugin, the ResourceQuota controller must be enabled for the Kube-Controller-Manager of your Garden cluster.\n ResourceReferenceManager (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for CloudProfiles, Projects, SecretBindings, Seeds, and Shoots. Generally, it checks whether referred resources stated in the specifications of these objects exist in the system (e.g., if a referenced Secret exists). However, it also has some special behaviours for certain resources:\n CloudProfiles: It rejects removing Kubernetes or machine image versions if there is at least one Shoot that refers to them. Projects: It sets the .spec.createdBy field for newly created Project resources, and defaults the .spec.owner field in case it is empty (to the same value of .spec.createdBy). Shoots: It sets the gardener.cloud/created-by=\u003cusername\u003e annotation for newly created Shoot resources. SeedValidator (enabled by default)\nThis admission controller reacts on DELETE operations for Seeds. Rejects the deletion if Shoot(s) reference the seed cluster.\nShootDNS (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It tries to assign a default domain to the Shoot. It also validates the DNS configuration (.spec.dns) for shoots.\nShootNodeLocalDNSEnabledByDefault (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it will enable node local dns within the shoot cluster (for more information, see NodeLocalDNS Configuration) by setting spec.systemComponents.nodeLocalDNS.enabled=true for newly created Shoots. Already existing Shoots and new Shoots that explicitly disable node local dns (spec.systemComponents.nodeLocalDNS.enabled=false) will not be affected by this admission plugin.\nShootQuotaValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It validates the resource consumption declared in the specification against applicable Quota resources. Only if the applicable Quota resources admit the configured resources in the Shoot then it allows the request. Applicable Quotas are referred in the SecretBinding that is used by the Shoot.\nShootResourceReservation (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It injects the Kubernetes.Kubelet.KubeReserved setting for kubelet either as global setting for a shoot or on a per worker pool basis. If the admission configuration (see this example) for the ShootResourceReservation plugin contains useGKEFormula: false (the default), then it sets a static default resource reservation for the shoot.\nIf useGKEFormula: true is set, then the plugin injects resource reservations based on the machine type similar to GKE’s formula for resource reservation into each worker pool. Already existing resource reservations are not modified; this also means that resource reservations are not automatically updated if the machine type for a worker pool is changed. If a shoot contains global resource reservations, then no per worker pool resource reservations are injected.\nShootVPAEnabledByDefault (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it will enable the managed VerticalPodAutoscaler components (for more information, see Vertical Pod Auto-Scaling) by setting spec.kubernetes.verticalPodAutoscaler.enabled=true for newly created Shoots. Already existing Shoots and new Shoots that explicitly disable VPA (spec.kubernetes.verticalPodAutoscaler.enabled=false) will not be affected by this admission plugin.\nShootTolerationRestriction (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It validates the .spec.tolerations used in Shoots against the whitelist of its Project, or against the whitelist configured in the admission controller’s configuration, respectively. Additionally, it defaults the .spec.tolerations in Shoots with those configured in its Project, and those configured in the admission controller’s configuration, respectively.\nShootValidator (enabled by default)\nThis admission controller reacts on CREATE, UPDATE and DELETE operations for Shoots. It validates certain configurations in the specification against the referred CloudProfile (e.g., machine images, machine types, used Kubernetes version, …). Generally, it performs validations that cannot be handled by the static API validation due to their dynamic nature (e.g., when something needs to be checked against referred resources). Additionally, it takes over certain defaulting tasks (e.g., default machine image for worker pools, default Kubernetes version).\nShootManagedSeed (enabled by default)\nThis admission controller reacts on UPDATE and DELETE operations for Shoots. It validates certain configuration values in the specification that are specific to ManagedSeeds (e.g. the nginx-addon of the Shoot has to be disabled, the Shoot VPA has to be enabled). It rejects the deletion if the Shoot is referred to by a ManagedSeed.\nManagedSeedValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for ManagedSeedss. It validates certain configuration values in the specification against the referred Shoot, for example Seed provider, network ranges, DNS domain, etc. Similar to ShootValidator, it performs validations that cannot be handled by the static API validation due to their dynamic nature. Additionally, it performs certain defaulting tasks, making sure that configuration values that are not specified are defaulted to the values of the referred Shoot, for example Seed provider, network ranges, DNS domain, etc.\nManagedSeedShoot (enabled by default)\nThis admission controller reacts on DELETE operations for ManagedSeeds. It rejects the deletion if there are Shoots that are scheduled onto the Seed that is registered by the ManagedSeed.\nShootDNSRewriting (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it adds a set of common suffixes configured in its admission plugin configuration to the Shoot (spec.systemComponents.coreDNS.rewriting.commonSuffixes) (for more information, see DNS Search Path Optimization). Already existing Shoots will not be affected by this admission plugin.\nNamespacedCloudProfileValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for NamespacedCloudProfiles. It primarily validates if the referenced parent CloudProfile exists in the system. In addition, the admission controller ensures that the NamespacedCloudProfile only configures new machine types, and does not overwrite those from the parent CloudProfile.\n","categories":"","description":"A list of all gardener managed admission plugins together with their responsibilities","excerpt":"A list of all gardener managed admission plugins together with their …","ref":"/docs/gardener/concepts/apiserver-admission-plugins/","tags":"","title":"APIServer Admission Plugins"},{"body":"Official Definition - What is Kubernetes? “Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.”\n Introduction - Basic Principle The foundation of the Gardener (providing Kubernetes Clusters as a Service) is Kubernetes itself, because Kubernetes is the go-to solution to manage software in the Cloud, even when it’s Kubernetes itself (see also OpenStack which is provisioned more and more on top of Kubernetes as well).\nWhile self-hosting, meaning to run Kubernetes components inside Kubernetes, is a popular topic in the community, we apply a special pattern catering to the needs of our cloud platform to provision hundreds or even thousands of clusters. We take a so-called “seed” cluster and seed the control plane (such as the API server, scheduler, controllers, etcd persistence and others) of an end-user cluster, which we call “shoot” cluster, as pods into the “seed” cluster. That means that one “seed” cluster, of which we will have one per IaaS and region, hosts the control planes of multiple “shoot” clusters. That allows us to avoid dedicated hardware/virtual machines for the “shoot” cluster control planes. We simply put the control plane into pods/containers and since the “seed” cluster watches them, they can be deployed with a replica count of 1 and only need to be scaled out when the control plane gets under pressure, but no longer for HA reasons. At the same time, the deployments get simpler (standard Kubernetes deployment) and easier to update (standard Kubernetes rolling update). The actual “shoot” cluster consists only of the worker nodes (no control plane) and therefore the users may get full administrative access to their clusters.\nSetting The Scene - Components and Procedure We provide a central operator UI, which we call the “Gardener Dashboard”. It talks to a dedicated cluster, which we call the “Garden” cluster, and uses custom resources managed by an aggregated API server (one of the general extension concepts of Kubernetes) to represent “shoot” clusters. In this “Garden” cluster runs the “Gardener”, which is basically a Kubernetes controller that watches the custom resources and acts upon them, i.e. creates, updates/modifies, or deletes “shoot” clusters. The creation follows basically these steps:\n Create a namespace in the “seed” cluster for the “shoot” cluster, which will host the “shoot” cluster control plane. Generate secrets and credentials, which the worker nodes will need to talk to the control plane. Create the infrastructure (using Terraform), which basically consists out of the network setup. Deploy the “shoot” cluster control plane into the “shoot” namespace in the “seed” cluster, containing the “machine-controller-manager” pod. Create machine CRDs in the “seed” cluster, describing the configuration and the number of worker machines for the “shoot” (the machine-controller-manager watches the CRDs and creates virtual machines out of it). Wait for the “shoot” cluster API server to become responsive (pods will be scheduled, persistent volumes and load balancers are created by Kubernetes via the respective cloud provider). Finally, we deploy kube-system daemons like kube-proxy and further add-ons like the dashboard into the “shoot” cluster and the cluster becomes active. Overview Architecture Diagram Detailed Architecture Diagram Note: The kubelet, as well as the pods inside the “shoot” cluster, talks through the front-door (load balancer IP; public Internet) to its “shoot” cluster API server running in the “seed” cluster. The reverse communication from the API server to the pod, service, and node networks happens through a VPN connection that we deploy into the “seed” and “shoot” clusters.\n","categories":["Users"],"description":"The concepts behind the Gardener architecture","excerpt":"The concepts behind the Gardener architecture","ref":"/docs/gardener/concepts/architecture/","tags":"","title":"Architecture"},{"body":"Audit a Kubernetes Cluster The shoot cluster is a Kubernetes cluster and its kube-apiserver handles the audit events. In order to define which audit events must be logged, a proper audit policy file must be passed to the Kubernetes API server. You could find more information about auditing a kubernetes cluster in the Auditing topic.\nDefault Audit Policy By default, the Gardener will deploy the shoot cluster with audit policy defined in the kube-apiserver package.\nCustom Audit Policy If you need specific audit policy for your shoot cluster, then you could deploy the required audit policy in the garden cluster as ConfigMap resource and set up your shoot to refer this ConfigMap. Note that the policy must be stored under the key policy in the data section of the ConfigMap.\nFor example, deploy the auditpolicy ConfigMap in the same namespace as your Shoot resource:\nkubectl apply -f example/95-configmap-custom-audit-policy.yaml then set your shoot to refer that ConfigMap (only related fields are shown):\nspec: kubernetes: kubeAPIServer: auditConfig: auditPolicy: configMapRef: name: auditpolicy Gardener validate the Shoot resource to refer only existing ConfigMap containing valid audit policy, and rejects the Shoot on failure. If you want to switch back to the default audit policy, you have to remove the section\nauditPolicy: configMapRef: name: \u003cconfigmap-name\u003e from the shoot spec.\nRolling Out Changes to the Audit Policy Gardener is not automatically rolling out changes to the Audit Policy to minimize the amount of Shoot reconciliations in order to prevent cloud provider rate limits, etc. Gardener will pick up the changes on the next reconciliation of Shoots referencing the Audit Policy ConfigMap. If users want to immediately rollout Audit Policy changes, they can manually trigger a Shoot reconciliation as described in triggering an immediate reconciliation. This is similar to changes to the cloud provider secret referenced by Shoots.\n","categories":"","description":"How to define a custom audit policy through a `ConfigMap` and reference it in the shoot spec","excerpt":"How to define a custom audit policy through a `ConfigMap` and …","ref":"/docs/gardener/shoot_auditpolicy/","tags":"","title":"Audit a Kubernetes Cluster"},{"body":"Increasing the Security of All Gardener Stakeholders In summer 2018, the Gardener project team asked Kinvolk to execute several penetration tests in its role as third-party contractor. The goal of this ongoing work was to increase the security of all Gardener stakeholders in the open source community. Following the Gardener architecture, the control plane of a Gardener managed shoot cluster resides in the corresponding seed cluster. This is a Control-Plane-as-a-Service with a network air gap.\nAlong the way we found various kinds of security issues, for example, due to misconfiguration or missing isolation, as well as two special problems with upstream Kubernetes and its Control-Plane-as-a-Service architecture.\nMajor Findings From this experience, we’d like to share a few examples of security issues that could happen on a Kubernetes installation and how to fix them.\nAlban Crequy (Kinvolk) and Dirk Marwinski (SAP SE) gave a presentation entitled Hardening Multi-Cloud Kubernetes Clusters as a Service at KubeCon 2018 in Shanghai presenting some of the findings.\nHere is a summary of the findings:\n Privilege escalation due to insecure configuration of the Kubernetes API server\n Root cause: Same certificate authority (CA) is used for both the API server and the proxy that allows accessing the API server. Risk: Users can get access to the API server. Recommendation: Always use different CAs. Exploration of the control plane network with malicious HTTP-redirects\n Root cause: See detailed description below. Risk: Provoked error message contains full HTTP payload from anexisting endpoint which can be exploited. The contents of the payload depends on your setup, but can potentially be user data, configuration data, and credentials. Recommendation: Use the latest version of Gardener Ensure the seed cluster’s container network supports network policies. Clusters that have been created with Kubify are not protected as Flannel is used there which doesn’t support network policies. Reading private AWS metadata via Grafana\n Root cause: It is possible to configuring a new custom data source in Grafana, we could send HTTP requests to target the control Risk: Users can get the “user-data” for the seed cluster from the metadata service and retrieve a kubeconfig for that Kubernetes cluster Recommendation: Lockdown Grafana features to only what’s necessary in this setup, block all unnecessary outgoing traffic, move Grafana to a different network, lockdown unauthenticated endpoints Scenario 1: Privilege Escalation with Insecure API Server In most configurations, different components connect directly to the Kubernetes API server, often using a kubeconfig with a client certificate. The API server is started with the flag:\n/hyperkube apiserver --client-ca-file=/srv/kubernetes/ca/ca.crt ... The API server will check whether the client certificate presented by kubectl, kubelet, scheduler or another component is really signed by the configured certificate authority for clients.\nThe API server can have many clients of various kinds\nHowever, it is possible to configure the API server differently for use with an intermediate authenticating proxy. The proxy will authenticate the client with its own custom method and then issue HTTP requests to the API server with additional HTTP headers specifying the user name and group name. The API server should only accept HTTP requests with HTTP headers from a legitimate proxy. To allow the API server to check incoming requests, you need pass on a list of certificate authorities (CAs) to it. Requests coming from a proxy are only accepted if they use a client certificate that is signed by one of the CAs of that list.\n--requestheader-client-ca-file=/srv/kubernetes/ca/ca-proxy.crt --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group API server clients can reach the API server through an authenticating proxy\nSo far, so good. But what happens if the malicious user “Mallory” tries to connect directly to the API server and reuses the HTTP headers to pretend to be someone else?\nWhat happens when a client bypasses the proxy, connecting directly to the API server?\nWith a correct configuration, Mallory’s kubeconfig will have a certificate signed by the API server certificate authority but not signed by the proxy certificate authority. So the API server will not accept the extra HTTP header “X-Remote-Group: system:masters”.\nYou only run into an issue when the same certificate authority is used for both the API server and the proxy. Then, any Kubernetes client certificate can be used to take the role of different user or group as the API server will accept the user header and group header.\nThe kubectl tool does not normally add those HTTP headers but it’s pretty easy to generate the corresponding HTTP requests manually.\nWe worked on improving the Kubernetes documentation to make clearer that this configuration should be avoided.\nScenario 2: Exploration of the Control Plane Network with Malicious HTTP-Redirects The API server is a central component of Kubernetes and many components initiate connections to it, including the kubelet running on worker nodes. Most of the requests from those clients will end up updating Kubernetes objects (pods, services, deployments, and so on) in the etcd database but the API server usually does not need to initiate TCP connections itself.\nThe API server is mostly a component that receives requests\nHowever, there are exceptions. Some kubectl commands will trigger the API server to open a new connection to the kubelet. kubectl exec is one of those commands. In order to get the standard I/Os from the pod, the API server will start an HTTP connection to the kubelet on the worker node where the pod is running. Depending on the container runtime used, it can be done in different ways, but one way to do it is for the kubelet to reply with a HTTP-302 redirection to the Container Runtime Interface (CRI). Basically, the kubelet is telling the API server to get the streams from CRI itself directly instead of forwarding. The redirection from the kubelet will only change the port and path from the URL; the IP address will not be changed because the kubelet and the CRI component run on the same worker node.\nBut the API server also initiates some connections, for example, to worker nodes\nIt’s often quite easy for users of a Kubernetes cluster to get access to worker nodes and tamper with the kubelet. They could be given explicit SSH access or they could be given a kubeconfig with enough privileges to create privileged pods or even just pods with “host” volumes.\nIn contrast, users (even those with “system:masters” permissions or “root” rights) are often not given access to the control plane. On setups like, for example, GKE or Gardener, the control plane is running on separate nodes, with a different administrative access. It could be hosted on a different cloud provider account. So users are not free to explore the internal networking the control plane.\nWhat would happen if a user was tampering with the kubelet to make it maliciously redirect kubectl exec requests to a different random endpoint? Most likely the given endpoint would not speak to the streaming server protocol, so there would be an error. However, the full HTTP payload from the endpoint is included in the error message printed by kubectl exec.\nThe API server is tricked to connect to other components\nThe impact of this issue depends on the specific setup. But in many configurations, we could find a metadata service (such as the AWS metadata service) containing user data, configurations and credentials. The setup we explored had a different AWS account and a different EC2 instance profile for the worker nodes and the control plane. This issue allowed users to get access to the AWS metadata service in the context of the control plane, which they should not have access to.\nWe have reported this issue to the Kubernetes Security mailing list and the public pull request that addresses the issue has been merged PR#66516. It provides a way to enforce HTTP redirect validation (disabled by default).\nBut there are several other ways that users could trigger the API server to generate HTTP requests and get the reply payload back, so it is advised to isolate the API server and other components from the network as additional precautious measures. Depending on where the API server runs, it could be with Kubernetes Network Policies, EC2 Security Groups or just iptables directly. Following the defense in depth principle, it is a good idea to apply the API server HTTP redirect validation when it is available as well as firewall rules.\nIn Gardener, this has been fixed with Kubernetes network policies along with changes to ensure the API server does not need to contact the metadata service. You can see more details in the announcements on the Gardener mailing list. This is tracked in CVE-2018-2475.\nTo be protected from this issue, stakeholders should:\n Use the latest version of Gardener Ensure the seed cluster’s container network supports network policies. Clusters that have been created with Kubify are not protected as Flannel is used there which doesn’t support network policies. Scenario 3: Reading Private AWS Metadata via Grafana For our tests, we had access to a Kubernetes setup where users are not only given access to the API server in the control plane, but also to a Grafana instance that is used to gather data from their Kubernetes clusters via Prometheus. The control plane is managed and users don’t have access to the nodes that it runs. They can only access the API server and Grafana via a load balancer. The internal network of the control plane is therefore hidden to users.\nPrometheus and Grafana can be used to monitor worker nodes\nUnfortunately, that setup was not protecting the control plane network from nosy users. By configuring a new custom data source in Grafana, we could send HTTP requests to target the control plane network, for example the AWS metadata service. The reply payload is not displayed on the Grafana Web UI but it is possible to access it from the debugging console of the Chrome browser.\nCredentials can be retrieved from the debugging console of Chrome\nAdding a Grafana data source is a way to issue HTTP requests to arbitrary targets\nIn that installation, users could get the “user-data” for the seed cluster from the metadata service and retrieve a kubeconfig for that Kubernetes cluster.\nThere are many possible measures to avoid this situation: lockdown Grafana features to only what’s necessary in this setup, block all unnecessary outgoing traffic, move Grafana to a different network, or lockdown unauthenticated endpoints, among others.\nConclusion The three scenarios above show pitfalls with a Kubernetes setup. A lot of them were specific to the Kubernetes installation: different cloud providers or different configurations will show different weaknesses. Users should no longer be given access to Grafana.\n","categories":"","description":"A few insecure configurations in Kubernetes","excerpt":"A few insecure configurations in Kubernetes","ref":"/docs/guides/applications/insecure-configuration/","tags":"","title":"Auditing Kubernetes for Secure Setup"},{"body":"Packages:\n authentication.gardener.cloud/v1alpha1 authentication.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API. “authentication.gardener.cloud/v1alpha1” API is already used for CRD registration and must not be served by the API server.\nResource Types: AdminKubeconfigRequest AdminKubeconfigRequest can be used to request a kubeconfig with admin credentials for a Shoot cluster.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec AdminKubeconfigRequestSpec Spec is the specification of the AdminKubeconfigRequest.\n expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n status AdminKubeconfigRequestStatus Status is the status of the AdminKubeconfigRequest.\n AdminKubeconfigRequestSpec (Appears on: AdminKubeconfigRequest) AdminKubeconfigRequestSpec contains the expiration time of the kubeconfig.\n Field Description expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n AdminKubeconfigRequestStatus (Appears on: AdminKubeconfigRequest) AdminKubeconfigRequestStatus is the status of the AdminKubeconfigRequest containing the kubeconfig and expiration of the credential.\n Field Description kubeconfig []byte Kubeconfig contains the kubeconfig with cluster-admin privileges for the shoot cluster.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the expiration timestamp of the returned credential.\n ViewerKubeconfigRequest ViewerKubeconfigRequest can be used to request a kubeconfig with viewer credentials (excluding Secrets) for a Shoot cluster.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ViewerKubeconfigRequestSpec Spec is the specification of the ViewerKubeconfigRequest.\n expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n status ViewerKubeconfigRequestStatus Status is the status of the ViewerKubeconfigRequest.\n ViewerKubeconfigRequestSpec (Appears on: ViewerKubeconfigRequest) ViewerKubeconfigRequestSpec contains the expiration time of the kubeconfig.\n Field Description expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n ViewerKubeconfigRequestStatus (Appears on: ViewerKubeconfigRequest) ViewerKubeconfigRequestStatus is the status of the ViewerKubeconfigRequest containing the kubeconfig and expiration of the credential.\n Field Description kubeconfig []byte Kubeconfig contains the kubeconfig with viewer privileges (excluding Secrets) for the shoot cluster.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the expiration timestamp of the returned credential.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n authentication.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/authentication/","tags":"","title":"Authentication"},{"body":"Authentication of Gardener Control Plane Components Against the Garden Cluster Note: This document refers to Gardener’s API server, admission controller, controller manager and scheduler components. Any reference to the term Gardener control plane component can be replaced with any of the mentioned above.\n There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy a Gardener control plane component is to not provide a kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution is to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see the example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.\u003cGardenerControlPlaneComponent\u003e.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication is to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig, which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.deployment.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution is to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.deployment.virtualGarden.enabled: true and .Values.global.deployment.virtualGarden.\u003cGardenerControlPlaneComponent\u003e.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also, the runtime cluster should be registered as a trusted identity provider in the target cluster. Then, projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.deployment.virtualGarden.enabled: true and .Values.global.deployment.virtualGarden.\u003cGardenerControlPlaneComponent\u003e.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e\n Set .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003cclient-id-from-trust-config\u003e.\n Craft a kubeconfig (see the example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Authentication of Gardener Control Plane Components Against the Garden …","ref":"/docs/gardener/deployment/authentication_gardener_control_plane/","tags":"","title":"Authentication Gardener Control Plane"},{"body":"Overview The project resource operations that are performed manually in the dashboard or via kubectl can be automated using the Gardener API and a Service Account authorized to perform them.\nCreate a Service Account Prerequisites You are logged on to the Gardener Dashboard You have created a project Steps Select your project and choose MEMBERS from the menu on the left.\n Locate the section Service Accounts and choose +.\n Enter the service account details.\nThe following Roles are available:\n Role Granted Permissions Owner Combines the Admin, UAM and Service Account Manager roles. There can only be one owner per project. You can change the owner on the project administration page. Admin Allows to manage resources inside the project (e.g. secrets, shoots, configmaps and similar) and to manage permissions for service accounts. Note that the Admin role has read-only access to service accounts. Viewer Provides read access to project details and shoots. Has access to shoots but is not able to create new ones. Cannot read cloud provider secrets. UAM Allows to add/modify/remove human users, service accounts or groups to/from the project member list. In case an external UAM system is connected via a service account, only this account should get the UAM role. Service Account Manager Allows to manage service accounts inside the project namespace and request tokens for them. The permissions of the created service accounts are instead managed by the Admin role. For security reasons this role should not be assigned to service accounts. In particular it should be ensured that the service account is not able to refresh service account tokens forever. Choose CREATE. Use the Service Account To use the service account, download or copy its kubeconfig. With it you can connect to the API endpoint of your Gardener project.\n Note: The downloaded kubeconfig contains the service account credentials. Treat with care.\n Delete the Service Account Choose Delete Service Account to delete it.\nRelated Links Service Account Manager ","categories":"","description":"","excerpt":"Overview The project resource operations that are performed manually …","ref":"/docs/dashboard/automated-resource-management/","tags":"","title":"Automating Project Resource Management"},{"body":"Overview This document describes the used autoscaling mechanism for several components.\nGarden or Shoot Cluster etcd By default, if none of the autoscaling modes is requested the etcd is deployed with static resources, without autoscaling.\nHowever, there are two supported autoscaling modes for the Garden or Shoot cluster etcd.\n HVPA\nIn HVPA mode, the etcd is scaled by the hvpa-controller. The gardenlet/gardener-operator is creating an HVPA resource for the etcd (main or events). The HVPA enables a vertical scaling for etcd.\nThe HVPA mode is the used autoscaling mode when the HVPA feature gate is enabled and the VPAForETCD feature gate is disabled.\n VPA\nIn VPA mode, the etcd is scaled by a native VPA resource.\nThe VPA mode is the used autoscaling mode when the VPAForETCD feature gate is enabled (takes precedence over the HVPA feature gate).\n [!NOTE] Starting with release v1.97, the VPAForETCD feature gate is enabled by default.\n For both of the autoscaling modes downscaling is handled more pessimistically to prevent many subsequent etcd restarts. Thus, for production and infrastructure Shoot clusters (or all Garden clusters), downscaling is deactivated for the main etcd. For all other Shoot clusters, lower advertised requests/limits are only applied during the Shoot’s maintenance time window.\nShoot Kubernetes API Server There are three supported autoscaling modes for the Shoot Kubernetes API server.\n Baseline\nIn Baseline mode, the Shoot Kubernetes API server is scaled by active HPA and VPA in passive, recommend-only mode.\nThe API server resource requests are computed based on the Shoot’s minimum Nodes count:\n Range Resource Requests [0, 2] 800m, 800Mi (2, 10] 1000m, 1100Mi (10, 50] 1200m, 1600Mi (50, 100] 2500m, 5200Mi (100, inf.) 3000m, 5200Mi The API server’s min replicas count is 2, the max replicas count - 3.\nThe Baseline mode is the used autoscaling mode when the HVPA and VPAAndHPAForAPIServer feature gates are not enabled.\n HVPA\nIn HVPA mode, the Shoot Kubernetes API server is scaled by the hvpa-controller. The gardenlet is creating an HVPA resource for the API server. The HVPA resource is backed by HPA and VPA both in recommend-only mode. The hvpa-controller is responsible for enabling simultaneous horizontal and vertical scaling by incorporating the recommendations from the HPA and VPA.\nThe initial API server resource requests are 500m and 1Gi. HVPA’s HPA is scaling only on CPU (average utilization 80%). HVPA’s VPA max allowed values are 8 CPU and 25G.\nThe API server’s min replicas count is 2, the max replicas count - 3.\nThe HVPA mode is the used autoscaling mode when the HVPA feature gate is enabled (and the VPAAndHPAForAPIServer feature gate is disabled).\n VPAAndHPA\nIn VPAAndHPA mode, the Shoot Kubernetes API server is scaled simultaneously by VPA and HPA on the same metric (CPU and memory usage). The pod-trashing cycle between VPA and HPA scaling on the same metric is avoided by configuring the HPA to scale on average usage (not on average utilization) and by picking the target average utilization values in sync with VPA’s allowed maximums. This makes possible VPA to first scale vertically on CPU/memory usage. Once all Pods’ average CPU/memory usage is close to exceed the VPA’s allowed maximum CPU/memory (the HPA’s target average utilization, 1/7 less than VPA’s allowed maximums), HPA is scaling horizontally (by adding a new replica).\nThe VPAAndHPA mode is introduced to address disadvantages with HVPA: additional component; modifies the deployment triggering unnecessary rollouts; vertical scaling only at max replicas; stuck vertical resource requests when scaling in again; etc.\nThe initial API server resource requests are 250m and 500Mi. VPA’s max allowed values are 7 CPU and 28G. HPA’s average target usage values are 6 CPU and 24G.\nThe API server’s min replicas count is 2, the max replicas count - 6.\nThe VPAAndHPA mode is the used autoscaling mode when the VPAAndHPAForAPIServer feature gate is enabled (takes precedence over the HVPA feature gate).\n [!NOTE] Starting with release v1.101, the VPAAndHPAForAPIServer feature gate is enabled by default.\n In all scaling modes the min replicas count of 2 is imposed by the High Availability of Shoot Control Plane Components.\nThe gardenlet sets the initial API server resource requests only when the Deployment is not found. When the Deployment exists, it is not overwriting the kube-apiserver container resources.\nDisabling Scale Down for Components in the Shoot Control Plane Some Shoot clusters’ control plane components can be overloaded and can have very high resource usage. The existing autoscaling solution could be imperfect to cover these cases. Scale down actions for such overloaded components could be disruptive.\nTo prevent such disruptive scale-down actions it is possible to disable scale down of the etcd, Kubernetes API server and Kubernetes controller manager in the Shoot control plane by annotating the Shoot with alpha.control-plane.scaling.shoot.gardener.cloud/scale-down-disabled=true.\nThere are the following specifics for when disabling scale-down for the Kubernetes API server component:\n In Baseline and HVPA modes the HPA’s min and max replicas count are set to 4. In VPAAndHPA mode if the HPA resource exists and HPA’s spec.minReplicas is not nil then the min replicas count is max(spec.minReplicas, status.desiredReplicas). When scale-down is disabled, this allows operators to specify a custom value for HPA spec.minReplicas and this value not to be reverted by gardenlet. I.e, HPA does scale down to min replicas but not below min replicas. HPA’s max replicas count is 6. Note: The alpha.control-plane.scaling.shoot.gardener.cloud/scale-down-disabled annotation is alpha and can be removed anytime without further notice. Only use it if you know what you do.\n Virtual Kubernetes API Server and Gardener API Server The virtual Kubernetes API server’s autoscaling is same as the Shoot Kubernetes API server’s with the following differences:\n The initial API server resource requests are 600m and 512Mi in all autoscaling modes. The min replicas count is 2 for a non-HA virtual cluster and 3 for an HA virtual cluster. The max replicas count is 6. In HVPA mode, HVPA’s HPA is scaling on both CPU and memory (average utilization 80% for both). The Gardener API server’s autoscaling is the same as the Shoot Kubernetes API server’s with the following differences:\n The initial API server resource requests are 600m and 512Mi in all autoscaling modes. The min replicas count is 2 for a non-HA virtual cluster and 3 for an HA virtual cluster. The max replicas count is 6. In HVPA mode, HVPA’s HPA is scaling on both CPU and memory (average utilization 80% for both). In HVPA mode, HVPA’s VPA max allowed values are 4 CPU and 25G. ","categories":"","description":"","excerpt":"Overview This document describes the used autoscaling mechanism for …","ref":"/docs/gardener/autoscaling-specifics-for-components/","tags":"","title":"Autoscaling Specifics for Components"},{"body":"Azure Permissions The following document describes the required Azure actions manage a Shoot cluster on Azure split by the different Azure provider/services.\nBe aware some actions are just required if particilar deployment sceanrios or features e.g. bring your own vNet, use Azure-file, let the Shoot act as Seed etc. should be used.\nMicrosoft.Compute # Required if a non zonal cluster based on Availability Set should be used. Microsoft.Compute/availabilitySets/delete Microsoft.Compute/availabilitySets/read Microsoft.Compute/availabilitySets/write # Required to let Kubernetes manage Azure disks. Microsoft.Compute/disks/delete Microsoft.Compute/disks/read Microsoft.Compute/disks/write # Required for to fetch meta information about disk and virtual machines sizes. Microsoft.Compute/locations/diskOperations/read Microsoft.Compute/locations/operations/read Microsoft.Compute/locations/vmSizes/read # Required if csi snapshot capabilities should be used and/or the Shoot should act as a Seed. Microsoft.Compute/snapshots/delete Microsoft.Compute/snapshots/read Microsoft.Compute/snapshots/write # Required to let Gardener/Machine-Controller-Manager manage the cluster nodes/machines. Microsoft.Compute/virtualMachines/delete Microsoft.Compute/virtualMachines/read Microsoft.Compute/virtualMachines/start/action Microsoft.Compute/virtualMachines/write # Required if a non zonal cluster based on VMSS Flex (VMO) should be used. Microsoft.Compute/virtualMachineScaleSets/delete Microsoft.Compute/virtualMachineScaleSets/read Microsoft.Compute/virtualMachineScaleSets/write Microsoft.ManagedIdentity # Required if a user provided Azure managed identity should attached to the cluster nodes. Microsoft.ManagedIdentity/userAssignedIdentities/assign/action Microsoft.ManagedIdentity/userAssignedIdentities/read Microsoft.MarketplaceOrdering # Required if nodes/machines should be created with images hosted on the Azure Marketplace. Microsoft.MarketplaceOrdering/offertypes/publishers/offers/plans/agreements/read Microsoft.MarketplaceOrdering/offertypes/publishers/offers/plans/agreements/write Microsoft.Network # Required to let Kubernetes manage services of type 'LoadBalancer'. Microsoft.Network/loadBalancers/backendAddressPools/join/action Microsoft.Network/loadBalancers/delete Microsoft.Network/loadBalancers/read Microsoft.Network/loadBalancers/write # Required in case the Shoot should use NatGateway(s). Microsoft.Network/natGateways/delete Microsoft.Network/natGateways/join/action Microsoft.Network/natGateways/read Microsoft.Network/natGateways/write # Required to let Gardener/Machine-Controller-Manager manage the cluster nodes/machines. Microsoft.Network/networkInterfaces/delete Microsoft.Network/networkInterfaces/ipconfigurations/join/action Microsoft.Network/networkInterfaces/ipconfigurations/read Microsoft.Network/networkInterfaces/join/action Microsoft.Network/networkInterfaces/read Microsoft.Network/networkInterfaces/write # Required to let Gardener maintain the basic infrastructure of the Shoot cluster and maintaing LoadBalancer services. Microsoft.Network/networkSecurityGroups/delete Microsoft.Network/networkSecurityGroups/join/action Microsoft.Network/networkSecurityGroups/read Microsoft.Network/networkSecurityGroups/write # Required for managing LoadBalancers and NatGateways. Microsoft.Network/publicIPAddresses/delete Microsoft.Network/publicIPAddresses/join/action Microsoft.Network/publicIPAddresses/read Microsoft.Network/publicIPAddresses/write # Required for managing the basic infrastructure of a cluster and maintaing LoadBalancer services. Microsoft.Network/routeTables/delete Microsoft.Network/routeTables/join/action Microsoft.Network/routeTables/read Microsoft.Network/routeTables/routes/delete Microsoft.Network/routeTables/routes/read Microsoft.Network/routeTables/routes/write Microsoft.Network/routeTables/write # Required to let Gardener maintain the basic infrastructure of the Shoot cluster. # Only a subset is required for the bring your own vNet scenario. Microsoft.Network/virtualNetworks/delete # not required for bring your own vnet Microsoft.Network/virtualNetworks/read Microsoft.Network/virtualNetworks/subnets/delete Microsoft.Network/virtualNetworks/subnets/join/action Microsoft.Network/virtualNetworks/subnets/read Microsoft.Network/virtualNetworks/subnets/write Microsoft.Network/virtualNetworks/write # not required for bring your own vnet Microsoft.Resources # Required to let Gardener maintain the basic infrastructure of the Shoot cluster. Microsoft.Resources/subscriptions/resourceGroups/delete Microsoft.Resources/subscriptions/resourceGroups/read Microsoft.Resources/subscriptions/resourceGroups/write Microsoft.Storage # Required if Azure File should be used and/or if the Shoot should act as Seed. Microsoft.Storage/operations/read Microsoft.Storage/storageAccounts/blobServices/containers/delete Microsoft.Storage/storageAccounts/blobServices/containers/read Microsoft.Storage/storageAccounts/blobServices/containers/write Microsoft.Storage/storageAccounts/blobServices/read Microsoft.Storage/storageAccounts/delete Microsoft.Storage/storageAccounts/listkeys/action Microsoft.Storage/storageAccounts/read Microsoft.Storage/storageAccounts/write ","categories":"","description":"","excerpt":"Azure Permissions The following document describes the required Azure …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/azure-permissions/","tags":"","title":"Azure Permissions"},{"body":"Overview Kubernetes uses etcd as the key-value store for its resource definitions. Gardener supports the backup and restore of etcd. It is the responsibility of the shoot owners to backup the workload data.\nGardener uses an etcd-backup-restore component to backup the etcd backing the Shoot cluster regularly and restore it in case of disaster. It is deployed as sidecar via etcd-druid. This doc mainly focuses on the backup and restore configuration used by Gardener when deploying these components. For more details on the design and internal implementation details, please refer to GEP-06 and the documentation on individual repositories.\nBucket Provisioning Refer to the backup bucket extension document to find out details about configuring the backup bucket.\nBackup Policy etcd-backup-restore supports full snapshot and delta snapshots over full snapshot. In Gardener, this configuration is currently hard-coded to the following parameters:\n Full Snapshot schedule: Daily, 24hr interval. For each Shoot, the schedule time in a day is randomized based on the configured Shoot maintenance window. Delta Snapshot schedule: At 5min interval. If aggregated events size since last snapshot goes beyond 100Mib. Backup History / Garbage backup deletion policy: Gardener configures backup restore to have Exponential garbage collection policy. As per policy, the following backups are retained: All full backups and delta backups for the previous hour. Latest full snapshot of each previous hour for the day. Latest full snapshot of each previous day for 7 days. Latest full snapshot of the previous 4 weeks. Garbage Collection is configured at 12hr interval. Listing: Gardener doesn’t have any API to list out the backups. To find the backups list, an admin can checkout the BackupEntry resource associated with the Shoot which holds the bucket and prefix details on the object store. Restoration The restoration process of etcd is automated through the etcd-backup-restore component from the latest snapshot. Gardener doesn’t support Point-In-Time-Recovery (PITR) of etcd. In case of an etcd disaster, the etcd is recovered from the latest backup automatically. For further details, please refer the Restoration topic. Post restoration of etcd, the Shoot reconciliation loop brings the cluster back to its previous state.\nAgain, the Shoot owner is responsible for maintaining the backup/restore of his workload. Gardener only takes care of the cluster’s etcd.\n","categories":["Users"],"description":"Understand the etcd backup and restore capabilities of Gardener","excerpt":"Understand the etcd backup and restore capabilities of Gardener","ref":"/docs/gardener/concepts/backup-restore/","tags":"","title":"Backup and Restore"},{"body":"Contract: BackupBucket Resource The Gardener project features a sub-project called etcd-backup-restore to take periodic backups of etcd backing Shoot clusters. It demands the bucket (or its equivalent in different object store providers) to be created and configured externally with appropriate credentials. The BackupBucket resource takes this responsibility in Gardener.\nBefore introducing the BackupBucket extension resource, Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see AWS Backup). Now, Gardener commissions an external, provider-specific controller to take over this task. You can also refer to backupInfra proposal documentation to get an idea about how the transition was done and understand the resource in a broader scope.\nWhat Is the Scope of a Bucket? A bucket will be provisioned per Seed. So, a backup of every Shoot created on that Seed will be stored under a different shoot specific prefix under the bucket. For the backup of the Shoot rescheduled on different Seed, it will continue to use the same bucket.\nWhat Is the Lifespan of a BackupBucket? The bucket associated with BackupBucket will be created at the creation of the Seed. And as per current implementation, it will also be deleted on deletion of the Seed, if there isn’t any BackupEntry resource associated with it.\nIn the future, we plan to introduce a schedule for BackupBucket - the deletion logic for the BackupBucket resource, which will reschedule it on different available Seeds on deletion or failure of a health check for the currently associated seed. In that case, the BackupBucket will be deleted only if there isn’t any schedulable Seed available and there isn’t any associated BackupEntry resource.\nWhat Needs to Be Implemented to Support a New Infrastructure Provider? As part of the seed flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: BackupBucket metadata: name: foo spec: type: azure providerConfig: \u003csome-optional-provider-specific-backupbucket-configuration\u003e region: eu-west-1 secretRef: name: backupprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. This provider secret will be configured by the Gardener operator in the Seed resource and propagated over there by the seed controller.\nAfter your controller has created the required bucket, if required, it generates the secret to access the objects in the bucket and put a reference to it in status. This secret is supposed to be used by Gardener, or eventually a BackupEntry resource and etcd-backup-restore component, to backup the etcd.\nIn order to support a new infrastructure provider, you need to write a controller that watches all BackupBuckets with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nReferences and Additional Resources BackupBucket API Reference Exemplary Implementation for the Azure Provider BackupEntry Resource Documentation Shared Bucket Proposal ","categories":"","description":"","excerpt":"Contract: BackupBucket Resource The Gardener project features a …","ref":"/docs/gardener/extensions/backupbucket/","tags":"","title":"BackupBucket"},{"body":"Contract: BackupEntry Resource The Gardener project features a sub-project called etcd-backup-restore to take periodic backups of etcd backing Shoot clusters. It demands the bucket (or its equivalent in different object store providers) access credentials to be created and configured externally with appropriate credentials. The BackupEntry resource takes this responsibility in Gardener to provide this information by creating a secret specific to the component.\nThat being said, the core motivation for introducing this resource was to support retention of backups post deletion of Shoot. The etcd-backup-restore components take responsibility of garbage collecting old backups out of the defined period. Once a shoot is deleted, we need to persist the backups for few days. Hence, Gardener uses the BackupEntry resource for this housekeeping work post deletion of a Shoot. The BackupEntry resource is responsible for shoot specific prefix under referred bucket.\nBefore introducing the BackupEntry extension resource, Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see AWS Backup). Now, Gardener commissions an external, provider-specific controller to take over this task. You can also refer to backupInfra proposal documentation to get idea about how the transition was done and understand the resource in broader scope.\nWhat Is the Lifespan of a BackupEntry? The bucket associated with BackupEntry will be created by using a BackupBucket resource. The BackupEntry resource will be created as a part of the Shoot creation. But resources might continue to exist post deletion of a Shoot (see gardenlet for more details).\nWhat Needs to be Implemented to Support a New Infrastructure Provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: BackupEntry metadata: name: shoot--foo--bar spec: type: azure providerConfig: \u003csome-optional-provider-specific-backup-bucket-configuration\u003e backupBucketProviderStatus: \u003csome-optional-provider-specific-backup-bucket-status\u003e region: eu-west-1 bucketName: foo secretRef: name: backupprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. This provider secret will be propagated from the BackupBucket resource by the shoot controller.\nYour controller is supposed to create the etcd-backup secret in the control plane namespace of a shoot. This secret is supposed to be used by Gardener or eventually by the etcd-backup-restore component to backup the etcd. The controller implementation should clean up the objects created under the shoot specific prefix in the bucket equivalent to the name of the BackupEntry resource.\nIn order to support a new infrastructure provider, you need to write a controller that watches all the BackupBuckets with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nReferences and Additional Resources BackupEntry API Reference Exemplary Implementation for the Azure Provider BackupBucket Resource Documentation Shared Bucket Proposal Gardener-controller-manager-component-config API Specification ","categories":"","description":"","excerpt":"Contract: BackupEntry Resource The Gardener project features a …","ref":"/docs/gardener/extensions/backupentry/","tags":"","title":"BackupEntry"},{"body":"Contract: Bastion Resource The Gardener project allows users to connect to Shoot worker nodes via SSH. As nodes are usually firewalled and not directly accessible from the public internet, GEP-15 introduced the concept of “Bastions”. A bastion is a dedicated server that only serves to allow SSH ingress to the worker nodes.\nBastion resources contain the user’s public SSH key and IP address, in order to provision the server accordingly: The public key is put onto the Bastion and SSH ingress is only authorized for the given IP address (in fact, it’s not a single IP address, but a set of IP ranges, however for most purposes a single IP is be used).\nWhat Is the Lifespan of a Bastion? Once a Bastion has been created in the garden, it will be replicated to the appropriate seed cluster, where a controller then reconciles a server and firewall rules etc., on the cloud provider used by the target Shoot. When the Bastion is ready (i.e. has a public IP), that IP is stored in the Bastion’s status and from there it is picked up by the garden cluster and gardenctl eventually.\nTo make multiple SSH sessions possible, the existence of the Bastion is not directly tied to the execution of gardenctl: users can exit out of gardenctl and use ssh manually to connect to the bastion and worker nodes.\nHowever, Bastions have an expiry date, after which they will be garbage collected.\nWhen SSH access is set to false for the Shoot in the workers settings (see Shoot Worker Nodes Settings), Bastion resources are deleted during Shoot reconciliation and new Bastions are prevented from being created.\nWhat Needs to Be Implemented to Support a New Infrastructure Provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Bastion metadata: name: mybastion namespace: shoot--foo--bar spec: type: aws # userData is base64-encoded cloud provider user data; this contains the # user's SSH key userData: IyEvYmluL2Jhc2ggL....Nlcgo= ingress: - ipBlock: cidr: 192.88.99.0/32 # this is most likely the user's IP address Your controller is supposed to create a new instance at the given cloud provider, firewall it to only allow SSH (TCP port 22) from the given IP blocks, and then configure the firewall for the worker nodes to allow SSH from the bastion instance. When a Bastion is deleted, all these changes need to be reverted.\nImplementation Details ConfigValidator Interface For bastion controllers, the generic Reconciler also delegates to a ConfigValidator interface that contains a single Validate method. This method is called by the generic Reconciler at the beginning of every reconciliation, and can be implemented by the extension to validate the .spec.providerConfig part of the Bastion resource with the respective cloud provider, typically the existence and validity of cloud provider resources such as VPCs, images, etc.\nThe Validate method returns a list of errors. If this list is non-empty, the generic Reconciler will fail with an error. This error will have the error code ERR_CONFIGURATION_PROBLEM, unless there is at least one error in the list that has its ErrorType field set to field.ErrorTypeInternal.\nReferences and Additional Resources Bastion API Reference Exemplary Implementation for the AWS Provider GEP-15 ","categories":"","description":"","excerpt":"Contract: Bastion Resource The Gardener project allows users to …","ref":"/docs/gardener/extensions/bastion/","tags":"","title":"Bastion"},{"body":"CA Rotation in Extensions GEP-18 proposes adding support for automated rotation of Shoot cluster certificate authorities (CAs). This document outlines all the requirements that Gardener extensions need to fulfill in order to support the CA rotation feature.\nRequirements for Shoot Cluster CA Rotation Extensions must not rely on static CA Secret names managed by the gardenlet, because their names are changing during CA rotation. Extensions cannot issue or use client certificates for authenticating against shoot API servers. Instead, they should use short-lived auto-rotated ServiceAccount tokens via gardener-resource-manager’s TokenRequestor. Also see Conventions and TokenRequestor documents. Extensions need to generate dedicated CAs for signing server certificates (e.g. cloud-controller-manager). There should be one CA per controller and purpose in order to bind the lifecycle to the reconciliation cycle of the respective object for which it is created. CAs managed by extensions should be rotated in lock-step with the shoot cluster CA. When the user triggers a rotation, the gardenlet writes phase and initiation time to Shoot.status.credentials.rotation.certificateAuthorities.{phase,lastInitiationTime}. See GEP-18 for a detailed description on what needs to happen in each phase. Extensions can retrieve this information from Cluster.shoot.status. Utilities for Secrets Management In order to fulfill the requirements listed above, extension controllers can reuse the SecretsManager that the gardenlet uses to manage all shoot cluster CAs, certificates, and other secrets as well. It implements the core logic for managing secrets that need to be rotated, auto-renewed, etc.\nAdditionally, there are utilities for reusing SecretsManager in extension controllers. They already implement the above requirements based on the Cluster resource and allow focusing on the extension controllers’ business logic.\nFor example, a simple SecretsManager usage in an extension controller could look like this:\nconst ( // identity for SecretsManager instance in ControlPlane controller identity = \"provider-foo-controlplane\" // secret config name of the dedicated CA caControlPlaneName = \"ca-provider-foo-controlplane\" ) func Reconcile() { var ( cluster *extensionscontroller.Cluster client client.Client // define wanted secrets with options secretConfigs = []extensionssecretsmanager.SecretConfigWithOptions{ { // dedicated CA for ControlPlane controller Config: \u0026secretutils.CertificateSecretConfig{ Name: caControlPlaneName, CommonName: \"ca-provider-foo-controlplane\", CertType: secretutils.CACert, }, // persist CA so that it gets restored on control plane migration Options: []secretsmanager.GenerateOption{secretsmanager.Persist()}, }, { // server cert for control plane component Config: \u0026secretutils.CertificateSecretConfig{ Name: \"cloud-controller-manager\", CommonName: \"cloud-controller-manager\", DNSNames: kutil.DNSNamesForService(\"cloud-controller-manager\", namespace), CertType: secretutils.ServerCert, }, // sign with our dedicated CA Options: []secretsmanager.GenerateOption{secretsmanager.SignedByCA(caControlPlaneName)}, }, } ) // initialize SecretsManager based on Cluster object sm, err := extensionssecretsmanager.SecretsManagerForCluster(ctx, logger.WithName(\"secretsmanager\"), clock.RealClock{}, client, cluster, identity, secretConfigs) // generate all wanted secrets (first CAs, then the rest) secrets, err := extensionssecretsmanager.GenerateAllSecrets(ctx, sm, secretConfigs) // cleanup any secrets that are not needed any more (e.g. after rotation) err = sm.Cleanup(ctx) } Please pay attention to the following points:\n There should be one SecretsManager identity per controller (and purpose if applicable) in order to prevent conflicts between different instances. E.g., there should be different identities for Infrastructrue, Worker controller, etc., and the ControlPlane controller should use dedicated SecretsManager identities per purpose (e.g. provider-foo-controlplane and provider-foo-controlplane-exposure). All other points in Reusing the SecretsManager in Other Components. ","categories":"","description":"","excerpt":"CA Rotation in Extensions GEP-18 proposes adding support for automated …","ref":"/docs/gardener/extensions/ca-rotation/","tags":"","title":"CA Rotation"},{"body":"Gardener Extension for Calico Networking \nThis controller operates on the Network resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting Calico Networking configuration (.spec.type=calico):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: calico-network namespace: shoot--core--test-01 spec: type: calico clusterCIDR: 192.168.0.0/24 serviceCIDR: 10.96.0.0/24 providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: false Please find a concrete example in the example folder. All the Calico specific configuration should be configured in the providerConfig section. If additional configuration is required, it should be added to the networking-calico chart in controllers/networking-calico/charts/internal/calico/values.yaml and corresponding code parts should be adapted (for example in controllers/networking-calico/pkg/charts/utils.go).\nOnce the network resource is applied, the networking-calico controller would then create all the necessary managed-resources which should be picked up by the gardener-resource-manager which will then apply all the network extensions resources to the shoot cluster.\nFinally after successful reconciliation an output similar to the one below should be expected.\n status: lastOperation: description: Successfully reconciled network lastUpdateTime: \"...\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 1 providerStatus: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkStatus Compatibility The following table lists known compatibility issues of this extension controller with other Gardener components.\n Calico Extension Gardener Action Notes \u003e= v1.30.0 \u003c v1.63.0 Please first update Gardener components to \u003e= v1.63.0. Without the mentioned minimum Gardener version, Calico Pods are not only scheduled to dedicated system component nodes in the shoot cluster. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig pointed to the cluster you want to connect to. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Calico CNI network plugin","excerpt":"Gardener extension controller for the Calico CNI network plugin","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/","tags":"","title":"Calico CNI"},{"body":"While it is possible, we highly recommend not to use privileged containers in your productive environment.\n","categories":"","description":"","excerpt":"While it is possible, we highly recommend not to use privileged …","ref":"/docs/faq/privileged-containers/","tags":"","title":"Can I run privileged containers?"},{"body":"There is no automatic migration of major/minor versions of Kubernetes. You need to update your clusters manually or press the Upgrade button in the Dashboard.\nBefore updating a cluster you should be aware of the potential errors this might cause. The following video will dive into a Kubernetes outage in production that Monzo experienced, its causes and effects, and the architectural and operational lessons learned.\n It is therefore recommended to first update your test cluster and validate it before performing changes on a productive environment.\n","categories":"","description":"","excerpt":"There is no automatic migration of major/minor versions of Kubernetes. …","ref":"/docs/faq/automatic-upgrade/","tags":"","title":"Can Kubernetes upgrade automatically?"},{"body":"Backing up your Kubernetes cluster is possible through the use of specialized software like Velero. Velero consists of a server side component and a client tool that allow you to backup or restore all objects in your cluster, as well as the cluster resources and persistent volumes.\n","categories":"","description":"","excerpt":"Backing up your Kubernetes cluster is possible through the use of …","ref":"/docs/faq/backup/","tags":"","title":"Can you backup your Kubernetes cluster resources?"},{"body":"The migration of clusters or content from one cluster to another is out of scope for the Gardener project. For such scenarios you may consider using tools like Velero.\n","categories":"","description":"","excerpt":"The migration of clusters or content from one cluster to another is …","ref":"/docs/faq/automatic-migrate/","tags":"","title":"Can you migrate the content of one cluster to another cluster?"},{"body":"Changing the API This document describes the steps that need to be performed when changing the API. It provides guidance for API changes to both (Gardener system in general or component configurations).\nGenerally, as Gardener is a Kubernetes-native extension, it follows the same API conventions and guidelines like Kubernetes itself. The Kubernetes API Conventions as well as Changing the API topics already provide a good overview and general explanation of the basic concepts behind it. We are following the same approaches.\nGardener API The Gardener API is defined in the pkg/apis/{core,extensions,settings} directories and is the main point of interaction with the system. It must be ensured that the API is always backwards-compatible.\nChanging the API Checklist when changing the API:\n Modify the field(s) in the respective Golang files of all external versions and the internal version. Make sure new fields are being added as “optional” fields, i.e., they are of pointer types, they have the // +optional comment, and they have the omitempty JSON tag. Make sure that the existing field numbers in the protobuf tags are not changed. Do not copy protobuf tags from other fields but create them with make generate WHAT=\"protobuf\". If necessary, implement/adapt the conversion logic defined in the versioned APIs (e.g., pkg/apis/core/v1beta1/conversions*.go). If necessary, implement/adapt defaulting logic defined in the versioned APIs (e.g., pkg/apis/core/v1beta1/defaults*.go). Run the code generation: make generate If necessary, implement/adapt validation logic defined in the internal API (e.g., pkg/apis/core/validation/validation*.go). If necessary, adapt the exemplary YAML manifests of the Gardener resources defined in example/*.yaml. In most cases, it makes sense to add/adapt the documentation for administrators/operators and/or end-users in the docs folder to provide information on purpose and usage of the added/changed fields. When opening the pull request, always add a release note so that end-users are becoming aware of the changes. Removing a Field If fields shall be removed permanently from the API, then a proper deprecation period must be adhered to so that end-users have enough time to adapt their clients.\nOnce the deprecation period is over, the field should be dropped from the API in a two-step process, i.e., in two release cycles. In the first step, all the usages in the code base should be dropped. In the second step, the field should be dropped from API. We need to follow this two-step process cause there can be the case where gardener-apiserver is upgraded to a new version in which the field has been removed but other controllers are still on the old version of Gardener. This can lead to nil pointer exceptions or other unexpected behaviour.\nThe steps for removing a field from the code base is:\n The field in the external version(s) has to be commented out with appropriate doc string that the protobuf number of the corresponding field is reserved. Example:\n-\tSeedTemplate *gardencorev1beta1.SeedTemplate `json:\"seedTemplate,omitempty\" protobuf:\"bytes,2,opt,name=seedTemplate\"` +\t// SeedTemplate is tombstoned to show why 2 is reserved protobuf tag. +\t// SeedTemplate *gardencorev1beta1.SeedTemplate `json:\"seedTemplate,omitempty\" protobuf:\"bytes,2,opt,name=seedTemplate\"` The reasoning behind this is to prevent the same protobuf number being used by a new field. Introducing a new field with the same protobuf number would be a breaking change for clients still using the old protobuf definitions that have the old field for the given protobuf number. The field in the internal version can be removed.\n A unit test has to be added to make sure that a new field does not reuse the already reserved protobuf tag.\n Example of field removal can be found in the Remove seedTemplate field from ManagedSeed API PR.\nComponent Configuration APIs Most Gardener components have a component configuration that follows similar principles to the Gardener API. Those component configurations are defined in pkg/{controllermanager,gardenlet,scheduler},pkg/apis/config. Hence, the above checklist also applies for changes to those APIs. However, since these APIs are only used internally and only during the deployment of Gardener, the guidelines with respect to changes and backwards-compatibility are slightly relaxed. If necessary, it is allowed to remove fields without a proper deprecation period if the release note uses the breaking operator keywords.\nIn addition to the above checklist:\n If necessary, then adapt the Helm chart of Gardener defined in charts/gardener. Adapt the values.yaml file as well as the manifest templates. ","categories":"","description":"","excerpt":"Changing the API This document describes the steps that need to be …","ref":"/docs/gardener/changing-the-api/","tags":"","title":"Changing the API"},{"body":"CI/CD As an execution environment for CI/CD workloads, we use Concourse. We however abstract from the underlying “build executor” and instead offer a Pipeline Definition Contract, through which components declare their build pipelines as required.\nIn order to run continuous delivery workloads for all components contributing to the Gardener project, we operate a central service.\nTypical workloads encompass the execution of tests and builds of a variety of technologies, as well as building and publishing container images, typically containing build results.\nWe are building our CI/CD offering around some principles:\n container-native - each workload is executed within a container environment. Components may customise used container images automation - pipelines are generated without manual interaction self-service - components customise their pipelines by changing their sources standardisation Learn more on our: Build Pipeline Reference Manual\n","categories":"","description":"","excerpt":"CI/CD As an execution environment for CI/CD workloads, we use …","ref":"/docs/contribute/code/cicd/","tags":"","title":"CI/CD"},{"body":"Gardener Extension for cilium Networking \nThis controller operates on the Network resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting cilium Networking configuration (.spec.type=cilium):\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: cilium-network namespace: shoot--foo--bar spec: type: cilium podCIDR: 10.244.0.0/16 serviceCIDR: 10.96.0.0/24 providerConfig: apiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig # hubble: # enabled: true # store: kubernetes Please find a concrete example in the example folder. All the cilium specific configuration should be configured in the providerConfig section. If additional configuration is required, it should be added to the networking-cilium chart in controllers/networking-cilium/charts/internal/cilium/values.yaml and corresponding code parts should be adapted (for example in controllers/networking-cilium/pkg/charts/utils.go).\nOnce the network resource is applied, the networking-cilium controller would then create all the necessary managed-resources which should be picked up by the gardener-resource-manager which will then apply all the network extensions resources to the shoot cluster.\nFinally after successful reconciliation an output similar to the one below should be expected.\n status: lastOperation: description: Successfully reconciled network lastUpdateTime: \"...\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 1 How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig pointed to the cluster you want to connect to. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Docs for cilium user ","categories":"","description":"Gardener extension controller for the Cilium CNI network plugin","excerpt":"Gardener extension controller for the Cilium CNI network plugin","ref":"/docs/extensions/network-extensions/gardener-extension-networking-cilium/","tags":"","title":"Cilium CNI"},{"body":"Cleanup of Shoot Clusters in Deletion When a shoot cluster is deleted then Gardener tries to gracefully remove most of the Kubernetes resources inside the cluster. This is to prevent that any infrastructure or other artifacts remain after the shoot deletion.\nThe cleanup is performed in four steps. Some resources are deleted with a grace period, and all resources are forcefully deleted (by removing blocking finalizers) after some time to not block the cluster deletion entirely.\nCleanup steps:\n All ValidatingWebhookConfigurations and MutatingWebhookConfigurations are deleted with a 5m grace period. Forceful finalization happens after 5m. All APIServices and CustomResourceDefinitions are deleted with a 5m grace period. Forceful finalization happens after 1h. All CronJobs, DaemonSets, Deployments, Ingresss, Jobs, Pods, ReplicaSets, ReplicationControllers, Services, StatefulSets, PersistentVolumeClaims are deleted with a 5m grace period. Forceful finalization happens after 5m. If the Shoot is annotated with shoot.gardener.cloud/skip-cleanup=true, then only Services and PersistentVolumeClaims are considered.\n All VolumeSnapshots and VolumeSnapshotContents are deleted with a 5m grace period. Forceful finalization happens after 1h. It is possible to override the finalization grace periods via annotations on the Shoot:\n shoot.gardener.cloud/cleanup-webhooks-finalize-grace-period-seconds (for the resources handled in step 1) shoot.gardener.cloud/cleanup-extended-apis-finalize-grace-period-seconds (for the resources handled in step 2) shoot.gardener.cloud/cleanup-kubernetes-resources-finalize-grace-period-seconds (for the resources handled in step 3) ⚠️ If \"0\" is provided, then all resources are finalized immediately without waiting for any graceful deletion. Please be aware that this might lead to orphaned infrastructure artifacts.\n","categories":"","description":"","excerpt":"Cleanup of Shoot Clusters in Deletion When a shoot cluster is deleted …","ref":"/docs/gardener/shoot_cleanup/","tags":"","title":"Cleanup of Shoot Clusters in Deletion"},{"body":"CLI Flags Etcd-druid exposes the following CLI flags that allow for configuring its behavior.\n CLI FLag Component Description Default feature-gates etcd-druid A set of key=value pairs that describe feature gates for alpha/experimental features. Please check feature-gates for more information. \"\" metrics-bind-address controller-manager The IP address that the metrics endpoint binds to. \"\" metrics-port controller-manager The port used for the metrics endpoint. 8080 metrics-addr controller-manager The fully qualified address:port that the metrics endpoint binds to.\nDeprecated: this field will be eventually removed. Please use --metrics-bind-address and –metrics-port instead. \":8080\" webhook-server-bind-address controller-manager The IP address on which to listen for the HTTPS webhook server. \"\" webhook-server-port controller-manager The port on which to listen for the HTTPS webhook server. 9443 webhook-server-tls-server-cert-dir controller-manager The path to a directory containing the server’s TLS certificate and key (the files must be named tls.crt and tls.key respectively). \"/etc/webhook-server-tls\" enable-leader-election controller-manager Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager. false leader-election-id controller-manager Name of the resource that leader election will use for holding the leader lock. \"druid-leader-election\" leader-election-resource-lock controller-manager Specifies which resource type to use for leader election. Supported options are ’endpoints’, ‘configmaps’, ’leases’, ’endpointsleases’ and ‘configmapsleases’.\nDeprecated. Will be removed in the future in favour of using only leases as the leader election resource lock for the controller manager. \"leases\" disable-lease-cache controller-manager Disable cache for lease.coordination.k8s.io resources. false etcd-workers etcd-controller Number of workers spawned for concurrent reconciles of etcd spec and status changes. If not specified then default of 3 is assumed. 3 ignore-operation-annotation etcd-controller Specifies whether to ignore or honour the annotation gardener.cloud/operation: reconcile on resources to be reconciled.\nDeprecated: please use --enable-etcd-spec-auto-reconcile instead. false enable-etcd-spec-auto-reconcile etcd-controller If true then automatically reconciles Etcd Spec. If false, waits for explicit annotation gardener.cloud/operation: reconcile to be placed on the Etcd resource to trigger reconcile. false disable-etcd-serviceaccount-automount etcd-controller If true then .automountServiceAccountToken will be set to false for the ServiceAccount created for etcd StatefulSets. false etcd-status-sync-period etcd-controller Period after which an etcd status sync will be attempted. 15s etcd-member-notready-threshold etcd-controller Threshold after which an etcd member is considered not ready if the status was unknown before. 5m etcd-member-unknown-threshold etcd-controller Threshold after which an etcd member is considered unknown. 1m enable-backup-compaction compaction-controller Enable automatic compaction of etcd backups. false compaction-workers compaction-controller Number of worker threads of the CompactionJob controller. The controller creates a backup compaction job if a certain etcd event threshold is reached. If compaction is enabled, the value for this flag must be greater than zero. 3 etcd-events-threshold compaction-controller Total number of etcd events that can be allowed before a backup compaction job is triggered. 1000000 active-deadline-duration compaction-controller Duration after which a running backup compaction job will be terminated. 3h metrics-scrape-wait-duration compaction-controller Duration to wait for after compaction job is completed, to allow Prometheus metrics to be scraped. 0s etcd-copy-backups-task-workers etcdcopybackupstask-controller Number of worker threads for the etcdcopybackupstask controller. 3 secret-workers secret-controller Number of worker threads for the secrets controller. 10 enable-etcd-components-webhook etcdcomponents-webhook Enable EtcdComponents Webhook to prevent unintended changes to resources managed by etcd-druid. false reconciler-service-account etcdcomponents-webhook The fully qualified name of the service account used by etcd-druid for reconciling etcd resources. If unspecified, the default service account mounted for etcd-druid will be used. \u003cetcd-druid-service-account\u003e etcd-components-exempt-service-accounts etcdcomponents-webhook The comma-separated list of fully qualified names of service accounts that are exempt from EtcdComponents Webhook checks. \"\" ","categories":"","description":"","excerpt":"CLI Flags Etcd-druid exposes the following CLI flags that allow for …","ref":"/docs/other-components/etcd-druid/deployment/cli-flags/","tags":"","title":"Cli Flags"},{"body":"Cluster Resource As part of the extensibility epic, a lot of responsibility that was previously taken over by Gardener directly has now been shifted to extension controllers running in the seed clusters. These extensions often serve a well-defined purpose, e.g. the management of DNS records, infrastructure, etc. We have introduced a couple of extension CRDs in the seeds whose specification is written by Gardener, and which are acted up by the extensions.\nHowever, the extensions sometimes require more information that is not directly part of the specification. One example of that is the GCP infrastructure controller which needs to know the shoot’s pod and service network. Another example is the Azure infrastructure controller which requires some information out of the CloudProfile resource. The problem is that Gardener does not know which extension requires which information so that it can write it into their specific CRDs.\nIn order to deal with this problem we have introduced the Cluster extension resource. This CRD is written into the seeds, however, it does not contain a status, so it is not expected that something acts upon it. Instead, you can treat it like a ConfigMap which contains data that might be interesting for you. In the context of Gardener, seeds and shoots, and extensibility the Cluster resource contains the CloudProfile, Seed, and Shoot manifest. Extension controllers can take whatever information they want out of it that might help completing their individual tasks.\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Cluster metadata: name: shoot--foo--bar spec: cloudProfile: apiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile ... seed: apiVersion: core.gardener.cloud/v1beta1 kind: Seed ... shoot: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... The resource is written by Gardener before it starts the reconciliation flow of the shoot.\n⚠️ All Gardener components use the core.gardener.cloud/v1beta1 version, i.e., the Cluster resource will contain the objects in this version.\nImportant Information that Should Be Taken into Account There are some fields in the Shoot specification that might be interesting to take into account.\n .spec.hibernation.enabled={true,false}: Extension controllers might want to behave differently if the shoot is hibernated or not (probably they might want to scale down their control plane components, for example). .status.lastOperation.state=Failed: If Gardener sets the shoot’s last operation state to Failed, it means that Gardener won’t automatically retry to finish the reconciliation/deletion flow because an error occurred that could not be resolved within the last 24h (default). In this case, end-users are expected to manually re-trigger the reconciliation flow in case they want Gardener to try again. Extension controllers are expected to follow the same principle. This means they have to read the shoot state out of the Cluster resource. Extension Resources Not Associated with a Shoot In some cases, Gardener may create extension resources that are not associated with a shoot, but are needed to support some functionality internal to Gardener. Such resources will be created in the garden namespace of a seed cluster.\nFor example, if the managed ingress controller is active on the seed, Gardener will create a DNSRecord resource(s) in the garden namespace of the seed cluster for the ingress DNS record.\nExtension controllers that may be expected to reconcile extension resources in the garden namespace should make sure that they can tolerate the absence of a cluster resource. This means that they should not attempt to read the cluster resource in such cases, or if they do they should ignore the “not found” error.\nReferences and Additional Resources Cluster API (Golang Specification) ","categories":"","description":"","excerpt":"Cluster Resource As part of the extensibility epic, a lot of …","ref":"/docs/gardener/extensions/cluster/","tags":"","title":"Cluster"},{"body":"Relation Between Gardener API and Cluster API (SIG Cluster Lifecycle) In essence, the Cluster API harmonizes how to get to clusters, while Gardener goes one step further and also harmonizes the clusters themselves. The Cluster API delegates the specifics to so-called providers for infrastructures or control planes via specific CR(D)s, while Gardener only has one cluster CR(D). Different Cluster API providers, e.g. for AWS, Azure, GCP, etc., give you vastly different Kubernetes clusters. In contrast, Gardener gives you the exact same clusters with the exact same K8s version, operating system, control plane configuration like for API server or kubelet, add-ons like overlay network, HPA/VPA, DNS and certificate controllers, ingress and network policy controllers, control plane monitoring and logging stacks, down to the behavior of update procedures, auto-scaling, self-healing, etc., on all supported infrastructures. These homogeneous clusters are an essential goal for Gardener, as its main purpose is to simplify operations for teams that need to develop and ship software on Kubernetes clusters on a plethora of infrastructures (a.k.a. multi-cloud).\nIncidentally, Gardener influenced the Machine API in the Cluster API with its Machine Controller Manager and was the first to adopt it. You can find more information on that in the joint SIG Cluster Lifecycle KubeCon talk where @hardikdr from our Gardener team in India spoke.\nThat means that we follow the Cluster API with great interest and are active members. It was completely overhauled from v1alpha1 to v1alpha2. But because v1alpha2 made too many assumptions about the bring-up of masters and was enforcing master machine operations (for more information, see The Cluster API Book: “As of v1alpha2, Machine-Based is the only control plane type that Cluster API supports”), services that managed their control planes differently like GKE or Gardener couldn’t adopt it. In 2020 v1alpha3 was introduced and made it possible (again) to integrate managed services like GKE or Gardener. The mapping from the Gardener API to the Cluster API is mostly syntactic.\nTo wrap it up, while the Cluster API knows about clusters, it doesn’t know about their make-up. With Gardener, we wanted to go beyond that and harmonize the make-up of the clusters themselves and make them homogeneous across all supported infrastructures. Gardener can therefore deliver homogeneous clusters with exactly the same configuration and behavior on all infrastructures (see also Gardener’s coverage in the official conformance test grid).\nWith Cluster API v1alpha3 and the support for declarative control plane management, it has became possible (again) to enable Kubernetes managed services like GKE or Gardener. We would be more than happy if the community would be interested to contribute a Gardener control plane provider.\n","categories":["Users"],"description":"Understand the evolution of the Gardener API and its relation to the Cluster API","excerpt":"Understand the evolution of the Gardener API and its relation to the …","ref":"/docs/gardener/concepts/cluster-api/","tags":"","title":"Cluster API"},{"body":"Community Calls Join our community calls to connect with other Gardener enthusiasts and watch cool presentations.\nWhat content can you expect?\n Gardener core developers roll out new information, share knowledge with the members and demonstrate new service capabilities. Adopters and contributors share their use-cases, experience and exchange on future requirements. If you want to receive updates, sign up here:\n Gardener Google Group The recordings are published on the Gardener Project YouTube channel. Topic Speaker Date and Time Link Get more computing power in Gardener by overcoming Kubelet limitations with CRI-resource-manager Pawel Palucki, Alexander D. Kanevskiy October 20, 2022 Recording Summary Cilium / Isovalent Presentation Raymond de Jong October 6, 2022 Recording Summary Gardener Extension Development - From scratch to the gardener-extension-shoot-flux Jens Schneider, Lothar Gesslein June 9, 2022 Recording Summary Deploying and Developing Gardener Locally (Without Any External Infrastructure!) Tim Ebert, Rafael Franzke March 17, 2022 Recording Summary Gardenctl-v2 Holger Kosser, Lukas Gross, Peter Sutter February 17, 2022 Recording Summary Google Calendar\n Presenting a Topic If there is a topic you would like to present, message us in our #gardener slack channel or get in touch with Jessica Katz.\n","categories":"","description":"","excerpt":"Community Calls Join our community calls to connect with other …","ref":"/community/","tags":"","title":"Community"},{"body":"Checklist For Adding New Components Adding new components that run in the garden, seed, or shoot cluster is theoretically quite simple - we just need a Deployment (or other similar workload resource), the respective container image, and maybe a bit of configuration. In practice, however, there are a couple of things to keep in mind in order to make the deployment production-ready. This document provides a checklist for them that you can walk through.\nGeneral Avoid usage of Helm charts (example)\nNowadays, we use Golang components instead of Helm charts for deploying components to a cluster. Please find a typical structure of such components in the provided metrics_server.go file (configuration values are typically managed in a Values structure). There are a few exceptions (e.g., Istio) still using charts, however the default should be using a Golang-based implementation. For the exceptional cases, use Golang’s embed package to embed the Helm chart directory (example 1, example 2).\n Choose the proper deployment way (example 1 (direct application w/ client), example 2 (using ManagedResource), example 3 (mixed scenario))\nFor historic reasons, resources related to shoot control plane components are applied directly with the client. All other resources (seed or shoot system components) are deployed via gardener-resource-manager’s Resource controller (ManagedResources) since it performs health checks out-of-the-box and has a lot of other features (see its documentation for more information). Components that can run as both seed system component or shoot control plane component (e.g., VPA or kube-state-metrics) can make use of these utility functions.\n Use unique ConfigMaps/Secrets (example 1, example 2)\nUnique ConfigMaps/Secrets are immutable for modification and have a unique name. This has a couple of benefits, e.g. the kubelet doesn’t watch these resources, and it is always clear which resource contains which data since it cannot be changed. As a consequence, unique/immutable ConfigMaps/Secret are superior to checksum annotations on the pod templates. Stale/unused ConfigMaps/Secrets are garbage-collected by gardener-resource-manager’s GarbageCollector. There are utility functions (see examples above) for using unique ConfigMaps/Secrets in Golang components. It is essential to inject the annotations into the workload resource to make the garbage-collection work.\nNote that some ConfigMaps/Secrets should not be unique (e.g., those containing monitoring or logging configuration). The reason is that the old revision stays in the cluster even if unused until the garbage-collector acts. During this time, they would be wrongly aggregated to the full configuration.\n Manage certificates/secrets via secrets manager (example)\nYou should use the secrets manager for the management of any kind of credentials. This makes sure that credentials rotation works out-of-the-box without you requiring to think about it. Generally, do not use client certificates (see the Security section).\n Consider hibernation when calculating replica count (example)\nShoot clusters can be hibernated meaning that all control plane components in the shoot namespace in the seed cluster are scaled down to zero and all worker nodes are terminated. If your component runs in the seed cluster then you have to consider this case and provide the proper replica count. There is a utility function available (see example).\n Ensure task dependencies are as precise as possible in shoot flows (example 1, example 2)\nOnly define the minimum of needed dependency tasks in the shoot reconciliation/deletion flows.\n Handle shoot system components\nShoot system components deployed by gardener-resource-manager are labelled with resource.gardener.cloud/managed-by: gardener. This makes Gardener adding required label selectors and tolerations so that non-DaemonSet managed Pods will exclusively run on selected nodes (for more information, see System Components Webhook). DaemonSets on the other hand, should generally tolerate any NoSchedule or NoExecute taints so that they can run on any Node, regardless of user added taints.\n Images Do not hard-code container image references (example 1, example 2, example 3)\nWe define all image references centrally in the imagevector/containers.yaml file. Hence, the image references must not be hard-coded in the pod template spec but read from this so-called image vector instead.\n Do not use container images from registries that don’t support IPv6 (example: image vector, prow configuration)\nRegistries such as ECR, GHCR (ghcr.io), MCR (mcr.microsoft.com) don’t support pulling images over IPv6.\nCheck if the upstream image is being also maintained in a registry that support IPv6 natively such as Artifact Registry, Quay (quay.io). If there is such image, use the image from registry with IPv6 support.\nIf the image is not available in a registry with IPv6 then copy the image to the gardener GCR. There is a prow job copying images that are needed in gardener components from a source registry to the gardener GCR under the prefix europe-docker.pkg.dev/gardener-project/releases/3rd/ (see the documentation or gardener/ci-infra#619).\nIf you want to use a new image from a registry without IPv6 support or upgrade an already used image to a newer tag, please open a PR to the ci-infra repository that modifies the job’s list of images to copy: images.yaml.\n Do not use container images from Docker Hub (example: image vector, prow configuration)\nThere is a strict rate-limit that applies to the Docker Hub registry. As described in 2., use another registry (if possible) or copy the image to the gardener GCR.\n Do not use Shoot container images that are not multi-arch\nGardener supports Shoot clusters with both amd64 and arm64 based worker Nodes. amd64 container images cannot run on arm64 worker Nodes and vice-versa.\n Security Use a dedicated ServiceAccount and disable auto-mount (example)\nComponents that need to talk to the API server of their runtime cluster must always use a dedicated ServiceAccount (do not use default), with automountServiceAccountToken set to false. This makes gardener-resource-manager’s TokenInvalidator invalidate the static token secret and its ProjectedTokenMount webhook inject a projected token automatically.\n Use shoot access tokens instead of a client certificates (example)\nFor components that need to talk to a target cluster different from their runtime cluster (e.g., running in seed cluster but talking to shoot) the gardener-resource-manager’s TokenRequestor should be used to manage a so-called “shoot access token”.\n Define RBAC roles with minimal privileges (example)\nThe component’s ServiceAccount (if it exists) should have as little privileges as possible. Consequently, please define proper RBAC roles for it. This might include a combination of ClusterRoles and Roles. Please do not provide elevated privileges due to laziness (e.g., because there is already a ClusterRole that can be extended vs. creating a Role only when access to a single namespace is needed).\n Use NetworkPolicys to restrict network traffic\nYou should restrict both ingress and egress traffic to/from your component as much as possible to ensure that it only gets access to/from other components if really needed. Gardener provides a few default policies for typical usage scenarios. For more information, see NetworkPolicys In Garden, Seed, Shoot Clusters.\n Do not run containers in privileged mode (example, example 2)\nAvoid running containers with privileged=true. Instead, define the needed Linux capabilities.\n Do not run containers as root (example)\nAvoid running containers as root. Usually, components such as Kubernetes controllers and admission webhook servers don’t need root user capabilities to do their jobs.\nThe problem with running as root, starts with how the container is first built. Unless a non-privileged user is configured in the Dockerfile, container build systems by default set up the container with the root user. Add a non-privileged user to your Dockerfile or use a base image with a non-root user (for example the nonroot images from distroless such as gcr.io/distroless/static-debian12:nonroot).\nIf the image is an upstream one, then consider configuring a securityContext for the container/Pod with a non-privileged user. For more information, see Configure a Security Context for a Pod or Container.\n Choose the proper Seccomp profile (example 1, example 2)\nFor components deployed in the Seed cluster, the Seccomp profile will be defaulted to RuntimeDefault by gardener-resource-manager’s SeccompProfile webhook which works well for the majority of components. However, in some special cases you might need to overwrite it.\nThe gardener-resource-manager’s SeccompProfile webhook is not enabled for a Shoot cluster. For components deployed in the Shoot cluster, it is required [*] to explicitly specify the Seccomp profile.\n[*] It is required because if a component deployed in the Shoot cluster does not specify a Seccomp profile and cannot run with the RuntimeDefault Seccomp profile, then enabling the .spec.kubernetes.kubelet.seccompDefault field in the Shoot spec would break the corresponding component.\n High Availability / Stability Specify the component type label for high availability (example)\nTo support high-availability deployments, gardener-resource-managers HighAvailabilityConfig webhook injects the proper specification like replica or topology spread constraints. You only need to specify the type label. For more information, see High Availability Of Deployed Components.\n Define a PodDisruptionBudget (example)\nClosely related to high availability but also to stability in general: The definition of a PodDisruptionBudget with maxUnavailable=1 should be provided by default.\n Choose the right PriorityClass (example)\nEach cluster runs many components with different priorities. Gardener provides a set of default PriorityClasses. For more information, see Priority Classes.\n Consider defining liveness and readiness probes (example)\nTo ensure smooth rolling update behaviour, consider the definition of liveness and/or readiness probes.\n Mark node-critical components (example)\nTo ensure user workload pods are only scheduled to Nodes where all node-critical components are ready, these components need to tolerate the node.gardener.cloud/critical-components-not-ready taint (NoSchedule effect). Also, such DaemonSets and the included PodTemplates need to be labelled with node.gardener.cloud/critical-component=true. For more information, see Readiness of Shoot Worker Nodes.\n Consider making a Service topology-aware (example)\nTo reduce costs and to improve the network traffic latency in multi-zone Seed clusters, consider making a Service topology-aware, if applicable. In short, when a Service is topology-aware, Kubernetes routes network traffic to the Endpoints (Pods) which are located in the same zone where the traffic originated from. In this way, the cross availability zone traffic is avoided. See Topology-Aware Traffic Routing.\n Enable leader election unconditionally for controllers (example 1, example 2, example 3)\nEnable leader election unconditionally for controllers independently from the number of replicas or from the high availability configurations. Having leader election enabled even for a single replica Deployment prevents having two Pods active at the same time. Otherwise, there are some corner cases that can result in two active Pods - Deployment rolling update or kubelet stops running on a Node and is not able to terminate the old replica while kube-controller-manager creates a new replica to match the Deployment’s desired replicas count.\n Scalability Provide resource requirements (example)\nAll components should define reasonable (initial) CPU and memory requests and avoid limits (especially CPU limits) unless you know the healthy range for your component (almost impossible with most components today), but no more than the node allocatable remainder (after daemonset pods) of the largest eligible machine type. Scheduling only takes requests into account!\n Define a VerticalPodAutoscaler (example)\nWe typically (need to) perform vertical auto-scaling for containers that have a significant usage (\u003e50m/100M) and a significant usage spread over time (\u003e2x) by defining a VerticalPodAutoscaler with updatePolicy.updateMode Auto, containerPolicies[].controlledValues RequestsOnly, reasonable minAllowed configuration and no maxAllowed configuration (will be taken care of in Gardener environments for you/capped at the largest eligible machine type).\n Define a HorizontalPodAutoscaler if needed (example)\nIf your component is capable of scaling horizontally, you should consider defining a HorizontalPodAutoscaler.\n [!NOTE] For more information and concrete configuration hints, please see our best practices guide for pod auto scaling and especially the summary and recommendations sections.\n Observability / Operations Productivity Provide monitoring scrape config and alerting rules (example 1, example 2)\nComponents should provide scrape configuration and alerting rules for Prometheus/Alertmanager if appropriate. This should be done inside a dedicated monitoring.go file. Extensions should follow the guidelines described in Extensions Monitoring Integration.\n Provide logging parsers and filters (example 1, example 2)\nComponents should provide parsers and filters for fluent-bit, if appropriate. This should be done inside a dedicated logging.go file. Extensions should follow the guidelines described in Fluent-bit log parsers and filters.\n Set the revisionHistoryLimit to 2 for Deployments (example)\nIn order to allow easy inspection of two ReplicaSets to quickly find the changes that lead to a rolling update, the revision history limit should be set to 2.\n Define health checks (example 1)\ngardener-operators’s and gardenlet’s care controllers regularly check the health status of components relevant to the respective cluster (garden/seed/shoot). For shoot control plane components, you need to enhance the lists of components to make sure your component is checked, see example above. For components deployed via ManagedResource, please consult the respective care controller documentation for more information (garden, seed, shoot).\n Configure automatic restarts in shoot maintenance time window (example 1, example 2)\nGardener offers to restart components during the maintenance time window. For more information, see Restart Control Plane Controllers and Restart Some Core Addons. You can consider adding the needed label to your control plane component to get this automatic restart (probably not needed for most components).\n ","categories":"","description":"","excerpt":"Checklist For Adding New Components Adding new components that run in …","ref":"/docs/gardener/component-checklist/","tags":"","title":"Component Checklist"},{"body":"Concept Title (the topic title can also be placed in the frontmatter)\nOverview This section provides an overview of the topic and the information provided in it.\nRelevant heading 1 This section gives the user all the information needed in order to understand the topic.\nRelevant subheading This adds additional information that belongs to the topic discussed in the parent heading.\nRelevant heading 2 This section gives the user all the information needed in order to understand the topic.\nRelated Links Link 1 Link 2 ","categories":"","description":"Describes the contents of a concept topic","excerpt":"Describes the contents of a concept topic","ref":"/docs/contribute/documentation/style-guide/concept_template/","tags":"","title":"Concept Topic Structure"},{"body":"Deployment of the shoot DNS service extension Disclaimer: This document is NOT a step by step deployment guide for the shoot DNS service extension and only contains some configuration specifics regarding the deployment of different components via the helm charts residing in the shoot DNS service extension repository.\ngardener-extension-admission-shoot-dns-service Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-shoot-dns-service component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the shoot DNS service extension Disclaimer: This …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/configuration/","tags":"","title":"Configuration"},{"body":"Configuring the Rsyslog Relp Extension Introduction As a cluster owner, you might need audit logs on a Shoot node level. With these audit logs you can track actions on your nodes like privilege escalation, file integrity, process executions, and who is the user that performed these actions. Such information is essential for the security of your Shoot cluster. Linux operating systems collect such logs via the auditd and journald daemons. However, these logs can be lost if they are only kept locally on the operating system. You need a reliable way to send them to a remote server where they can be stored for longer time periods and retrieved when necessary.\nRsyslog offers a solution for that. It gathers and processes logs from auditd and journald and then forwards them to a remote server. Moreover, rsyslog can make use of the RELP protocol so that logs are sent reliably and no messages are lost.\nThe shoot-rsyslog-relp extension is used to configure rsyslog on each Shoot node so that the following can take place:\n Rsyslog reads logs from the auditd and journald sockets. The logs are filtered based on the program name and syslog severity of the message. The logs are enriched with metadata containing the name of the Project in which the Shoot is created, the name of the Shoot, the UID of the Shoot, and the hostname of the node on which the log event occurred. The enriched logs are sent to the target remote server via the RELP protocol. The following graph shows a rough outline of how that looks in a Shoot cluster: Shoot Configuration The extension is not globally enabled and must be configured per Shoot cluster. The Shoot specification has to be adapted to include the shoot-rsyslog-relp extension configuration, which specifies the target server to which logs are forwarded, its port, and some optional rsyslog settings described in the examples below.\nBelow is an example shoot-rsyslog-relp extension configuration as part of the Shoot spec:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig # Set the target server to which logs are sent. The server must support the RELP protocol. target: some.rsyslog-relp.server # Set the port of the target server. port: 10250 # Define rules to select logs from which programs and with what syslog severity # are forwarded to the target server. loggingRules: - severity: 4 programNames: [\"kubelet\", \"audisp-syslog\"] - severity: 1 programNames: [\"audisp-syslog\"] # Define an interval of 90 seconds at which the current connection is broken and re-established. # By default this value is 0 which means that the connection is never broken and re-established. rebindInterval: 90 # Set the timeout for relp sessions to 90 seconds. If set too low, valid sessions may be considered # dead and tried to recover. timeout: 90 # Set how often an action is retried before it is considered to have failed. # Failed actions discard log messages. Setting `-1` here means that messages are never discarded. resumeRetryCount: -1 # Configures rsyslog to report continuation of action suspension, e.g. when the connection to the target # server is broken. reportSuspensionContinuation: true # Add tls settings if tls should be used to encrypt the connection to the target server. tls: enabled: true # Use `name` authentication mode for the tls connection. authMode: name # Only allow connections if the server's name is `some.rsyslog-relp.server` permittedPeer: - \"some.rsyslog-relp.server\" # Reference to the resource which contains certificates used for the tls connection. # It must be added to the `.spec.resources` field of the Shoot. secretReferenceName: rsyslog-relp-tls # Instruct librelp on the Shoot nodes to use the gnutls tls library. tlsLib: gnutls # Add auditConfig settings if you want to customize node level auditing. auditConfig: enabled: true # Reference to the resource which contains the audit configuration. # It must be added to the `.spec.resources` field of the Shoot. configMapReferenceName: audit-config resources: # Add the rsyslog-relp-tls secret in the resources field of the Shoot spec. - name: rsyslog-relp-tls resourceRef: apiVersion: v1 kind: Secret name: rsyslog-relp-tls-v1 - name: audit-config resourceRef: apiVersion: v1 kind: ConfigMap name: audit-config-v1 ... Choosing Which Log Messages to Send to the Target Server The .loggingRules field defines rules about which logs should be sent to the target server. When a log is processed by rsyslog, it is compared against the list of rules in order. If the program name and the syslog severity of the log messages matches the rule, the message is forwarded to the target server. The following table describes the syslog severity and their corresponding codes:\nNumerical Severity Code 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error conditions 4 Warning: warning conditions 5 Notice: normal but significant condition 6 Informational: informational messages 7 Debug: debug-level messages Below is an example with a .loggingRules section that will only forward logs from the kubelet program with syslog severity of 6 or lower and any other program with syslog severity of 2 or lower:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: localhost port: 1520 loggingRules: - severity: 6 programNames: [\"kubelet\"] - severity: 2 You can use a minimal shoot-rsyslog-relp extension configuration to forward all logs to the target server:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 Securing the Communication to the Target Server with TLS The communication to the target server is not encrypted by default. To enable encryption, set the .tls.enabled field in the shoot-rsyslog-relp extension configuration to true. In this case, an immutable secret which contains the TLS certificates used to establish the TLS connection to the server must be created in the same project namespace as your Shoot.\nAn example Secret is given below:\n Note: The secret must be immutable\n kind: Secret apiVersion: v1 metadata: name: rsyslog-relp-tls-v1 namespace: garden-foo immutable: true data: ca: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- crt: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- key: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- The Secret must be referenced in the Shoot’s .spec.resources field and the corresponding resource entry must be referenced in the .tls.secretReferenceName of the shoot-rsyslog-relp extension configuration:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 tls: enabled: true secretReferenceName: rsyslog-relp-tls resources: - name: rsyslog-relp-tls resourceRef: apiVersion: v1 kind: Secret name: rsyslog-relp-tls-v1 ... You can set a few additional parameters for the TLS connection: .tls.authMode, tls.permittedPeer, and tls.tlsLib. Refer to the rsyslog documentation for more information on these parameters:\n .tls.authMode .tls.permittedPeer .tls.tlsLib Configuring the Audit Daemon on the Shoot Nodes The shoot-rsyslog-relp extension also allows you to configure the Audit Daemon (auditd) on the Shoot nodes.\nBy default, the audit rules located under the /etc/audit/rules.d directory on your Shoot’s nodes will be moved to /etc/audit/rules.d.original and the following rules will be placed under the /etc/audit/rules.d directory: 00-base-config.rules, 10-privilege-escalation.rules, 11-privilege-special.rules, 12-system-integrity.rules. Next, augerules --load will be called and the audit daemon (auditd) restarted so that the new rules can take effect.\nAlternatively, you can define your own auditd rules to be placed on your Shoot’s nodes by using the following configuration:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: Auditd auditRules: |## First rule - delete all existing rules -D ## Now define some custom rules -a exit,always -F arch=b64 -S setuid -S setreuid -S setgid -S setregid -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation -a exit,always -F arch=b64 -S execve -S execveat -F euid=0 -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation In this case the original rules are also backed up in the /etc/audit/rules.d.original directory.\nTo deploy this configuration, it must be embedded in an immutable ConfigMap.\n [!NOTE] The data key storing this configuration must be named auditd.\n An example ConfigMap is given below:\napiVersion: v1 kind: ConfigMap metadata: name: audit-config-v1 namespace: garden-foo immutable: true data: auditd: |apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: Auditd auditRules: | ## First rule - delete all existing rules -D ## Now define some custom rules -a exit,always -F arch=b64 -S setuid -S setreuid -S setgid -S setregid -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation -a exit,always -F arch=b64 -S execve -S execveat -F euid=0 -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation After creating such a ConfigMap, it must be included in the Shoot’s spec.resources array and then referenced from the providerConfig.auditConfig.configMapReferenceName field in the shoot-rsyslog-relp extension configuration.\nAn example configuration is given below:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 auditConfig: enabled: true configMapReferenceName: audit-config resources: - name: audit-config resourceRef: apiVersion: v1 kind: ConfigMap name: audit-config-v1 Finally, by setting providerConfig.auditConfig.enabled to false in the shoot-rsyslog-relp extension configuration, the original audit rules on your Shoot’s nodes will not be modified and auditd will not be restarted.\nExamples on how the providerConfig.auditConfig.enabled field functions are given below:\n The following deploys the extension default audit rules as of today: providerConfig: auditConfig: enabled: true The following deploys only the rules specified in the referenced ConfigMap: providerConfig: auditConfig: enabled: true configMapReferenceName: audit-config Both of the following do not deploy any audit rules: providerConfig: auditConfig: enabled: false configMapReferenceName: audit-config providerConfig: auditConfig: enabled: false ","categories":"","description":"","excerpt":"Configuring the Rsyslog Relp Extension Introduction As a cluster …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/configuration/","tags":"","title":"Configuration"},{"body":"Gardener Configuration and Usage Gardener automates the full lifecycle of Kubernetes clusters as a service. Additionally, it has several extension points allowing external controllers to plug-in to the lifecycle. As a consequence, there are several configuration options for the various custom resources that are partially required.\nThis document describes the:\n Configuration and usage of Gardener as operator/administrator. Configuration and usage of Gardener as end-user/stakeholder/customer. Configuration and Usage of Gardener as Operator/Administrator When we use the terms “operator/administrator”, we refer to both the people deploying and operating Gardener. Gardener consists of the following components:\n gardener-apiserver, a Kubernetes-native API extension that serves custom resources in the Kubernetes-style (like Seeds and Shoots), and a component that contains multiple admission plugins. gardener-admission-controller, an HTTP(S) server with several handlers to be used in a ValidatingWebhookConfiguration. gardener-controller-manager, a component consisting of multiple controllers that implement reconciliation and deletion flows for some of the custom resources (e.g., it contains the logic for maintaining Shoots, reconciling Projects). gardener-scheduler, a component that assigns newly created Shoot clusters to appropriate Seed clusters. gardenlet, a component running in seed clusters and consisting out of multiple controllers that implement reconciliation and deletion flows for some of the custom resources (e.g., it contains the logic for reconciliation and deletion of Shoots). Each of these components have various configuration options. The gardener-apiserver uses the standard API server library maintained by the Kubernetes community, and as such it mainly supports command line flags. Other components use so-called componentconfig files that describe their configuration in a Kubernetes-style versioned object.\nConfiguration File for Gardener Admission Controller The Gardener admission controller only supports one command line flag, which should be a path to a valid admission-controller configuration file. Please take a look at this example configuration.\nConfiguration File for Gardener Controller Manager The Gardener controller manager only supports one command line flag, which should be a path to a valid controller-manager configuration file. Please take a look at this example configuration.\nConfiguration File for Gardener Scheduler The Gardener scheduler also only supports one command line flag, which should be a path to a valid scheduler configuration file. Please take a look at this example configuration. Information about the concepts of the Gardener scheduler can be found at Gardener Scheduler.\nConfiguration File for gardenlet The gardenlet also only supports one command line flag, which should be a path to a valid gardenlet configuration file. Please take a look at this example configuration. Information about the concepts of the Gardenlet can be found at gardenlet.\nSystem Configuration After successful deployment of the four components, you need to setup the system. Let’s first focus on some “static” configuration. When the gardenlet starts, it scans the garden namespace of the garden cluster for Secrets that have influence on its reconciliation loops, mainly the Shoot reconciliation:\n Internal domain secret - contains the DNS provider credentials (having appropriate privileges) which will be used to create/delete the so-called “internal” DNS records for the Shoot clusters, please see this yaml file for an example.\n This secret is used in order to establish a stable endpoint for shoot clusters, which is used internally by all control plane components. The DNS records are normal DNS records but called “internal” in our scenario because only the kubeconfigs for the control plane components use this endpoint when talking to the shoot clusters. It is forbidden to change the internal domain secret if there are existing shoot clusters. Default domain secrets (optional) - contain the DNS provider credentials (having appropriate privileges) which will be used to create/delete DNS records for a default domain for shoots (e.g., example.com), please see this yaml file for an example.\n Not every end-user/stakeholder/customer has its own domain, however, Gardener needs to create a DNS record for every shoot cluster. As landscape operator you might want to define a default domain owned and controlled by you that is used for all shoot clusters that don’t specify their own domain. If you have multiple default domain secrets defined you can add a priority as an annotation (dns.gardener.cloud/domain-default-priority) to select which domain should be used for new shoots during creation. The domain with the highest priority is selected during shoot creation. If there is no annotation defined, the default priority is 0, also all non integer values are considered as priority 0. Alerting secrets (optional) - contain the alerting configuration and credentials for the AlertManager to send email alerts. It is also possible to configure the monitoring stack to send alerts to an AlertManager not deployed by Gardener to handle alerting. Please see this yaml file for an example.\n If email alerting is configured: An AlertManager is deployed into each seed cluster that handles the alerting for all shoots on the seed cluster. Gardener will inject the SMTP credentials into the configuration of the AlertManager. The AlertManager will send emails to the configured email address in case any alerts are firing. If an external AlertManager is configured: Each shoot has a Prometheus responsible for monitoring components and sending out alerts. The alerts will be sent to a URL configured in the alerting secret. This external AlertManager is not managed by Gardener and can be configured however the operator sees fit. Supported authentication types are no authentication, basic, or mutual TLS. Global monitoring secrets (optional) - contains basic authentication credentials for the Prometheus aggregating metrics for all clusters.\n These secrets are synced to each seed cluster and used to gain access to the aggregate monitoring components. Shoot Service Account Issuer secret (optional) - contains the configuration needed to centrally configure gardenlets in order to implement GEP-24. Please see the example configuration for more details. In addition to that, the ShootManagedIssuer gardenlet feature gate should be enabled in order for configurations to take effect.\n This secret contains the hostname which will be used to configure the shoot’s managed issuer, therefore the value of the hostname should not be changed once configured. [!CAUTION] Gardener Operator manages this field automatically if Gardener Discovery Server is enabled and does not provide a way to change the default value of it as of now. It calculates it based on the first ingress domain for the runtime Garden cluster. The domain is prefixed with “discovery.” using the formula discovery.{garden.spec.runtimeCluster.ingress.domains[0]}. If you are not yet using Gardener Operator but plan to enable the ShootManagedIssuer feature gate, it is EXTREMELY important to follow the same convention as Gardener Operator, so that during migration to Gardener Operator the hostname can stay the same and avoid disruptions for shoots that already have a managed service account issuer.\n Apart from this “static” configuration there are several custom resources extending the Kubernetes API and used by Gardener. As an operator/administrator, you have to configure some of them to make the system work.\nConfiguration and Usage of Gardener as End-User/Stakeholder/Customer As an end-user/stakeholder/customer, you are using a Gardener landscape that has been setup for you by another team. You don’t need to care about how Gardener itself has to be configured or how it has to be deployed. Take a look at Gardener API Server - the topic describes which resources are offered by Gardener. You may want to have a more detailed look for Projects, SecretBindings, Shoots, and (Cluster)OpenIDConnectPresets.\n","categories":"","description":"","excerpt":"Gardener Configuration and Usage Gardener automates the full lifecycle …","ref":"/docs/gardener/configuration/","tags":"","title":"Configuration"},{"body":"Configure Dependency Watchdog Components Prober Dependency watchdog prober command takes command-line-flags which are meant to fine-tune the prober. In addition a ConfigMap is also mounted to the container which provides tuning knobs for the all probes that the prober starts.\nCommand line arguments Prober can be configured via the following flags:\n Flag Name Type Required Default Value Description kube-api-burst int No 10 Burst to use while talking with kubernetes API server. The number must be \u003e= 0. If it is 0 then a default value of 10 will be used kube-api-qps float No 5.0 Maximum QPS (queries per second) allowed when talking with kubernetes API server. The number must be \u003e= 0. If it is 0 then a default value of 5.0 will be used concurrent-reconciles int No 1 Maximum number of concurrent reconciles config-file string Yes NA Path of the config file containing the configuration to be used for all probes metrics-bind-addr string No “:9643” The TCP address that the controller should bind to for serving prometheus metrics health-bind-addr string No “:9644” The TCP address that the controller should bind to for serving health probes enable-leader-election bool No false In case prober deployment has more than 1 replica for high availability, then it will be setup in a active-passive mode. Out of many replicas one will become the leader and the rest will be passive followers waiting to acquire leadership in case the leader dies. leader-election-namespace string No “garden” Namespace in which leader election resource will be created. It should be the same namespace where DWD pods are deployed leader-elect-lease-duration time.Duration No 15s The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. leader-elect-renew-deadline time.Duration No 10s The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. leader-elect-retry-period time.Duration No 2s The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. You can view an example kubernetes prober deployment YAML to see how these command line args are configured.\nProber Configuration A probe configuration is mounted as ConfigMap to the container. The path to the config file is configured via config-file command line argument as mentioned above. Prober will start one probe per Shoot control plane hosted within the Seed cluster. Each such probe will run asynchronously and will periodically connect to the Kube ApiServer of the Shoot. Configuration below will influence each such probe.\nYou can view an example YAML configuration provided as data in a ConfigMap here.\n Name Type Required Default Value Description kubeConfigSecretName string Yes NA Name of the kubernetes Secret which has the encoded KubeConfig required to connect to the Shoot control plane Kube ApiServer via an internal domain. This typically uses the local cluster DNS. probeInterval metav1.Duration No 10s Interval with which each probe will run. initialDelay metav1.Duration No 30s Initial delay for the probe to become active. Only applicable when the probe is created for the first time. probeTimeout metav1.Duration No 30s In each run of the probe it will attempt to connect to the Shoot Kube ApiServer. probeTimeout defines the timeout after which a single run of the probe will fail. backoffJitterFactor float64 No 0.2 Jitter with which a probe is run. dependentResourceInfos []prober.DependentResourceInfo Yes NA Detailed below. kcmNodeMonitorGraceDuration metav1.Duration Yes NA It is the node-monitor-grace-period set in the kcm flags. Used to determine whether a node lease can be considered expired. nodeLeaseFailureFraction float64 No 0.6 is used to determine the maximum number of leases that can be expired for a lease probe to succeed. DependentResourceInfo If a lease probe fails, then it scales down the dependent resources defined by this property. Similarly, if the lease probe is now successful, then it scales up the dependent resources defined by this property.\nEach dependent resource info has the following properties:\n Name Type Required Default Value Description ref autoscalingv1.CrossVersionObjectReference Yes NA It is a collection of ApiVersion, Kind and Name for a kubernetes resource thus serving as an identifier. optional bool Yes NA It is possible that a dependent resource is optional for a Shoot control plane. This property enables a probe to determine the correct behavior in case it is unable to find the resource identified via ref. scaleUp prober.ScaleInfo No Captures the configuration to scale up this resource. Detailed below. scaleDown prober.ScaleInfo No Captures the configuration to scale down this resource. Detailed below. NOTE: Since each dependent resource is a target for scale up/down, therefore it is mandatory that the resource reference points a kubernetes resource which has a scale subresource.\n ScaleInfo How to scale a DependentResourceInfo is captured in ScaleInfo. It has the following properties:\n Name Type Required Default Value Description level int Yes NA Detailed below. initialDelay metav1.Duration No 0s (No initial delay) Once a decision is taken to scale a resource then via this property a delay can be induced before triggering the scale of the dependent resource. timeout metav1.Duration No 30s Defines the timeout for the scale operation to finish for a dependent resource. Determining target replicas\nProber cannot assume any target replicas during a scale-up operation for the following reasons:\n Kubernetes resources could be set to provide highly availability and the number of replicas could wary from one shoot control plane to the other. In gardener the number of replicas of pods in shoot namespace are controlled by the shoot control plane configuration. If Horizontal Pod Autoscaler has been configured for a kubernetes dependent resource then it could potentially change the spec.replicas for a deployment/statefulset. Given the above constraint lets look at how prober determines the target replicas during scale-down or scale-up operations.\n Scale-Up: Primary responsibility of a probe while performing a scale-up is to restore the replicas of a kubernetes dependent resource prior to scale-down. In order to do that it updates the following for each dependent resource that requires a scale-up:\n spec.replicas: Checks if dependency-watchdog.gardener.cloud/replicas is set. If it is, then it will take the value stored against this key as the target replicas. To be a valid value it should always be greater than 0. If dependency-watchdog.gardener.cloud/replicas annotation is not present then it falls back to the hard coded default value for scale-up which is set to 1. Removes the annotation dependency-watchdog.gardener.cloud/replicas if it exists. Scale-Down: To scale down a dependent kubernetes resource it does the following:\n Adds an annotation dependency-watchdog.gardener.cloud/replicas and sets its value to the current value of spec.replicas. Updates spec.replicas to 0. Level\nEach dependent resource that should be scaled up or down is associated to a level. Levels are ordered and processed in ascending order (starting with 0 assigning it the highest priority). Consider the following configuration:\ndependentResourceInfos: - ref: kind: \"Deployment\" name: \"kube-controller-manager\" apiVersion: \"apps/v1\" scaleUp: level: 1 scaleDown: level: 0 - ref: kind: \"Deployment\" name: \"machine-controller-manager\" apiVersion: \"apps/v1\" scaleUp: level: 1 scaleDown: level: 1 - ref: kind: \"Deployment\" name: \"cluster-autoscaler\" apiVersion: \"apps/v1\" scaleUp: level: 0 scaleDown: level: 2 Let us order the dependent resources by their respective levels for both scale-up and scale-down. We get the following order:\nScale Up Operation\nOrder of scale up will be:\n cluster-autoscaler kube-controller-manager and machine-controller-manager will be scaled up concurrently after cluster-autoscaler has been scaled up. Scale Down Operation\nOrder of scale down will be:\n kube-controller-manager machine-controller-manager after (1) has been scaled down. cluster-autoscaler after (2) has been scaled down. Disable/Ignore Scaling A probe can be configured to ignore scaling of configured dependent kubernetes resources. To do that one must set dependency-watchdog.gardener.cloud/ignore-scaling annotation to true on the scalable resource for which scaling should be ignored.\nWeeder Dependency watchdog weeder command also (just like the prober command) takes command-line-flags which are meant to fine-tune the weeder. In addition a ConfigMap is also mounted to the container which helps in defining the dependency of pods on endpoints.\nCommand Line Arguments Weeder can be configured with the same flags as that for prober described under command-line-arguments section You can find an example weeder deployment YAML to see how these command line args are configured.\nWeeder Configuration Weeder configuration is mounted as ConfigMap to the container. The path to the config file is configured via config-file command line argument as mentioned above. Weeder will start one go routine per podSelector per endpoint on an endpoint event as described in weeder internal concepts.\nYou can view the example YAML configuration provided as data in a ConfigMap here.\n Name Type Required Default Value Description watchDuration *metav1.Duration No 5m0s The time duration for which watch is kept on dependent pods to see if anyone turns to CrashLoopBackoff servicesAndDependantSelectors map[string]DependantSelectors Yes NA Endpoint name and its corresponding dependent pods. More info below. DependantSelectors If the service recovers from downtime, then weeder starts to watch for CrashLoopBackOff pods. These pods are identified by info stored in this property.\n Name Type Required Default Value Description podSelectors []*metav1.LabelSelector Yes NA This is a list of Label selector ","categories":"","description":"","excerpt":"Configure Dependency Watchdog Components Prober Dependency watchdog …","ref":"/docs/other-components/dependency-watchdog/deployment/configure/","tags":"","title":"Configure"},{"body":"Configuring the Logging Stack via gardenlet Configurations Enable the Logging In order to install the Gardener logging stack, the logging.enabled configuration option has to be enabled in the Gardenlet configuration:\nlogging: enabled: true From now on, each Seed is going to have a logging stack which will collect logs from all pods and some systemd services. Logs related to Shoots with testing purpose are dropped in the fluent-bit output plugin. Shoots with a purpose different than testing have the same type of log aggregator (but different instance) as the Seed. The logs can be viewed in the Plutono in the garden namespace for the Seed components and in the respective shoot control plane namespaces.\nEnable Logs from the Shoot’s Node systemd Services The logs from the systemd services on each node can be retrieved by enabling the logging.shootNodeLogging option in the gardenlet configuration:\nlogging: enabled: true shootNodeLogging: shootPurposes: - \"evaluation\" - \"deployment\" Under the shootPurpose section, just list all the shoot purposes for which the Shoot node logging feature will be enabled. Specifying the testing purpose has no effect because this purpose prevents the logging stack installation. Logs can be viewed in the operator Plutono! The dedicated labels are unit, syslog_identifier, and nodename in the Explore menu.\nConfiguring Central Vali Storage Capacity By default, the central Vali has 100Gi of storage capacity. To overwrite the current central Vali storage capacity, the logging.vali.garden.storage setting in the gardenlet’s component configuration should be altered. If you need to increase it, you can do so without losing the current data by specifying a higher capacity. By doing so, the Vali’s PersistentVolume capacity will be increased instead of deleting the current PV. However, if you specify less capacity, then the PersistentVolume will be deleted and with it the logs, too.\nlogging: enabled: true vali: garden: storage: \"200Gi\" ","categories":"","description":"","excerpt":"Configuring the Logging Stack via gardenlet Configurations Enable the …","ref":"/docs/gardener/deployment/configuring_logging/","tags":"","title":"Configuring Logging"},{"body":"Configuring the Registry Cache Extension Introduction Use Case For a Shoot cluster, the containerd daemon of every Node goes to the internet and fetches an image that it doesn’t have locally in the Node’s image cache. New Nodes are often created due to events such as auto-scaling (scale up), rolling update, or replacement of unhealthy Node. Such a new Node would need to pull all of the images of the Pods running on it from the internet because the Node’s cache is initially empty. Pulling an image from a registry produces network traffic and registry costs. To avoid these network traffic and registry costs, you can use the registry-cache extension to run a registry as pull-through cache.\nThe following diagram shows a rough outline of how an image pull looks like for a Shoot cluster without registry cache: Solution The registry-cache extension deploys and manages a registry in the Shoot cluster that runs as pull-through cache. The used registry implementation is distribution/distribution.\nHow does it work? When the extension is enabled, a registry cache for each configured upstream is deployed to the Shoot cluster. Along with this, the containerd daemon on the Shoot cluster Nodes gets configured to use as a mirror the Service IP address of the deployed registry cache. For example, if a registry cache for upstream docker.io is requested via the Shoot spec, then containerd gets configured to first pull the image from the deployed cache in the Shoot cluster. If this image pull operation fails, containerd falls back to the upstream itself (docker.io in that case).\nThe first time an image is requested from the pull-through cache, it pulls the image from the configured upstream registry and stores it locally, before handing it back to the client. On subsequent requests, the pull-through cache is able to serve the image from its own storage.\n [!NOTE] The used registry implementation (distribution/distribution) supports mirroring of only one upstream registry.\n The following diagram shows a rough outline of how an image pull looks like for a Shoot cluster with registry cache: Shoot Configuration The extension is not globally enabled and must be configured per Shoot cluster. The Shoot specification has to be adapted to include the registry-cache extension configuration.\nBelow is an example of registry-cache extension configuration as part of the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: crazy-botany namespace: garden-dev spec: extensions: - type: registry-cache providerConfig: apiVersion: registry.extensions.gardener.cloud/v1alpha3 kind: RegistryConfig caches: - upstream: docker.io volume: size: 100Gi # storageClassName: premium - upstream: ghcr.io - upstream: quay.io garbageCollection: ttl: 0s secretReferenceName: quay-credentials - upstream: my-registry.io:5000 remoteURL: http://my-registry.io:5000 # ... resources: - name: quay-credentials resourceRef: apiVersion: v1 kind: Secret name: quay-credentials-v1 The providerConfig field is required.\nThe providerConfig.caches field contains information about the registry caches to deploy. It is a required field. At least one cache has to be specified.\nThe providerConfig.caches[].upstream field is the remote registry host to cache. It is a required field. The value must be a valid DNS subdomain (RFC 1123) and optionally a port (i.e. \u003chost\u003e[:\u003cport\u003e]). It must not include a scheme.\nThe providerConfig.caches[].remoteURL optional field is the remote registry URL. If configured, it must include an https:// or http:// scheme. If the field is not configured, the remote registry URL defaults to https://\u003cupstream\u003e. In case the upstream is docker.io, it defaults to https://registry-1.docker.io.\nThe providerConfig.caches[].volume field contains settings for the registry cache volume. The registry-cache extension deploys a StatefulSet with a volume claim template. A PersistentVolumeClaim is created with the configured size and StorageClass name.\nThe providerConfig.caches[].volume.size field is the size of the registry cache volume. Defaults to 10Gi. The size must be a positive quantity (greater than 0). This field is immutable. See Increase the cache disk size on how to resize the disk. The extension defines alerts for the volume. See Alerting for Users on how to enable notifications for Shoot cluster alerts.\nThe providerConfig.caches[].volume.storageClassName field is the name of the StorageClass used by the registry cache volume. This field is immutable. If the field is not specified, then the default StorageClass will be used.\nThe providerConfig.caches[].garbageCollection.ttl field is the time to live of a blob in the cache. If the field is set to 0s, the garbage collection is disabled. Defaults to 168h (7 days). See the Garbage Collection section for more details.\nThe providerConfig.caches[].secretReferenceName is the name of the reference for the Secret containing the upstream registry credentials. To cache images from a private registry, credentials to the upstream registry should be supplied. For more details, see How to provide credentials for upstream registry.\n [!NOTE] It is only possible to provide one set of credentials for one private upstream registry.\n Garbage Collection When the registry cache receives a request for an image that is not present in its local store, it fetches the image from the upstream, returns it to the client and stores the image in the local store. The registry cache runs a scheduler that deletes images when their time to live (ttl) expires. When adding an image to the local store, the registry cache also adds a time to live for the image. The ttl defaults to 168h (7 days) and is configurable. The garbage collection can be disabled by setting the ttl to 0s. Requesting an image from the registry cache does not extend the time to live of the image. Hence, an image is always garbage collected from the registry cache store when its ttl expires. At the time of writing this document, there is no functionality for garbage collection based on disk size - e.g., garbage collecting images when a certain disk usage threshold is passed. The garbage collection cannot be enabled once it is disabled. This constraint is added to mitigate distribution/distribution#4249.\nIncrease the Cache Disk Size When there is no available disk space, the registry cache continues to respond to requests. However, it cannot store the remotely fetched images locally because it has no free disk space. In such case, it is simply acting as a proxy without being able to cache the images in its local store. The disk has to be resized to ensure that the registry cache continues to cache images.\nThere are two alternatives to enlarge the cache’s disk size:\n[Alternative 1] Resize the PVC To enlarge the PVC’s size, perform the following steps:\n Make sure that the KUBECONFIG environment variable is targeting the correct Shoot cluster.\n Find the PVC name to resize for the desired upstream. The below example fetches the PVC for the docker.io upstream:\nkubectl -n kube-system get pvc -l upstream-host=docker.io Patch the PVC’s size to the desired size. The below example patches the size of a PVC to 10Gi:\nkubectl -n kube-system patch pvc $PVC_NAME --type merge -p '{\"spec\":{\"resources\":{\"requests\": {\"storage\": \"10Gi\"}}}}' Make sure that the PVC gets resized. Describe the PVC to check the resize operation result:\nkubectl -n kube-system describe pvc -l upstream-host=docker.io Drawback of this approach: The cache’s size in the Shoot spec (providerConfig.caches[].size) diverges from the PVC’s size.\n [Alternative 2] Remove and Readd the Cache There is always the option to remove the cache from the Shoot spec and to readd it again with the updated size.\n Drawback of this approach: The already cached images get lost and the cache starts with an empty disk.\n High Аvailability The registry cache runs with a single replica. This fact may lead to concerns for the high availability such as “What happens when the registry cache is down? Does containerd fail to pull the image?”. As outlined in the How does it work? section, containerd is configured to fall back to the upstream registry if it fails to pull the image from the registry cache. Hence, when the registry cache is unavailable, the containerd’s image pull operations are not affected because containerd falls back to image pull from the upstream registry.\nPossible Pitfalls The used registry implementation (the Distribution project) supports mirroring of only one upstream registry. The extension deploys a pull-through cache for each configured upstream. us-docker.pkg.dev, europe-docker.pkg.dev, and asia-docker.pkg.dev are different upstreams. Hence, configuring pkg.dev as upstream won’t cache images from us-docker.pkg.dev, europe-docker.pkg.dev, or asia-docker.pkg.dev. Limitations Images that are pulled before a registry cache Pod is running or before a registry cache Service is reachable from the corresponding Node won’t be cached - containerd will pull these images directly from the upstream.\nThe reasoning behind this limitation is that a registry cache Pod is running in the Shoot cluster. To have a registry cache’s Service cluster IP reachable from containerd running on the Node, the registry cache Pod has to be running and kube-proxy has to configure iptables/IPVS rules for the registry cache Service. If kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service, then the image pull times (and new Node bootstrap times) will be increased significantly. For more detailed explanations, see point 2. and gardener/gardener-extension-registry-cache#68.\nThat’s why the registry configuration on a Node is applied only after the registry cache Service is reachable from the Node. The gardener-node-agent.service systemd unit sends requests to the registry cache’s Service. Once the registry cache responds with HTTP 200, the unit creates the needed registry configuration file (hosts.toml).\nAs a result, for images from Shoot system components:\n On Shoot creation with the registry cache extension enabled, a registry cache is unable to cache all of the images from the Shoot system components. Usually, until the registry cache Pod is running, containerd pulls from upstream the images from Shoot system components (before the registry configuration gets applied). On new Node creation for existing Shoot with the registry cache extension enabled, a registry cache is unable to cache most of the images from Shoot system components. The reachability of the registry cache Service requires the Service network to be set up, i.e., the kube-proxy for that new Node to be running and to have set up iptables/IPVS configuration for the registry cache Service. containerd requests will time out in 30s in case kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service - the image pull times will increase significantly.\ncontainerd is configured to fall back to the upstream itself if a request against the cache fails. However, if the cluster IP of the registry cache Service does not exist or if kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service, then containerd requests against the registry cache time out in 30 seconds. This significantly increases the image pull times because containerd does multiple requests as part of the image pull (HEAD request to resolve the manifest by tag, GET request for the manifest by SHA, GET requests for blobs)\nExample: If the Service of a registry cache is deleted, then a new Service will be created. containerd’s registry config will still contain the old Service’s cluster IP. containerd requests against the old Service’s cluster IP will time out and containerd will fall back to upstream.\n Image pull of docker.io/library/alpine:3.13.2 from the upstream takes ~2s while image pull of the same image with invalid registry cache cluster IP takes ~2m.2s. Image pull of eu.gcr.io/gardener-project/gardener/ops-toolbelt:0.18.0 from the upstream takes ~10s while image pull of the same image with invalid registry cache cluster IP takes ~3m.10s. Amazon Elastic Container Registry is currently not supported. For details see distribution/distribution#4383.\n ","categories":"","description":"Learn what is the use-case for a pull-through cache, how to enable it and configure it","excerpt":"Learn what is the use-case for a pull-through cache, how to enable it …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-cache/configuration/","tags":"","title":"Configuring the Registry Cache Extension"},{"body":"Configuring the Registry Mirror Extension Introduction Use Case containerd allows registry mirrors to be configured. Use cases are:\n Usage of public mirror(s) - for example, circumvent issues with the upstream registry such as rate limiting, outages, and others. Usage of private mirror(s) - for example, reduce network costs by using a private mirror running in the same network. Solution The registry-mirror extension allows the registry mirror configuration to be configured via the Shoot spec directly.\nHow does it work? When the extension is enabled, the containerd daemon on the Shoot cluster Nodes gets configured to use the requested mirrors as a mirror. For example, if for the upstream docker.io the mirror https://mirror.gcr.io is configured in the Shoot spec, then containerd gets configured to first pull the image from the mirror (https://mirror.gcr.io in that case). If this image pull operation fails, containerd falls back to the upstream itself (docker.io in that case).\nThe extension is based on the contract described in containerd Registry Configuration. The corresponding upstream documentation in containerd is Registry Configuration - Introduction.\nShoot Configuration The Shoot specification has to be adapted to include the registry-mirror extension configuration.\nBelow is an example of registry-mirror extension configuration as part of the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: crazy-botany namespace: garden-dev spec: extensions: - type: registry-mirror providerConfig: apiVersion: mirror.extensions.gardener.cloud/v1alpha1 kind: MirrorConfig mirrors: - upstream: docker.io hosts: - host: \"https://mirror.gcr.io\" capabilities: [\"pull\"] The providerConfig field is required.\nThe providerConfig.mirrors field contains information about the registry mirrors to configure. It is a required field. At least one mirror has to be specified.\nThe providerConfig.mirror[].upstream field is the remote registry host to mirror. It is a required field. The value must be a valid DNS subdomain (RFC 1123) and optionally a port (i.e. \u003chost\u003e[:\u003cport\u003e]). It must not include a scheme.\nThe providerConfig.mirror[].hosts field represents the mirror hosts to be used for the upstream. At least one mirror host has to be specified.\nThe providerConfig.mirror[].hosts[].host field is the mirror host. It is a required field. The value must include a scheme - http:// or https://.\nThe providerConfig.mirror[].hosts[].capabilities field represents the operations a host is capable of performing. This also represents the set of operations for which the mirror host may be trusted to perform. Defaults to [\"pull\"]. The supported values are pull and resolve. See the capabilities field documentation for more information on which operations are considered trusted ones against public/private mirrors.\n","categories":"","description":"Learn what is the use-case for a registry mirror, how to enable and configure it","excerpt":"Learn what is the use-case for a registry mirror, how to enable and …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-mirror/configuration/","tags":"","title":"Configuring the Registry Mirror Extension"},{"body":"Connect Kubectl In Kubernetes, the configuration for accessing your cluster is in a format known as kubeconfig, which is stored as a file. It contains details such as cluster API server addresses and access credentials or a command to obtain access credentials from a kubectl credential plugin. In general, treat a kubeconfig as sensitive data. Tools like kubectl use the kubeconfig to connect and authenticate to a cluster and perform operations on it. Learn more about kubeconfig and kubectl on kubernetes.io.\nTools In this guide, we reference the following tools:\n kubectl: Command-line tool for running commands against Kubernetes clusters. It allows you to control various aspects of your cluster, such as creating or modifying resources, viewing resource status, and debugging your applications. kubelogin: kubectl credential plugin used for OIDC authentication, which is required for the (OIDC) Garden cluster kubeconfig gardenlogin: kubectl credential plugin used for Shoot authentication as system:masters, which is required for the (gardenlogin) Shoot cluster kubeconfig gardenctl: Optional. Command-line tool to administrate one or many Garden, Seed and Shoot clusters. Use this tool to setup gardenlogin and gardenctl itself, configure access to clusters and configure cloud provider CLI tools. Connect Kubectl to a Shoot Cluster In order to connect to a Shoot cluster, you first have to install and setup gardenlogin.\nYou can obtain the kubeconfig for the Shoot cluster either by downloading it from the Gardener dashboard or by copying the gardenctl target command from the dashboard and executing it.\nSetup Gardenlogin Prerequisites You are logged on to the Gardener dashboard. The dashboard admin has configured OIDC for the dashboard. You have installed kubelogin You have installed gardenlogin To setup gardenlogin, you need to:\n Download the kubeconfig for the Garden cluster Configure gardenlogin Download Kubeconfig for the Garden Cluster Navigate to the MY ACCOUNT page on the dashboard by clicking on the user avatar -\u003e MY ACCOUNT. Under the Access section, download the kubeconfig. Configure Gardenlogin Configure gardenlogin by following the installation instruction on the dashboard:\n Select your project from the dropdown on the left Choose CLUSTERS and select your cluster in the list. Choose the Show information about gardenlogin info icon and follow the configuration hints. [!IMPORTANT] Use the previously downloaded kubeconfig for the Garden cluster as the kubeconfig path. Do not use the gardenlogin Shoot cluster kubeconfig here.\n Download and Setup Kubeconfig for a Shoot Cluster The gardenlogin kubeconfig for the Shoot cluster can be obtained in various ways:\n Copy and run the gardenctl target command from the dashboard Download from the Gardener dashboard Copy and Run gardenctl target Command Using the gardenctl target command you can quickly set or switch between clusters. The command sets the scope for the next operation, e.g., it ensures that the KUBECONFIG env variable always points to the current targeted cluster.\nTo target a Shoot cluster:\n Copy the gardenctl target command from the dashboard\n Paste and run the command in the terminal application, for example:\n $ gardenctl target --garden landscape-dev --project core --shoot mycluster Successfully targeted shoot \"mycluster\" Your KUBECONFIG env variable is now pointing to the current target (also visible with gardenctl target view -o yaml). You can now run kubectl commands against your Shoot cluster.\n$ kubectl get namespaces The command connects to the cluster and list its namespaces.\nKUBECONFIG Env Var not Setup Correctly If your KUBECONFIG env variable does not point to the current target, you will see the following message after running the gardenctl target command:\nWARN The KUBECONFIG environment variable does not point to the current target of gardenctl. Run `gardenctl kubectl-env --help` on how to configure the KUBECONFIG environment variable accordingly In this case you would need to run the following command (assuming bash as your current shell). For other shells, consult the gardenctl kubectl-env –help documentation.\n$ eval \"$(gardenctl kubectl-env bash)\" Download from Dashboard Select your project from the dropdown on the left, then choose CLUSTERS and locate your cluster in the list. Choose the key icon to bring up a dialog with the access options.\nIn the Kubeconfig - Gardenlogin section the options are to show gardenlogin info, download, copy or view the kubeconfig for the cluster.\nThe same options are available also in the Access section in the cluster details screen. To find it, choose a cluster from the list.\n Choose the download icon to download the kubeconfig as file on your local system.\n Connecting to the Cluster In the following command, change \u003cpath-to-gardenlogin-kubeconfig\u003e with the actual path to the file where you stored the kubeconfig downloaded in the previous step 2.\n$ kubectl --kubeconfig=\u003cpath-to-gardenlogin-kubeconfig\u003e get namespaces The command connects to the cluster and list its namespaces.\nExporting KUBECONFIG environment variable Since many kubectl commands will be used, it’s a good idea to take advantage of every opportunity to shorten the expressions. The kubectl tool has a fallback strategy for looking up a kubeconfig to work with. For example, it looks for the KUBECONFIG environment variable with value that is the path to the kubeconfig file meant to be used. Export the variable:\n$ export KUBECONFIG=\u003cpath-to-gardenlogin-kubeconfig\u003e Again, replace \u003cpath-to-gardenlogin-kubeconfig\u003e with the actual path to the kubeconfig for the cluster you want to connect to.\nWhat’s next? Using Dashboard Terminal ","categories":"","description":"","excerpt":"Connect Kubectl In Kubernetes, the configuration for accessing your …","ref":"/docs/dashboard/connect-kubectl/","tags":"","title":"Connect Kubectl"},{"body":"Connectivity Shoot Connectivity We measure the connectivity from the shoot to the API Server. This is done via the blackbox exporter which is deployed in the shoot’s kube-system namespace. Prometheus will scrape the blackbox exporter and then the exporter will try to access the API Server. Metrics are exposed if the connection was successful or not. This can be seen in the Kubernetes Control Plane Status dashboard under the API Server Connectivity panel. The shoot line represents the connectivity from the shoot.\nSeed Connectivity In addition to the shoot connectivity, we also measure the seed connectivity. This means trying to reach the API Server from the seed via the external fully qualified domain name of the API server. The connectivity is also displayed in the above panel as the seed line. Both seed and shoot connectivity are shown below.\n","categories":["Users"],"description":"","excerpt":"Connectivity Shoot Connectivity We measure the connectivity from the …","ref":"/docs/gardener/monitoring/connectivity/","tags":"","title":"Connectivity"},{"body":"Problem Two of the most common causes of this problems are specifying the wrong container image or trying to use private images without providing registry credentials.\nNote There is no observable difference in pod status between a missing image and incorrect registry permissions. In either case, Kubernetes will report an ErrImagePull status for the pods. For this reason, this article deals with both scenarios. Example Let’s see an example. We’ll create a pod named fail, referencing a non-existent Docker image:\nkubectl run -i --tty fail --image=tutum/curl:1.123456 The command doesn’t return and you can terminate the process with Ctrl+C.\nError Analysis We can then inspect our pods and see that we have one pod with a status of ErrImagePull or ImagePullBackOff.\n$ (minikube) kubectl get pods NAME READY STATUS RESTARTS AGE client-5b65b6c866-cs4ch 1/1 Running 1 1m fail-6667d7685d-7v6w8 0/1 ErrImagePull 0 \u003cinvalid\u003e vuejs-578574b75f-5x98z 1/1 Running 0 1d $ (minikube) For some additional information, we can describe the failing pod.\nkubectl describe pod fail-6667d7685d-7v6w8 As you can see in the events section, your image can’t be pulled:\nName: fail-6667d7685d-7v6w8 Namespace: default Node: minikube/192.168.64.10 Start Time: Wed, 22 Nov 2017 10:01:59 +0100 Labels: pod-template-hash=2223832418 run=fail Annotations: kubernetes.io/created-by={\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"fail-6667d7685d\",\"uid\":\"cc4ccb3f-cf63-11e7-afca-4a7a1fa05b3f\",\"a... . . . . Events: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 1m\t1m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned fail-6667d7685d-7v6w8 to minikube 1m\t1m\t1\tkubelet, minikube\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-9fr6r\" 1m\t6s\t4\tkubelet, minikube\tspec.containers{fail}\tNormal\tPulling\tpulling image \"tutum/curl:1.123456\" 1m\t5s\t4\tkubelet, minikube\tspec.containers{fail}\tWarning\tFailed\tFailed to pull image \"tutum/curl:1.123456\": rpc error: code = Unknown desc = Error response from daemon: manifest for tutum/curl:1.123456 not found 1m\t\u003cinvalid\u003e\t10\tkubelet, minikube\tWarning\tFailedSync\tError syncing pod 1m\t\u003cinvalid\u003e\t6\tkubelet, minikube\tspec.containers{fail}\tNormal\tBackOff\tBack-off pulling image \"tutum/curl:1.123456\" Why couldn’t Kubernetes pull the image? There are three primary candidates besides network connectivity issues:\n The image tag is incorrect The image doesn’t exist Kubernetes doesn’t have permissions to pull that image If you don’t notice a typo in your image tag, then it’s time to test using your local machine. I usually start by running docker pull on my local development machine with the exact same image tag. In this case, I would run docker pull tutum/curl:1.123456.\nIf this succeeds, then it probably means that Kubernetes doesn’t have the correct permissions to pull that image.\nAdd the docker registry user/pwd to your cluster:\nkubectl create secret docker-registry dockersecret --docker-server=https://index.docker.io/v1/ --docker-username=\u003cusername\u003e --docker-password=\u003cpassword\u003e --docker-email=\u003cemail\u003e If the exact image tag fails, then I will test without an explicit image tag:\ndocker pull tutum/curl This command will attempt to pull the latest tag. If this succeeds, then that means the originally specified tag doesn’t exist. Go to the Docker registry and check which tags are available for this image.\nIf docker pull tutum/curl (without an exact tag) fails, then we have a bigger problem - that image does not exist at all in our image registry.\n","categories":"","description":"Wrong Container Image or Invalid Registry Permissions","excerpt":"Wrong Container Image or Invalid Registry Permissions","ref":"/docs/guides/applications/missing-registry-permission/","tags":"","title":"Container Image Not Pulled"},{"body":"Introduction A container image should use a fixed tag or the SHA of the image. It should not use the tags latest, head, canary, or other tags that are designed to be floating.\nProblem If you have encountered this issue, you have probably done something along the lines of:\n Deploy anything using an image tag (e.g., cp-enablement/awesomeapp:1.0) Fix a bug in awesomeapp Build a new image and push it with the same tag (cp-enablement/awesomeapp:1.0) Update the deployment Realize that the bug is still present Repeat steps 3-5 without any improvement The problem relates to how Kubernetes decides whether to do a docker pull when starting a container. Since we tagged our image as :1.0, the default pull policy is IfNotPresent. The Kubelet already has a local copy of cp-enablement/awesomeapp:1.0, so it doesn’t attempt to do a docker pull. When the new Pods come up, they’re still using the old broken Docker image.\nThere are a couple of ways to resolve this, with the recommended one being to use unique tags.\nSolution In order to fix the problem, you can use the following bash script that runs anytime the deployment is updated to create a new tag and push it to the registry.\n#!/usr/bin/env bash # Set the docker image name and the corresponding repository # Ensure that you change them in the deployment.yml as well. # You must be logged in with docker login. # # CHANGE THIS TO YOUR Docker.io SETTINGS # PROJECT=awesomeapp REPOSITORY=cp-enablement # causes the shell to exit if any subcommand or pipeline returns a non-zero status. # set -e # set debug mode # set -x # build my nodeJS app # npm run build # get the latest version ID from the Docker.io registry and increment them # VERSION=$(curl https://registry.hub.docker.com/v1/repositories/$REPOSITORY/$PROJECT/tags | sed -e 's/[][]//g' -e 's/\"//g' -e 's/ //g' | tr '}' '\\n' | awk -F: '{print $3}' | grep v| tail -n 1) VERSION=${VERSION:1} ((VERSION++)) VERSION=\"v$VERSION\" # build the new docker image # echo '\u003e\u003e\u003e Building new image' echo '\u003e\u003e\u003e Push new image' docker push $REPOSITORY/$PROJECT:$VERSION ","categories":"","description":"Updating images in your cluster during development","excerpt":"Updating images in your cluster during development","ref":"/docs/guides/applications/image-pull-policy/","tags":"","title":"Container Image Not Updating"},{"body":"containerd Registry Configuration containerd supports configuring registries and mirrors. Using this native containerd feature, Shoot owners can configure containerd to use public or private mirrors for a given upstream registry. More details about the registry configuration can be found in the corresponding upstream documentation.\ncontainerd Registry Configuration Patterns At the time of writing this document, containerd support two patterns for configuring registries/mirrors.\n Note: Trying to use both of the patterns at the same time is not supported by containerd. Only one of the configuration patterns has to be followed strictly.\n Old and Deprecated Pattern The old and deprecated pattern is specifying registry.mirrors and registry.configs in the containerd’s config.toml file. See the upstream documentation. Example of the old and deprecated pattern:\nversion = 2 [plugins.\"io.containerd.grpc.v1.cri\".registry] [plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors] [plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors.\"docker.io\"] endpoint = [\"https://public-mirror.example.com\"] In the above example, containerd is configured to first try to pull docker.io images from a configured endpoint (https://public-mirror.example.com). If the image is not available in https://public-mirror.example.com, then containerd will fall back to the upstream registry (docker.io) and will pull the image from there.\nHosts Directory Pattern The hosts directory pattern is the new and recommended pattern for configuring registries. It is available starting containerd@v1.5.0. See the upstream documentation. The above example in the hosts directory pattern looks as follows. The /etc/containerd/config.toml file has the following section:\nversion = 2 [plugins.\"io.containerd.grpc.v1.cri\".registry] config_path = \"/etc/containerd/certs.d\" The following hosts directory structure has to be created:\n$ tree /etc/containerd/certs.d /etc/containerd/certs.d └── docker.io └── hosts.toml Finally, for the docker.io upstream registry, we configure a hosts.toml file as follows:\nserver = \"https://registry-1.docker.io\" [host.\"http://public-mirror.example.com\"] capabilities = [\"pull\", \"resolve\"] Configuring containerd Registries for a Shoot Gardener supports configuring containerd registries on a Shoot using the new hosts directory pattern. For each Shoot Node, Gardener creates the /etc/containerd/certs.d directory and adds the following section to the containerd’s /etc/containerd/config.toml file:\n[plugins.\"io.containerd.grpc.v1.cri\".registry] # gardener-managed config_path = \"/etc/containerd/certs.d\" This allows Shoot owners to use the hosts directory pattern to configure registries for containerd. To do this, the Shoot owners need to create a directory under /etc/containerd/certs.d that is named with the upstream registry host name. In the newly created directory, a hosts.toml file needs to be created. For more details, see the hosts directory pattern section and the upstream documentation.\nThe registry-cache Extension There is a Gardener-native extension named registry-cache that supports:\n Configuring containerd registry mirrors based on the above-described contract. The feature is added in registry-cache@v0.6.0. Running pull through cache(s) in the Shoot. For more details, see the registry-cache documentation.\n","categories":"","description":"","excerpt":"containerd Registry Configuration containerd supports configuring …","ref":"/docs/gardener/containerd-registry-configuration/","tags":"","title":"containerd Registry Configuration"},{"body":"Gardener Container Runtime Extension At the lowest layers of a Kubernetes node is the software that, among other things, starts and stops containers. It is called “Container Runtime”. The most widely known container runtime is Docker, but it is not alone in this space. In fact, the container runtime space has been rapidly evolving.\nKubernetes supports different container runtimes using Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.\nGardener supports creation of Worker machines using CRI. For more information, see CRI Support.\nMotivation Prior to the Container Runtime Extensibility concept, Gardener used Docker as the only container runtime to use in shoot worker machines. Because of the wide variety of different container runtimes offering multiple important features (for example, enhanced security concepts), it is important to enable end users to use other container runtimes as well.\nThe ContainerRuntime Extension Resource Here is what a typical ContainerRuntime resource would look like:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: ContainerRuntime metadata: name: my-container-runtime spec: binaryPath: /var/bin/containerruntimes type: gvisor workerPool: name: worker-ubuntu selector: matchLabels: worker.gardener.cloud/pool: worker-ubuntu Gardener deploys one ContainerRuntime resource per worker pool per CRI. To exemplify this, consider a Shoot having two worker pools (worker-one, worker-two) using containerd as the CRI as well as gvisor and kata as enabled container runtimes. Gardener would deploy four ContainerRuntime resources. For worker-one: one ContainerRuntime for type gvisor and one for type kata. The same resource are being deployed for worker-two.\nSupporting a New Container Runtime Provider To add support for another container runtime (e.g., gvisor, kata-containers), a container runtime extension controller needs to be implemented. It should support Gardener’s supported CRI plugins.\nThe container runtime extension should install the necessary resources into the shoot cluster (e.g., RuntimeClasses), and it should copy the runtime binaries to the relevant worker machines in path: spec.binaryPath. Gardener labels the shoot nodes according to the CRI configured: worker.gardener.cloud/cri-name=\u003cvalue\u003e (e.g worker.gardener.cloud/cri-name=containerd) and multiple labels for each of the container runtimes configured for the shoot Worker machine: containerruntime.worker.gardener.cloud/\u003ccontainer-runtime-type-value\u003e=true (e.g containerruntime.worker.gardener.cloud/gvisor=true). The way to install the binaries is by creating a daemon set which copies the binaries from an image in a docker registry to the relevant labeled Worker’s nodes (avoid downloading binaries from the internet to also cater with isolated environments).\nFor additional reference, please have a look at the runtime-gvsior provider extension, which provides more information on how to configure the necessary charts, as well as the actuators required to reconcile container runtime inside the Shoot cluster to the desired state.\n","categories":"","description":"","excerpt":"Gardener Container Runtime Extension At the lowest layers of a …","ref":"/docs/gardener/extensions/containerruntime/","tags":"","title":"ContainerRuntime"},{"body":"You are welcome to contribute code to Gardener in order to fix a bug or to implement a new feature.\nThe following rules govern code contributions:\n Contributions must be licensed under the Apache 2.0 License You need to sign the Contributor License Agreement. We are using CLA assistant providing a click-through workflow for accepting the CLA. For company contributors additionally the company needs to sign a corporate license agreement. See the following sections for details. ","categories":"","description":"","excerpt":"You are welcome to contribute code to Gardener in order to fix a bug …","ref":"/docs/contribute/code/","tags":"","title":"Contributing Code"},{"body":"You are welcome to contribute documentation to Gardener.\nThe following rules govern documentation contributions:\n Contributions must be licensed under the Creative Commons Attribution 4.0 International License You need to sign the Contributor License Agreement. We are using CLA assistant providing a click-through workflow for accepting the CLA. For company contributors additionally the company needs to sign a corporate license agreement. See the following sections for details. ","categories":"","description":"","excerpt":"You are welcome to contribute documentation to Gardener.\nThe following …","ref":"/contribute/docs/","tags":"","title":"Contributing Documentation"},{"body":"How to contribute? Contributions are always welcome!\nIn order to contribute ensure that you have the development environment setup and you familiarize yourself with required steps to build, verify-quality and test.\nSetting up development environment Installing Go Minimum Golang version required: 1.18. On MacOS run:\nbrew install go For other OS, follow the installation instructions.\nInstalling Git Git is used as version control for dependency-watchdog. On MacOS run:\nbrew install git If you do not have git installed already then please follow the installation instructions.\nInstalling Docker In order to test dependency-watchdog containers you will need a local kubernetes setup. Easiest way is to first install Docker. This becomes a pre-requisite to setting up either a vanilla KIND/minikube cluster or a local Gardener cluster.\nOn MacOS run:\nbrew install -cash docker For other OS, follow the installation instructions.\nInstalling Kubectl To interact with the local Kubernetes cluster you will need kubectl. On MacOS run:\nbrew install kubernetes-cli For other IS, follow the installation instructions.\nGet the sources Clone the repository from Github:\ngit clone https://github.com/gardener/dependency-watchdog.git Using Makefile For every change following make targets are recommended to run.\n# build the code changes \u003e make build # ensure that all required checks pass \u003e make verify # this will check formatting, linting and will run unit tests # if you do not wish to run tests then you can use the following make target. \u003e make check All tests should be run and the test coverage should ideally not reduce. Please ensure that you have read testing guidelines.\nBefore raising a pull request ensure that if you are introducing any new file then you must add licesence header to all new files. To add license header you can run this make target:\n\u003e make add-license-headers # This will add license headers to any file which does not already have it. NOTE: Also have a look at the Makefile as it has other targets that are not mentioned here.\n Raising a Pull Request To raise a pull request do the following:\n Create a fork of dependency-watchdog Add dependency-watchdog as upstream remote via git remote add upstream https://github.com/gardener/dependency-watchdog It is recommended that you create a git branch and push all your changes for the pull-request. Ensure that while you work on your pull-request, you continue to rebase the changes from upstream to your branch. To do that execute the following command: git pull --rebase upstream master We prefer clean commits. If you have multiple commits in the pull-request, then squash the commits to a single commit. You can do this via interactive git rebase command. For example if your PR branch is ahead of remote origin HEAD by 5 commits then you can execute the following command and pick the first commit and squash the remaining commits. git rebase -i HEAD~5 #actual number from the head will depend upon how many commits your branch is ahead of remote origin master ","categories":"","description":"","excerpt":"How to contribute? Contributions are always welcome!\nIn order to …","ref":"/docs/other-components/dependency-watchdog/contribution/","tags":"","title":"Contribution"},{"body":"Endpoints and Ports of a Shoot Control-Plane With the reversed VPN tunnel, there are no endpoints with open ports in the shoot cluster required by Gardener. In order to allow communication to the shoots control-plane in the seed cluster, there are endpoints shared by multiple shoots of a seed cluster. Depending on the configured zones or exposure classes, there are different endpoints in a seed cluster. The IP address(es) can be determined by a DNS query for the API Server URL. The main entry-point into the seed cluster is the load balancer of the Istio ingress-gateway service. Depending on the infrastructure provider, there can be one IP address per zone.\nThe load balancer of the Istio ingress-gateway service exposes the following TCP ports:\n 443 for requests to the shoot API Server. The request is dispatched according to the set TLS SNI extension. 8443 for requests to the shoot API Server via api-server-proxy, dispatched based on the proxy protocol target, which is the IP address of kubernetes.default.svc.cluster.local in the shoot. 8132 to establish the reversed VPN connection. It’s dispatched according to an HTTP header value. kube-apiserver via SNI DNS entries for api.\u003cexternal-domain\u003e and api.\u003cshoot\u003e.\u003cproject\u003e.\u003cinternal-domain\u003e point to the load balancer of an Istio ingress-gateway service. The Kubernetes client sets the server name to api.\u003cexternal-domain\u003e or api.\u003cshoot\u003e.\u003cproject\u003e.\u003cinternal-domain\u003e. Based on SNI, the connection is forwarded to the respective API Server at TCP layer. There is no TLS termination at the Istio ingress-gateway. TLS termination happens on the shoots API Server. Traffic is end-to-end encrypted between the client and the API Server. The certificate authority and authentication are defined in the corresponding kubeconfig. Details can be found in GEP-08.\nkube-apiserver via apiserver-proxy Inside the shoot cluster, the API Server can also be reached by the cluster internal name kubernetes.default.svc.cluster.local. The pods apiserver-proxy are deployed in the host network as daemonset and intercept connections to the Kubernetes service IP address. The destination address is changed to the cluster IP address of the service kube-apiserver.\u003cshoot-namespace\u003e.svc.cluster.local in the seed cluster. The connections are forwarded via the HaProxy Proxy Protocol to the Istio ingress-gateway in the seed cluster. The Istio ingress-gateway forwards the connection to the respective shoot API Server by it’s cluster IP address. As TLS termination happens at the API Server, the traffic is end-to-end encrypted the same way as with SNI.\nDetails can be found in GEP-11.\nReversed VPN Tunnel As the API Server has to be able to connect to endpoints in the shoot cluster, a VPN connection is established. This VPN connection is initiated from a VPN client in the shoot cluster. The VPN client connects to the Istio ingress-gateway and is forwarded to the VPN server in the control-plane namespace of the shoot. Once the VPN tunnel between the VPN client in the shoot and the VPN server in the seed cluster is established, the API Server can connect to nodes, services and pods in the shoot cluster.\nMore details can be found in the usage document and GEP-14.\n","categories":"","description":"","excerpt":"Endpoints and Ports of a Shoot Control-Plane With the reversed VPN …","ref":"/docs/gardener/control-plane-endpoints-and-ports/","tags":"","title":"Control Plane Endpoints And Ports"},{"body":"Control Plane Migration Prerequisites The Seeds involved in the control plane migration must have backups enabled - their .spec.backup fields cannot be nil.\nShootState ShootState is an API resource which stores non-reconstructible state and data required to completely recreate a Shoot’s control plane on a new Seed. The ShootState resource is created on Shoot creation in its Project namespace and the required state/data is persisted during Shoot creation or reconciliation.\nShoot Control Plane Migration Triggering the migration is done by changing the Shoot’s .spec.seedName to a Seed that differs from the .status.seedName, we call this Seed a \"Destination Seed\". This action can only be performed by an operator (see Triggering the Migration). If the Destination Seed does not have a backup and restore configuration, the change to spec.seedName is rejected. Additionally, this Seed must not be set for deletion and must be healthy.\nIf the Shoot has different .spec.seedName and .status.seedName, a process is started to prepare the Control Plane for migration:\n .status.lastOperation is changed to Migrate. Kubernetes API Server is stopped and the extension resources are annotated with gardener.cloud/operation=migrate. Full snapshot of the ETCD is created and terminating of the Control Plane in the Source Seed is initiated. If the process is successful, we update the status of the Shoot by setting the .status.seedName to the null value. That way, a restoration is triggered in the Destination Seed and .status.lastOperation is changed to Restore. The control plane migration is completed when the Restore operation has completed successfully.\nThe etcd backups will be copied over to the BackupBucket of the Destination Seed during control plane migration and any future backups will be uploaded there.\nTriggering the Migration For control plane migration, operators with the necessary RBAC can use the shoots/binding subresource to change the .spec.seedName, with the following commands:\nNAMESPACE=my-namespace SHOOT_NAME=my-shoot DEST_SEED_NAME=destination-seed kubectl get --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME} | jq -c '.spec.seedName = \"'${DEST_SEED_NAME}'\"' | kubectl replace --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/binding -f - | jq -r '.spec.seedName' [!IMPORTANT] When migrating Shoots to a Destination Seed with different provider type from the Source Seed, make sure of the following:\nPods running in the Destination Seed must have network connectivity to the backup storage provider of the Source Seed so that etcd backups can be copied successfully. Otherwise, the Restore operation will get stuck at the Waiting until etcd backups are copied step. However, if you do end up in this case, you can still finish the control plane migration by following the guide to manually copy etcd backups.\nThe nodes of your Shoot cluster must have network connectivity to the Shoot’s kube-apiserver and the vpn-seed-server once they are migrated to the Destination Seed. Otherwise, the Restore operation will get stuck at the Waiting until the Kubernetes API server can connect to the Shoot workers step. However, if you do end up in this case and cannot allow network traffic from the nodes to the Shoot’s control plane, you can annotate the Shoot with the shoot.gardener.cloud/skip-readiness annotation so that the Restore operation finishes, and then use the shoots/binding subresource to migrate the control plane back to the Source Seed.\n Copying ETCD Backups Manually During the Restore Operation Following is a workaround that can be used to copy etcd backups manually in situations where a Shoot’s control plane has been moved to a Destination Seed and the pods running in it lack network connectivity to the Source Seed’s storage provider:\n Follow the instructions in the etcd-backup-restore getting started documentation on how to run the etcdbrctl command locally or in a container. Follow the instructions in the passing-credentials guide on how to set up the required credentials for the copy operation depending on the storage providers for which you want to perform it. Use the etcdbrctl copy command to copy the backups by following the instructions in the etcdbrctl copy guide After you have successfully copied the etcd backups, wait for the EtcdCopyBackupsTask custom resource to be created in the Shoot’s control plane on the Destination Seed, if it does not already exist. Afterwards, mark it as successful by patching it using the following command: SHOOT_NAME=my-shoot PROJECT_NAME=my-project kubectl patch -n shoot--${PROJECT_NAME}--${SHOOT_NAME} etcdcopybackupstask ${SHOOT_NAME} --subresource status --type merge -p \"{\\\"status\\\":{\\\"conditions\\\":[{\\\"type\\\":\\\"Succeeded\\\",\\\"status\\\":\\\"True\\\",\\\"reason\\\":\\\"manual copy successful\\\",\\\"message\\\":\\\"manual copy successful\\\",\\\"lastTransitionTime\\\":\\\"$(date -Iseconds)\\\",\\\"lastUpdateTime\\\":\\\"$(date -Iseconds)\\\"}]}}\" After the main-etcd becomes Ready, and the source-etcd-backup secret is deleted from the Shoot’s control plane, remove the finalizer on the source extensions.gardener.cloud/v1alpha1.BackupEntry in the Destination Seed so that it can be deleted successfully (the resource name uses the following format: source-shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e--\u003cuid\u003e). This is necessary as the Destination Seed will not have network connectivity to the Source Seed’s storage provider and the deletion will fail. Once the control plane migration has finished successfully, make sure to manually clean up the source backup directory in the Source Seed’s storage provider. ","categories":"","description":"","excerpt":"Control Plane Migration Prerequisites The Seeds involved in the …","ref":"/docs/gardener/control_plane_migration/","tags":"","title":"Control Plane Migration"},{"body":"Registering Extension Controllers Extensions are registered in the garden cluster via ControllerRegistration resources. Deployment for respective extensions are specified via ControllerDeployment resources. Gardener evaluates the registrations and deployments and creates ControllerInstallation resources which describe the request “please install this controller X to this seed Y”.\nSimilar to how CloudProfile or Seed resources get into the system, the Gardener administrator must deploy the ControllerRegistration and ControllerDeployment resources (this does not happen automatically in any way - the administrator decides which extensions shall be enabled).\nThe specification mainly describes which of Gardener’s extension CRDs are managed, for example:\napiVersion: core.gardener.cloud/v1 kind: ControllerDeployment metadata: name: os-gardenlinux helm: ociRepository: ref: registry.example.com/os-gardenlinux/charts/os-gardenlinux:1.0.0 # or a base64-encoded, gzip'ed, tar'ed extension controller chart # rawChart: H4sIFAAAAAAA/yk... values: foo: bar --- apiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: os-gardenlinux spec: deployment: deploymentRefs: - name: os-gardenlinux resources: - kind: OperatingSystemConfig type: gardenlinux primary: true This information tells Gardener that there is an extension controller that can handle OperatingSystemConfig resources of type gardenlinux. A reference to the shown ControllerDeployment specifies how the deployment of the extension controller is accomplished.\nAlso, it specifies that this controller is the primary one responsible for the lifecycle of the OperatingSystemConfig resource. Setting primary to false would allow to register additional, secondary controllers that may also watch/react on the OperatingSystemConfig/coreos resources, however, only the primary controller may change/update the main status of the extension object (that are used to “communicate” with the gardenlet). Particularly, only the primary controller may set .status.lastOperation, .status.lastError, .status.observedGeneration, and .status.state. Secondary controllers may contribute to the .status.conditions[] if they like, of course.\nSecondary controllers might be helpful in scenarios where additional tasks need to be completed which are not part of the reconciliation logic of the primary controller but separated out into a dedicated extension.\n⚠️ There must be exactly one primary controller for every registered kind/type combination. Also, please note that the primary field cannot be changed after creation of the ControllerRegistration.\nDeploying Extension Controllers Submitting the above ControllerDeployment and ControllerRegistration will create a ControllerInstallation resource:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerInstallation metadata: name: os-gardenlinux spec: deploymentRef: name: os-gardenlinux registrationRef: name: os-gardenlinux seedRef: name: aws-eu1 This resource expresses that Gardener requires the os-gardenlinux extension controller to run on the aws-eu1 seed cluster.\ngardener-controller-manager automatically determines which extension is required on which seed cluster and will only create ControllerInstallation objects for those. Also, it will automatically delete ControllerInstallations referencing extension controllers that are no longer required on a seed (e.g., because all shoots on it have been deleted). There are additional configuration options, please see the Deployment Configuration Options section. After gardener-controller-manager has written the ControllerInstallation resource, gardenlet picks it up and installs the controller on the respective Seed using the referenced ControllerDeployment.\nIt is sufficient to create a Helm chart and deploy it together with some static configuration values. For this, operators have to provide the deployment information in the ControllerDeployment.helm section:\n... helm: rawChart: H4sIFAAAAAAA/yk... values: foo: bar You can check out hack/generate-controller-registration.yaml for generating a ControllerDeployment including a controller helm chart.\nIf ControllerDeployment.helm is specified, gardenlet either decodes the provided Helm chart (.helm.rawChart) or pulls the chart from the referenced OCI Repository (.helm.ociRepository). When referencing an OCI Repository, you have several options in how to specify where to pull the chart:\nhelm: ociRepository: # full ref with either tag or digest, or both ref: registry.example.com/foo:1.0.0@sha256:abc --- helm: ociRepository: # repository and tag repository: registry.example.com tag: 1.0.0 --- helm: ociRepository: # repository and digest repository: registry.example.com digest: sha256:abc --- helm: ociRepository: # when specifying both tag and digest, the tag is ignored. repository: registry.example.com tag: 1.0.0 digest: sha256:abc Gardenlet caches the downloaded chart in memory. It is recommended to always specify a digest, because if it is not specified, gardenlet needs to fetch the manifest in every reconciliation to compare the digest with the local cache.\nNo matter where the chart originates from, gardenlet deploys it with the provided static configuration (.helm.values). The chart and the values can be updated at any time - Gardener will recognize it and re-trigger the deployment process. In order to allow extensions to get information about the garden and the seed cluster, gardenlet mixes in certain properties into the values (root level) of every deployed Helm chart:\ngardener: version: \u003cgardener-version\u003e garden: clusterIdentity: \u003cuuid-of-gardener-installation\u003e genericKubeconfigSecretName: \u003cgeneric-garden-kubeconfig-secret-name\u003e seed: name: \u003cseed-name\u003e clusterIdentity: \u003cseed-cluster-identity\u003e annotations: \u003cseed-annotations\u003e labels: \u003cseed-labels\u003e provider: \u003cseed-provider-type\u003e region: \u003cseed-region\u003e volumeProvider: \u003cseed-first-volume-provider\u003e volumeProviders: \u003cseed-volume-providers\u003e ingressDomain: \u003cseed-ingress-domain\u003e protected: \u003cseed-protected-taint\u003e visible: \u003cseed-visible-setting\u003e taints: \u003cseed-taints\u003e networks: \u003cseed-networks\u003e blockCIDRs: \u003cseed-networks-blockCIDRs\u003e spec: \u003cseed-spec\u003e gardenlet: featureGates: \u003cgardenlet-feature-gates\u003e Extensions can use this information in their Helm chart in case they require knowledge about the garden and the seed environment. The list might be extended in the future.\ngardenlet reports whether the extension controller has been installed successfully and running in the ControllerInstallation status:\nstatus: conditions: - lastTransitionTime: \"2024-05-16T13:04:16Z\" lastUpdateTime: \"2024-05-16T13:04:16Z\" message: The controller running in the seed cluster is healthy. reason: ControllerHealthy status: \"True\" type: Healthy - lastTransitionTime: \"2024-05-16T13:04:06Z\" lastUpdateTime: \"2024-05-16T13:04:06Z\" message: The controller was successfully installed in the seed cluster. reason: InstallationSuccessful status: \"True\" type: Installed - lastTransitionTime: \"2024-05-16T13:04:16Z\" lastUpdateTime: \"2024-05-16T13:04:16Z\" message: The controller has been rolled out successfully. reason: ControllerRolledOut status: \"False\" type: Progressing - lastTransitionTime: \"2024-05-16T13:03:39Z\" lastUpdateTime: \"2024-05-16T13:03:39Z\" message: chart could be rendered successfully. reason: RegistrationValid status: \"True\" type: Valid Deployment Configuration Options The .spec.deployment resource allows to configure a deployment policy. There are the following policies:\n OnDemand (default): Gardener will demand the deployment and deletion of the extension controller to/from seed clusters dynamically. It will automatically determine (based on other resources like Shoots) whether it is required and decide accordingly. Always: Gardener will demand the deployment of the extension controller to seed clusters independent of whether it is actually required or not. This might be helpful if you want to add a new component/controller to all seed clusters by default. Another use-case is to minimize the durations until extension controllers get deployed and ready in case you have highly fluctuating seed clusters. AlwaysExceptNoShoots: Similar to Always, but if the seed does not have any shoots, then the extension is not being deployed. It will be deleted from a seed after the last shoot has been removed from it. Also, the .spec.deployment.seedSelector allows to specify a label selector for seed clusters. Only if it matches the labels of a seed, then it will be deployed to it. Please note that a seed selector can only be specified for secondary controllers (primary=false for all .spec.resources[]).\nExtensions in the Garden Cluster Itself The Shoot resource itself will contain some provider-specific data blobs. As a result, some extensions might also want to run in the garden cluster, e.g., to provide ValidatingWebhookConfigurations for validating the correctness of their provider-specific blobs:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: ... cloud: type: aws region: eu-west-1 providerConfig: apiVersion: aws.cloud.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: # specify either 'id' or 'cidr' # id: vpc-123456 cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a ... In the above example, Gardener itself does not understand the AWS-specific provider configuration for the infrastructure. However, if this part of the Shoot resource should be validated, then you should run an AWS-specific component in the garden cluster that registers a webhook. You can do it similarly if you want to default some fields of a resource (by using a MutatingWebhookConfiguration).\nAgain, similar to how Gardener is deployed to the garden cluster, these components must be deployed and managed by the Gardener administrator.\nExtension Resource Configurations The Extension resource allows injecting arbitrary steps into the shoot reconciliation flow that are unknown to Gardener. Hence, it is slightly special and allows further configuration when registering it:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: extension-foo spec: resources: - kind: Extension type: foo primary: true globallyEnabled: true reconcileTimeout: 30s lifecycle: reconcile: AfterKubeAPIServer delete: BeforeKubeAPIServer migrate: BeforeKubeAPIServer The globallyEnabled=true option specifies that the Extension/foo object shall be created by default for all shoots (unless they opted out by setting .spec.extensions[].enabled=false in the Shoot spec).\nThe reconcileTimeout tells Gardener how long it should wait during its shoot reconciliation flow for the Extension/foo’s reconciliation to finish.\nExtension Lifecycle The lifecycle field tells Gardener when to perform a certain action on the Extension resource during the reconciliation flows. If omitted, then the default behaviour will be applied. Please find more information on the defaults in the explanation below. Possible values for each control flow are AfterKubeAPIServer, BeforeKubeAPIServer, and AfterWorker. Let’s take the following configuration and explain it.\n ... lifecycle: reconcile: AfterKubeAPIServer delete: BeforeKubeAPIServer migrate: BeforeKubeAPIServer reconcile: AfterKubeAPIServer means that the extension resource will be reconciled after the successful reconciliation of the kube-apiserver during shoot reconciliation. This is also the default behaviour if this value is not specified. During shoot hibernation, the opposite rule is applied, meaning that in this case the reconciliation of the extension will happen before the kube-apiserver is scaled to 0 replicas. On the other hand, if the extension needs to be reconciled before the kube-apiserver and scaled down after it, then the value BeforeKubeAPIServer should be used. delete: BeforeKubeAPIServer means that the extension resource will be deleted before the kube-apiserver is destroyed during shoot deletion. This is the default behaviour if this value is not specified. migrate: BeforeKubeAPIServer means that the extension resource will be migrated before the kube-apiserver is destroyed in the source cluster during control plane migration. This is the default behaviour if this value is not specified. The restoration of the control plane follows the reconciliation control flow. The lifecycle value AfterWorker is only available during reconcile. When specified, the extension resource will be reconciled after the workers are deployed. This is useful for extensions that want to deploy a workload in the shoot control plane and want to wait for the workload to run and get ready on a node. During shoot creation the extension will start its reconciliation before the first workers have joined the cluster, they will become available at some later point.\n","categories":"","description":"","excerpt":"Registering Extension Controllers Extensions are registered in the …","ref":"/docs/gardener/extensions/controllerregistration/","tags":"","title":"ControllerRegistration"},{"body":"Controllers etcd-druid is an operator to manage etcd clusters, and follows the Operator pattern for Kubernetes. It makes use of the Kubebuilder framework which makes it quite easy to define Custom Resources (CRs) such as Etcds and EtcdCopyBackupTasks through Custom Resource Definitions (CRDs), and define controllers for these CRDs. etcd-druid uses Kubebuilder to define the Etcd CR and its corresponding controllers.\nAll controllers that are a part of etcd-druid reside in package internal/controller, as sub-packages.\nEtcd-druid currently consists of the following controllers, each having its own responsibility:\n etcd : responsible for the reconciliation of the Etcd CR spec, which allows users to run etcd clusters within the specified Kubernetes cluster, and also responsible for periodically updating the Etcd CR status with the up-to-date state of the managed etcd cluster. compaction : responsible for snapshot compaction. etcdcopybackupstask : responsible for the reconciliation of the EtcdCopyBackupsTask CR, which helps perform the job of copying snapshot backups from one object store to another. secret : responsible in making sure Secrets being referenced by Etcd resources are not deleted while in use. Package Structure The typical package structure for the controllers that are part of etcd-druid is shown with the compaction controller:\ninternal/controller/compaction ├── config.go ├── reconciler.go └── register.go config.go: contains all the logic for the configuration of the controller, including feature gate activations, CLI flag parsing and validations. register.go: contains the logic for registering the controller with the etcd-druid controller manager. reconciler.go: contains the controller reconciliation logic. Each controller package also contains auxiliary files which are relevant to that specific controller.\nController Manager A manager is first created for all controllers that are a part of etcd-druid. The controller manager is responsible for all the controllers that are associated with CRDs. Once the manager is Start()ed, all the controllers that are registered with it are started.\nEach controller is built using a controller builder, configured with details such as the type of object being reconciled, owned objects whose owner object is reconciled, event filters (predicates), etc. Predicates are filters which allow controllers to filter which type of events the controller should respond to and which ones to ignore.\nThe logic relevant to the controller manager like the creation of the controller manager and registering each of the controllers with the manager, is contained in internal/manager/manager.go.\nEtcd Controller The etcd controller is responsible for the reconciliation of the Etcd resource spec and status. It handles the provisioning and management of the etcd cluster. Different components that are required for the functioning of the cluster like Leases, ConfigMaps, and the Statefulset for the etcd cluster are all deployed and managed by the etcd controller.\nAdditionally, etcd controller also periodically updates the Etcd resource status with the latest available information from the etcd cluster, as well as results and errors from the recent-most reconciliation of the Etcd resource spec.\nThe etcd controller is essential to the functioning of the etcd cluster and etcd-druid, thus the minimum number of worker threads is 1 (default being 3), controlled by the CLI flag --etcd-workers.\nEtcd Spec Reconciliation While building the controller, an event filter is set such that the behavior of the controller, specifically for Etcd update operations, depends on the gardener.cloud/operation: reconcile annotation. This is controlled by the --enable-etcd-spec-auto-reconcile CLI flag, which, if set to false, tells the controller to perform reconciliation only when this annotation is present. If the flag is set to true, the controller will reconcile the etcd cluster anytime the Etcd spec, and thus generation, changes, and the next queued event for it is triggered.\n Note: Creation and deletion of Etcd resources are not affected by the above flag or annotation.\n The reason this filter is present is that any disruption in the Etcd resource due to reconciliation (due to changes in the Etcd spec, for example) while workloads are being run would cause unwanted downtimes to the etcd cluster. Hence, any user who wishes to avoid such disruptions, can choose to set the --enable-etcd-spec-auto-reconcile CLI flag to false. An example of this is Gardener’s gardenlet, which reconciles the Etcd resource only during a shoot cluster’s maintenance window.\nThe controller adds a finalizer to the Etcd resource in order to ensure that it does not get deleted until all dependent resources managed by etcd-druid, aka managed components, are properly cleaned up. Only the etcd controller can delete a resource once it adds finalizers to it. This ensures that the proper deletion flow steps are followed while deleting the resource. During deletion flow, managed components are deleted in parallel.\nEtcd Status Updates The Etcd resource status is updated periodically by etcd controller, the interval for which is determined by the CLI flag --etcd-status-sync-period.\nStatus fields of the Etcd resource such as LastOperation, LastErrors and ObservedGeneration, are updated to reflect the result of the recent reconciliation of the Etcd resource spec.\n LastOperation holds information about the last operation performed on the etcd cluster, indicated by fields Type, State, Description and LastUpdateTime. Additionally, a field RunID indicates the unique ID assigned to the specific reconciliation run, to allow for better debugging of issues. LastErrors is a slice of errors encountered by the last reconciliation run. Each error consists of fields Code to indicate the custom etcd-druid error code for the error, a human-readable Description, and the ObservedAt time when the error was seen. ObservedGeneration indicates the latest generation of the Etcd resource that etcd-druid has “observed” and consequently reconciled. It helps identify whether a change in the Etcd resource spec was acted upon by druid or not. Status fields of the Etcd resource which correspond to the StatefulSet like CurrentReplicas, ReadyReplicas and Replicas are updated to reflect those of the StatefulSet by the controller.\nStatus fields related to the etcd cluster itself, such as Members, PeerUrlTLSEnabled and Ready are updated as follows:\n Cluster Membership: The controller updates the information about etcd cluster membership like Role, Status, Reason, LastTransitionTime and identifying information like the Name and ID. For the Status field, the member is checked for the Ready condition, where the member can be in Ready, NotReady and Unknown statuses. Etcd resource conditions are indicated by status field Conditions. The condition checks that are currently performed are:\n AllMembersReady: indicates readiness of all members of the etcd cluster. Ready: indicates overall readiness of the etcd cluster in serving traffic. BackupReady: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. DataVolumesReady: indicates health of the persistent volumes containing the etcd data. Compaction Controller The compaction controller deploys the snapshot compaction job whenever required. To understand the rationale behind this controller, please read snapshot-compaction.md. The controller watches the number of events accumulated as part of delta snapshots in the etcd cluster’s backups, and triggers a snapshot compaction when the number of delta events crosses the set threshold, which is configurable through the --etcd-events-threshold CLI flag (1M events by default).\nThe controller watches for changes in snapshot Leases associated with Etcd resources. It checks the full and delta snapshot Leases and calculates the difference in events between the latest delta snapshot and the previous full snapshot, and initiates the compaction job if the event threshold is crossed.\nThe number of worker threads for the compaction controller needs to be greater than or equal to 0 (default 3), controlled by the CLI flag --compaction-workers. This is unlike other controllers which need at least one worker thread for the proper functioning of etcd-druid as snapshot compaction is not a core functionality for the etcd clusters to be deployed. The compaction controller should be explicitly enabled by the user, through the --enable-backup-compaction CLI flag.\nEtcdCopyBackupsTask Controller The etcdcopybackupstask controller is responsible for deploying the etcdbrctl copy command as a job. This controller reacts to create/update events arising from EtcdCopyBackupsTask resources, and deploys the EtcdCopyBackupsTask job with source and target backup storage providers as arguments, which are derived from source and target bucket secrets referenced by the EtcdCopyBackupsTask resource.\nThe number of worker threads for the etcdcopybackupstask controller needs to be greater than or equal to 0 (default being 3), controlled by the CLI flag --etcd-copy-backups-task-workers. This is unlike other controllers who need at least one worker thread for the proper functioning of etcd-druid as EtcdCopyBackupsTask is not a core functionality for the etcd clusters to be deployed.\nSecret Controller The secret controller’s primary responsibility is to add a finalizer on Secrets referenced by the Etcd resource. The secret controller is registered for Secrets, and the controller keeps a watch on the Etcd CR. This finalizer is added to ensure that Secrets which are referenced by the Etcd CR aren’t deleted while still being used by the Etcd resource.\nEvents arising from the Etcd resource are mapped to a list of Secrets such as backup and TLS secrets that are referenced by the Etcd resource, and are enqueued into the request queue, which the reconciler then acts on.\nThe number of worker threads for the secret controller must be at least 1 (default being 10) for this core controller, controlled by the CLI flag --secret-workers, since the referenced TLS and infrastructure access secrets are essential to the proper functioning of the etcd cluster.\n","categories":"","description":"","excerpt":"Controllers etcd-druid is an operator to manage etcd clusters, and …","ref":"/docs/other-components/etcd-druid/concepts/controllers/","tags":"","title":"Controllers"},{"body":"Controlling the Kubernetes Versions for Specific Worker Pools Since Gardener v1.36, worker pools can have different Kubernetes versions specified than the control plane.\nIn earlier Gardener versions, all worker pools inherited the Kubernetes version of the control plane. Once the Kubernetes version of the control plane was modified, all worker pools have been updated as well (either by rolling the nodes in case of a minor version change, or in-place for patch version changes).\nIn order to gracefully perform Kubernetes upgrades (triggering a rolling update of the nodes) with workloads sensitive to restarts (e.g., those dealing with lots of data), it might be required to be able to gradually perform the upgrade process. In such cases, the Kubernetes version for the worker pools can be pinned (.spec.provider.workers[].kubernetes.version) while the control plane Kubernetes version (.spec.kubernetes.version) is updated. This results in the nodes being untouched while the control plane is upgraded. Now a new worker pool (with the version equal to the control plane version) can be added. Administrators can then reschedule their workloads to the new worker pool according to their upgrade requirements and processes.\nExample Usage in a Shoot spec: kubernetes: version: 1.27.4 provider: workers: - name: data1 kubernetes: version: 1.26.8 - name: data2 If .kubernetes.version is not specified in a worker pool, then the Kubernetes version of the kubelet is inherited from the control plane (.spec.kubernetes.version), i.e., in the above example, the data2 pool will use 1.26.8. If .kubernetes.version is specified in a worker pool, then it must meet the following constraints: It must be at most two minor versions lower than the control plane version. If it was not specified before, then no downgrade is possible (you cannot set it to 1.26.8 while .spec.kubernetes.version is already 1.27.4). The “two minor version skew” is only possible if the worker pool version is set to the control plane version and then the control plane was updated gradually by two minor versions. If the version is removed from the worker pool, only one minor version difference is allowed to the control plane (you cannot upgrade a pool from version 1.25.0 to 1.27.0 in one go). Automatic updates of Kubernetes versions (see Shoot Maintenance) also apply to worker pool Kubernetes versions.\n","categories":"","description":"","excerpt":"Controlling the Kubernetes Versions for Specific Worker Pools Since …","ref":"/docs/gardener/worker_pool_k8s_versions/","tags":"","title":"Controlling the Kubernetes Versions for Specific Worker Pools"},{"body":"Contract: ControlPlane Resource Most Kubernetes clusters require a cloud-controller-manager or CSI drivers in order to work properly. Before introducing the ControlPlane extension resource Gardener was having several different Helm charts for the cloud-controller-manager deployments for the various providers. Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich control plane resources are required? As mentioned in the controlplane customization webhooks document, Gardener shall not deploy any cloud-controller-manager or any other provider-specific component. Instead, it creates a ControlPlane CRD that should be picked up by provider extensions. Its purpose is to trigger the deployment of such provider-specific components in the shoot namespace in the seed cluster.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane namespace: shoot--foo--bar spec: type: openstack region: europe-west1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: provider zone: eu-1a cloudControllerManager: featureGates: CustomResourceValidation: true infrastructureProviderStatus: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus networks: floatingPool: id: vpc-1234 subnets: - purpose: nodes id: subnetid The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used for the shoot cluster. However, the most important section is the .spec.providerConfig and the .spec.infrastructureProviderStatus. The first one contains an embedded declaration of the provider specific configuration for the control plane (that cannot be known by Gardener itself). You are responsible for designing how this configuration looks like. Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the Shoot resource. The second one contains the output of the Infrastructure resource (that might be relevant for the CCM config).\nIn order to support a new control plane provider, you need to write a controller that watches all ControlPlanes with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Alicloud provider.\nThe control plane controller as part of the ControlPlane reconciliation often deploys resources (e.g. pods/deployments) into the Shoot namespace in the Seed as part of its ControlPlane reconciliation loop. Because the namespace contains network policies that per default deny all ingress and egress traffic, the pods may need to have proper labels matching to the selectors of the network policies in order to allow the required network traffic. Otherwise, they won’t be allowed to talk to certain other components (e.g., the kube-apiserver of the shoot). For more information, see NetworkPolicys In Garden, Seed, Shoot Clusters.\nNon-Provider Specific Information Required for Infrastructure Creation Most providers might require further information that is not provider specific but already part of the shoot resource. One example for this is the GCP control plane controller, which needs the Kubernetes version of the shoot cluster (because it already uses the in-tree Kubernetes cloud-controller-manager). As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Infrastructure resource itself.\nReferences and Additional Resources ControlPlane API (Golang Specification) Exemplary Implementation for the Alicloud Provider ","categories":"","description":"","excerpt":"Contract: ControlPlane Resource Most Kubernetes clusters require a …","ref":"/docs/gardener/extensions/controlplane/","tags":"","title":"ControlPlane"},{"body":"Contract: ControlPlane Resource with Purpose exposure Some Kubernetes clusters require an additional deployments required by the seed cloud provider in order to work properly, e.g. AWS Load Balancer Readvertiser. Before using ControlPlane resources with purpose exposure, Gardener was having different Helm charts for the deployments for the various providers. Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich control plane resources are required? As mentioned in the controlplane document, Gardener shall not deploy any other provider-specific component. Instead, it creates a ControlPlane CRD with purpose exposure that should be picked up by provider extensions. Its purpose is to trigger the deployment of such provider-specific components in the shoot namespace in the seed cluster that are needed to expose the kube-apiserver.\nThe shoot cluster’s kube-apiserver are exposed via a Service of type LoadBalancer from the shoot provider (you may run the control plane of an Azure shoot in a GCP seed). It’s the seed provider extension controller that should act on the ControlPlane resources with purpose exposure.\nIf SNI is enabled, then the Service from above is of type ClusterIP and Gardner will not create ControlPlane resources with purpose exposure.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane-exposure namespace: shoot--foo--bar spec: type: aws purpose: exposure region: europe-west1 secretRef: name: cloudprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used for the shoot cluster. It is most likely not needed, however, still added for some potential corner cases. If you don’t need it, then just ignore it. The .spec.region contains the region of the seed cluster.\nIn order to support a control plane provider with purpose exposure, you need to write a controller or expand the existing controlplane controller that watches all ControlPlanes with .spec.type=\u003cmy-provider-name\u003e and purpose exposure. You can take a look at the below referenced example implementation for the AWS provider.\nNon-Provider Specific Information Required for Infrastructure Creation Most providers might require further information that is not provider specific but already part of the shoot resource. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information.\nReferences and Additional Resources ControlPlane API (Golang Specification) Exemplary Implementation for the AWS Provider AWS Load Balancer Readvertiser ","categories":"","description":"","excerpt":"Contract: ControlPlane Resource with Purpose exposure Some Kubernetes …","ref":"/docs/gardener/extensions/controlplane-exposure/","tags":"","title":"ControlPlane Exposure"},{"body":"ControlPlane Customization Webhooks Gardener creates the Shoot controlplane in several steps of the Shoot flow. At different point of this flow, it:\n Deploys standard controlplane components such as kube-apiserver, kube-controller-manager, and kube-scheduler by creating the corresponding deployments, services, and other resources in the Shoot namespace. Initiates the deployment of custom controlplane components by ControlPlane controllers by creating a ControlPlane resource in the Shoot namespace. In order to apply any provider-specific changes to the configuration provided by Gardener for the standard controlplane components, cloud extension providers can install mutating admission webhooks for the resources created by Gardener in the Shoot namespace.\nWhat needs to be implemented to support a new cloud provider? In order to support a new cloud provider, you should install “controlplane” mutating webhooks for any of the following resources:\n Deployment with name kube-apiserver, kube-controller-manager, or kube-scheduler Service with name kube-apiserver OperatingSystemConfig with any name, and purpose reconcile See Contract Specification for more details on the contract that Gardener and webhooks should adhere to regarding the content of the above resources.\nYou can install 3 different kinds of controlplane webhooks:\n Shoot, or controlplane webhooks apply changes needed by the Shoot cloud provider, for example the --cloud-provider command line flag of kube-apiserver and kube-controller-manager. Such webhooks should only operate on Shoot namespaces labeled with shoot.gardener.cloud/provider=\u003cprovider\u003e. Seed, or controlplaneexposure webhooks apply changes needed by the Seed cloud provider, for example annotations on the kube-apiserver service to ensure cloud-specific load balancers are correctly provisioned for a service of type LoadBalancer. Such webhooks should only operate on Shoot namespaces labeled with seed.gardener.cloud/provider=\u003cprovider\u003e. The labels shoot.gardener.cloud/provider and seed.gardener.cloud/provider are added by Gardener when it creates the Shoot namespace.\nThe resources mutated by the “controlplane” mutating webhooks are labeled with provider.extensions.gardener.cloud/mutated-by-controlplane-webhook: true by gardenlet. The provider extensions can add an object selector to their “controlplane” mutating webhooks to not intercept requests for unrelated objects.\nContract Specification This section specifies the contract that Gardener and webhooks should adhere to in order to ensure smooth interoperability. Note that this contract can’t be specified formally and is therefore easy to violate, especially by Gardener. The Gardener team will nevertheless do its best to adhere to this contract in the future and to ensure via additional measures (tests, validations) that it’s not unintentionally broken. If it needs to be changed intentionally, this can only happen after proper communication has taken place to ensure that the affected provider webhooks could be adapted to work with the new version of the contract.\n Note: The contract described below may not necessarily be what Gardener does currently (as of May 2019). Rather, it reflects the target state after changes for Gardener extensibility have been introduced.\n kube-apiserver To deploy kube-apiserver, Gardener shall create a deployment and a service both named kube-apiserver in the Shoot namespace. They can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-apiserver deployment shall contain a container named kube-apiserver.\nThe command field of the kube-apiserver container shall contain the kube-apiserver command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n admission plugins (--enable-admission-plugins, --disable-admission-plugins) secure communications (--etcd-cafile, --etcd-certfile, --etcd-keyfile, …) audit log (--audit-log-*) ports (--secure-port) The kube-apiserver command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config These flags can be added by webhooks if needed.\nThe kube-apiserver command line may contain a number of additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-apiserver behavior for the specific provider. Among the flags to be considered are:\n --endpoint-reconciler-type --advertise-address --feature-gates Gardener uses SNI to expose the apiserver. In this case, Gardener will label the kube-apiserver’s Deployment with core.gardener.cloud/apiserver-exposure: gardener-managed label (deprecated, the label will no longer be added as of v1.80) and expects that the --endpoint-reconciler-type and --advertise-address flags are not modified.\nThe --enable-admission-plugins flag may contain admission plugins that are not compatible with CSI plugins such as PersistentVolumeLabel. Webhooks should therefore ensure that such admission plugins are either explicitly enabled (if CSI plugins are not used) or disabled (otherwise).\nThe env field of the kube-apiserver container shall not contain any provider-specific environment variables (so it will be empty). If any provider-specific environment variables are needed, they should be added by webhooks.\nThe volumes field of the pod template of the kube-apiserver deployment, and respectively the volumeMounts field of the kube-apiserver container shall not contain any provider-specific Secret or ConfigMap resources. If such resources should be mounted as volumes, this should be done by webhooks.\nThe kube-apiserver Service may be of type LoadBalancer, but shall not contain any provider-specific annotations that may be needed to actually provision a load balancer resource in the Seed provider’s cloud. If any such annotations are needed, they should be added by webhooks (typically controlplaneexposure webhooks).\nThe kube-apiserver Service will be of type ClusterIP. In this case, Gardener will label this Service with core.gardener.cloud/apiserver-exposure: gardener-managed label (deprecated, the label will no longer be added as of v1.80) and expects that no mutations happen.\nkube-controller-manager To deploy kube-controller-manager, Gardener shall create a deployment named kube-controller-manager in the Shoot namespace. It can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-controller-manager deployment shall contain a container named kube-controller-manager.\nThe command field of the kube-controller-manager container shall contain the kube-controller-manager command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --kubeconfig, --authentication-kubeconfig, --authorization-kubeconfig --leader-elect secure communications (--tls-cert-file, --tls-private-key-file, …) cluster CIDR and identity (--cluster-cidr, --cluster-name) sync settings (--concurrent-deployment-syncs, --concurrent-replicaset-syncs) horizontal pod autoscaler (--horizontal-pod-autoscaler-*) ports (--port, --secure-port) The kube-controller-manager command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config --configure-cloud-routes --external-cloud-volume-plugin These flags can be added by webhooks if needed.\nThe kube-controller-manager command line may contain a number of additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-controller-manager behavior for the specific provider. Among the flags to be considered are:\n --feature-gates The env field of the kube-controller-manager container shall not contain any provider-specific environment variables (so it will be empty). If any provider-specific environment variables are needed, they should be added by webhooks.\nThe volumes field of the pod template of the kube-controller-manager deployment, and respectively the volumeMounts field of the kube-controller-manager container shall not contain any provider-specific Secret or ConfigMap resources. If such resources should be mounted as volumes, this should be done by webhooks.\nkube-scheduler To deploy kube-scheduler, Gardener shall create a deployment named kube-scheduler in the Shoot namespace. It can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-scheduler deployment shall contain a container named kube-scheduler.\nThe command field of the kube-scheduler container shall contain the kube-scheduler command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --config --authentication-kubeconfig, --authorization-kubeconfig secure communications (--tls-cert-file, --tls-private-key-file, …) ports (--port, --secure-port) The kube-scheduler command line may contain additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-controller-manager behavior for the specific provider. Among the flags to be considered are:\n --feature-gates The kube-scheduler command line can’t contain provider-specific flags, and it makes no sense to specify provider-specific environment variables or mount provider-specific Secret or ConfigMap resources as volumes.\netcd-main and etcd-events To deploy etcd, Gardener shall create 2 Etcd named etcd-main and etcd-events in the Shoot namespace. They can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nGardener shall configure the Etcd resource completely to set up an etcd cluster which uses the default storage class of the seed cluster.\ncloud-controller-manager Gardener shall not deploy a cloud-controller-manager. If it is needed, it should be added by a ControlPlane controller\nCSI Controllers Gardener shall not deploy a CSI controller. If it is needed, it should be added by a ControlPlane controller\nkubelet To specify the kubelet configuration, Gardener shall create a OperatingSystemConfig resource with any name and purpose reconcile in the Shoot namespace. It can therefore also be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener. Gardener may write multiple such resources with different type to the same Shoot namespaces if multiple OSs are used.\nThe OSC resource shall contain a unit named kubelet.service, containing the corresponding systemd unit configuration file. The [Service] section of this file shall contain a single ExecStart option having the kubelet command line as its value.\nThe OSC resource shall contain a file with path /var/lib/kubelet/config/kubelet, which contains a KubeletConfiguration resource in YAML format. Most of the flags that can be specified in the kubelet command line can alternatively be specified as options in this configuration as well.\nThe kubelet command line shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --config --bootstrap-kubeconfig, --kubeconfig --network-plugin (and, if it equals cni, also --cni-bin-dir and --cni-conf-dir) --node-labels The kubelet command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config --provider-id These flags can be added by webhooks if needed.\nThe kubelet command line / configuration may contain a number of additional provider-independent flags / options. In general, webhooks should ignore these unless they are known to interfere with the desired kubelet behavior for the specific provider. Among the flags / options to be considered are:\n --enable-controller-attach-detach (enableControllerAttachDetach) - should be set to true if CSI plugins are used, but in general can also be ignored since its default value is also true, and this should work both with and without CSI plugins. --feature-gates (featureGates) - should contain a list of specific feature gates if CSI plugins are used. If CSI plugins are not used, the corresponding feature gates can be ignored since enabling them should not harm in any way. ","categories":"","description":"","excerpt":"ControlPlane Customization Webhooks Gardener creates the Shoot …","ref":"/docs/gardener/extensions/controlplane-webhooks/","tags":"","title":"ControlPlane Webhooks"},{"body":"General Conventions All the extensions that are registered to Gardener are deployed to the seed clusters on which they are required (also see ControllerRegistration).\nSome of these extensions might need to create global resources in the seed (e.g., ClusterRoles), i.e., it’s important to have a naming scheme to avoid conflicts as it cannot be checked or validated upfront that two extensions don’t use the same names.\nConsequently, this page should help answering some general questions that might come up when it comes to developing an extension.\nPriorityClasses Extensions are not supposed to create and use self-defined PriorityClasses. Instead, they can and should rely on well-known PriorityClasses managed by gardenlet.\nHigh Availability of Deployed Components Extensions might deploy components via Deployments, StatefulSets, etc., as part of the shoot control plane, or the seed or shoot system components. In case a seed or shoot cluster is highly available, there are various failure tolerance types. For more information, see Highly Available Shoot Control Plane. Accordingly, the replicas, topologySpreadConstraints or affinity settings of the deployed components might need to be adapted.\nInstead of doing this one-by-one for each and every component, extensions can rely on a mutating webhook provided by Gardener. Please refer to High Availability of Deployed Components for details.\nTo reduce costs and to improve the network traffic latency in multi-zone clusters, extensions can make a Service topology-aware. Please refer to this document for details.\nIs there a naming scheme for (global) resources? As there is no formal process to validate non-existence of conflicts between two extensions, please follow these naming schemes when creating resources (especially, when creating global resources, but it’s in general a good idea for most created resources):\nThe resource name should be prefixed with extensions.gardener.cloud:\u003cextension-type\u003e-\u003cextension-name\u003e:\u003cresource-name\u003e, for example:\n extensions.gardener.cloud:provider-aws:some-controller-manager extensions.gardener.cloud:extension-certificate-service:cert-broker How to create resources in the shoot cluster? Some extensions might not only create resources in the seed cluster itself but also in the shoot cluster. Usually, every extension comes with a ServiceAccount and the required RBAC permissions when it gets installed to the seed. However, there are no credentials for the shoot for every extension.\nExtensions are supposed to use ManagedResources to manage resources in shoot clusters. gardenlet deploys gardener-resource-manager instances into all shoot control planes, that will reconcile ManagedResources without a specified class (spec.class=null) in shoot clusters. Mind that Gardener acts on ManagedResources with the origin=gardener label. In order to prevent unwanted behavior, extensions should omit the origin label or provide their own unique value for it when creating such resources.\nIf you need to deploy a non-DaemonSet resource, Gardener automatically ensures that it only runs on nodes that are allowed to host system components and extensions. For more information, see System Components Webhook.\nHow to create kubeconfigs for the shoot cluster? Historically, Gardener extensions used to generate kubeconfigs with client certificates for components they deploy into the shoot control plane. For this, they reused the shoot cluster CA secret (ca) to issue new client certificates. With gardener/gardener#4661 we moved away from using client certificates in favor of short-lived, auto-rotated ServiceAccount tokens. These tokens are managed by gardener-resource-manager’s TokenRequestor. Extensions are supposed to reuse this mechanism for requesting tokens and a generic-token-kubeconfig for authenticating against shoot clusters.\nWith GEP-18 (Shoot cluster CA rotation), a dedicated CA will be used for signing client certificates (gardener/gardener#5779) which will be rotated when triggered by the shoot owner. With this, extensions cannot reuse the ca secret anymore to issue client certificates. Hence, extensions must switch to short-lived ServiceAccount tokens in order to support the CA rotation feature.\nThe generic-token-kubeconfig secret contains the CA bundle for establishing trust to shoot API servers. However, as the secret is immutable, its name changes with the rotation of the cluster CA. Extensions need to look up the generic-token-kubeconfig.secret.gardener.cloud/name annotation on the respective Cluster object in order to determine which secret contains the current CA bundle. The helper function extensionscontroller.GenericTokenKubeconfigSecretNameFromCluster can be used for this task.\nYou can take a look at CA Rotation in Extensions for more details on the CA rotation feature in regard to extensions.\nHow to create certificates for the shoot cluster? Gardener creates several certificate authorities (CA) that are used to create server certificates for various components. For example, the shoot’s etcd has its own CA, the kube-aggregator has its own CA as well, and both are different to the actual cluster’s CA.\nWith GEP-18 (Shoot cluster CA rotation), extensions are required to do the same and generate dedicated CAs for their components (e.g. for signing a server certificate for cloud-controller-manager). They must not depend on the CA secrets managed by gardenlet.\nPlease see CA Rotation in Extensions for the exact requirements that extensions need to fulfill in order to support the CA rotation feature.\nHow to enforce a Pod Security Standard for extension namespaces? The pod-security.kubernetes.io/enforce namespace label enforces the Pod Security Standards.\nYou can set the pod-security.kubernetes.io/enforce label for extension namespace by adding the security.gardener.cloud/pod-security-enforce annotation to your ControllerRegistration. The value of the annotation would be the value set for the pod-security.kubernetes.io/enforce label. It is advised to set the annotation with the most restrictive pod security standard that your extension pods comply with.\nIf you are using the ./hack/generate-controller-registration.sh script to generate your ControllerRegistration you can use the -e, –pod-security-enforce option to set the security.gardener.cloud/pod-security-enforce annotation. If the option is not set, it defaults to baseline.\n","categories":"","description":"","excerpt":"General Conventions All the extensions that are registered to Gardener …","ref":"/docs/gardener/extensions/conventions/","tags":"","title":"Conventions"},{"body":"Packages:\n core.gardener.cloud/v1beta1 core.gardener.cloud/v1beta1 Package v1beta1 is a version of the API.\nResource Types: BackupBucket BackupEntry CloudProfile ControllerDeployment ControllerInstallation ControllerRegistration ExposureClass InternalSecret NamespacedCloudProfile Project Quota SecretBinding Seed Shoot ShootState BackupBucket BackupBucket holds details about backup bucket\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string BackupBucket metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BackupBucketSpec Specification of the Backup Bucket.\n provider BackupBucketProvider Provider holds the details of cloud provider of the object store. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n seedName string (Optional) SeedName holds the name of the seed allocated to BackupBucket for running controller. This field is immutable.\n status BackupBucketStatus Most recently observed status of the Backup Bucket.\n BackupEntry BackupEntry holds details about shoot backup.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string BackupEntry metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BackupEntrySpec (Optional) Spec contains the specification of the Backup Entry.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n seedName string (Optional) SeedName holds the name of the seed to which this BackupEntry is scheduled\n status BackupEntryStatus (Optional) Status contains the most recently observed status of the Backup Entry.\n CloudProfile CloudProfile represents certain properties about a provider environment.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string CloudProfile metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec CloudProfileSpec (Optional) Spec defines the provider environment properties.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig contains provider-specific configuration for the profile.\n regions []Region Regions contains constraints regarding allowed values for regions and zones.\n seedSelector SeedSelector (Optional) SeedSelector contains an optional list of labels on Seed resources that marks those seeds whose shoots may use this provider profile. An empty list means that all seeds of the same provider type are supported. This is useful for environments that are of the same type (like openstack) but may have different “instances”/landscapes. Optionally a list of possible providers can be added to enable cross-provider scheduling. By default, the provider type of the seed must match the shoot’s provider.\n type string Type is the name of the provider.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n bastion Bastion (Optional) Bastion contains the machine and image properties\n ControllerDeployment ControllerDeployment contains information about how this controller is deployed.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerDeployment metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. type string Type is the deployment type.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension ProviderConfig contains type-specific configuration. It contains assets that deploy the controller.\n ControllerInstallation ControllerInstallation represents an installation request for an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerInstallation metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ControllerInstallationSpec Spec contains the specification of this installation. If the object’s deletion timestamp is set, this field is immutable.\n registrationRef Kubernetes core/v1.ObjectReference RegistrationRef is used to reference a ControllerRegistration resource. The name field of the RegistrationRef is immutable.\n seedRef Kubernetes core/v1.ObjectReference SeedRef is used to reference a Seed resource. The name field of the SeedRef is immutable.\n deploymentRef Kubernetes core/v1.ObjectReference (Optional) DeploymentRef is used to reference a ControllerDeployment resource.\n status ControllerInstallationStatus Status contains the status of this installation.\n ControllerRegistration ControllerRegistration represents a registration of an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerRegistration metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ControllerRegistrationSpec Spec contains the specification of this registration. If the object’s deletion timestamp is set, this field is immutable.\n resources []ControllerResource (Optional) Resources is a list of combinations of kinds (DNSProvider, Infrastructure, Generic, …) and their actual types (aws-route53, gcp, auditlog, …).\n deployment ControllerRegistrationDeployment (Optional) Deployment contains information for how this controller is deployed.\n ExposureClass ExposureClass represents a control plane endpoint exposure strategy.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ExposureClass metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. handler string Handler is the name of the handler which applies the control plane endpoint exposure strategy. This field is immutable.\n scheduling ExposureClassScheduling (Optional) Scheduling holds information how to select applicable Seed’s for ExposureClass usage. This field is immutable.\n InternalSecret InternalSecret holds secret data of a certain type. The total bytes of the values in the Data field must be less than MaxSecretSize bytes.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string InternalSecret metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object’s metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata\nRefer to the Kubernetes API documentation for the fields of the metadata field. immutable bool (Optional) Immutable, if set to true, ensures that data stored in the Secret cannot be updated (only object metadata can be modified). If not set to true, the field can be modified at any time. Defaulted to nil.\n data map[string][]byte (Optional) Data contains the secret data. Each key must consist of alphanumeric characters, ‘-’, ‘_’ or ‘.’. The serialized form of the secret data is a base64 encoded string, representing the arbitrary (possibly non-string) data value here. Described in https://tools.ietf.org/html/rfc4648#section-4\n stringData map[string]string (Optional) stringData allows specifying non-binary secret data in string form. It is provided as a write-only input field for convenience. All keys and values are merged into the data field on write, overwriting any existing values. The stringData field is never output when reading from the API.\n type Kubernetes core/v1.SecretType (Optional) Used to facilitate programmatic handling of secret data. More info: https://kubernetes.io/docs/concepts/configuration/secret/#secret-types\n NamespacedCloudProfile NamespacedCloudProfile represents certain properties about a provider environment.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string NamespacedCloudProfile metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec NamespacedCloudProfileSpec Spec defines the provider environment properties.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings (Optional) Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage (Optional) MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType (Optional) MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n parent CloudProfileReference Parent contains a reference to a CloudProfile it inherits from.\n status NamespacedCloudProfileStatus Most recently observed status of the NamespacedCloudProfile.\n Project Project holds certain properties about a Gardener project.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Project metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ProjectSpec (Optional) Spec defines the project properties.\n createdBy Kubernetes rbac/v1.Subject (Optional) CreatedBy is a subject representing a user name, an email address, or any other identifier of a user who created the project. This field is immutable.\n description string (Optional) Description is a human-readable description of what the project is used for.\n owner Kubernetes rbac/v1.Subject (Optional) Owner is a subject representing a user name, an email address, or any other identifier of a user owning the project. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the owner role. The only way to change the owner will be by moving the owner role. In this API version the only way to change the owner is to use this field. TODO: Remove this field in favor of the owner role in v1.\n purpose string (Optional) Purpose is a human-readable explanation of the project’s purpose.\n members []ProjectMember (Optional) Members is a list of subjects representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n namespace string (Optional) Namespace is the name of the namespace that has been created for the Project object. A nil value means that Gardener will determine the name of the namespace. This field is immutable.\n tolerations ProjectTolerations (Optional) Tolerations contains the tolerations for taints on seed clusters.\n dualApprovalForDeletion []DualApprovalForDeletion (Optional) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n status ProjectStatus (Optional) Most recently observed status of the Project.\n Quota Quota represents a quota on resources consumed by shoot clusters either per project or per provider secret.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Quota metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec QuotaSpec (Optional) Spec defines the Quota constraints.\n clusterLifetimeDays int32 (Optional) ClusterLifetimeDays is the lifetime of a Shoot cluster in days before it will be terminated automatically.\n metrics Kubernetes core/v1.ResourceList Metrics is a list of resources which will be put under constraints.\n scope Kubernetes core/v1.ObjectReference Scope is the scope of the Quota object, either ‘project’, ‘secret’ or ‘workloadidentity’. This field is immutable.\n SecretBinding SecretBinding represents a binding to a secret in the same or another namespace.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string SecretBinding metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret object in the same or another namespace. This field is immutable.\n quotas []Kubernetes core/v1.ObjectReference (Optional) Quotas is a list of references to Quota objects in the same or another namespace. This field is immutable.\n provider SecretBindingProvider (Optional) Provider defines the provider type of the SecretBinding. This field is immutable.\n Seed Seed represents an installation request for an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Seed metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec SeedSpec Spec contains the specification of this installation.\n backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n status SeedStatus Status contains the status of this installation.\n Shoot Shoot represents a Shoot cluster created and managed by Gardener.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Shoot metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootSpec (Optional) Specification of the Shoot cluster. If the object’s deletion timestamp is set, this field is immutable.\n addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n status ShootStatus (Optional) Most recently observed status of the Shoot cluster.\n ShootState ShootState contains a snapshot of the Shoot’s state required to migrate the Shoot’s control plane to a new Seed.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ShootState metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootStateSpec (Optional) Specification of the ShootState.\n gardener []GardenerResourceData (Optional) Gardener holds the data required to generate resources deployed by the gardenlet\n extensions []ExtensionResourceState (Optional) Extensions holds the state of custom resources reconciled by extension controllers in the seed\n resources []ResourceData (Optional) Resources holds the data of resources referred to by extension controller states\n APIServerLogging (Appears on: KubeAPIServerConfig) APIServerLogging contains configuration for the logs level and http access logs\n Field Description verbosity int32 (Optional) Verbosity is the kube-apiserver log verbosity level Defaults to 2.\n httpAccessVerbosity int32 (Optional) HTTPAccessVerbosity is the kube-apiserver access logs level\n APIServerRequests (Appears on: KubeAPIServerConfig) APIServerRequests contains configuration for request-specific settings for the kube-apiserver.\n Field Description maxNonMutatingInflight int32 (Optional) MaxNonMutatingInflight is the maximum number of non-mutating requests in flight at a given time. When the server exceeds this, it rejects requests.\n maxMutatingInflight int32 (Optional) MaxMutatingInflight is the maximum number of mutating requests in flight at a given time. When the server exceeds this, it rejects requests.\n Addon (Appears on: KubernetesDashboard, NginxIngress) Addon allows enabling or disabling a specific addon and is used to derive from.\n Field Description enabled bool Enabled indicates whether the addon is enabled or not.\n Addons (Appears on: ShootSpec) Addons is a collection of configuration for specific addons which are managed by the Gardener.\n Field Description kubernetesDashboard KubernetesDashboard (Optional) KubernetesDashboard holds configuration settings for the kubernetes dashboard addon.\n nginxIngress NginxIngress (Optional) NginxIngress holds configuration settings for the nginx-ingress addon.\n AdmissionPlugin (Appears on: KubeAPIServerConfig) AdmissionPlugin contains information about a specific admission plugin and its corresponding configuration.\n Field Description name string Name is the name of the plugin.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the configuration of the plugin.\n disabled bool (Optional) Disabled specifies whether this plugin should be disabled.\n kubeconfigSecretName string (Optional) KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this admission plugin.\n Alerting (Appears on: Monitoring) Alerting contains information about how alerting will be done (i.e. who will receive alerts and how).\n Field Description emailReceivers []string (Optional) MonitoringEmailReceivers is a list of recipients for alerts\n AuditConfig (Appears on: KubeAPIServerConfig) AuditConfig contains settings for audit of the api server\n Field Description auditPolicy AuditPolicy (Optional) AuditPolicy contains configuration settings for audit policy of the kube-apiserver.\n AuditPolicy (Appears on: AuditConfig) AuditPolicy contains audit policy for kube-apiserver\n Field Description configMapRef Kubernetes core/v1.ObjectReference (Optional) ConfigMapRef is a reference to a ConfigMap object in the same namespace, which contains the audit policy for the kube-apiserver.\n AvailabilityZone (Appears on: Region) AvailabilityZone is an availability zone.\n Field Description name string Name is an availability zone name.\n unavailableMachineTypes []string (Optional) UnavailableMachineTypes is a list of machine type names that are not availability in this zone.\n unavailableVolumeTypes []string (Optional) UnavailableVolumeTypes is a list of volume type names that are not availability in this zone.\n BackupBucketProvider (Appears on: BackupBucketSpec) BackupBucketProvider holds the details of cloud provider of the object store.\n Field Description type string Type is the type of provider.\n region string Region is the region of the bucket.\n BackupBucketSpec (Appears on: BackupBucket) BackupBucketSpec is the specification of a Backup Bucket.\n Field Description provider BackupBucketProvider Provider holds the details of cloud provider of the object store. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n seedName string (Optional) SeedName holds the name of the seed allocated to BackupBucket for running controller. This field is immutable.\n BackupBucketStatus (Appears on: BackupBucket) BackupBucketStatus holds the most recently observed status of the Backup Bucket.\n Field Description providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus is the configuration passed to BackupBucket resource.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the BackupBucket.\n lastError LastError (Optional) LastError holds information about the last occurred error during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this BackupBucket. It corresponds to the BackupBucket’s generation, which is updated on mutation by the API Server.\n generatedSecretRef Kubernetes core/v1.SecretReference (Optional) GeneratedSecretRef is reference to the secret generated by backup bucket, which will have object store specific credentials.\n BackupEntrySpec (Appears on: BackupEntry) BackupEntrySpec is the specification of a Backup Entry.\n Field Description bucketName string BucketName is the name of backup bucket for this Backup Entry.\n seedName string (Optional) SeedName holds the name of the seed to which this BackupEntry is scheduled\n BackupEntryStatus (Appears on: BackupEntry) BackupEntryStatus holds the most recently observed status of the Backup Entry.\n Field Description lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the BackupEntry.\n lastError LastError (Optional) LastError holds information about the last occurred error during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this BackupEntry. It corresponds to the BackupEntry’s generation, which is updated on mutation by the API Server.\n seedName string (Optional) SeedName is the name of the seed to which this BackupEntry is currently scheduled. This field is populated at the beginning of a create/reconcile operation. It is used when moving the BackupEntry between seeds.\n migrationStartTime Kubernetes meta/v1.Time (Optional) MigrationStartTime is the time when a migration to a different seed was initiated.\n Bastion (Appears on: CloudProfileSpec) Bastion contains the bastions creation info\n Field Description machineImage BastionMachineImage (Optional) MachineImage contains the bastions machine image properties\n machineType BastionMachineType (Optional) MachineType contains the bastions machine type properties\n BastionMachineImage (Appears on: Bastion) BastionMachineImage contains the bastions machine image properties\n Field Description name string Name of the machine image\n version string (Optional) Version of the machine image\n BastionMachineType (Appears on: Bastion) BastionMachineType contains the bastions machine type properties\n Field Description name string Name of the machine type\n CARotation (Appears on: ShootCredentialsRotation) CARotation contains information about the certificate authority credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the certificate authority credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the certificate authority credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the certificate authority credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the certificate authority credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the certificate authority credential rotation completion was triggered.\n CRI (Appears on: MachineImageVersion, Worker) CRI contains information about the Container Runtimes.\n Field Description name CRIName The name of the CRI library. Supported values are containerd.\n containerRuntimes []ContainerRuntime (Optional) ContainerRuntimes is the list of the required container runtimes supported for a worker pool.\n CRIName (string alias)\n (Appears on: CRI) CRIName is a type alias for the CRI name string.\nCloudProfileReference (Appears on: NamespacedCloudProfileSpec, ShootSpec) CloudProfileReference holds the information about a CloudProfile or a NamespacedCloudProfile.\n Field Description kind string Kind contains a CloudProfile kind.\n name string Name contains the name of the referenced CloudProfile.\n CloudProfileSpec (Appears on: CloudProfile, NamespacedCloudProfileStatus) CloudProfileSpec is the specification of a CloudProfile. It must contain exactly one of its defined keys.\n Field Description caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig contains provider-specific configuration for the profile.\n regions []Region Regions contains constraints regarding allowed values for regions and zones.\n seedSelector SeedSelector (Optional) SeedSelector contains an optional list of labels on Seed resources that marks those seeds whose shoots may use this provider profile. An empty list means that all seeds of the same provider type are supported. This is useful for environments that are of the same type (like openstack) but may have different “instances”/landscapes. Optionally a list of possible providers can be added to enable cross-provider scheduling. By default, the provider type of the seed must match the shoot’s provider.\n type string Type is the name of the provider.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n bastion Bastion (Optional) Bastion contains the machine and image properties\n ClusterAutoscaler (Appears on: Kubernetes) ClusterAutoscaler contains the configuration flags for the Kubernetes cluster autoscaler.\n Field Description scaleDownDelayAfterAdd Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterAdd defines how long after scale up that scale down evaluation resumes (default: 1 hour).\n scaleDownDelayAfterDelete Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterDelete how long after node deletion that scale down evaluation resumes, defaults to scanInterval (default: 0 secs).\n scaleDownDelayAfterFailure Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterFailure how long after scale down failure that scale down evaluation resumes (default: 3 mins).\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down (default: 30 mins).\n scaleDownUtilizationThreshold float64 (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed (default: 0.5).\n scanInterval Kubernetes meta/v1.Duration (Optional) ScanInterval how often cluster is reevaluated for scale up or down (default: 10 secs).\n expander ExpanderMode (Optional) Expander defines the algorithm to use during scale up (default: least-waste). See: https://github.com/gardener/autoscaler/blob/machine-controller-manager-provider/cluster-autoscaler/FAQ.md#what-are-expanders.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long CA waits for node to be provisioned (default: 20 mins).\n maxGracefulTerminationSeconds int32 (Optional) MaxGracefulTerminationSeconds is the number of seconds CA waits for pod termination when trying to scale down a node (default: 600).\n ignoreTaints []string (Optional) IgnoreTaints specifies a list of taint keys to ignore in node templates when considering to scale a node group.\n newPodScaleUpDelay Kubernetes meta/v1.Duration (Optional) NewPodScaleUpDelay specifies how long CA should ignore newly created pods before they have to be considered for scale-up (default: 0s).\n maxEmptyBulkDelete int32 (Optional) MaxEmptyBulkDelete specifies the maximum number of empty nodes that can be deleted at the same time (default: 10).\n ignoreDaemonsetsUtilization bool (Optional) IgnoreDaemonsetsUtilization allows CA to ignore DaemonSet pods when calculating resource utilization for scaling down (default: false).\n verbosity int32 (Optional) Verbosity allows CA to modify its log level (default: 2).\n ClusterAutoscalerOptions (Appears on: Worker) ClusterAutoscalerOptions contains the cluster autoscaler configurations for a worker pool.\n Field Description scaleDownUtilizationThreshold float64 (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed.\n scaleDownGpuUtilizationThreshold float64 (Optional) ScaleDownGpuUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) of gpu resources under which a node is being removed.\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down.\n scaleDownUnreadyTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnreadyTime defines how long an unready node should be unneeded before it is eligible for scale down.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long CA waits for node to be provisioned.\n Condition (Appears on: ControllerInstallationStatus, SeedStatus, ShootStatus) Condition holds the information about the state of a resource.\n Field Description type ConditionType Type of the condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n lastUpdateTime Kubernetes meta/v1.Time Last time the condition was updated.\n reason string The reason for the condition’s last transition.\n message string A human readable message indicating details about the transition.\n codes []ErrorCode (Optional) Well-defined error codes in case the condition reports a problem.\n ConditionStatus (string alias)\n (Appears on: Condition) ConditionStatus is the status of a condition.\nConditionType (string alias)\n (Appears on: Condition) ConditionType is a string alias.\nContainerRuntime (Appears on: CRI) ContainerRuntime contains information about worker’s available container runtime\n Field Description type string Type is the type of the Container Runtime.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to container runtime resource.\n ControlPlane (Appears on: ShootSpec) ControlPlane holds information about the general settings for the control plane of a shoot.\n Field Description highAvailability HighAvailability (Optional) HighAvailability holds the configuration settings for high availability of the control plane of a shoot.\n ControllerDeploymentPolicy (string alias)\n (Appears on: ControllerRegistrationDeployment) ControllerDeploymentPolicy is a string alias.\nControllerInstallationSpec (Appears on: ControllerInstallation) ControllerInstallationSpec is the specification of a ControllerInstallation.\n Field Description registrationRef Kubernetes core/v1.ObjectReference RegistrationRef is used to reference a ControllerRegistration resource. The name field of the RegistrationRef is immutable.\n seedRef Kubernetes core/v1.ObjectReference SeedRef is used to reference a Seed resource. The name field of the SeedRef is immutable.\n deploymentRef Kubernetes core/v1.ObjectReference (Optional) DeploymentRef is used to reference a ControllerDeployment resource.\n ControllerInstallationStatus (Appears on: ControllerInstallation) ControllerInstallationStatus is the status of a ControllerInstallation.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of a ControllerInstallations’s current state.\n providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains type-specific status.\n ControllerRegistrationDeployment (Appears on: ControllerRegistrationSpec) ControllerRegistrationDeployment contains information for how this controller is deployed.\n Field Description policy ControllerDeploymentPolicy (Optional) Policy controls how the controller is deployed. It defaults to ‘OnDemand’.\n seedSelector Kubernetes meta/v1.LabelSelector (Optional) SeedSelector contains an optional label selector for seeds. Only if the labels match then this controller will be considered for a deployment. An empty list means that all seeds are selected.\n deploymentRefs []DeploymentRef (Optional) DeploymentRefs holds references to ControllerDeployments. Only one element is supported currently.\n ControllerRegistrationSpec (Appears on: ControllerRegistration) ControllerRegistrationSpec is the specification of a ControllerRegistration.\n Field Description resources []ControllerResource (Optional) Resources is a list of combinations of kinds (DNSProvider, Infrastructure, Generic, …) and their actual types (aws-route53, gcp, auditlog, …).\n deployment ControllerRegistrationDeployment (Optional) Deployment contains information for how this controller is deployed.\n ControllerResource (Appears on: ControllerRegistrationSpec) ControllerResource is a combination of a kind (DNSProvider, Infrastructure, Generic, …) and the actual type for this kind (aws-route53, gcp, auditlog, …).\n Field Description kind string Kind is the resource kind, for example “OperatingSystemConfig”.\n type string Type is the resource type, for example “coreos” or “ubuntu”.\n globallyEnabled bool (Optional) GloballyEnabled determines if this ControllerResource is required by all Shoot clusters. This field is defaulted to false when kind is “Extension”.\n reconcileTimeout Kubernetes meta/v1.Duration (Optional) ReconcileTimeout defines how long Gardener should wait for the resource reconciliation. This field is defaulted to 3m0s when kind is “Extension”.\n primary bool (Optional) Primary determines if the controller backed by this ControllerRegistration is responsible for the extension resource’s lifecycle. This field defaults to true. There must be exactly one primary controller for this kind/type combination. This field is immutable.\n lifecycle ControllerResourceLifecycle (Optional) Lifecycle defines a strategy that determines when different operations on a ControllerResource should be performed. This field is defaulted in the following way when kind is “Extension”. Reconcile: “AfterKubeAPIServer” Delete: “BeforeKubeAPIServer” Migrate: “BeforeKubeAPIServer”\n workerlessSupported bool (Optional) WorkerlessSupported specifies whether this ControllerResource supports Workerless Shoot clusters. This field is only relevant when kind is “Extension”.\n ControllerResourceLifecycle (Appears on: ControllerResource) ControllerResourceLifecycle defines the lifecycle of a controller resource.\n Field Description reconcile ControllerResourceLifecycleStrategy (Optional) Reconcile defines the strategy during reconciliation.\n delete ControllerResourceLifecycleStrategy (Optional) Delete defines the strategy during deletion.\n migrate ControllerResourceLifecycleStrategy (Optional) Migrate defines the strategy during migration.\n ControllerResourceLifecycleStrategy (string alias)\n (Appears on: ControllerResourceLifecycle) ControllerResourceLifecycleStrategy is a string alias.\nCoreDNS (Appears on: SystemComponents) CoreDNS contains the settings of the Core DNS components running in the data plane of the Shoot cluster.\n Field Description autoscaling CoreDNSAutoscaling (Optional) Autoscaling contains the settings related to autoscaling of the Core DNS components running in the data plane of the Shoot cluster.\n rewriting CoreDNSRewriting (Optional) Rewriting contains the setting related to rewriting of requests, which are obviously incorrect due to the unnecessary application of the search path.\n CoreDNSAutoscaling (Appears on: CoreDNS) CoreDNSAutoscaling contains the settings related to autoscaling of the Core DNS components running in the data plane of the Shoot cluster.\n Field Description mode CoreDNSAutoscalingMode The mode of the autoscaling to be used for the Core DNS components running in the data plane of the Shoot cluster. Supported values are horizontal and cluster-proportional.\n CoreDNSAutoscalingMode (string alias)\n (Appears on: CoreDNSAutoscaling) CoreDNSAutoscalingMode is a type alias for the Core DNS autoscaling mode string.\nCoreDNSRewriting (Appears on: CoreDNS) CoreDNSRewriting contains the setting related to rewriting requests, which are obviously incorrect due to the unnecessary application of the search path.\n Field Description commonSuffixes []string (Optional) CommonSuffixes are expected to be the suffix of a fully qualified domain name. Each suffix should contain at least one or two dots (‘.’) to prevent accidental clashes.\n CredentialsRotationPhase (string alias)\n (Appears on: CARotation, ETCDEncryptionKeyRotation, ServiceAccountKeyRotation) CredentialsRotationPhase is a string alias.\nDNS (Appears on: ShootSpec) DNS holds information about the provider, the hosted zone id and the domain.\n Field Description domain string (Optional) Domain is the external available domain of the Shoot cluster. This domain will be written into the kubeconfig that is handed out to end-users. This field is immutable.\n providers []DNSProvider (Optional) Providers is a list of DNS providers that shall be enabled for this shoot cluster. Only relevant if not a default domain is used.\nDeprecated: Configuring multiple DNS providers is deprecated and will be forbidden in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional providers.\n DNSIncludeExclude (Appears on: DNSProvider) DNSIncludeExclude contains information about which domains shall be included/excluded.\n Field Description include []string (Optional) Include is a list of domains that shall be included.\n exclude []string (Optional) Exclude is a list of domains that shall be excluded.\n DNSProvider (Appears on: DNS) DNSProvider contains information about a DNS provider.\n Field Description domains DNSIncludeExclude (Optional) Domains contains information about which domains shall be included/excluded for this provider.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional configuration.\n primary bool (Optional) Primary indicates that this DNSProvider is used for shoot related domains.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional and non-primary providers.\n secretName string (Optional) SecretName is a name of a secret containing credentials for the stated domain and the provider. When not specified, the Gardener will use the cloud provider credentials referenced by the Shoot and try to find respective credentials there (primary provider only). Specifying this field may override this behavior, i.e. forcing the Gardener to only look into the given secret.\n type string (Optional) Type is the DNS provider type.\n zones DNSIncludeExclude (Optional) Zones contains information about which hosted zones shall be included/excluded for this provider.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional configuration.\n DataVolume (Appears on: Worker) DataVolume contains information about a data volume.\n Field Description name string Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string VolumeSize is the size of the volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n DeploymentRef (Appears on: ControllerRegistrationDeployment) DeploymentRef contains information about ControllerDeployment references.\n Field Description name string Name is the name of the ControllerDeployment that is being referred to.\n DualApprovalForDeletion (Appears on: ProjectSpec) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n Field Description resource string Resource is the name of the resource this applies to.\n selector Kubernetes meta/v1.LabelSelector Selector is the label selector for the resources.\n includeServiceAccounts bool (Optional) IncludeServiceAccounts specifies whether the concept also applies when deletion is triggered by ServiceAccounts. Defaults to true.\n ETCDEncryptionKeyRotation (Appears on: ShootCredentialsRotation) ETCDEncryptionKeyRotation contains information about the ETCD encryption key credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the ETCD encryption key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the ETCD encryption key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the ETCD encryption key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the ETCD encryption key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the ETCD encryption key credential rotation completion was triggered.\n EncryptionConfig (Appears on: KubeAPIServerConfig) EncryptionConfig contains customizable encryption configuration of the API server.\n Field Description resources []string Resources contains the list of resources that shall be encrypted in addition to secrets. Each item is a Kubernetes resource name in plural (resource or resource.group) that should be encrypted. Note that configuring a custom resource is only supported for versions \u003e= 1.26. Wildcards are not supported for now. See https://github.com/gardener/gardener/blob/master/docs/usage/etcd_encryption_config.md for more details.\n ErrorCode (string alias)\n (Appears on: Condition, LastError) ErrorCode is a string alias.\nExpanderMode (string alias)\n (Appears on: ClusterAutoscaler) ExpanderMode is type used for Expander values\nExpirableVersion (Appears on: KubernetesSettings, MachineImageVersion) ExpirableVersion contains a version and an expiration date.\n Field Description version string Version is the version identifier.\n expirationDate Kubernetes meta/v1.Time (Optional) ExpirationDate defines the time at which this version expires.\n classification VersionClassification (Optional) Classification defines the state of a version (preview, supported, deprecated)\n ExposureClassScheduling (Appears on: ExposureClass) ExposureClassScheduling holds information to select applicable Seed’s for ExposureClass usage.\n Field Description seedSelector SeedSelector (Optional) SeedSelector is an optional label selector for Seed’s which are suitable to use the ExposureClass.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on Seed clusters.\n Extension (Appears on: ShootSpec) Extension contains type and provider information for Shoot extensions.\n Field Description type string Type is the type of the extension resource.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to extension resource.\n disabled bool (Optional) Disabled allows to disable extensions that were marked as ‘globally enabled’ by Gardener administrators.\n ExtensionResourceState (Appears on: ShootStateSpec) ExtensionResourceState contains the kind of the extension custom resource and its last observed state in the Shoot’s namespace on the Seed cluster.\n Field Description kind string Kind (type) of the extension custom resource\n name string (Optional) Name of the extension custom resource\n purpose string (Optional) Purpose of the extension custom resource\n state k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) State of the extension resource\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in the state by their names.\n FailureTolerance (Appears on: HighAvailability) FailureTolerance describes information about failure tolerance level of a highly available resource.\n Field Description type FailureToleranceType Type specifies the type of failure that the highly available resource can tolerate\n FailureToleranceType (string alias)\n (Appears on: FailureTolerance) FailureToleranceType specifies the type of failure that a highly available shoot control plane that can tolerate.\nGardener (Appears on: SeedStatus, ShootStatus) Gardener holds the information about the Gardener version that operated a resource.\n Field Description id string ID is the container id of the Gardener which last acted on a resource.\n name string Name is the hostname (pod name) of the Gardener which last acted on a resource.\n version string Version is the version of the Gardener which last acted on a resource.\n GardenerResourceData (Appears on: ShootStateSpec) GardenerResourceData holds the data which is used to generate resources, deployed in the Shoot’s control plane.\n Field Description name string Name of the object required to generate resources\n type string Type of the object\n data k8s.io/apimachinery/pkg/runtime.RawExtension Data contains the payload required to generate resources\n labels map[string]string (Optional) Labels are labels of the object\n HelmControllerDeployment HelmControllerDeployment configures how an extension controller is deployed using helm. This is the legacy structure that used to be defined in gardenlet’s ControllerInstallation controller for ControllerDeployment’s with type=helm. While this is not a proper API type, we need to define the structure in the API package so that we can convert it to the internal API version in the new representation.\n Field Description chart []byte Chart is a Helm chart tarball.\n values Kubernetes apiextensions/v1.JSON Values is a map of values for the given chart.\n ociRepository OCIRepository (Optional) OCIRepository defines where to pull the chart.\n Hibernation (Appears on: ShootSpec) Hibernation contains information whether the Shoot is suspended or not.\n Field Description enabled bool (Optional) Enabled specifies whether the Shoot needs to be hibernated or not. If it is true, the Shoot’s desired state is to be hibernated. If it is false or nil, the Shoot’s desired state is to be awakened.\n schedules []HibernationSchedule (Optional) Schedules determine the hibernation schedules.\n HibernationSchedule (Appears on: Hibernation) HibernationSchedule determines the hibernation schedule of a Shoot. A Shoot will be regularly hibernated at each start time and will be woken up at each end time. Start or End can be omitted, though at least one of each has to be specified.\n Field Description start string (Optional) Start is a Cron spec at which time a Shoot will be hibernated.\n end string (Optional) End is a Cron spec at which time a Shoot will be woken up.\n location string (Optional) Location is the time location in which both start and shall be evaluated.\n HighAvailability (Appears on: ControlPlane) HighAvailability specifies the configuration settings for high availability for a resource. Typical usages could be to configure HA for shoot control plane or for seed system components.\n Field Description failureTolerance FailureTolerance FailureTolerance holds information about failure tolerance level of a highly available resource.\n HorizontalPodAutoscalerConfig (Appears on: KubeControllerManagerConfig) HorizontalPodAutoscalerConfig contains horizontal pod autoscaler configuration settings for the kube-controller-manager. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description cpuInitializationPeriod Kubernetes meta/v1.Duration (Optional) The period after which a ready pod transition is considered to be the first.\n downscaleStabilization Kubernetes meta/v1.Duration (Optional) The configurable window at which the controller will choose the highest recommendation for autoscaling.\n initialReadinessDelay Kubernetes meta/v1.Duration (Optional) The configurable period at which the horizontal pod autoscaler considers a Pod “not yet ready” given that it’s unready and it has transitioned to unready during that time.\n syncPeriod Kubernetes meta/v1.Duration (Optional) The period for syncing the number of pods in horizontal pod autoscaler.\n tolerance float64 (Optional) The minimum change (from 1.0) in the desired-to-actual metrics ratio for the horizontal pod autoscaler to consider scaling.\n IPFamily (string alias)\n (Appears on: Networking, SeedNetworks) IPFamily is a type for specifying an IP protocol version to use in Gardener clusters.\nIngress (Appears on: SeedSpec) Ingress configures the Ingress specific settings of the cluster\n Field Description domain string Domain specifies the IngressDomain of the cluster pointing to the ingress controller endpoint. It will be used to construct ingress URLs for system applications running in Shoot/Garden clusters. Once set this field is immutable.\n controller IngressController Controller configures a Gardener managed Ingress Controller listening on the ingressDomain\n IngressController (Appears on: Ingress) IngressController enables a Gardener managed Ingress Controller listening on the ingressDomain\n Field Description kind string Kind defines which kind of IngressController to use. At the moment only nginx is supported\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig specifies infrastructure specific configuration for the ingressController\n KubeAPIServerConfig (Appears on: Kubernetes) KubeAPIServerConfig contains configuration settings for the kube-apiserver.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) admissionPlugins []AdmissionPlugin (Optional) AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding configuration.\n apiAudiences []string (Optional) APIAudiences are the identifiers of the API. The service account token authenticator will validate that tokens used against the API are bound to at least one of these audiences. Defaults to [“kubernetes”].\n auditConfig AuditConfig (Optional) AuditConfig contains configuration settings for the audit of the kube-apiserver.\n oidcConfig OIDCConfig (Optional) OIDCConfig contains configuration settings for the OIDC provider.\n runtimeConfig map[string]bool (Optional) RuntimeConfig contains information about enabled or disabled APIs.\n serviceAccountConfig ServiceAccountConfig (Optional) ServiceAccountConfig contains configuration settings for the service account handling of the kube-apiserver.\n watchCacheSizes WatchCacheSizes (Optional) WatchCacheSizes contains configuration of the API server’s watch cache sizes. Configuring these flags might be useful for large-scale Shoot clusters with a lot of parallel update requests and a lot of watching controllers (e.g. large ManagedSeed clusters). When the API server’s watch cache’s capacity is too small to cope with the amount of update requests and watchers for a particular resource, it might happen that controller watches are permanently stopped with too old resource version errors. Starting from kubernetes v1.19, the API server’s watch cache size is adapted dynamically and setting the watch cache size flags will have no effect, except when setting it to 0 (which disables the watch cache).\n requests APIServerRequests (Optional) Requests contains configuration for request-specific settings for the kube-apiserver.\n enableAnonymousAuthentication bool (Optional) EnableAnonymousAuthentication defines whether anonymous requests to the secure port of the API server should be allowed (flag --anonymous-auth). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n eventTTL Kubernetes meta/v1.Duration (Optional) EventTTL controls the amount of time to retain events. Defaults to 1h.\n logging APIServerLogging (Optional) Logging contains configuration for the log level and HTTP access logs.\n defaultNotReadyTolerationSeconds int64 (Optional) DefaultNotReadyTolerationSeconds indicates the tolerationSeconds of the toleration for notReady:NoExecute that is added by default to every pod that does not already have such a toleration (flag --default-not-ready-toleration-seconds). The field has effect only when the DefaultTolerationSeconds admission plugin is enabled. Defaults to 300.\n defaultUnreachableTolerationSeconds int64 (Optional) DefaultUnreachableTolerationSeconds indicates the tolerationSeconds of the toleration for unreachable:NoExecute that is added by default to every pod that does not already have such a toleration (flag --default-unreachable-toleration-seconds). The field has effect only when the DefaultTolerationSeconds admission plugin is enabled. Defaults to 300.\n encryptionConfig EncryptionConfig (Optional) EncryptionConfig contains customizable encryption configuration of the Kube API server.\n structuredAuthentication StructuredAuthentication (Optional) StructuredAuthentication contains configuration settings for structured authentication to the kube-apiserver. This field is only available for Kubernetes v1.30 or later.\n KubeControllerManagerConfig (Appears on: Kubernetes) KubeControllerManagerConfig contains configuration settings for the kube-controller-manager.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) horizontalPodAutoscaler HorizontalPodAutoscalerConfig (Optional) HorizontalPodAutoscalerConfig contains horizontal pod autoscaler configuration settings for the kube-controller-manager.\n nodeCIDRMaskSize int32 (Optional) NodeCIDRMaskSize defines the mask size for node cidr in cluster (default is 24). This field is immutable.\n podEvictionTimeout Kubernetes meta/v1.Duration (Optional) PodEvictionTimeout defines the grace period for deleting pods on failed nodes. Defaults to 2m.\nDeprecated: The corresponding kube-controller-manager flag --pod-eviction-timeout is deprecated in favor of the kube-apiserver flags --default-not-ready-toleration-seconds and --default-unreachable-toleration-seconds. The --pod-eviction-timeout flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. Hence, instead of setting this field, set the spec.kubernetes.kubeAPIServer.defaultNotReadyTolerationSeconds and spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds.\n nodeMonitorGracePeriod Kubernetes meta/v1.Duration (Optional) NodeMonitorGracePeriod defines the grace period before an unresponsive node is marked unhealthy.\n KubeProxyConfig (Appears on: Kubernetes) KubeProxyConfig contains configuration settings for the kube-proxy.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) mode ProxyMode (Optional) Mode specifies which proxy mode to use. defaults to IPTables.\n enabled bool (Optional) Enabled indicates whether kube-proxy should be deployed or not. Depending on the networking extensions switching kube-proxy off might be rejected. Consulting the respective documentation of the used networking extension is recommended before using this field. defaults to true if not specified.\n KubeSchedulerConfig (Appears on: Kubernetes) KubeSchedulerConfig contains configuration settings for the kube-scheduler.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) kubeMaxPDVols string (Optional) KubeMaxPDVols allows to configure the KUBE_MAX_PD_VOLS environment variable for the kube-scheduler. Please find more information here: https://kubernetes.io/docs/concepts/storage/storage-limits/#custom-limits Note that using this field is considered alpha-/experimental-level and is on your own risk. You should be aware of all the side-effects and consequences when changing it.\n profile SchedulingProfile (Optional) Profile configures the scheduling profile for the cluster. If not specified, the used profile is “balanced” (provides the default kube-scheduler behavior).\n KubeletConfig (Appears on: Kubernetes, WorkerKubernetes) KubeletConfig contains configuration settings for the kubelet.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) cpuCFSQuota bool (Optional) CPUCFSQuota allows you to disable/enable CPU throttling for Pods.\n cpuManagerPolicy string (Optional) CPUManagerPolicy allows to set alternative CPU management policies (default: none).\n evictionHard KubeletConfigEviction (Optional) EvictionHard describes a set of eviction thresholds (e.g. memory.available\u003c1Gi) that if met would trigger a Pod eviction. Default: memory.available: “100Mi/1Gi/5%” nodefs.available: “5%” nodefs.inodesFree: “5%” imagefs.available: “5%” imagefs.inodesFree: “5%”\n evictionMaxPodGracePeriod int32 (Optional) EvictionMaxPodGracePeriod describes the maximum allowed grace period (in seconds) to use when terminating pods in response to a soft eviction threshold being met. Default: 90\n evictionMinimumReclaim KubeletConfigEvictionMinimumReclaim (Optional) EvictionMinimumReclaim configures the amount of resources below the configured eviction threshold that the kubelet attempts to reclaim whenever the kubelet observes resource pressure. Default: 0 for each resource\n evictionPressureTransitionPeriod Kubernetes meta/v1.Duration (Optional) EvictionPressureTransitionPeriod is the duration for which the kubelet has to wait before transitioning out of an eviction pressure condition. Default: 4m0s\n evictionSoft KubeletConfigEviction (Optional) EvictionSoft describes a set of eviction thresholds (e.g. memory.available\u003c1.5Gi) that if met over a corresponding grace period would trigger a Pod eviction. Default: memory.available: “200Mi/1.5Gi/10%” nodefs.available: “10%” nodefs.inodesFree: “10%” imagefs.available: “10%” imagefs.inodesFree: “10%”\n evictionSoftGracePeriod KubeletConfigEvictionSoftGracePeriod (Optional) EvictionSoftGracePeriod describes a set of eviction grace periods (e.g. memory.available=1m30s) that correspond to how long a soft eviction threshold must hold before triggering a Pod eviction. Default: memory.available: 1m30s nodefs.available: 1m30s nodefs.inodesFree: 1m30s imagefs.available: 1m30s imagefs.inodesFree: 1m30s\n maxPods int32 (Optional) MaxPods is the maximum number of Pods that are allowed by the Kubelet. Default: 110\n podPidsLimit int64 (Optional) PodPIDsLimit is the maximum number of process IDs per pod allowed by the kubelet.\n failSwapOn bool (Optional) FailSwapOn makes the Kubelet fail to start if swap is enabled on the node. (default true).\n kubeReserved KubeletConfigReserved (Optional) KubeReserved is the configuration for resources reserved for kubernetes node components (mainly kubelet and container runtime). When updating these values, be aware that cgroup resizes may not succeed on active worker nodes. Look for the NodeAllocatableEnforced event to determine if the configuration was applied. Default: cpu=80m,memory=1Gi,pid=20k\n systemReserved KubeletConfigReserved (Optional) SystemReserved is the configuration for resources reserved for system processes not managed by kubernetes (e.g. journald). When updating these values, be aware that cgroup resizes may not succeed on active worker nodes. Look for the NodeAllocatableEnforced event to determine if the configuration was applied.\nDeprecated: Separately configuring resource reservations for system processes is deprecated in Gardener and will be forbidden starting from Kubernetes 1.31. Please merge existing resource reservations into the kubeReserved field. TODO(MichaelEischer): Drop this field after support for Kubernetes 1.30 is dropped.\n imageGCHighThresholdPercent int32 (Optional) ImageGCHighThresholdPercent describes the percent of the disk usage which triggers image garbage collection. Default: 50\n imageGCLowThresholdPercent int32 (Optional) ImageGCLowThresholdPercent describes the percent of the disk to which garbage collection attempts to free. Default: 40\n serializeImagePulls bool (Optional) SerializeImagePulls describes whether the images are pulled one at a time. Default: true\n registryPullQPS int32 (Optional) RegistryPullQPS is the limit of registry pulls per second. The value must not be a negative number. Setting it to 0 means no limit. Default: 5\n registryBurst int32 (Optional) RegistryBurst is the maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registryPullQPS. The value must not be a negative number. Only used if registryPullQPS is greater than 0. Default: 10\n seccompDefault bool (Optional) SeccompDefault enables the use of RuntimeDefault as the default seccomp profile for all workloads. This requires the corresponding SeccompDefault feature gate to be enabled as well. This field is only available for Kubernetes v1.25 or later.\n containerLogMaxSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) A quantity defines the maximum size of the container log file before it is rotated. For example: “5Mi” or “256Ki”. Default: 100Mi\n containerLogMaxFiles int32 (Optional) Maximum number of container log files that can be present for a container.\n protectKernelDefaults bool (Optional) ProtectKernelDefaults ensures that the kernel tunables are equal to the kubelet defaults. Defaults to true for Kubernetes v1.26 or later.\n streamingConnectionIdleTimeout Kubernetes meta/v1.Duration (Optional) StreamingConnectionIdleTimeout is the maximum time a streaming connection can be idle before the connection is automatically closed. This field cannot be set lower than “30s” or greater than “4h”. Default: “4h” for Kubernetes \u003c v1.26. “5m” for Kubernetes \u003e= v1.26.\n memorySwap MemorySwapConfiguration (Optional) MemorySwap configures swap memory available to container workloads.\n KubeletConfigEviction (Appears on: KubeletConfig) KubeletConfigEviction contains kubelet eviction thresholds supporting either a resource.Quantity or a percentage based value.\n Field Description memoryAvailable string (Optional) MemoryAvailable is the threshold for the free memory on the host server.\n imageFSAvailable string (Optional) ImageFSAvailable is the threshold for the free disk space in the imagefs filesystem (docker images and container writable layers).\n imageFSInodesFree string (Optional) ImageFSInodesFree is the threshold for the available inodes in the imagefs filesystem.\n nodeFSAvailable string (Optional) NodeFSAvailable is the threshold for the free disk space in the nodefs filesystem (docker volumes, logs, etc).\n nodeFSInodesFree string (Optional) NodeFSInodesFree is the threshold for the available inodes in the nodefs filesystem.\n KubeletConfigEvictionMinimumReclaim (Appears on: KubeletConfig) KubeletConfigEvictionMinimumReclaim contains configuration for the kubelet eviction minimum reclaim.\n Field Description memoryAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MemoryAvailable is the threshold for the memory reclaim on the host server.\n imageFSAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) ImageFSAvailable is the threshold for the disk space reclaim in the imagefs filesystem (docker images and container writable layers).\n imageFSInodesFree k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) ImageFSInodesFree is the threshold for the inodes reclaim in the imagefs filesystem.\n nodeFSAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) NodeFSAvailable is the threshold for the disk space reclaim in the nodefs filesystem (docker volumes, logs, etc).\n nodeFSInodesFree k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) NodeFSInodesFree is the threshold for the inodes reclaim in the nodefs filesystem.\n KubeletConfigEvictionSoftGracePeriod (Appears on: KubeletConfig) KubeletConfigEvictionSoftGracePeriod contains grace periods for kubelet eviction thresholds.\n Field Description memoryAvailable Kubernetes meta/v1.Duration (Optional) MemoryAvailable is the grace period for the MemoryAvailable eviction threshold.\n imageFSAvailable Kubernetes meta/v1.Duration (Optional) ImageFSAvailable is the grace period for the ImageFSAvailable eviction threshold.\n imageFSInodesFree Kubernetes meta/v1.Duration (Optional) ImageFSInodesFree is the grace period for the ImageFSInodesFree eviction threshold.\n nodeFSAvailable Kubernetes meta/v1.Duration (Optional) NodeFSAvailable is the grace period for the NodeFSAvailable eviction threshold.\n nodeFSInodesFree Kubernetes meta/v1.Duration (Optional) NodeFSInodesFree is the grace period for the NodeFSInodesFree eviction threshold.\n KubeletConfigReserved (Appears on: KubeletConfig) KubeletConfigReserved contains reserved resources for daemons\n Field Description cpu k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) CPU is the reserved cpu.\n memory k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Memory is the reserved memory.\n ephemeralStorage k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) EphemeralStorage is the reserved ephemeral-storage.\n pid k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) PID is the reserved process-ids.\n Kubernetes (Appears on: ShootSpec) Kubernetes contains the version and configuration variables for the Shoot control plane.\n Field Description clusterAutoscaler ClusterAutoscaler (Optional) ClusterAutoscaler contains the configuration flags for the Kubernetes cluster autoscaler.\n kubeAPIServer KubeAPIServerConfig (Optional) KubeAPIServer contains configuration settings for the kube-apiserver.\n kubeControllerManager KubeControllerManagerConfig (Optional) KubeControllerManager contains configuration settings for the kube-controller-manager.\n kubeScheduler KubeSchedulerConfig (Optional) KubeScheduler contains configuration settings for the kube-scheduler.\n kubeProxy KubeProxyConfig (Optional) KubeProxy contains configuration settings for the kube-proxy.\n kubelet KubeletConfig (Optional) Kubelet contains configuration settings for the kubelet.\n version string (Optional) Version is the semantic Kubernetes version to use for the Shoot cluster. Defaults to the highest supported minor and patch version given in the referenced cloud profile. The version can be omitted completely or partially specified, e.g. \u003cmajor\u003e.\u003cminor\u003e.\n verticalPodAutoscaler VerticalPodAutoscaler (Optional) VerticalPodAutoscaler contains the configuration flags for the Kubernetes vertical pod autoscaler.\n enableStaticTokenKubeconfig bool (Optional) EnableStaticTokenKubeconfig indicates whether static token kubeconfig secret will be created for the Shoot cluster. Defaults to true for Shoots with Kubernetes versions \u003c 1.26. Defaults to false for Shoots with Kubernetes versions \u003e= 1.26. Starting Kubernetes 1.27 the field will be locked to false.\n KubernetesConfig (Appears on: KubeAPIServerConfig, KubeControllerManagerConfig, KubeProxyConfig, KubeSchedulerConfig, KubeletConfig) KubernetesConfig contains common configuration fields for the control plane components.\n Field Description featureGates map[string]bool (Optional) FeatureGates contains information about enabled feature gates.\n KubernetesDashboard (Appears on: Addons) KubernetesDashboard describes configuration values for the kubernetes-dashboard addon.\n Field Description Addon Addon (Members of Addon are embedded into this type.) authenticationMode string (Optional) AuthenticationMode defines the authentication mode for the kubernetes-dashboard.\n KubernetesSettings (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) KubernetesSettings contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n Field Description versions []ExpirableVersion (Optional) Versions is the list of allowed Kubernetes versions with optional expiration dates for Shoot clusters.\n LastError (Appears on: BackupBucketStatus, BackupEntryStatus, ShootStatus) LastError indicates the last occurred error for an operation on a resource.\n Field Description description string A human readable message indicating details about the last error.\n taskID string (Optional) ID of the task which caused this last error\n codes []ErrorCode (Optional) Well-defined error codes of the last error(s).\n lastUpdateTime Kubernetes meta/v1.Time (Optional) Last time the error was reported\n LastMaintenance (Appears on: ShootStatus) LastMaintenance holds information about a maintenance operation on the Shoot.\n Field Description description string A human-readable message containing details about the operations performed in the last maintenance.\n triggeredTime Kubernetes meta/v1.Time TriggeredTime is the time when maintenance was triggered.\n state LastOperationState Status of the last maintenance operation, one of Processing, Succeeded, Error.\n failureReason string (Optional) FailureReason holds the information about the last maintenance operation failure reason.\n LastOperation (Appears on: BackupBucketStatus, BackupEntryStatus, SeedStatus, ShootStatus) LastOperation indicates the type and the state of the last operation, along with a description message and a progress indicator.\n Field Description description string A human readable message indicating details about the last operation.\n lastUpdateTime Kubernetes meta/v1.Time Last time the operation state transitioned from one to another.\n progress int32 The progress in percentage (0-100) of the last operation.\n state LastOperationState Status of the last operation, one of Aborted, Processing, Succeeded, Error, Failed.\n type LastOperationType Type of the last operation, one of Create, Reconcile, Delete, Migrate, Restore.\n LastOperationState (string alias)\n (Appears on: LastMaintenance, LastOperation) LastOperationState is a string alias.\nLastOperationType (string alias)\n (Appears on: LastOperation) LastOperationType is a string alias.\nLoadBalancerServicesProxyProtocol (Appears on: SeedSettingLoadBalancerServices, SeedSettingLoadBalancerServicesZones) LoadBalancerServicesProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services.\n Field Description allowed bool Allowed controls whether the ProxyProtocol is optionally allowed for the load balancer services. This should only be enabled if the load balancer services are already using ProxyProtocol or will be reconfigured to use it soon. Until the load balancers are configured with ProxyProtocol, enabling this setting may allow clients to spoof their source IP addresses. The option allows a migration from non-ProxyProtocol to ProxyProtocol without downtime (depending on the infrastructure). Defaults to false.\n Machine (Appears on: Worker) Machine contains information about the machine type and image.\n Field Description type string Type is the machine type of the worker group.\n image ShootMachineImage (Optional) Image holds information about the machine image to use for all nodes of this pool. It will default to the latest version of the first image stated in the referenced CloudProfile if no value has been provided.\n architecture string (Optional) Architecture is CPU architecture of machines in this worker pool.\n MachineControllerManagerSettings (Appears on: Worker) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n Field Description machineDrainTimeout Kubernetes meta/v1.Duration (Optional) MachineDrainTimeout is the period after which machine is forcefully deleted.\n machineHealthTimeout Kubernetes meta/v1.Duration (Optional) MachineHealthTimeout is the period after which machine is declared failed.\n machineCreationTimeout Kubernetes meta/v1.Duration (Optional) MachineCreationTimeout is the period after which creation of the machine is declared failed.\n maxEvictRetries int32 (Optional) MaxEvictRetries are the number of eviction retries on a pod after which drain is declared failed, and forceful deletion is triggered.\n nodeConditions []string (Optional) NodeConditions are the set of conditions if set to true for the period of MachineHealthTimeout, machine will be declared failed.\n MachineImage (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) MachineImage defines the name and multiple versions of the machine image in any environment.\n Field Description name string Name is the name of the image.\n versions []MachineImageVersion Versions contains versions, expiration dates and container runtimes of the machine image\n updateStrategy MachineImageUpdateStrategy (Optional) UpdateStrategy is the update strategy to use for the machine image. Possible values are: - patch: update to the latest patch version of the current minor version. - minor: update to the latest minor and patch version. - major: always update to the overall latest version (default).\n MachineImageUpdateStrategy (string alias)\n (Appears on: MachineImage) MachineImageUpdateStrategy is the update strategy to use for a machine image\nMachineImageVersion (Appears on: MachineImage) MachineImageVersion is an expirable version with list of supported container runtimes and interfaces\n Field Description ExpirableVersion ExpirableVersion (Members of ExpirableVersion are embedded into this type.) cri []CRI (Optional) CRI list of supported container runtime and interfaces supported by this version\n architectures []string (Optional) Architectures is the list of CPU architectures of the machine image in this version.\n kubeletVersionConstraint string (Optional) KubeletVersionConstraint is a constraint describing the supported kubelet versions by the machine image in this version. If the field is not specified, it is assumed that the machine image in this version supports all kubelet versions. Examples: - ‘\u003e= 1.26’ - supports only kubelet versions greater than or equal to 1.26 - ‘\u003c 1.26’ - supports only kubelet versions less than 1.26\n MachineType (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) MachineType contains certain properties of a machine type.\n Field Description cpu k8s.io/apimachinery/pkg/api/resource.Quantity CPU is the number of CPUs for this machine type.\n gpu k8s.io/apimachinery/pkg/api/resource.Quantity GPU is the number of GPUs for this machine type.\n memory k8s.io/apimachinery/pkg/api/resource.Quantity Memory is the amount of memory for this machine type.\n name string Name is the name of the machine type.\n storage MachineTypeStorage (Optional) Storage is the amount of storage associated with the root volume of this machine type.\n usable bool (Optional) Usable defines if the machine type can be used for shoot clusters.\n architecture string (Optional) Architecture is the CPU architecture of this machine type.\n MachineTypeStorage (Appears on: MachineType) MachineTypeStorage is the amount of storage associated with the root volume of this machine type.\n Field Description class string Class is the class of the storage type.\n size k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageSize is the storage size.\n type string Type is the type of the storage.\n minSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinSize is the minimal supported storage size. This overrides any other common minimum size configuration from spec.volumeTypes[*].minSize.\n Maintenance (Appears on: ShootSpec) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n Field Description autoUpdate MaintenanceAutoUpdate (Optional) AutoUpdate contains information about which constraints should be automatically updated.\n timeWindow MaintenanceTimeWindow (Optional) TimeWindow contains information about the time window for maintenance operations.\n confineSpecUpdateRollout bool (Optional) ConfineSpecUpdateRollout prevents that changes/updates to the shoot specification will be rolled out immediately. Instead, they are rolled out during the shoot’s maintenance time window. There is one exception that will trigger an immediate roll out which is changes to the Spec.Hibernation.Enabled field.\n MaintenanceAutoUpdate (Appears on: Maintenance) MaintenanceAutoUpdate contains information about which constraints should be automatically updated.\n Field Description kubernetesVersion bool KubernetesVersion indicates whether the patch Kubernetes version may be automatically updated (default: true).\n machineImageVersion bool (Optional) MachineImageVersion indicates whether the machine image version may be automatically updated (default: true).\n MaintenanceTimeWindow (Appears on: Maintenance) MaintenanceTimeWindow contains information about the time window for maintenance operations.\n Field Description begin string Begin is the beginning of the time window in the format HHMMSS+ZONE, e.g. “220000+0100”. If not present, a random value will be computed.\n end string End is the end of the time window in the format HHMMSS+ZONE, e.g. “220000+0100”. If not present, the value will be computed based on the “Begin” value.\n MemorySwapConfiguration (Appears on: KubeletConfig) MemorySwapConfiguration contains kubelet swap configuration For more information, please see KEP: 2400-node-swap\n Field Description swapBehavior SwapBehavior (Optional) SwapBehavior configures swap memory available to container workloads. May be one of {“LimitedSwap”, “UnlimitedSwap”} defaults to: LimitedSwap\n Monitoring (Appears on: ShootSpec) Monitoring contains information about the monitoring configuration for the shoot.\n Field Description alerting Alerting (Optional) Alerting contains information about the alerting configuration for the shoot cluster.\n NamedResourceReference (Appears on: ExtensionResourceState, ShootSpec) NamedResourceReference is a named reference to a resource.\n Field Description name string Name of the resource reference.\n resourceRef Kubernetes autoscaling/v1.CrossVersionObjectReference ResourceRef is a reference to a resource.\n NamespacedCloudProfileSpec (Appears on: NamespacedCloudProfile) NamespacedCloudProfileSpec is the specification of a NamespacedCloudProfile.\n Field Description caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings (Optional) Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage (Optional) MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType (Optional) MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n parent CloudProfileReference Parent contains a reference to a CloudProfile it inherits from.\n NamespacedCloudProfileStatus (Appears on: NamespacedCloudProfile) NamespacedCloudProfileStatus holds the most recently observed status of the NamespacedCloudProfile.\n Field Description cloudProfileSpec CloudProfileSpec CloudProfile is the most recently generated CloudProfile of the NamespacedCloudProfile.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this NamespacedCloudProfile.\n Networking (Appears on: ShootSpec) Networking defines networking parameters for the shoot cluster.\n Field Description type string (Optional) Type identifies the type of the networking plugin. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to network resource.\n pods string (Optional) Pods is the CIDR of the pod network. This field is immutable.\n nodes string (Optional) Nodes is the CIDR of the entire node network. This field is mutable.\n services string (Optional) Services is the CIDR of the service network. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md. Defaults to [“IPv4”].\n NetworkingStatus (Appears on: ShootStatus) NetworkingStatus contains information about cluster networking such as CIDRs.\n Field Description pods []string (Optional) Pods are the CIDRs of the pod network.\n nodes []string (Optional) Nodes are the CIDRs of the node network.\n services []string (Optional) Services are the CIDRs of the service network.\n NginxIngress (Appears on: Addons) NginxIngress describes configuration values for the nginx-ingress addon.\n Field Description Addon Addon (Members of Addon are embedded into this type.) loadBalancerSourceRanges []string (Optional) LoadBalancerSourceRanges is list of allowed IP sources for NginxIngress\n config map[string]string (Optional) Config contains custom configuration for the nginx-ingress-controller configuration. See https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#configuration-options\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy controls the .spec.externalTrafficPolicy value of the load balancer Service exposing the nginx-ingress. Defaults to Cluster.\n NodeLocalDNS (Appears on: SystemComponents) NodeLocalDNS contains the settings of the node local DNS components running in the data plane of the Shoot cluster.\n Field Description enabled bool Enabled indicates whether node local DNS is enabled or not.\n forceTCPToClusterDNS bool (Optional) ForceTCPToClusterDNS indicates whether the connection from the node local DNS to the cluster DNS (Core DNS) will be forced to TCP or not. Default, if unspecified, is to enforce TCP.\n forceTCPToUpstreamDNS bool (Optional) ForceTCPToUpstreamDNS indicates whether the connection from the node local DNS to the upstream DNS (infrastructure DNS) will be forced to TCP or not. Default, if unspecified, is to enforce TCP.\n disableForwardToUpstreamDNS bool (Optional) DisableForwardToUpstreamDNS indicates whether requests from node local DNS to upstream DNS should be disabled. Default, if unspecified, is to forward requests for external domains to upstream DNS\n OCIRepository (Appears on: HelmControllerDeployment) OCIRepository configures where to pull an OCI Artifact, that could contain for example a Helm Chart.\n Field Description ref string (Optional) Ref is the full artifact Ref and takes precedence over all other fields.\n repository string (Optional) Repository is a reference to an OCI artifact repository.\n tag string (Optional) Tag is the image tag to pull.\n digest string (Optional) Digest of the image to pull, takes precedence over tag.\n OIDCConfig (Appears on: KubeAPIServerConfig) OIDCConfig contains configuration settings for the OIDC provider. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description caBundle string (Optional) If set, the OpenID server’s certificate will be verified by one of the authorities in the oidc-ca-file, otherwise the host’s root CA set will be used.\n clientAuthentication OpenIDConnectClientAuthentication (Optional) ClientAuthentication can optionally contain client configuration used for kubeconfig generation.\nDeprecated: This field has no implemented use and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n clientID string (Optional) The client ID for the OpenID Connect client, must be set.\n groupsClaim string (Optional) If provided, the name of a custom OpenID Connect claim for specifying user groups. The claim value is expected to be a string or array of strings. This flag is experimental, please see the authentication documentation for further details.\n groupsPrefix string (Optional) If provided, all groups will be prefixed with this value to prevent conflicts with other authentication strategies.\n issuerURL string (Optional) The URL of the OpenID issuer, only HTTPS scheme will be accepted. Used to verify the OIDC JSON Web Token (JWT).\n requiredClaims map[string]string (Optional) key=value pairs that describes a required claim in the ID Token. If set, the claim is verified to be present in the ID Token with a matching value.\n signingAlgs []string (Optional) List of allowed JOSE asymmetric signing algorithms. JWTs with a ‘alg’ header value not in this list will be rejected. Values are defined by RFC 7518 https://tools.ietf.org/html/rfc7518#section-3.1\n usernameClaim string (Optional) The OpenID claim to use as the user name. Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable. This flag is experimental, please see the authentication documentation for further details. (default “sub”)\n usernamePrefix string (Optional) If provided, all usernames will be prefixed with this value. If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes. To skip any prefixing, provide the value ‘-’.\n ObservabilityRotation (Appears on: ShootCredentialsRotation) ObservabilityRotation contains information about the observability credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the observability credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the observability credential rotation was successfully completed.\n OpenIDConnectClientAuthentication (Appears on: OIDCConfig) OpenIDConnectClientAuthentication contains configuration for OIDC clients.\n Field Description extraConfig map[string]string (Optional) Extra configuration added to kubeconfig’s auth-provider. Must not be any of idp-issuer-url, client-id, client-secret, idp-certificate-authority, idp-certificate-authority-data, id-token or refresh-token\n secret string (Optional) The client Secret for the OpenID Connect client.\n ProjectMember (Appears on: ProjectSpec) ProjectMember is a member of a project.\n Field Description Subject Kubernetes rbac/v1.Subject (Members of Subject are embedded into this type.) Subject is representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n role string Role represents the role of this member. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the roles list. TODO: Remove this field in favor of the roles list in v1.\n roles []string (Optional) Roles represents the list of roles of this member.\n ProjectPhase (string alias)\n (Appears on: ProjectStatus) ProjectPhase is a label for the condition of a project at the current time.\nProjectSpec (Appears on: Project) ProjectSpec is the specification of a Project.\n Field Description createdBy Kubernetes rbac/v1.Subject (Optional) CreatedBy is a subject representing a user name, an email address, or any other identifier of a user who created the project. This field is immutable.\n description string (Optional) Description is a human-readable description of what the project is used for.\n owner Kubernetes rbac/v1.Subject (Optional) Owner is a subject representing a user name, an email address, or any other identifier of a user owning the project. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the owner role. The only way to change the owner will be by moving the owner role. In this API version the only way to change the owner is to use this field. TODO: Remove this field in favor of the owner role in v1.\n purpose string (Optional) Purpose is a human-readable explanation of the project’s purpose.\n members []ProjectMember (Optional) Members is a list of subjects representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n namespace string (Optional) Namespace is the name of the namespace that has been created for the Project object. A nil value means that Gardener will determine the name of the namespace. This field is immutable.\n tolerations ProjectTolerations (Optional) Tolerations contains the tolerations for taints on seed clusters.\n dualApprovalForDeletion []DualApprovalForDeletion (Optional) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n ProjectStatus (Appears on: Project) ProjectStatus holds the most recently observed status of the project.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this project.\n phase ProjectPhase Phase is the current phase of the project.\n staleSinceTimestamp Kubernetes meta/v1.Time (Optional) StaleSinceTimestamp contains the timestamp when the project was first discovered to be stale/unused.\n staleAutoDeleteTimestamp Kubernetes meta/v1.Time (Optional) StaleAutoDeleteTimestamp contains the timestamp when the project will be garbage-collected/automatically deleted because it’s stale/unused.\n lastActivityTimestamp Kubernetes meta/v1.Time (Optional) LastActivityTimestamp contains the timestamp from the last activity performed in this project.\n ProjectTolerations (Appears on: ProjectSpec) ProjectTolerations contains the tolerations for taints on seed clusters.\n Field Description defaults []Toleration (Optional) Defaults contains a list of tolerations that are added to the shoots in this project by default.\n whitelist []Toleration (Optional) Whitelist contains a list of tolerations that are allowed to be added to the shoots in this project. Please note that this list may only be added by users having the spec-tolerations-whitelist verb for project resources.\n Provider (Appears on: ShootSpec) Provider contains provider-specific information that are handed-over to the provider-specific extension controller.\n Field Description type string Type is the type of the provider. This field is immutable.\n controlPlaneConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ControlPlaneConfig contains the provider-specific control plane config blob. Please look up the concrete definition in the documentation of your provider extension.\n infrastructureConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureConfig contains the provider-specific infrastructure config blob. Please look up the concrete definition in the documentation of your provider extension.\n workers []Worker (Optional) Workers is a list of worker groups.\n workersSettings WorkersSettings (Optional) WorkersSettings contains settings for all workers.\n ProxyMode (string alias)\n (Appears on: KubeProxyConfig) ProxyMode available in Linux platform: ‘userspace’ (older, going to be EOL), ‘iptables’ (newer, faster), ‘ipvs’ (newest, better in performance and scalability). As of now only ‘iptables’ and ‘ipvs’ is supported by Gardener. In Linux platform, if the iptables proxy is selected, regardless of how, but the system’s kernel or iptables versions are insufficient, this always falls back to the userspace proxy. IPVS mode will be enabled when proxy mode is set to ‘ipvs’, and the fall back path is firstly iptables and then userspace.\nQuotaSpec (Appears on: Quota) QuotaSpec is the specification of a Quota.\n Field Description clusterLifetimeDays int32 (Optional) ClusterLifetimeDays is the lifetime of a Shoot cluster in days before it will be terminated automatically.\n metrics Kubernetes core/v1.ResourceList Metrics is a list of resources which will be put under constraints.\n scope Kubernetes core/v1.ObjectReference Scope is the scope of the Quota object, either ‘project’, ‘secret’ or ‘workloadidentity’. This field is immutable.\n Region (Appears on: CloudProfileSpec) Region contains certain properties of a region.\n Field Description name string Name is a region name.\n zones []AvailabilityZone (Optional) Zones is a list of availability zones in this region.\n labels map[string]string (Optional) Labels is an optional set of key-value pairs that contain certain administrator-controlled labels for this region. It can be used by Gardener administrators/operators to provide additional information about a region, e.g. wrt quality, reliability, access restrictions, etc.\n ResourceData (Appears on: ShootStateSpec) ResourceData holds the data of a resource referred to by an extension controller state.\n Field Description CrossVersionObjectReference Kubernetes autoscaling/v1.CrossVersionObjectReference (Members of CrossVersionObjectReference are embedded into this type.) data k8s.io/apimachinery/pkg/runtime.RawExtension Data of the resource\n ResourceWatchCacheSize (Appears on: WatchCacheSizes) ResourceWatchCacheSize contains configuration of the API server’s watch cache size for one specific resource.\n Field Description apiGroup string (Optional) APIGroup is the API group of the resource for which the watch cache size should be configured. An unset value is used to specify the legacy core API (e.g. for secrets).\n resource string Resource is the name of the resource for which the watch cache size should be configured (in lowercase plural form, e.g. secrets).\n size int32 CacheSize specifies the watch cache size that should be configured for the specified resource.\n SSHAccess (Appears on: WorkersSettings) SSHAccess contains settings regarding ssh access to the worker nodes.\n Field Description enabled bool Enabled indicates whether the SSH access to the worker nodes is ensured to be enabled or disabled in systemd. Defaults to true.\n SchedulingProfile (string alias)\n (Appears on: KubeSchedulerConfig) SchedulingProfile is a string alias used for scheduling profile values.\nSecretBindingProvider (Appears on: SecretBinding) SecretBindingProvider defines the provider type of the SecretBinding.\n Field Description type string Type is the type of the provider.\nFor backwards compatibility, the field can contain multiple providers separated by a comma. However the usage of single SecretBinding (hence Secret) for different cloud providers is strongly discouraged.\n SeedBackup (Appears on: SeedSpec) SeedBackup contains the object store configuration for backups for shoot (currently only etcd).\n Field Description provider string Provider is a provider name. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n region string (Optional) Region is a region name. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a Secret object containing the cloud provider credentials for the object store where backups should be stored. It should have enough privileges to manipulate the objects as well as buckets.\n SeedDNS (Appears on: SeedSpec) SeedDNS contains DNS-relevant information about this seed cluster.\n Field Description provider SeedDNSProvider (Optional) Provider configures a DNSProvider\n SeedDNSProvider (Appears on: SeedDNS) SeedDNSProvider configures a DNSProvider for Seeds\n Field Description type string Type describes the type of the dns-provider, for example aws-route53\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a Secret object containing cloud provider credentials used for registering external domains.\n SeedNetworks (Appears on: SeedSpec) SeedNetworks contains CIDRs for the pod, service and node networks of a Kubernetes cluster.\n Field Description nodes string (Optional) Nodes is the CIDR of the node network. This field is immutable.\n pods string Pods is the CIDR of the pod network. This field is immutable.\n services string Services is the CIDR of the service network. This field is immutable.\n shootDefaults ShootNetworks (Optional) ShootDefaults contains the default networks CIDRs for shoots.\n blockCIDRs []string (Optional) BlockCIDRs is a list of network addresses that should be blocked for shoot control plane components running in the seed cluster.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for seed networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md. Defaults to [“IPv4”].\n SeedProvider (Appears on: SeedSpec) SeedProvider defines the provider-specific information of this Seed cluster.\n Field Description type string Type is the name of the provider.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to Seed resource.\n region string Region is a name of a region.\n zones []string (Optional) Zones is the list of availability zones the seed cluster is deployed to.\n SeedSelector (Appears on: CloudProfileSpec, ExposureClassScheduling, ShootSpec) SeedSelector contains constraints for selecting seed to be usable for shoots using a profile\n Field Description LabelSelector Kubernetes meta/v1.LabelSelector (Members of LabelSelector are embedded into this type.) (Optional) LabelSelector is optional and can be used to select seeds by their label settings\n providerTypes []string (Optional) Providers is optional and can be used by restricting seeds by their provider type. ‘*’ can be used to enable seeds regardless of their provider type.\n SeedSettingDependencyWatchdog (Appears on: SeedSettings) SeedSettingDependencyWatchdog controls the dependency-watchdog settings for the seed.\n Field Description weeder SeedSettingDependencyWatchdogWeeder (Optional) Weeder controls the weeder settings for the dependency-watchdog for the seed.\n prober SeedSettingDependencyWatchdogProber (Optional) Prober controls the prober settings for the dependency-watchdog for the seed.\n SeedSettingDependencyWatchdogProber (Appears on: SeedSettingDependencyWatchdog) SeedSettingDependencyWatchdogProber controls the prober settings for the dependency-watchdog for the seed.\n Field Description enabled bool Enabled controls whether the probe controller(prober) of the dependency-watchdog should be enabled. This controller scales down the kube-controller-manager, machine-controller-manager and cluster-autoscaler of shoot clusters in case their respective kube-apiserver is not reachable via its external ingress in order to avoid melt-down situations.\n SeedSettingDependencyWatchdogWeeder (Appears on: SeedSettingDependencyWatchdog) SeedSettingDependencyWatchdogWeeder controls the weeder settings for the dependency-watchdog for the seed.\n Field Description enabled bool Enabled controls whether the endpoint controller(weeder) of the dependency-watchdog should be enabled. This controller helps to alleviate the delay where control plane components remain unavailable by finding the respective pods in CrashLoopBackoff status and restarting them once their dependants become ready and available again.\n SeedSettingExcessCapacityReservation (Appears on: SeedSettings) SeedSettingExcessCapacityReservation controls the excess capacity reservation for shoot control planes in the seed.\n Field Description enabled bool (Optional) Enabled controls whether the default excess capacity reservation should be enabled. When not specified, the functionality is enabled.\n configs []SeedSettingExcessCapacityReservationConfig (Optional) Configs configures excess capacity reservation deployments for shoot control planes in the seed.\n SeedSettingExcessCapacityReservationConfig (Appears on: SeedSettingExcessCapacityReservation) SeedSettingExcessCapacityReservationConfig configures excess capacity reservation deployments for shoot control planes in the seed.\n Field Description resources Kubernetes core/v1.ResourceList Resources specify the resource requests and limits of the excess-capacity-reservation pod.\n nodeSelector map[string]string (Optional) NodeSelector specifies the node where the excess-capacity-reservation pod should run.\n tolerations []Kubernetes core/v1.Toleration (Optional) Tolerations specify the tolerations for the the excess-capacity-reservation pod.\n SeedSettingLoadBalancerServices (Appears on: SeedSettings) SeedSettingLoadBalancerServices controls certain settings for services of type load balancer that are created in the seed.\n Field Description annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into every load balancer service object.\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy describes how nodes distribute service traffic they receive on one of the service’s “externally-facing” addresses. Defaults to “Cluster”.\n zones []SeedSettingLoadBalancerServicesZones (Optional) Zones controls settings, which are specific to the single-zone load balancers in a multi-zonal setup. Can be empty for single-zone seeds. Each specified zone has to relate to one of the zones in seed.spec.provider.zones.\n proxyProtocol LoadBalancerServicesProxyProtocol (Optional) ProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services. Defaults to nil, which is equivalent to not allowing ProxyProtocol.\n SeedSettingLoadBalancerServicesZones (Appears on: SeedSettingLoadBalancerServices) SeedSettingLoadBalancerServicesZones controls settings, which are specific to the single-zone load balancers in a multi-zonal setup.\n Field Description name string Name is the name of the zone as specified in seed.spec.provider.zones.\n annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into the zone-specific load balancer service object.\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy describes how nodes distribute service traffic they receive on one of the service’s “externally-facing” addresses. Defaults to “Cluster”.\n proxyProtocol LoadBalancerServicesProxyProtocol (Optional) ProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services. Defaults to nil, which is equivalent to not allowing ProxyProtocol.\n SeedSettingScheduling (Appears on: SeedSettings) SeedSettingScheduling controls settings for scheduling decisions for the seed.\n Field Description visible bool Visible controls whether the gardener-scheduler shall consider this seed when scheduling shoots. Invisible seeds are not considered by the scheduler.\n SeedSettingTopologyAwareRouting (Appears on: SeedSettings) SeedSettingTopologyAwareRouting controls certain settings for topology-aware traffic routing in the seed. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Field Description enabled bool Enabled controls whether certain Services deployed in the seed cluster should be topology-aware. These Services are etcd-main-client, etcd-events-client, kube-apiserver, gardener-resource-manager and vpa-webhook.\n SeedSettingVerticalPodAutoscaler (Appears on: SeedSettings) SeedSettingVerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n Field Description enabled bool Enabled controls whether the VPA components shall be deployed into the garden namespace in the seed cluster. It is enabled by default because Gardener heavily relies on a VPA being deployed. You should only disable this if your seed cluster already has another, manually/custom managed VPA deployment.\n SeedSettings (Appears on: SeedSpec) SeedSettings contains certain settings for this seed cluster.\n Field Description excessCapacityReservation SeedSettingExcessCapacityReservation (Optional) ExcessCapacityReservation controls the excess capacity reservation for shoot control planes in the seed.\n scheduling SeedSettingScheduling (Optional) Scheduling controls settings for scheduling decisions for the seed.\n loadBalancerServices SeedSettingLoadBalancerServices (Optional) LoadBalancerServices controls certain settings for services of type load balancer that are created in the seed.\n verticalPodAutoscaler SeedSettingVerticalPodAutoscaler (Optional) VerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n dependencyWatchdog SeedSettingDependencyWatchdog (Optional) DependencyWatchdog controls certain settings for the dependency-watchdog components deployed in the seed.\n topologyAwareRouting SeedSettingTopologyAwareRouting (Optional) TopologyAwareRouting controls certain settings for topology-aware traffic routing in the seed. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n SeedSpec (Appears on: Seed, SeedTemplate) SeedSpec is the specification of a Seed.\n Field Description backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n SeedStatus (Appears on: Seed) SeedStatus is the status of a Seed.\n Field Description gardener Gardener (Optional) Gardener holds information about the Gardener which last acted on the Shoot.\n kubernetesVersion string (Optional) KubernetesVersion is the Kubernetes version of the seed cluster.\n conditions []Condition (Optional) Conditions represents the latest available observations of a Seed’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Seed. It corresponds to the Seed’s generation, which is updated on mutation by the API Server.\n clusterIdentity string (Optional) ClusterIdentity is the identity of the Seed cluster. This field is immutable.\n capacity Kubernetes core/v1.ResourceList (Optional) Capacity represents the total resources of a seed.\n allocatable Kubernetes core/v1.ResourceList (Optional) Allocatable represents the resources of a seed that are available for scheduling. Defaults to Capacity.\n clientCertificateExpirationTimestamp Kubernetes meta/v1.Time (Optional) ClientCertificateExpirationTimestamp is the timestamp at which gardenlet’s client certificate expires.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the Seed.\n SeedTaint (Appears on: SeedSpec) SeedTaint describes a taint on a seed.\n Field Description key string Key is the taint key to be applied to a seed.\n value string (Optional) Value is the taint value corresponding to the taint key.\n SeedTemplate SeedTemplate is a template for creating a Seed object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec SeedSpec (Optional) Specification of the desired behavior of the Seed.\n backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n SeedVolume (Appears on: SeedSpec) SeedVolume contains settings for persistentvolumes created in the seed cluster.\n Field Description minimumSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinimumSize defines the minimum size that should be used for PVCs in the seed.\n providers []SeedVolumeProvider (Optional) Providers is a list of storage class provisioner types for the seed.\n SeedVolumeProvider (Appears on: SeedVolume) SeedVolumeProvider is a storage class provisioner type.\n Field Description purpose string Purpose is the purpose of this provider.\n name string Name is the name of the storage class provisioner type.\n ServiceAccountConfig (Appears on: KubeAPIServerConfig) ServiceAccountConfig is the kube-apiserver configuration for service accounts.\n Field Description issuer string (Optional) Issuer is the identifier of the service account token issuer. The issuer will assert this identifier in “iss” claim of issued tokens. This value is used to generate new service account tokens. This value is a string or URI. Defaults to URI of the API server.\n extendTokenExpiration bool (Optional) ExtendTokenExpiration turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.\n maxTokenExpiration Kubernetes meta/v1.Duration (Optional) MaxTokenExpiration is the maximum validity duration of a token created by the service account token issuer. If an otherwise valid TokenRequest with a validity duration larger than this value is requested, a token will be issued with a validity duration of this value. This field must be within [30d,90d].\n acceptedIssuers []string (Optional) AcceptedIssuers is an additional set of issuers that are used to determine which service account tokens are accepted. These values are not used to generate new service account tokens. Only useful when service account tokens are also issued by another external system or a change of the current issuer that is used for generating tokens is being performed.\n ServiceAccountKeyRotation (Appears on: ShootCredentialsRotation) ServiceAccountKeyRotation contains information about the service account key credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the service account key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the service account key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the service account key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the service account key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the service account key credential rotation completion was triggered.\n ShootAdvertisedAddress (Appears on: ShootStatus) ShootAdvertisedAddress contains information for the shoot’s Kube API server.\n Field Description name string Name of the advertised address. e.g. external\n url string The URL of the API Server. e.g. https://api.foo.bar or https://1.2.3.4\n ShootCredentials (Appears on: ShootStatus) ShootCredentials contains information about the shoot credentials.\n Field Description rotation ShootCredentialsRotation (Optional) Rotation contains information about the credential rotations.\n ShootCredentialsRotation (Appears on: ShootCredentials) ShootCredentialsRotation contains information about the rotation of credentials.\n Field Description certificateAuthorities CARotation (Optional) CertificateAuthorities contains information about the certificate authority credential rotation.\n kubeconfig ShootKubeconfigRotation (Optional) Kubeconfig contains information about the kubeconfig credential rotation.\n sshKeypair ShootSSHKeypairRotation (Optional) SSHKeypair contains information about the ssh-keypair credential rotation.\n observability ObservabilityRotation (Optional) Observability contains information about the observability credential rotation.\n serviceAccountKey ServiceAccountKeyRotation (Optional) ServiceAccountKey contains information about the service account key credential rotation.\n etcdEncryptionKey ETCDEncryptionKeyRotation (Optional) ETCDEncryptionKey contains information about the ETCD encryption key credential rotation.\n ShootKubeconfigRotation (Appears on: ShootCredentialsRotation) ShootKubeconfigRotation contains information about the kubeconfig credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the kubeconfig credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the kubeconfig credential rotation was successfully completed.\n ShootMachineImage (Appears on: Machine) ShootMachineImage defines the name and the version of the shoot’s machine image in any environment. Has to be defined in the respective CloudProfile.\n Field Description name string Name is the name of the image.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the shoot’s individual configuration passed to an extension resource.\n version string (Optional) Version is the version of the shoot’s image. If version is not provided, it will be defaulted to the latest version from the CloudProfile.\n ShootNetworks (Appears on: SeedNetworks) ShootNetworks contains the default networks CIDRs for shoots.\n Field Description pods string (Optional) Pods is the CIDR of the pod network.\n services string (Optional) Services is the CIDR of the service network.\n ShootPurpose (string alias)\n (Appears on: ShootSpec) ShootPurpose is a type alias for string.\nShootSSHKeypairRotation (Appears on: ShootCredentialsRotation) ShootSSHKeypairRotation contains information about the ssh-keypair credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the ssh-keypair credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the ssh-keypair credential rotation was successfully completed.\n ShootSpec (Appears on: Shoot, ShootTemplate) ShootSpec is the specification of a Shoot.\n Field Description addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n ShootStateSpec (Appears on: ShootState) ShootStateSpec is the specification of the ShootState.\n Field Description gardener []GardenerResourceData (Optional) Gardener holds the data required to generate resources deployed by the gardenlet\n extensions []ExtensionResourceState (Optional) Extensions holds the state of custom resources reconciled by extension controllers in the seed\n resources []ResourceData (Optional) Resources holds the data of resources referred to by extension controller states\n ShootStatus (Appears on: Shoot) ShootStatus holds the most recently observed status of the Shoot cluster.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of a Shoots’s current state.\n constraints []Condition (Optional) Constraints represents conditions of a Shoot’s current state that constraint some operations on it.\n gardener Gardener Gardener holds information about the Gardener which last acted on the Shoot.\n hibernated bool IsHibernated indicates whether the Shoot is currently hibernated.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the Shoot.\n lastErrors []LastError (Optional) LastErrors holds information about the last occurred error(s) during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Shoot. It corresponds to the Shoot’s generation, which is updated on mutation by the API Server.\n retryCycleStartTime Kubernetes meta/v1.Time (Optional) RetryCycleStartTime is the start time of the last retry cycle (used to determine how often an operation must be retried until we give up).\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot. This value is only written after a successful create/reconcile operation. It will be used when control planes are moved between Seeds.\n technicalID string TechnicalID is the name that is used for creating the Seed namespace, the infrastructure resources, and basically everything that is related to this particular Shoot. This field is immutable.\n uid k8s.io/apimachinery/pkg/types.UID UID is a unique identifier for the Shoot cluster to avoid portability between Kubernetes clusters. It is used to compute unique hashes. This field is immutable.\n clusterIdentity string (Optional) ClusterIdentity is the identity of the Shoot cluster. This field is immutable.\n advertisedAddresses []ShootAdvertisedAddress (Optional) List of addresses that are relevant to the shoot. These include the Kube API server address and also the service account issuer.\n migrationStartTime Kubernetes meta/v1.Time (Optional) MigrationStartTime is the time when a migration to a different seed was initiated.\n credentials ShootCredentials (Optional) Credentials contains information about the shoot credentials.\n lastHibernationTriggerTime Kubernetes meta/v1.Time (Optional) LastHibernationTriggerTime indicates the last time when the hibernation controller managed to change the hibernation settings of the cluster\n lastMaintenance LastMaintenance (Optional) LastMaintenance holds information about the last maintenance operations on the Shoot.\n encryptedResources []string (Optional) EncryptedResources is the list of resources in the Shoot which are currently encrypted. Secrets are encrypted by default and are not part of the list. See https://github.com/gardener/gardener/blob/master/docs/usage/etcd_encryption_config.md for more details.\n networking NetworkingStatus (Optional) Networking contains information about cluster networking such as CIDRs.\n ShootTemplate ShootTemplate is a template for creating a Shoot object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootSpec (Optional) Specification of the desired behavior of the Shoot.\n addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n StructuredAuthentication (Appears on: KubeAPIServerConfig) StructuredAuthentication contains authentication config for kube-apiserver.\n Field Description configMapName string ConfigMapName is the name of the ConfigMap in the project namespace which contains AuthenticationConfiguration for the kube-apiserver.\n SwapBehavior (string alias)\n (Appears on: MemorySwapConfiguration) SwapBehavior configures swap memory available to container workloads\nSystemComponents (Appears on: ShootSpec) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n Field Description coreDNS CoreDNS (Optional) CoreDNS contains the settings of the Core DNS components running in the data plane of the Shoot cluster.\n nodeLocalDNS NodeLocalDNS (Optional) NodeLocalDNS contains the settings of the node local DNS components running in the data plane of the Shoot cluster.\n Toleration (Appears on: ExposureClassScheduling, ProjectTolerations, ShootSpec) Toleration is a toleration for a seed taint.\n Field Description key string Key is the toleration key to be applied to a project or shoot.\n value string (Optional) Value is the toleration value corresponding to the toleration key.\n VersionClassification (string alias)\n (Appears on: ExpirableVersion) VersionClassification is the logical state of a version.\nVerticalPodAutoscaler (Appears on: Kubernetes) VerticalPodAutoscaler contains the configuration flags for the Kubernetes vertical pod autoscaler.\n Field Description enabled bool Enabled specifies whether the Kubernetes VPA shall be enabled for the shoot cluster.\n evictAfterOOMThreshold Kubernetes meta/v1.Duration (Optional) EvictAfterOOMThreshold defines the threshold that will lead to pod eviction in case it OOMed in less than the given threshold since its start and if it has only one container (default: 10m0s).\n evictionRateBurst int32 (Optional) EvictionRateBurst defines the burst of pods that can be evicted (default: 1)\n evictionRateLimit float64 (Optional) EvictionRateLimit defines the number of pods that can be evicted per second. A rate limit set to 0 or -1 will disable the rate limiter (default: -1).\n evictionTolerance float64 (Optional) EvictionTolerance defines the fraction of replica count that can be evicted for update in case more than one pod can be evicted (default: 0.5).\n recommendationMarginFraction float64 (Optional) RecommendationMarginFraction is the fraction of usage added as the safety margin to the recommended request (default: 0.15).\n updaterInterval Kubernetes meta/v1.Duration (Optional) UpdaterInterval is the interval how often the updater should run (default: 1m0s).\n recommenderInterval Kubernetes meta/v1.Duration (Optional) RecommenderInterval is the interval how often metrics should be fetched (default: 1m0s).\n targetCPUPercentile float64 (Optional) TargetCPUPercentile is the usage percentile that will be used as a base for CPU target recommendation. Doesn’t affect CPU lower bound, CPU upper bound nor memory recommendations. (default: 0.9)\n recommendationLowerBoundCPUPercentile float64 (Optional) RecommendationLowerBoundCPUPercentile is the usage percentile that will be used for the lower bound on CPU recommendation. (default: 0.5)\n recommendationUpperBoundCPUPercentile float64 (Optional) RecommendationUpperBoundCPUPercentile is the usage percentile that will be used for the upper bound on CPU recommendation. (default: 0.95)\n targetMemoryPercentile float64 (Optional) TargetMemoryPercentile is the usage percentile that will be used as a base for memory target recommendation. Doesn’t affect memory lower bound nor memory upper bound. (default: 0.9)\n recommendationLowerBoundMemoryPercentile float64 (Optional) RecommendationLowerBoundMemoryPercentile is the usage percentile that will be used for the lower bound on memory recommendation. (default: 0.5)\n recommendationUpperBoundMemoryPercentile float64 (Optional) RecommendationUpperBoundMemoryPercentile is the usage percentile that will be used for the upper bound on memory recommendation. (default: 0.95)\n Volume (Appears on: Worker) Volume contains information about the volume type, size, and encryption.\n Field Description name string (Optional) Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string VolumeSize is the size of the volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n VolumeType (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) VolumeType contains certain properties of a volume type.\n Field Description class string Class is the class of the volume type.\n name string Name is the name of the volume type.\n usable bool (Optional) Usable defines if the volume type can be used for shoot clusters.\n minSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinSize is the minimal supported storage size.\n WatchCacheSizes (Appears on: KubeAPIServerConfig) WatchCacheSizes contains configuration of the API server’s watch cache sizes.\n Field Description default int32 (Optional) Default configures the default watch cache size of the kube-apiserver (flag --default-watch-cache-size, defaults to 100). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n resources []ResourceWatchCacheSize (Optional) Resources configures the watch cache size of the kube-apiserver per resource (flag --watch-cache-sizes). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n Worker (Appears on: Provider) Worker is the base definition of a worker group.\n Field Description annotations map[string]string (Optional) Annotations is a map of key/value pairs for annotations for all the Node objects in this worker pool.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every machine of this worker pool.\n cri CRI (Optional) CRI contains configurations of CRI support of every machine in the worker pool. Defaults to a CRI with name containerd.\n kubernetes WorkerKubernetes (Optional) Kubernetes contains configuration for Kubernetes components related to this worker pool.\n labels map[string]string (Optional) Labels is a map of key/value pairs for labels for all the Node objects in this worker pool.\n name string Name is the name of the worker group.\n machine Machine Machine contains information about the machine type and image.\n maximum int32 Maximum is the maximum number of machines to create. This value is divided by the number of configured zones for a fair distribution.\n minimum int32 Minimum is the minimum number of machines to create. This value is divided by the number of configured zones for a fair distribution.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) MaxSurge is maximum number of machines that are created during an update. This value is divided by the number of configured zones for a fair distribution.\n maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) MaxUnavailable is the maximum number of machines that can be unavailable during an update. This value is divided by the number of configured zones for a fair distribution.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the provider-specific configuration for this worker pool.\n taints []Kubernetes core/v1.Taint (Optional) Taints is a list of taints for all the Node objects in this worker pool.\n volume Volume (Optional) Volume contains information about the volume type and size.\n dataVolumes []DataVolume (Optional) DataVolumes contains a list of additional worker volumes.\n kubeletDataVolumeName string (Optional) KubeletDataVolumeName contains the name of a dataVolume that should be used for storing kubelet state.\n zones []string (Optional) Zones is a list of availability zones that are used to evenly distribute this worker pool. Optional as not every provider may support availability zones.\n systemComponents WorkerSystemComponents (Optional) SystemComponents contains configuration for system components related to this worker pool\n machineControllerManager MachineControllerManagerSettings (Optional) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n sysctls map[string]string (Optional) Sysctls is a map of kernel settings to apply on all machines in this worker pool.\n clusterAutoscaler ClusterAutoscalerOptions (Optional) ClusterAutoscaler contains the cluster autoscaler configurations for the worker pool.\n WorkerKubernetes (Appears on: Worker) WorkerKubernetes contains configuration for Kubernetes components related to this worker pool.\n Field Description kubelet KubeletConfig (Optional) Kubelet contains configuration settings for all kubelets of this worker pool. If set, all spec.kubernetes.kubelet settings will be overwritten for this worker pool (no merge of settings).\n version string (Optional) Version is the semantic Kubernetes version to use for the Kubelet in this Worker Group. If not specified the kubelet version is derived from the global shoot cluster kubernetes version. version must be equal or lower than the version of the shoot kubernetes version. Only one minor version difference to other worker groups and global kubernetes version is allowed.\n WorkerSystemComponents (Appears on: Worker) WorkerSystemComponents contains configuration for system components related to this worker pool\n Field Description allow bool Allow determines whether the pool should be allowed to host system components or not (defaults to true)\n WorkersSettings (Appears on: Provider) WorkersSettings contains settings for all workers.\n Field Description sshAccess SSHAccess (Optional) SSHAccess contains settings regarding ssh access to the worker nodes.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n core.gardener.cloud/v1beta1 core.gardener.cloud/v1beta1 …","ref":"/docs/gardener/api-reference/core/","tags":"","title":"Core"},{"body":"Packages:\n core.gardener.cloud/v1 core.gardener.cloud/v1 Package v1 is a version of the API.\nResource Types: ControllerDeployment ControllerDeployment ControllerDeployment contains information about how this controller is deployed.\n Field Description apiVersion string core.gardener.cloud/v1 kind string ControllerDeployment metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. helm HelmControllerDeployment (Optional) Helm configures that an extension controller is deployed using helm.\n HelmControllerDeployment (Appears on: ControllerDeployment) HelmControllerDeployment configures how an extension controller is deployed using helm.\n Field Description rawChart []byte (Optional) RawChart is the base64-encoded, gzip’ed, tar’ed extension controller chart.\n values Kubernetes apiextensions/v1.JSON (Optional) Values are the chart values.\n ociRepository OCIRepository (Optional) OCIRepository defines where to pull the chart.\n OCIRepository (Appears on: HelmControllerDeployment) OCIRepository configures where to pull an OCI Artifact, that could contain for example a Helm Chart.\n Field Description ref string (Optional) Ref is the full artifact Ref and takes precedence over all other fields.\n repository string (Optional) Repository is a reference to an OCI artifact repository.\n tag string (Optional) Tag is the image tag to pull.\n digest string (Optional) Digest of the image to pull, takes precedence over tag. The value should be in the format ‘sha256:’.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n core.gardener.cloud/v1 core.gardener.cloud/v1 Package …","ref":"/docs/gardener/api-reference/core-v1/","tags":"","title":"Core V1"},{"body":"Gardener Extension for CoreOS Container Linux \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It supports CoreOS Container Linux and Flatcar Container Linux (“a friendly fork of CoreOS Container Linux”).\nThe controller manages those objects that are requesting CoreOS Container Linux configuration (.spec.type=coreos) or Flatcar Container Linux configuration (.spec.type=flatcar):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: coreos units: ... files: ... Please find a concrete example in the example folder.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/coreos-cloudinit -from-file=\u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the CoreOS/FlatCar Container Linux operating system","excerpt":"Gardener extension controller for the CoreOS/FlatCar Container Linux …","ref":"/docs/extensions/os-extensions/gardener-extension-os-coreos/","tags":"","title":"CoreOS/FlatCar OS"},{"body":"Create a Shoot Cluster As you have already prepared an example Shoot manifest in the steps described in the development documentation, please open another Terminal pane/window with the KUBECONFIG environment variable pointing to the Garden development cluster and send the manifest to the Kubernetes API server:\nkubectl apply -f your-shoot-aws.yaml You should see that Gardener has immediately picked up your manifest and has started to deploy the Shoot cluster.\nIn order to investigate what is happening in the Seed cluster, please download its proper Kubeconfig yourself (see next paragraph). The namespace of the Shoot cluster in the Seed cluster will look like that: shoot-johndoe-johndoe-1, whereas the first johndoe is your namespace in the Garden cluster (also called “project”) and the johndoe-1 suffix is the actual name of the Shoot cluster.\nTo connect to the newly created Shoot cluster, you must download its Kubeconfig as well. Please connect to the proper Seed cluster, navigate to the Shoot namespace, and download the Kubeconfig from the kubecfg secret in that namespace.\nDelete a Shoot Cluster In order to delete your cluster, you have to set an annotation confirming the deletion first, and trigger the deletion after that. You can use the prepared delete shoot script which takes the Shoot name as first parameter. The namespace can be specified by the second parameter, but it is optional. If you don’t state it, it defaults to your namespace (the username you are logged in with to your machine).\n./hack/usage/delete shoot johndoe-1 johndoe (the hack bash script can be found at GitHub)\nConfigure a Shoot Cluster Aalert Receiver The receiver of the Shoot alerts can be configured from the .spec.monitoring.alerting.emailReceivers section in the Shoot specification. The value of the field has to be a list of valid mail addresses.\nThe alerting for the Shoot clusters is handled by the Prometheus Alertmanager. The Alertmanager will be deployed next to the control plane when the Shoot resource specifies .spec.monitoring.alerting.emailReceivers and if a SMTP secret exists.\nIf the field gets removed then the Alertmanager will be also removed during the next reconcilation of the cluster. The opposite is also valid if the field is added to an existing cluster.\n","categories":"","description":"","excerpt":"Create a Shoot Cluster As you have already prepared an example Shoot …","ref":"/docs/guides/administer-shoots/create-delete-shoot/","tags":"","title":"Create / Delete a Shoot Cluster"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Alibaba Cloud.\nPrerequisites You have created an Alibaba Cloud account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a project.\n To be able to add shoot clusters to this project, you must first create a technical user on Alibaba Cloud with sufficient permissions.\n Choose Secrets, then the plus icon and select AliCloud.\n To copy the policy for Alibaba Cloud from the Gardener dashboard, click on the help icon for Alibaba Cloud secrets, and choose copy .\n Create a custom policy in Alibaba Cloud:\n Log on to your Alibaba account and choose RAM \u003e Permissions \u003e Policies.\n Enter the name of your policy.\n Select Script.\n Paste the policy that you copied from the Gardener dashboard to this custom policy.\n Choose OK.\n In the Alibaba Cloud console, create a new technical user:\n Choose RAM \u003e Users.\n Choose Create User.\n Enter a logon and display name for your user.\n Select Open API Access.\n Choose OK.\n After the user is created, AccessKeyId and AccessKeySecret are generated and displayed. Remember to save them. The AccessKey is used later to create secrets for Gardener.\n Assign the policy you created to the technical user:\n Choose RAM \u003e Permissions \u003e Grants.\n Choose Grant Permission.\n Select Alibaba Cloud Account.\n Assign the policy you’ve created before to the technical user.\n Create your secret.\n Type the name of your secret. Copy and paste the Access Key ID and Secret Access Key you saved when you created the technical user on Alibaba Cloud. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select AliCloud in the Infrastructure tab.\n Type the name of your cluster in the Cluster Details tab.\n Choose the secret you created before in the Infrastructure Details tab.\n Choose Create.\n Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster. With it you can create shoot clusters on Alibaba Cloud. The size of persistent volumes in your shoot cluster must at least be 20 GiB large. If you choose smaller sizes in your Kubernetes PV definition, the allocation of cloud disk space on Alibaba Cloud fails.\n ","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/tutorials/kubernetes-cluster-on-alicloud-with-gardener/kubernetes-cluster-on-alicloud-with-gardener/","tags":"","title":"Create a Kubernetes Cluster on Alibaba Cloud with Gardener"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Azure.\nPrerequisites You have created an Azure account. You have access to the Gardener dashboard and have permissions to create projects. You have an Azure Service Principal assigned to your subscription. Steps Go to the Gardener dashboard and create a Project.\n Get the properties of your Azure AD tenant, Subscription and Service Principal.\nBefore you can provision and access a Kubernetes cluster on Azure, you need to add the Azure service principal, AD tenant and subscription credentials in Gardener. Gardener needs the credentials to provision and operate the Azure infrastructure for your Kubernetes cluster.\nEnsure that the Azure service principal has the actions defined within the Azure Permissions within your Subscription assigned. If no fine-grained permission/actions are required, then simply the built-in Contributor role can be assigned.\n Tenant ID\nTo find your TenantID, follow this guide.\n SubscriptionID\nTo find your SubscriptionID, search for and select Subscriptions. After that, copy the SubscriptionID from your subscription of choice. Service Principal (SPN)\nA service principal consist of a ClientID (also called ApplicationID) and a Client Secret. For more information, see Application and service principal objects in Azure Active Directory. You need to obtain the:\n Client ID\nAccess the Azure Portal and navigate to the Active Directory service. Within the service navigate to App registrations and select your service principal. Copy the ClientID you see there.\n Client Secret\nSecrets for the Azure Account/Service Principal can be generated/rotated via the Azure Portal. After copying your ClientID, in the Detail view of your Service Principal navigate to Certificates \u0026 secrets. In the section, you can generate a new secret.\n Choose Secrets, then the plus icon and select Azure.\n Create your secret.\n Type the name of your secret. Copy and paste the TenantID, SubscriptionID and the Service Principal credentials (ClientID and ClientSecret). Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n Register resource providers for your subscription.\n Go to your Azure dashboard Navigate to Subscriptions -\u003e \u003cyour_subscription\u003e Pick resource providers from the sidebar Register microsoft.Network Register microsoft.Compute To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select Azure in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/tutorials/kubernetes-cluster-on-azure-with-gardener/kubernetes-cluster-on-azure-with-gardener/","tags":"","title":"Create a Kubernetes Cluster on Azure with Gardener"},{"body":"Overview Gardener can create a new VPC, or use an existing one for your shoot cluster. Depending on your needs, you may want to create shoot(s) into an already created VPC. The tutorial describes how to create a shoot cluster into an existing AWS VPC. The steps are identical for Alicloud, Azure, and GCP. Please note that the existing VPC must be in the same region like the shoot cluster that you want to deploy into the VPC.\nTL;DR If .spec.provider.infrastructureConfig.networks.vpc.cidr is specified, Gardener will create a new VPC with the given CIDR block and respectively will delete it on shoot deletion.\nIf .spec.provider.infrastructureConfig.networks.vpc.id is specified, Gardener will use the existing VPC and respectively won’t delete it on shoot deletion.\nNote It’s not recommended to create a shoot cluster into a VPC that is managed by Gardener (that is created for another shoot cluster). In this case the deletion of the initial shoot cluster will fail to delete the VPC because there will be resources attached to it.\nGardener won’t delete any manually created (unmanaged) resources in your cloud provider account.\n 1. Configure the AWS CLI The aws configure command is a convenient way to setup your AWS CLI. It will prompt you for your credentials and settings which will be used in the following AWS CLI invocations:\naws configure AWS Access Key ID [None]: \u003cACCESS_KEY_ID\u003e AWS Secret Access Key [None]: \u003cSECRET_ACCESS_KEY\u003e Default region name [None]: \u003cDEFAULT_REGION\u003e Default output format [None]: \u003cDEFAULT_OUTPUT_FORMAT\u003e 2. Create a VPC Create the VPC by running the following command:\naws ec2 create-vpc --cidr-block \u003ccidr-block\u003e { \"Vpc\": { \"VpcId\": \"vpc-ff7bbf86\", \"InstanceTenancy\": \"default\", \"Tags\": [], \"CidrBlockAssociations\": [ { \"AssociationId\": \"vpc-cidr-assoc-6e42b505\", \"CidrBlock\": \"10.0.0.0/16\", \"CidrBlockState\": { \"State\": \"associated\" } } ], \"Ipv6CidrBlockAssociationSet\": [], \"State\": \"pending\", \"DhcpOptionsId\": \"dopt-38f7a057\", \"CidrBlock\": \"10.0.0.0/16\", \"IsDefault\": false } } Gardener requires the VPC to have enabled DNS support, i.e the attributes enableDnsSupport and enableDnsHostnames must be set to true. enableDnsSupport attribute is enabled by default, enableDnsHostnames - not. Set the enableDnsHostnames attribute to true:\naws ec2 modify-vpc-attribute --vpc-id vpc-ff7bbf86 --enable-dns-hostnames 3. Create an Internet Gateway Gardener also requires that an internet gateway is attached to the VPC. You can create one by using:\naws ec2 create-internet-gateway { \"InternetGateway\": { \"Tags\": [], \"InternetGatewayId\": \"igw-c0a643a9\", \"Attachments\": [] } } and attach it to the VPC using:\naws ec2 attach-internet-gateway --internet-gateway-id igw-c0a643a9 --vpc-id vpc-ff7bbf86 4. Create the Shoot Prepare your shoot manifest (you could check the example manifests). Please make sure that you choose the region in which you had created the VPC earlier (step 2). Also, put your VPC ID in the .spec.provider.infrastructureConfig.networks.vpc.id field:\nspec: region: \u003caws-region-of-vpc\u003e provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: id: vpc-ff7bbf86 # ... Apply your shoot manifest:\nkubectl apply -f your-shoot-aws.yaml Ensure that the shoot cluster is properly created:\nkubectl get shoot $SHOOT_NAME -n $SHOOT_NAMESPACE NAME CLOUDPROFILE VERSION SEED DOMAIN OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE \u003cSHOOT_NAME\u003e aws 1.15.0 aws \u003cSHOOT_DOMAIN\u003e Succeeded 100 True True True True 20m ","categories":"","description":"","excerpt":"Overview Gardener can create a new VPC, or use an existing one for …","ref":"/docs/guides/administer-shoots/create-shoot-into-existing-aws-vpc/","tags":"","title":"Create a Shoot Cluster Into an Existing AWS VPC"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on GCP.\nPrerequisites You have created a GCP account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a Project.\n Check which roles are required by Gardener.\n Choose Secrets, then the plus icon and select GCP.\n Click on the help button .\n Create a service account with the correct roles in GCP:\n Create a new service account in GCP.\n Enter the name and description of your service account.\n Assign the roles required by Gardener.\n Choose Done.\n Create a key for your service:\n Locate your service account, then choose Actions and Manage keys.\n Choose Add Key, then Create new key.\n Save the private key of the service account in JSON format.\n Note Save the key of the user, it’s used later to create secrets for Gardener. Enable the Google Compute API by following these steps.\n When you are finished, you should see the following page:\n Enable the Google IAM API by following these steps.\n When you are finished, you should see the following page:\n On the Gardener dashboard, choose Secrets and then the plus sign . Select GCP from the drop down menu to add a new GCP secret.\n Create your secret.\n Type the name of your secret. Select your Cloud Profile. Copy and paste the contents of the .JSON file you saved when you created the secret key on GCP. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select GCP in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/tutorials/kubernetes-cluster-on-gcp-with-gardener/kubernetes-cluster-on-gcp-with-gardener/","tags":"","title":"Create a Кubernetes Cluster on GCP with Gardener"},{"body":"Custom containerd Configuration In case a Shoot cluster uses containerd, it is possible to make the containerd process load custom configuration files. Gardener initializes containerd with the following statement:\nimports = [\"/etc/containerd/conf.d/*.toml\"] This means that all *.toml files in the /etc/containerd/conf.d directory will be imported and merged with the default configuration. To prevent unintended configuration overwrites, please be aware that containerd merges config sections, not individual keys (see here and here). Please consult the upstream containerd documentation for more information.\n ⚠️ Note that this only applies to nodes which were newly created after gardener/gardener@v1.51 was deployed. Existing nodes are not affected.\n ","categories":"","description":"","excerpt":"Custom containerd Configuration In case a Shoot cluster uses …","ref":"/docs/gardener/custom-containerd-config/","tags":"","title":"Custom containerd Configuration"},{"body":"Custom DNS Configuration Gardener provides Kubernetes-Clusters-As-A-Service where all the system components (e.g., kube-proxy, networking, dns) are managed. As a result, Gardener needs to ensure and auto-correct additional configuration to those system components to avoid unnecessary down-time.\nIn some cases, auto-correcting system components can prevent users from deploying applications on top of the cluster that requires bits of customization, DNS configuration can be a good example.\nTo allow for customizations for DNS configuration (that could potentially lead to downtime) while having the option to “undo”, we utilize the import plugin from CoreDNS [1]. which enables in-line configuration changes.\nHow to use To customize your CoreDNS cluster config, you can simply edit a ConfigMap named coredns-custom in the kube-system namespace. By editing, this ConfigMap, you are modifying CoreDNS configuration, therefore care is advised.\nFor example, to apply new config to CoreDNS that would point all .global DNS requests to another DNS pod, simply edit the configuration as follows:\napiVersion: v1 kind: ConfigMap metadata: name: coredns-custom namespace: kube-system data: istio.server: |global:8053 { errors cache 30 forward . 1.2.3.4 } corefile.override: |# \u003csome-plugin\u003e \u003csome-plugin-config\u003e debug whoami The port number 8053 in global:8053 is the specific port that CoreDNS is bound to and cannot be changed to any other port if it should act on ordinary name resolution requests from pods. Otherwise, CoreDNS will open a second port, but you are responsible to direct the traffic to this port. kube-dns service in kube-system namespace will direct name resolution requests within the cluster to port 8053 on the CoreDNS pods. Moreover, additional network policies are needed to allow corresponding ingress traffic to CoreDNS pods. In order for the destination DNS server to be reachable, it must listen on port 53 as it is required by network policies. Other ports are only possible if additional network policies allow corresponding egress traffic from CoreDNS pods.\nIt is important to have the ConfigMap keys ending with *.server (if you would like to add a new server) or *.override if you want to customize the current server configuration (it is optional setting both).\n[Optional] Reload CoreDNS As Gardener is configuring the reload plugin of CoreDNS a restart of the CoreDNS components is typically not necessary to propagate ConfigMap changes. However, if you don’t want to wait for the default (30s) to kick in, you can roll-out your CoreDNS deployment using:\nkubectl -n kube-system rollout restart deploy coredns This will reload the config into CoreDNS.\nThe approach we follow here was inspired by AKS’s approach [2].\nAnti-Pattern Applying a configuration that is in-compatible with the running version of CoreDNS is an anti-pattern (sometimes plugin configuration changes, simply applying a configuration can break DNS).\nIf incompatible changes are applied by mistake, simply delete the content of the ConfigMap and re-apply. This should bring the cluster DNS back to functioning state.\nNode Local DNS Custom DNS configuration] may not work as expected in conjunction with NodeLocalDNS. With NodeLocalDNS, ordinary DNS queries targeted at the upstream DNS servers, i.e. non-kubernetes domains, will not end up at CoreDNS, but will instead be directly sent to the upstream DNS server. Therefore, configuration applying to non-kubernetes entities, e.g. the istio.server block in the custom DNS configuration example, may not have any effect with NodeLocalDNS enabled. If this kind of custom configuration is required, forwarding to upstream DNS has to be disabled. This can be done by setting the option (spec.systemComponents.nodeLocalDNS.disableForwardToUpstreamDNS) in the Shoot resource to true:\n... spec: ... systemComponents: nodeLocalDNS: enabled: true disableForwardToUpstreamDNS: true ... References [1] Import plugin [2] AKS Custom DNS\n","categories":"","description":"","excerpt":"Custom DNS Configuration Gardener provides …","ref":"/docs/gardener/custom-dns-config/","tags":"","title":"Custom DNS Configuration"},{"body":"Custom Shoot Fields The Dashboard supports custom shoot fields, which can be configured to be displayed on the cluster list and cluster details page. Custom fields do not show up on the ALL_PROJECTS page.\nProject administration page: Each custom field configuration is shown with its own chip.\nClick on the chip to show more details for the custom field configuration.\nCustom fields can be shown on the cluster list, if showColumn is enabled. See configuration below for more details. In this example, a custom field for the Shoot status was configured.\nCustom fields can be shown in a dedicated card (Custom Fields) on the cluster details page, if showDetails is enabled. See configuration below for more details.\nConfiguration Property Type Default Required Description name String ✔️ Name of the custom field path String ✔️ Path in shoot resource, of which the value must be of primitive type (no object / array). Use lodash get path syntax, e.g. metadata.labels[\"shoot.gardener.cloud/status\"] or spec.networking.type icon String MDI icon for field on the cluster details page. See https://materialdesignicons.com/ for available icons. Must be in the format: mdi-\u003cicon-name\u003e. tooltip String Tooltip for the custom field that appears when hovering with the mouse over the value defaultValue String/Number Default value, in case there is no value for the given path showColumn Bool true Field shall appear as column in the cluster list columnSelectedByDefault Bool true Indicates if field shall be selected by default on the cluster list (not hidden by default) weight Number 0 Defines the order of the column. The built-in columns start with a weight of 100, increasing by 100 (200, 300, etc.) sortable Bool true Indicates if column is sortable on the cluster list searchable Bool true Indicates if column is searchable on the cluster list showDetails Bool true Indicates if field shall appear in a dedicated card (Custom Fields) on the cluster details page Editor for Custom Shoot Fields The Gardener Dashboard now includes an editor for custom shoot fields, allowing users to configure these fields directly from the dashboard without needing to use kubectl. This editor can be accessed from the project administration page.\nAccessing the Editor Navigate to the project administration page. Scroll down to the Custom Fields for Shoots section. Click on the gear icon to open the configuration panel for custom fields. Adding a New Custom Field In the Configure Custom Fields for Shoot Clusters panel, click on the + ADD NEW FIELD button. Fill in the details for the new custom field in the Add New Field form. Refer to the Configuration section for detailed descriptions of each field.\n Click the ADD button to save the new custom field.\n Example Custom shoot fields can be defined per project by specifying metadata.annotations[\"dashboard.gardener.cloud/shootCustomFields\"]. The following is an example project yaml:\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: annotations: dashboard.gardener.cloud/shootCustomFields: |{ \"shootStatus\": { \"name\": \"Shoot Status\", \"path\": \"metadata.labels[\\\"shoot.gardener.cloud/status\\\"]\", \"icon\": \"mdi-heart-pulse\", \"tooltip\": \"Indicates the health status of the cluster\", \"defaultValue\": \"unknown\", \"showColumn\": true, \"columnSelectedByDefault\": true, \"weight\": 950, \"searchable\": true, \"sortable\": true, \"showDetails\": true }, \"networking\": { \"name\": \"Networking Type\", \"path\": \"spec.networking.type\", \"icon\": \"mdi-table-network\", \"showColumn\": false } } ","categories":"","description":"","excerpt":"Custom Shoot Fields The Dashboard supports custom shoot fields, which …","ref":"/docs/dashboard/custom-fields/","tags":"","title":"Custom Fields"},{"body":"Overview Seccomp (secure computing mode) is a security facility in the Linux kernel for restricting the set of system calls applications can make.\nStarting from Kubernetes v1.3.0, the Seccomp feature is in Alpha. To configure it on a Pod, the following annotations can be used:\n seccomp.security.alpha.kubernetes.io/pod: \u003cseccomp-profile\u003e where \u003cseccomp-profile\u003e is the seccomp profile to apply to all containers in a Pod. container.seccomp.security.alpha.kubernetes.io/\u003ccontainer-name\u003e: \u003cseccomp-profile\u003e where \u003cseccomp-profile\u003e is the seccomp profile to apply to \u003ccontainer-name\u003e in a Pod. More details can be found in the PodSecurityPolicy documentation.\nInstallation of a Custom Profile By default, kubelet loads custom Seccomp profiles from /var/lib/kubelet/seccomp/. There are two ways in which Seccomp profiles can be added to a Node:\n to be baked in the machine image to be added at runtime This guide focuses on creating those profiles via a DaemonSet.\nCreate a file called seccomp-profile.yaml with the following content:\napiVersion: v1 kind: ConfigMap metadata: name: seccomp-profile namespace: kube-system data: my-profile.json: |{ \"defaultAction\": \"SCMP_ACT_ALLOW\", \"syscalls\": [ { \"name\": \"chmod\", \"action\": \"SCMP_ACT_ERRNO\" } ] } Note The policy above is a very simple one and not suitable for complex applications. The default docker profile can be used a reference. Feel free to modify it to your needs. Apply the ConfigMap in your cluster:\n$ kubectl apply -f seccomp-profile.yaml configmap/seccomp-profile created The next steps is to create the DaemonSet Seccomp installer. It’s going to copy the policy from above in /var/lib/kubelet/seccomp/my-profile.json.\nCreate a file called seccomp-installer.yaml with the following content:\napiVersion: apps/v1 kind: DaemonSet metadata: name: seccomp namespace: kube-system labels: security: seccomp spec: selector: matchLabels: security: seccomp template: metadata: labels: security: seccomp spec: initContainers: - name: installer image: alpine:3.10.0 command: [\"/bin/sh\", \"-c\", \"cp -r -L /seccomp/*.json /host/seccomp/\"] volumeMounts: - name: profiles mountPath: /seccomp - name: hostseccomp mountPath: /host/seccomp readOnly: false containers: - name: pause image: k8s.gcr.io/pause:3.1 terminationGracePeriodSeconds: 5 volumes: - name: hostseccomp hostPath: path: /var/lib/kubelet/seccomp - name: profiles configMap: name: seccomp-profile Create the installer and wait until it’s ready on all Nodes:\n$ kubectl apply -f seccomp-installer.yaml daemonset.apps/seccomp-installer created $ kubectl -n kube-system get pods -l security=seccomp NAME READY STATUS RESTARTS AGE seccomp-installer-wjbxq 1/1 Running 0 21s Create a Pod Using a Custom Seccomp Profile Finally, we want to create a profile which uses our new Seccomp profile my-profile.json.\nCreate a file called my-seccomp-pod.yaml with the following content:\napiVersion: v1 kind: Pod metadata: name: seccomp-app namespace: default annotations: seccomp.security.alpha.kubernetes.io/pod: \"localhost/my-profile.json\" # you can specify seccomp profile per container. If you add another profile you can configure # it for a specific container - 'pause' in this case. # container.seccomp.security.alpha.kubernetes.io/pause: \"localhost/some-other-profile.json\" spec: containers: - name: pause image: k8s.gcr.io/pause:3.1 Create the Pod and see that it’s running:\n$ kubectl apply -f my-seccomp-pod.yaml pod/seccomp-app created $ kubectl get pod seccomp-app NAME READY STATUS RESTARTS AGE seccomp-app 1/1 Running 0 42s Throubleshooting If an invalid or a non-existing profile is used, then the Pod will be stuck in ContainerCreating phase:\nbroken-seccomp-pod.yaml:\napiVersion: v1 kind: Pod metadata: name: broken-seccomp namespace: default annotations: seccomp.security.alpha.kubernetes.io/pod: \"localhost/not-existing-profile.json\" spec: containers: - name: pause image: k8s.gcr.io/pause:3.1 $ kubectl apply -f broken-seccomp-pod.yaml pod/broken-seccomp created $ kubectl get pod broken-seccomp NAME READY STATUS RESTARTS AGE broken-seccomp 1/1 ContainerCreating 0 2m $ kubectl describe pod broken-seccomp Name: broken-seccomp Namespace: default .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 18s default-scheduler Successfully assigned kube-system/broken-seccomp to docker-desktop Warning FailedCreatePodSandBox 4s (x2 over 18s) kubelet, docker-desktop Failed create pod sandbox: rpc error: code = Unknown desc = failed to make sandbox docker config for pod \"broken-seccomp\": failed to generate sandbox security options for sandbox \"broken-seccomp\": failed to generate seccomp security options for container: cannot load seccomp profile \"/var/lib/kubelet/seccomp/not-existing-profile.json\": open /var/lib/kubelet/seccomp/not-existing-profile.json: no such file or directory Related Links Seccomp A Seccomp Overview Seccomp Security Profiles for Docker Using Seccomp to Limit the Kernel Attack Surface ","categories":"","description":"","excerpt":"Overview Seccomp (secure computing mode) is a security facility in the …","ref":"/docs/guides/applications/secure-seccomp/","tags":"","title":"Custom Seccomp Profile"},{"body":"Theming and Branding Motivation Gardener landscape administrators should have the possibility to change the appearance and the branding of the Gardener Dashboard via configuration without the need to touch the code.\nBranding It is possible to change the branding of the Gardener Dashboard when using the helm chart in the frontendConfig.branding map. The following configuration properties are supported:\n name description default documentTitle Title of the browser window Gardener Dashboard productName Name of the Gardener product Gardener productTitle Title of the Gardener product displayed below the logo. It could also contain information about the specific Gardener instance (e.g. Development, Canary, Live) Gardener productTitleSuperscript Superscript next to the product title. To supress the superscript set to false Production version (e.g 1.73.1) productSlogan Slogan that is displayed under the product title and on the login page Universal Kubernetes at Scale productLogoUrl URL for the product logo. You can also use data: scheme for development. For production it is recommended to provide static assets /static/assets/logo.svg teaserHeight Height of the teaser in the GMainNavigation component 200 teaserTemplate Custom HTML template to replace to teaser content refer to GTeaser loginTeaserHeight Height of the login teaser in the GLogin component 260 loginTeaserTemplate Custom HTML template to replace to login teaser content refer to GLoginTeaser loginFooterHeight Height of the login footer in the GLogin component 24 loginFooterTemplate Custom HTML template to replace to login footer content refer to GLoginFooter loginHints Links { title: string; href: string; } to product related sites shown below the login button undefined oidcLoginTitle Title of tabstrip for loginType OIDC OIDC oidcLoginText Text show above the login button on the OIDC tabstrip Press Login to be redirected to\nconfigured OpenID Connect Provider. Colors Gardener Dashboard has been built with Vuetify. We use Vuetify’s built-in theming support to centrally configure colors that are used throughout the web application. Colors can be configured for both light and dark themes. Configuration is done via the helm chart, see the respective theme section there. Colors can be specified as HTML color code (e.g. #FF0000 for red) or by referencing a color (e.g grey.darken3 or shades.white) from Vuetify’s Material Design Color Pack.\nThe following colors can be configured:\n name usage primary icons, chips, buttons, popovers, etc. anchor links main-background main navigation, login page main-navigation-title text color on main navigation toolbar-background background color for toolbars in cards, dialogs, etc. toolbar-title text color for toolbars in cards, dialogs, etc. action-button buttons in tables and cards, e.g. cluster details page info notification info popups, texts and status tags success notification success popups, texts and status tags warning notification warning popups, texts and status tags error notification error popups, texts and status tags unknown status tags with unknown severity … all other Vuetify theme colors If you use the helm chart, you can configure those with frontendConfig.themes.light for the light theme and frontendConfig.themes.dark for the dark theme. The customization example below shows a possible custom color theme configuration.\nLogos and Icons It is also possible to exchange the Dashboard logo and icons. You can replace the assets folder when using the helm chart in the frontendConfig.assets map.\nAttention: You need to set values for all files as mapping the volume will overwrite all files. It is not possible to exchange single files.\nThe files have to be encoded as base64 for the chart - to generate the encoded files for the values.yaml of the helm chart, you can use the following shorthand with bash or zsh on Linux systems. If you use macOS, install coreutils with brew (brew install coreutils) or remove the -w0 parameter.\ncat \u003c\u003c EOF ### ### COPY EVERYTHING BELOW THIS LINE ### assets: favicon-16x16.png: | $(cat frontend/public/static/assets/favicon-16x16.png | base64 -w0) favicon-32x32.png: | $(cat frontend/public/static/assets/favicon-32x32.png | base64 -w0) favicon-96x96.png: | $(cat frontend/public/static/assets/favicon-96x96.png | base64 -w0) favicon.ico: | $(cat frontend/public/static/assets/favicon.ico | base64 -w0) logo.svg: | $(cat frontend/public/static/assets/logo.svg | base64 -w0) EOF Then, swap in the base64 encoded version of your files where needed.\nCustomization Example The following example configuration in values.yaml shows most of the possibilities to achieve a custom theming and branding:\nglobal: dashboard: frontendConfig: # ... branding: productName: Nucleus productTitle: Nucleus productSlogan: Supercool Cluster Service teaserHeight: 160 teaserTemplate: |\u003cdiv class=\"text-center px-2\" \u003e \u003ca href=\"/\" class=\"text-decoration-none\" \u003e \u003cimg src=\"{{ productLogoUrl }}\" width=\"80\" height=\"80\" alt=\"{{ productName }} Logo\" class=\"pointer-events-none\" \u003e \u003cdiv class=\"font-weight-thin text-grey-lighten-4\" style=\"font-size: 32px; line-height: 32px; letter-spacing: 2px;\" \u003e {{ productTitle }} \u003c/div\u003e \u003cdiv class=\"text-body-1 font-weight-normal text-primary mt-1\"\u003e {{ productSlogan }} \u003c/div\u003e \u003c/a\u003e \u003c/div\u003e loginTeaserHeight: 296 loginTeaserTemplate: |\u003cdiv class=\"d-flex flex-column align-center justify-center bg-main-background-darken-1 pa-3\" style=\"min-height: {{ minHeight }}px\" \u003e \u003cimg src=\"{{ productLogoUrl }}\" alt=\"Login to {{ productName }}\" width=\"140\" height=\"140\" class=\"mt-2\" \u003e \u003cdiv class=\"text-h3 text-center font-weight-thin text-white mt-4\"\u003e {{ productTitle }} \u003c/div\u003e \u003cdiv class=\"text-h5 text-center font-weight-light text-primary mt-1\"\u003e {{ productSlogan }} \u003c/div\u003e \u003c/div\u003e loginFooterTemplate: |\u003cdiv class=\"text-anchor text-caption\"\u003e Copyright 2023 by Nucleus Corporation \u003c/div\u003e loginHints: - title: Support href: https://gardener.cloud - title: Documentation href: https://gardener.cloud/docs oidcLoginTitle: IDS oidcLoginText: Press LOGIN to be redirected to the Nucleus Identity Service. themes: light: primary: '#354a5f' anchor: '#5b738b' main-background: '#354a5f' main-navigation-title: '#f5f6f7' toolbar-background: '#354a5f' toolbar-title: '#f5f6f7' action-button: '#354a5f' dark: primary: '#5b738b' anchor: '#5b738b' background: '#273849' surface: '#1d2b37' main-background: '#1a2733' main-navigation-title: '#f5f6f7' toolbar-background: '#0e1e2a' toolbar-title: '#f5f6f7' action-button: '#5b738b' assets: favicon-16x16.png: | iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAABHVBMVEUAAAALgGILgWIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGL///8Vq54LAAAAXXRSTlMAAAAAFmu96Pv0TRQ5NB0FLLn67j1X8fDdnSoR3CJC9cZx9/7ZtAna9rvFe28Cl552ZLDUIS7X5UnfVOrrOKS+Q7/Kz61IAwQKC8gVEwYIyQxgZQYCVn9+IFjR7/wm8JKCAAAAAWJLR0ReBNZhuwAAAAd0SU1FB+cKCgkYLrOE10YAAADMSURBVBjTXY7XUsIAFETvxhJQQwlEEhUlCEizVxANCIoQwAIobf//NyTjMDqcp5192D0ivwCra+uqz78h2NzSAkEgFNZJRqICYztmWju7YXrsxQX7B/OQsJOHqRiZzggCR8zmVJX5QpYsQnB8Qp6e+fTzC/LyCqLg+oa3d6lS+Z4VA/AOHx6dau3JqDeeX+ApKGi+tpy22+lCkYVWz3l7X5E/0KstFc2Pz/6iwMDVhvbXtz3U3MF8dDS2JtOZpz2bTqzxSEyd/9BN4RI/8jsrfdR558kAAAAldEVYdGRhdGU6Y3JlYXRlADIwMjMtMTAtMTBUMDk6MjQ6MzMrMDA6MDC+UDWaAAAAJXRFWHRkYXRlOm1vZGlmeQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwzw2NJgAAABJ0RVh0ZXhpZjpFeGlmT2Zmc2V0ADI2UxuiZQAAABh0RVh0ZXhpZjpQaXhlbFhEaW1lbnNpb24AMTUwO0W0KAAAABh0RVh0ZXhpZjpQaXhlbFlEaW1lbnNpb24AMTUwpkpVXgAAACB0RVh0c29mdHdhcmUAaHR0cHM6Ly9pbWFnZW1hZ2ljay5vcme8zx2dAAAAGHRFWHRUaHVtYjo6RG9jdW1lbnQ6OlBhZ2VzADGn/7svAAAAGHRFWHRUaHVtYjo6SW1hZ2U6OkhlaWdodAAxOTJAXXFVAAAAF3RFWHRUaHVtYjo6SW1hZ2U6OldpZHRoADE5MtOsIQgAAAAZdEVYdFRodW1iOjpNaW1ldHlwZQBpbWFnZS9wbmc/slZOAAAAF3RFWHRUaHVtYjo6TVRpbWUAMTY5NjkyOTg3M4YMipUAAAAPdEVYdFRodW1iOjpTaXplADBCQpSiPuwAAABWdEVYdFRodW1iOjpVUkkAZmlsZTovLy9tbnRsb2cvZmF2aWNvbnMvMjAyMy0xMC0xMC9kNzEyMWM2YzM2OTg3NmQ0MGUxY2EyMjVlYjg3MGZjYi5pY28ucG5nU19BKAAAAABJRU5ErkJggg== favicon-32x32.png: | iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAACTFBMVEUAAAALgWIKil4LgGIKgGEMf2MNgGIMgGILgGEKgWEIhVwGgl8JgVwMgGMKgGMNgGEKf2EMgWMMf2ILhmILhGIKjFwHgmYMf2EKgWMGgGIKgGIIg2UAgGMCkGINfGIMfGILfmELfWELgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgWELgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgWELgGILgGILgGIMgmILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKg2ALgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGELgGILgGILgGILgGILgGILgGILgGILgGILgGILgWILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGILgGILgGILgGILgmIMgWELgGIHgWILgGIMgWIHgGILgGILgGILgGILgGILgGILgWMLgGILgGILgGILgGMLgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGL///+Wa9azAAAAwnRSTlMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGVWDrNTs+v77InTF9YIGVs5sAjhdYk0wDwVv8PlPBan96sF/MwLoL5PhcAwgy8kTePiAAmj8nwJe7qbmlNNzNOTE3bfvfm3YdRS8pdcuQuLDEvFRKtmVAXx7A6KjCHE2LeVfBbPKZNxLlo2KqgOHuT9Hi6BZAna2mlwRAQGeAQ4BAemOJSYXAwsfIwNAY32d0L1BGjlq0fLzazoNcu0Jrf5EbikAAAABYktHRMOKaI5CAAAAB3RJTUUH5woKCRgus4TXRgAAAdNJREFUOMtjYMAEjExKyiqqauoamloqzJjSLKzaOrp6hyBAH1MBs4GhkeYhGDAGKmA2MTUzN7OwtGJjB8pzWNvYwqUP2dmDFDg4HjrkZOfs4urmzszJbIgkf8jDE6SAywvK9fbxZfbzR5I/FBAIVMDNHATjB4eEhoHtDw+GCEREMoKcFeUEUxGtHAOiYuPiIfwEZh6QgkRnEEcvSevQoeQUkLxylB1YPskS7Etm31QQLy09I/NQONAwrazsHIgBuQ5gBbzMPmBuXn5BIYguKlaGGFBSCg0m5rJysEBFZRWIqq6phRhQVw9TwNGgBRIo9/OMPXRIvbGpGSyv5gkPZ+biWpDvwlta2w4darfqAMt3dvHxIyJAqRsUgD3MvYcO9fVPAAXJxEkCgshRVD95ytSp04SAjoiYPiN55qzZc5iFUWNRRHTuPDFxoIL5C+oX1i/Ckg6AQAKsQFKKARegm4LF+BT4LHFeiscEafHiZctXyOBQICu3ctWK1WvWrF6xaq28LHpar1Fat37DxpxNKUlJKZs2z09fv2VrDSysRBR8t22ftWMnIjMAgeau+I279+xVFGFgqG/cN1MdRRKuaP9EHdMDDI5LDuEBzgcZDhEAlCsAAOGIeNYQEfj6AAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwvlA1mgAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMy0xMC0xMFQwOToyNDozMyswMDowMM8NjSYAAAASdEVYdGV4aWY6RXhpZk9mZnNldAAyNlMbomUAAAAYdEVYdGV4aWY6UGl4ZWxYRGltZW5zaW9uADE1MDtFtCgAAAAYdEVYdGV4aWY6UGl4ZWxZRGltZW5zaW9uADE1MKZKVV4AAAAgdEVYdHNvZnR3YXJlAGh0dHBzOi8vaW1hZ2VtYWdpY2sub3JnvM8dnQAAABh0RVh0VGh1bWI6OkRvY3VtZW50OjpQYWdlcwAxp/+7LwAAABh0RVh0VGh1bWI6OkltYWdlOjpIZWlnaHQAMTkyQF1xVQAAABd0RVh0VGh1bWI6OkltYWdlOjpXaWR0aAAxOTLTrCEIAAAAGXRFWHRUaHVtYjo6TWltZXR5cGUAaW1hZ2UvcG5nP7JWTgAAABd0RVh0VGh1bWI6Ok1UaW1lADE2OTY5Mjk4NzOGDIqVAAAAD3RFWHRUaHVtYjo6U2l6ZQAwQkKUoj7sAAAAVnRFWHRUaHVtYjo6VVJJAGZpbGU6Ly8vbW50bG9nL2Zhdmljb25zLzIwMjMtMTAtMTAvZDcxMjFjNmMzNjk4NzZkNDBlMWNhMjI1ZWI4NzBmY2IuaWNvLnBuZ1NfQSgAAAAASUVORK5CYII= favicon-96x96.png: | iVBORw0KGgoAAAANSUhEUgAAAGAAAABgCAYAAADimHc4AAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAABmJLR0QA/wD/AP+gvaeTAAAAB3RJTUUH5woKCRgus4TXRgAADuRJREFUeNrtnXuMXNV9xz/fc2dm3+uAwcEGg02IAwQIOC1xkrZGQQ0NiUpVaNVWRXjXJkLlj7RVEeA0Zmy3Uas0D6UoCZRXMIoSUNsoioQSqlBISQhpbew4xjwMNgYbG3vxvmZ2Hvf8+seZNbvE3r3z2J21mY80snfunXvPPd97f/d3fuf3u1e0mERndjUgQZySyGA6HTgHOANYAJwJLK78/5TKpxdoB9KVj4AyUARywDDwFvAN4D6A0ex9AKSafcBzgY71/ZiQM5uH2UJhHzD0QYxlwFnAQuA0oBuICB38zg8T/oUgRAroBOYTRFz8zn2/awXoWt8HwskzD+NCMy4DLQc+DJwl6AIyTO7UapkoTAwUVwJPTFjhXSVAT/ZmPMOITMbMn4OxwuBThE5fDHTM4O5LQO5xwE348l0jQFe2H08uA6kLDX8N8EngIqCH+s7ypBSB0RTgJ3x5cguQvYIe3gfykRnvM3QN8GfAB4G2WW5NARiKmaz2SStAV3YNRYaIsdNlugb4S+ByZtbMTMWY4IhhaIIROikF6MyuxmRRxrovAW4G/ojgiTSTnIkBeYd3bxuhk06Armw/yLdhugq4lXDWz4XjzIEdBiBlR790tW5tLtKd7cehDkzXA18CPsbc6HyAI5iGELz/9re/nLJxHRvWIAxQSlgXZp2YtQMZUBpwhhTWwYOVQQWwAiKPaSS2jqJTkVz232b06LrDYKoLb6uA2wgDqLmCAfuEigBb3QNHF0wpgPPjtspWAKuA00G9hBFhB5AW5io7KIXOZwQ0jHEYOBApfwDY25XtfwnsVaFhDwWBjQ/H66VzfR9ePiOvPzd0OyFcMJeIgdcRJWzygqSX5weA6wkjw2opA0eAN0H7DbYJfibY0bW+f/d7X75gdP+ybeQ/v6mmI+ta3wfOImJdDbqFudf5432wO0r5YhxPHnIkFWAUGKM2AVKEOMppwAXASqDPYBfGUweWPveoyulfdmy4cQDzPn/HvYk33LOxj0LBkU77i0F/Byyb9a5NxhjwaqngfLqrMGnBNDfh8evFRghRvUYQAfOA5cBfAXfL+Kbz8bURnELf1+je0JdoQ96LVNqfhulm4COz1Jm1cAA4KGcM3vrQpAXTCHA0lnSEEFJtNBHhZnkt8HUz+1LXOds+HnuXCWHh49OZXY33zsn0KeAa5o63cyxeB9481oKpBXjbXA0AQzPYQBHi7TcAdzlslbB5nXf007Oh/zdWbt/YjzPDyS/lqHMwp9mDahDAxk2Q7DAwOAsNTRHiNBuBWyTObBN0ZCeLkIoFThFwNfDbs9CuehgDngfyx1o4pQDxUS9UI8DhWWz0AuCvgbW5mIVCdG284ehCw2PmFxNMT88stqsW3kL8anTfmNcxYq7RVL+Mn3iWzBWXQRDqw4Rh/WyNnjPA+QJJtgXv8pkrL6Ft5XJKpImwq4AbaV5wLSm7gW9kutMDisoUH986aeG0nSnAiTLwCiGmPZv0AJ/FdL2HdosjDJGm3A18guBNNYocIWTcSAzYCQwgY+QLD/7GCtMK4JynHLtY8CLHsWMzzHzgJgeXp2JPxTNYDKxI0v4EHAF+BNwFvNbgtpeAX3jnB012zBWmPYChdQ/gnMdgP/BGgxuYlGVAXzmtUytjkw8Bi+rcZhnYCnyBELI+TPDEGskBYIvzLi5G6WOukPgMUmjgSw1uYFIccLW8PpmOyhFwMXBqHdsbA/4duEmObxFM2Z8QJuIbyU6D5w0o//3dxz2w6ZFhwRXdCVii3zSeBcCnSz46Fzgvcdt/k2HgHuBWr/hp75UhTFNe2OD2loHNiIOaYsY52UGYMFMR2E6wmc1iBaaPUXuoOQfcbbKN5t0eZymEX05wZ9M1bvN4HEY8qS2UplqpGhME8CvCsLoeSoQb+i8JA5RBkl9Vi4A/BpbUsN8i8LDEV62cOuicIdENug44t85jOhZbMJ7lUvjcuuOvlCh+4p3DeY/BHsEuQjpHrQjYJfE1sAEzLgKtAK6sdOxUY5P2ynrtNez3KeBfYvOvR2mwPNDOpcCnk/ZDFeSBH8dEBxyeL+r4Ed5EV0B+3T0gIeeGgSepz19OAVeacRNopFjU/ZJuAfoJeZOHpvitI9woowT7mchrwJ0Q7XBEmAfa1IlxLbVdTdOxS/BkRBxrUhbQsQ8oEc458D4WPA3sq7OBaeAzZqxta+NM7/2QSv5Jw24H1hMGfY2iBDxi8JgRm0+pYvDsYuAPaPzZ7wlX206A1DQCJD6Tio9vpu2K5YDGCLH8er0GR7C9eYn/JXIlobzENuAgcCkh87hedoD+wRm7zQlnBqIN6AP+sJo+SMhexD/HJXZGKRjOPjBtJyRGQJrUIeC/gZEGNLYbWAX6veLeqNIaG3OyR4CvMLU5SkIR+E9gu0lEisGEsLMJZ38tM3xT4YFHMXsmShsjd0w/512VAAYUKZrgv4DnGtToJcCazNnlM/CAPN40BnwHeITgT9fKy8CjBmMZ8ph3pIXM9AnC9Gij2Q087NCgJUw3rUqAkWy4mxt+N/AoTO3jJkTASkxXoVjmHcoYeAYQ9xJc31rwwE9NbJeMt7LfAaBsdgYhI7q3AW2fSAw8auIZD/j2ZJ511aNJ78pAVCAIsKtBjZ8PXIdFi0CMrL0fnAG2DXiY2oKAR4DHMBvFoLMyqWMhjjQT88fbEZtc2Y+YE2O33Z/oR1ULkF+3KRxGuFn+B40L4a4ALs+QoTPbh6WEYpUEP6TiUVTJS8AWIYtIo9DqNuAKQoZGIxkA7gLbbJEjty55ZkdN8RQhMHLAd4FnG3QQpwJXF91Yl4B0VAopX+Eqe4xwiSfFA78w8YYJ/NGfaiHwOzTW9SwD30c8grmSy1R3y6pJgJHsvZjAO54DNtGY+WIHfATTeSDKxRQixnB5hcFfNaHwPLAlNepGiAyTH/f9fwt4fwPaOpEtwDejOD7knRhe+2BVP655QiO37mycp0wwQz+gPm9lnCWYLk8N92IyUAR4LMTtX6xiO68DL8SdHlcWhvCR6yAk6zYyg2KP4MsRfos5FyIGVVL7jJKy4CAei/YjvkoYIddLD7C83DPYI+9w7cVKCaIdInhDSc3QPhRG01Yx/vK2iJBB0ahypAPAl5H9wMvFw9lkN913UteU3ui6+1C7USa1lZAO/kIDDuxS0BkAQ7c+FK4EFxfANpPcG9qLhbC5FIM5hJ1D48zPQcJA8UEz5aM6Atl1z6nms/eQspKX7EfAF6nOVByLc8HOlInOjTfg2kpYnDbQCyTLzisCu5BKAOYjpHIEXEJjfP/XgH+S9C2JQUgzuLb2LO+GpJiMZu/HjALY94DPA9vq2Fw36HylvHM+Yvi2hypGQwMkCwIWgT3OqayjU1HKEELotYSxx4kJ96LbBHeb+SHJMZq9q66+a5g7Npq9n87s6jHg+8KGCUUSH6X6eEsbcIGPleboGMMQDFUSA6ajBOz3sQ9D0RD87JZxNrXb/yHCwPPOlHjaG+WRGm3+O2loklUuey84Sj4V/RhxM/CvBD++Gh8+ApZYRbj27KrxhwHkCQOe6ShQmTYVLtyAjdOpbRI/B/wMWIv4mzGf+5+i+fJwgwpLYAay3HLr7sWVY/+37k9/bVgW7LPAnQQvZohkYvQKmwfgVHEjUUyy2NN4LQP2dix+PsmTuGJCFPYnwDqg38RdBvvjDd8lP014uVpmJKV7NHsfG7mPU/7x+pGBtZt+0ruh7xlvWgpcRkhxXEKI9XcSznhPONtGCO7d0+FvQ2bBlUxOnsqYxDhqczoIk0CeyfPPVvlulJA+vgf4NfAEYgtoH1DKVVE0Ui2zUaLPyuwPeYpNtNPlfOiIdsEpiC6MFBALcqAhk41iVgKLQQaGKQLoldlXgNXT7G6rwV8AO1BlC2KpjKsJiVedlfUKBNEPEdzK/YRw8oB3FGVYLkE8/4QQoF4616+GGgRoOyVGOFLvKTO2JxN571MGkQSG814Wx92upKKncNvMd/axmMtVJXXz1ue+PfHPmOqcgVnhpCrUPhFpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJM5GQU4IaZZxzmRpiSNkJp4mONX1ovwrpY5N/V4PE4kAXKERK/vJVjv1WY3tkWLFi1atJiOOeeydWTXEOOUodRmch1m1qaQst4GlqkUjkUEF3rcjfaVT1xJaSwCBYOCpILM54ukCxHe8tnq67hmklkXYF52DRGogKUMSzvRZnAapoXAArDxJ63PJ6SU9xJqx3oIz5bIELy38TfVQUjGLfH26wNHCC7rMCEje4Dgvh4ChVxQ2X7BIW8UhEptqByDDc6yQDMqQFd2FViMorbIzHowWwCcLjjLQr3WUsIjKMfTx3sJzwPqmIG2GSFzepQgyiBBlL3AK4IXLZQfvYl0UNKwxYUYRYw2OCV9Ig09yM71fRgmZ64dWQ/oLIzzCQ/GOJeQln428B7CmTz+4stmUnn7B0VCYcerhCzpl4HnEDvBXsM07OXHhCx3R2OqY6j34HuyNyKQl6UN5mO2FOw8wrN+PkR4uN742X0iDfogmLPxq+Q1Qn3Ys6CXkF4RHHamkoEN1/F+nKoF6M6uwuFdmVSv4AzBRRaeKX0RwawsItw0G/0gpGYTE2oK9hEqQbcLnjHYbvBGivKQx/mRKs3VtAL0ZvtxYe8pk80z0zLC2b2CUO2ykBPzDK+X8StkP/B/hKqerZK9INNgBGUPDE1TT3ZcAbqzawguX3yaifMwPk6oeryQcONsp/n2e65ghLq0vcAO4OeIp2S8BNEhsHjkON7VpA7sCpUoDujFuADso8DvEp4Rt4D66mzfTYwRyp42Az8F/RzxHMH78qMTas7Ukb2BjNooU+7EWEowLb9PuJEuYfbfOnqyUSB4Vc8SHrvzNOKVFKlc0QqkIrn3lq28nNDpKwnvDOukZV4aRRuhT5cBnwGex3iiTPmxSG5zykzfJtj1RZx8nstcQoRB5nKCE3OdmXakgKua3bJ3IRHBkVl8Ms4Jn1C0BGgyLQGaTEuAJtMSoMm0BGgyLQGaTEuAJtMSoMn8P4f/JnJ3AKQjAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwvlA1mgAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMy0xMC0xMFQwOToyNDozMyswMDowMM8NjSYAAAASdEVYdGV4aWY6RXhpZk9mZnNldAAyNlMbomUAAAAYdEVYdGV4aWY6UGl4ZWxYRGltZW5zaW9uADE1MDtFtCgAAAAYdEVYdGV4aWY6UGl4ZWxZRGltZW5zaW9uADE1MKZKVV4AAAAgdEVYdHNvZnR3YXJlAGh0dHBzOi8vaW1hZ2VtYWdpY2sub3JnvM8dnQAAABh0RVh0VGh1bWI6OkRvY3VtZW50OjpQYWdlcwAxp/+7LwAAABh0RVh0VGh1bWI6OkltYWdlOjpIZWlnaHQAMTkyQF1xVQAAABd0RVh0VGh1bWI6OkltYWdlOjpXaWR0aAAxOTLTrCEIAAAAGXRFWHRUaHVtYjo6TWltZXR5cGUAaW1hZ2UvcG5nP7JWTgAAABd0RVh0VGh1bWI6Ok1UaW1lADE2OTY5Mjk4NzOGDIqVAAAAD3RFWHRUaHVtYjo6U2l6ZQAwQkKUoj7sAAAAVnRFWHRUaHVtYjo6VVJJAGZpbGU6Ly8vbW50bG9nL2Zhdmljb25zLzIwMjMtMTAtMTAvZDcxMjFjNmMzNjk4NzZkNDBlMWNhMjI1ZWI4NzBmY2IuaWNvLnBuZ1NfQSgAAAAASUVORK5CYII= favicon.ico: | AAABAAEAEBAAAAEAIABoBAAAFgAAACgAAAAQAAAAIAAAAAEAIAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAABigAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL3WKAC/pigAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv6YoAL3WKACyBigAtYYoALnWKAC9FigAvvYoAL/GKAC/9igAv/YoAL/2KAC/9igAv8YoAL72KAC9FigAudYoALWGKACyAAAAAAYoALAGKACwJigAsVYoALNGKAC1ZigAtxYoALfmKAC35igAtxYoALVmKACzRigAsVYoALAmKACwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABigAsAYoALBGKAC2BigAtlYoAKBmKACgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYoALAGKACwhigAu/YoALyWKACgxigAoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGKACwBigAsIYoALv2KAC8ligAoMYoAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABhggoAYoALAGKACwRigAsKYoALC2KAC75igAvIYoALFGKACxNigAsGYoALAGGCCgAAAAAAAAAAAAAAAABigAsAYoALAGKACzhigAukYoALv2KAC0JigAu/YoALymKAC1digAvQYoALrmKAC0higAsDYoALAAAAAAAAAAAAYoALAGKACy5igAvXYoAL/2KAC+ZigAtJYoAL2WKAC99igAtUYoAL6mKAC/9igAvrYoALV2KACwBigAsAYoALAGKADAJigAuXYoAL/2KAC/9igAueYoALdmKAC/xigAv6YoALZGKAC7BigAv/YoAL/2KAC9RigAshYoALAGKACwBigAsdYoAL2mKAC/9igAv2YoALu2KAC+higAvoYoAL/2KAC8VigAt6YoAL9mKAC/9igAv/YoALb2KACwBigAsAYoALQWKAC/ZigAv/YoAL/2KAC/9igAvGYoALcWKAC/digAv+YoAL2WKAC/BigAv/YoAL/2KAC7RigAsJYoALAGKAC1digAvyYoAL8GKAC91igAueYoALKmKACxFigAu5YoAL/2KAC/9igAv/YoAL/2KAC/9igAvcYoALImKACwBigAsUYoALOGKACzRigAsdYoALBWKACwBigAsAYoALLGKAC7ligAv6YoAL/2KAC/9igAv/YoAL72KACz0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYoEKAGKACwBigAsVYoALa2KAC71igAvpYoAL+2KAC/RigAtNAAAAAAAAAAAAAAAAwAMAAPw/AAD8PwAA/D8AAPAPAADgAwAAwAMAAIABAACAAQAAgAAAAIAAAACDAAAA/4AAAA== logo.svg: | PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCI+PHBhdGggZmlsbD0iIzBCODA2MiIgZD0iTTIsMjJWMjBDMiwyMCA3LDE4IDEyLDE4QzE3LDE4IDIyLDIwIDIyLDIwVjIySDJNMTEuMyw5LjFDMTAuMSw1LjIgNCw2LjEgNCw2LjFDNCw2LjEgNC4yLDEzLjkgOS45LDEyLjdDOS41LDkuOCA4LDkgOCw5QzEwLjgsOSAxMSwxMi40IDExLDEyLjRWMTdDMTEuMywxNyAxMS43LDE3IDEyLDE3QzEyLjMsMTcgMTIuNywxNyAxMywxN1YxMi44QzEzLDEyLjggMTMsOC45IDE2LDcuOUMxNiw3LjkgMTQsMTAuOSAxNCwxMi45QzIxLDEzLjYgMjEsNCAyMSw0QzIxLDQgMTIuMSwzIDExLjMsOS4xWiIgLz48L3N2Zz4= Login Screen In this example, the login screen now displays the custom logo in a different size. The product title is also shown, and the OIDC tabstrip title and text have been changed to a custom-specific one. Product-related links are displayed below the login button. The footer contains a copyright notice for the custom company.\nTeaser in Main Navigation The template approach is also used in this case to change the font-size and line-height of the product title and slogan. The product version (superscript) is omitted.\nAbout Dialog By changing the productLogoUrl and the productName, the changes automatically effect the apperance of the About Dialog and the document title.\n","categories":"","description":"","excerpt":"Theming and Branding Motivation Gardener landscape administrators …","ref":"/docs/dashboard/customization/","tags":"","title":"Customization"},{"body":"Documentation Index Overview Gardener Landing Page gardener.cloud Usage Working with Projects Project Operations Automating Project Resource Management Use the Webterminal Terminal Shortcuts Connect kubectl Custom Shoot Fields Operations Configure Access Restrictions Theming and Branding Webterminals Development Dashboard Architecture Setting Up a Local Development Environment Testing Hotfixes ","categories":"","description":"","excerpt":"Documentation Index Overview Gardener Landing Page gardener.cloud …","ref":"/docs/dashboard/readme/","tags":"","title":"Dashboard"},{"body":"Data Disk Restore From Image Table of Contents Summary Motivation Goals Non-Goals Proposal Alternatives Summary Currently, we have no support either in the shoot spec or in the MCM GCP Provider for restoring GCP Data Disks from images.\nMotivation The primary motivation is to support Integration of vSMP MemeoryOne in Azure. We implemented support for this in AWS via Support for data volume snapshot ID . In GCP we have the option to restore data disk from a custom image which is more convenient and flexible.\nGoals Extend the GCP provider specific WorkerConfig section in the shoot YAML and support provider configuration for data-disks to support data-disk creation from an image name by supplying an image name. Proposal Shoot Specification At this current time, there is no support for provider specific configuration of data disks in an GCP shoot spec. The below shows an example configuration at the time of this proposal:\nproviderConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: # optional, skipped detail here serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 We propose that the worker config section be enahnced to support data disk configuration\nproviderConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: # optional, skipped detail here dataVolumes: # \u003c-- NEW SUB_SECTION - name: vsmp1 image: imgName serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 In the above imgName specified in providerConfig.dataVolumes.image represents the image name of a previously created image created by a tool or process. See Google Cloud Create Image.\nThe MCM GCP Provider will ensure when a VM instance is instantiated, that the data disk(s) for the VM are created with the source image set to the provided imgName. The mechanics of this is left to MCM GCP provider. See image param to --create-disk flag in Google Cloud Instance Creation\n","categories":"","description":"","excerpt":"Data Disk Restore From Image Table of Contents Summary Motivation …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/proposals/datadisk-image-restore/","tags":"","title":"Data Disk Restore From Image"},{"body":"Default Seccomp Profile and Configuration This is a short guide describing how to enable the defaulting of seccomp profiles for Gardener managed workloads in the seed. Running pods in Unconfined (seccomp disabled) mode is undesirable since this is the least restrictive profile. Also, mind that any privileged container will always run as Unconfined. More information about seccomp can be found in this Kubernetes tutorial.\nSetting the Seccomp Profile to RuntimeDefault for Seed Clusters To address the above issue, Gardener provides a webhook that is capable of mutating pods in the seed clusters, explicitly providing them with a seccomp profile type of RuntimeDefault. This profile is defined by the container runtime and represents a set of default syscalls that are allowed or not.\nspec: securityContext: seccompProfile: type: RuntimeDefault A Pod is mutated when all of the following preconditions are fulfilled:\n The Pod is created in a Gardener managed namespace. The Pod is NOT labeled with seccompprofile.resources.gardener.cloud/skip. The Pod does NOT explicitly specify .spec.securityContext.seccompProfile.type. How to Configure To enable this feature, the gardenlet DefaultSeccompProfile feature gate must be set to true.\nfeatureGates: DefaultSeccompProfile: true Please refer to the examples in this yaml file for more information.\nOnce the feature gate is enabled, the webhook will be registered and configured for the seed cluster. Newly created pods will be mutated to have their seccomp profile set to RuntimeDefault.\n Note: Please note that this feature is still in Alpha, so you might see instabilities every now and then.\n Setting the Seccomp Profile to RuntimeDefault for Shoot Clusters You can enable the use of RuntimeDefault as the default seccomp profile for all workloads. If enabled, the kubelet will use the RuntimeDefault seccomp profile by default, which is defined by the container runtime, instead of using the Unconfined mode. More information for this feature can be found in the Kubernetes documentation.\nTo use seccomp profile defaulting, you must run the kubelet with the SeccompDefault feature gate enabled (this is the default).\nHow to Configure To enable this feature, the kubelet seccompDefault configuration parameter must be set to true in the shoot’s spec.\nspec: kubernetes: version: 1.25.0 kubelet: seccompDefault: true Please refer to the examples in this yaml file for more information.\n","categories":"","description":"Enable the use of `RuntimeDefault` as the default seccomp profile through `spec.kubernetes.kubelet.seccompDefault`","excerpt":"Enable the use of `RuntimeDefault` as the default seccomp profile …","ref":"/docs/gardener/default_seccomp_profile/","tags":"","title":"Default Seccomp Profile"},{"body":"Defaulting Strategy and Developer Guidelines This document walks you through:\n Conventions to be followed when writing defaulting functions How to write a test for a defaulting function The document is aimed towards developers who want to contribute code and need to write defaulting code and unit tests covering the defaulting functions, as well as maintainers and reviewers who review code. It serves as a common guide that we commit to follow in our project to ensure consistency in our defaulting code, good coverage for high confidence, and good maintainability.\nWriting defaulting code Every kubernetes type should have a dedicated defaults_*.go file. For instance, if you have a Shoot type, there should be a corresponding defaults_shoot.go file containing all defaulting logic for that type. If there is only one type under an api group then we can just have types.go and a corresponding defaults.go. For instance, resourcemanager api has only one types.go, hence in this case only defaults.go file would suffice. Aim to segregate each struct type into its own SetDefaults_* function. These functions encapsulate the defaulting logic specific to the corresponding struct type, enhancing modularity and maintainability. For example, ServerConfiguration struct in resourcemanager api has corresponding SetDefaults_ServerConfiguration() function. ⚠️ Ensure to run the make generate WHAT=codegen command when new SetDefaults_* function is added, which generates the zz_generated.defaults.go file containing the overall defaulting function.\nWriting unit tests for defaulting code Each test case should validate the overall defaulting function SetObjectDefaults_* generated by defaulter-gen and not a specific SetDefaults_*. This way we also test if the zz_generated.defaults.go was generated correctly. For example, the spec.machineImages[].updateStrategy field in the CloudProfile is defaulted as follows: https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_cloudprofile.go#L23-L29 The defaulting should be tested with the overall defaulting function SetObjectDefaults_CloudProfile (and not with SetDefaults_MachineImage): https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_cloudprofile_test.go#L40-L47\n Test each defaulting function carefully to ensure:\n Proper defaulting behaviour when fields are empty or nil. Note that some fields may be optional and should not be defaulted.\n Preservation of existing values, ensuring that defaulting does not accidentally overwrite them.\nFor example, when spec.secretRef.namespace field of SecretBinding is nil, it should be defaulted to the namespace of SecretBinding object. But spec.secretRef.namespace field should not be overwritten by defaulting logic if it is already set. https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_secretbinding_test.go#L26-L54\n ","categories":"","description":"","excerpt":"Defaulting Strategy and Developer Guidelines This document walks you …","ref":"/docs/gardener/defaulting/","tags":"","title":"Defaulting"},{"body":"DEP-NN: Your short, descriptive title Table of Contents Summary Motivation Goals Non-Goals Proposal Alternatives Summary Motivation Goals Non-Goals Proposal Alternatives ","categories":"","description":"","excerpt":"DEP-NN: Your short, descriptive title Table of Contents Summary …","ref":"/docs/other-components/etcd-druid/proposals/00-template/","tags":"","title":"DEP Title"},{"body":"Testing We follow the BDD-style testing principles and are leveraging the Ginkgo framework along with Gomega as matcher library. In order to execute the existing tests, you can use\nmake test # runs tests make verify # runs static code checks and test There is an additional command for analyzing the code coverage of the tests. Ginkgo will generate standard Golang cover profiles which will be translated into a HTML file by the Go Cover Tool. Another command helps you to clean up the filesystem from the temporary cover profile files and the HTML report:\nmake test-cov open gardener.coverage.html make test-cov-clean sigs.k8s.io/controller-runtime env test Some of the integration tests in Gardener are using the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) against the integration tests can run. The test and test-cov rules in the Makefile prepare this env test automatically by downloading the respective binaries (if not yet present) and set the necessary environment variables.\nYou can also run go test or ginkgo without the test/test-cov rules. In this case you have to set the KUBEBUILDER_ASSETS environment variable to the path that contains the etcd + kube-apiserver binaries or you need to have the binaries pre-installed under /usr/local/kubebuilder/bin.\nDependency Management We are using go modules for depedency management. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating Dependencies The Makefile contains a rule called revendor which performs go mod vendor and go mod tidy. go mod vendor resets the main module’s vendor directory to include all packages needed to build and test all the main module’s packages. It does not include test code for vendored packages. go mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages.\nmake revendor The dependencies are installed into the vendor folder which should be added to the VCS.\nWarning Make sure that you test the code after you have updated the dependencies! ","categories":"","description":"","excerpt":"Testing We follow the BDD-style testing principles and are leveraging …","ref":"/docs/contribute/code/dependencies/","tags":"","title":"Dependencies"},{"body":"Dependency Management We are using go modules for dependency management. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating Dependencies The Makefile contains a rule called tidy which performs go mod tidy:\n go mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages. make tidy ⚠️ Make sure that you test the code after you have updated the dependencies!\nExported Packages This repository contains several packages that could be considered “exported packages”, in a sense that they are supposed to be reused in other Go projects. For example:\n Gardener’s API packages: pkg/apis Library for building Gardener extensions: extensions Gardener’s Test Framework: test/framework There are a few more folders in this repository (non-Go sources) that are reused across projects in the Gardener organization:\n GitHub templates: .github Concourse / cc-utils related helpers: hack/.ci Development, build and testing helpers: hack These packages feature a dummy doc.go file to allow other Go projects to pull them in as go mod dependencies.\nThese packages are explicitly not supposed to be used in other projects (consider them as “non-exported”):\n API validation packages: pkg/apis/*/*/validation Operation package (main Gardener business logic regarding Seed and Shoot clusters): pkg/gardenlet/operation Third party code: third_party Currently, we don’t have a mechanism yet for selectively syncing out these exported packages into dedicated repositories like kube’s staging mechanism (publishing-bot).\nImport Restrictions We want to make sure that other projects can depend on this repository’s “exported” packages without pulling in the entire repository (including “non-exported” packages) or a high number of other unwanted dependencies. Hence, we have to be careful when adding new imports or references between our packages.\n ℹ️ General rule of thumb: the mentioned “exported” packages should be as self-contained as possible and depend on as few other packages in the repository and other projects as possible.\n In order to support that rule and automatically check compliance with that goal, we leverage import-boss. The tool checks all imports of the given packages (including transitive imports) against rules defined in .import-restrictions files in each directory. An import is allowed if it matches at least one allowed prefix and does not match any forbidden prefixes.\n Note: '' (the empty string) is a prefix of everything. For more details, see the import-boss topic.\n import-boss is executed on every pull request and blocks the PR if it doesn’t comply with the defined import restrictions. You can also run it locally using make check.\nImport restrictions should be changed in the following situations:\n We spot a new pattern of imports across our packages that was not restricted before but makes it more difficult for other projects to depend on our “exported” packages. In that case, the imports should be further restricted to disallow such problematic imports, and the code/package structure should be reworked to comply with the newly given restrictions. We want to share code between packages, but existing import restrictions prevent us from doing so. In that case, please consider what additional dependencies it will pull in, when loosening existing restrictions. Also consider possible alternatives, like code restructurings or extracting shared code into dedicated packages for minimal impact on dependent projects. ","categories":"","description":"","excerpt":"Dependency Management We are using go modules for dependency …","ref":"/docs/gardener/dependencies/","tags":"","title":"Dependencies"},{"body":"Documentation Index Concepts Prober Weeder Development Contributions Testing Setup Dependency Watchdog using local Garden cluster Deployment Configure dependency watchdog ","categories":"","description":"","excerpt":"Documentation Index Concepts Prober Weeder Development …","ref":"/docs/other-components/dependency-watchdog/readme/","tags":"","title":"Dependency Watchdog"},{"body":"Deploying Gardenlets Gardenlets act as decentralized agents to manage the shoot clusters of a seed cluster.\nProcedure After you have deployed the Gardener control plane, you need one or more seed clusters in order to be able to create shoot clusters.\nYou can either register an existing cluster as “seed” (this could also be the cluster in which the control plane runs), or you can create new clusters (typically shoots, i.e., this approach registers at least one first initial seed) and then register them as “seeds”.\nThe following sections describe the scenarios.\nRegister A First Seed Cluster If you have not registered a seed cluster yet (thus, you need to deploy a first, so-called “unmanaged seed”), your approach depends on how you deployed the Gardener control plane.\nGardener Control Plane Deployed Via gardener/controlplane Helm chart You can follow Deploy a gardenlet Manually.\nGardener Control Plane Deployed Via gardener-operator If you want to register the same cluster in which gardener-operator runs, or if you want to register another cluster that is reachable (network-wise) for gardener-operator, you can follow Deploy gardenlet via gardener-operator. If you want to register a cluster that is not reachable (network-wise) (e.g., because it runs behind a firewall), you can follow Deploy a gardenlet Manually. Register Further Seed Clusters If you already have a seed cluster, and you want to deploy further seed clusters (so-called “managed seeds”), you can follow Deploy a gardenlet Automatically.\n","categories":"","description":"","excerpt":"Deploying Gardenlets Gardenlets act as decentralized agents to manage …","ref":"/docs/gardener/deployment/deploy_gardenlet/","tags":"","title":"Deploy Gardenlet"},{"body":"Deploy a gardenlet Automatically The gardenlet can automatically deploy itself into shoot clusters, and register them as seed clusters. These clusters are called “managed seeds” (aka “shooted seeds”). This procedure is the preferred way to add additional seed clusters, because shoot clusters already come with production-grade qualities that are also demanded for seed clusters.\nPrerequisites The only prerequisite is to register an initial cluster as a seed cluster that already has a manually deployed gardenlet (for a step-by-step manual installation guide, see Deploy a Gardenlet Manually).\n [!TIP] The initial seed cluster can be the garden cluster itself, but for better separation of concerns, it is recommended to only register other clusters as seeds.\n Auto-Deployment of Gardenlets into Shoot Clusters For a better scalability of your Gardener landscape (e.g., when the total number of Shoots grows), you usually need more seed clusters that you can create, as follows:\n Use the initial seed cluster (“unmanaged seed”) to create shoot clusters that you later register as seed clusters. The gardenlet deployed in the initial cluster can deploy itself into the shoot clusters (which eventually makes them getting registered as seeds) if ManagedSeed resources are created. The advantage of this approach is that there’s only one initial gardenlet installation required. Every other managed seed cluster gets an automatically deployed gardenlet.\nRelated Links ManagedSeeds: Register Shoot as Seed ","categories":"","description":"","excerpt":"Deploy a gardenlet Automatically The gardenlet can automatically …","ref":"/docs/gardener/deployment/deploy_gardenlet_automatically/","tags":"","title":"Deploy Gardenlet Automatically"},{"body":"Deploy a gardenlet Manually Manually deploying a gardenlet is usually only required if the Kubernetes cluster to be registered as a seed cluster is managed via third-party tooling (i.e., the Kubernetes cluster is not a shoot cluster, so Deploy a gardenlet Automatically cannot be used). In this case, gardenlet needs to be deployed manually, meaning that its Helm chart must be installed.\n [!TIP] Once you’ve deployed a gardenlet manually, you can deploy new gardenlets automatically. The manually deployed gardenlet is then used as a template for the new gardenlets. For more information, see Deploy a gardenlet Automatically.\n Prerequisites Kubernetes Cluster that Should Be Registered as a Seed Cluster Verify that the cluster has a supported Kubernetes version.\n Determine the nodes, pods, and services CIDR of the cluster. You need to configure this information in the Seed configuration. Gardener uses this information to check that the shoot cluster isn’t created with overlapping CIDR ranges.\n Every seed cluster needs an Ingress controller which distributes external requests to internal components like Plutono and Prometheus. For this, configure the following lines in your Seed resource:\nspec: dns: provider: type: aws-route53 secretRef: name: ingress-secret namespace: garden ingress: domain: ingress.my-seed.example.com controller: kind: nginx providerConfig: \u003csome-optional-provider-specific-config-for-the-ingressController\u003e Procedure Overview Prepare the garden cluster: Create a bootstrap token secret in the kube-system namespace of the garden cluster Create RBAC roles for the gardenlet to allow bootstrapping in the garden cluster Prepare the gardenlet Helm chart. Automatically register shoot cluster as a seed cluster. Deploy the gardenlet Check that the gardenlet is successfully deployed Create a Bootstrap Token Secret in the kube-system Namespace of the Garden Cluster The gardenlet needs to talk to the Gardener API server residing in the garden cluster.\nUse gardenlet’s ability to request a signed certificate for the garden cluster by leveraging Kubernetes Certificate Signing Requests. The gardenlet performs a TLS bootstrapping process that is similar to the Kubelet TLS Bootstrapping. Make sure that the API server of the garden cluster has bootstrap token authentication enabled.\nThe client credentials required for the gardenlet’s TLS bootstrapping process need to be either token or certificate (OIDC isn’t supported) and have permissions to create a Certificate Signing Request (CSR). It’s recommended to use bootstrap tokens due to their desirable security properties (such as a limited token lifetime).\nTherefore, first create a bootstrap token secret for the garden cluster:\napiVersion: v1 kind: Secret metadata: # Name MUST be of form \"bootstrap-token-\u003ctoken id\u003e\" name: bootstrap-token-07401b namespace: kube-system # Type MUST be 'bootstrap.kubernetes.io/token' type: bootstrap.kubernetes.io/token stringData: # Human readable description. Optional. description: \"Token to be used by the gardenlet for Seed `sweet-seed`.\" # Token ID and secret. Required. token-id: 07401b # 6 characters token-secret: f395accd246ae52d # 16 characters # Expiration. Optional. # expiration: 2017-03-10T03:22:11Z # Allowed usages. usage-bootstrap-authentication: \"true\" usage-bootstrap-signing: \"true\" When you later prepare the gardenlet Helm chart, a kubeconfig based on this token is shared with the gardenlet upon deployment.\nPrepare the gardenlet Helm Chart This section only describes the minimal configuration, using the global configuration values of the gardenlet Helm chart. For an overview over all values, see the configuration values. We refer to the global configuration values as gardenlet configuration in the following procedure.\n Create a gardenlet configuration gardenlet-values.yaml based on this template.\n Create a bootstrap kubeconfig based on the bootstrap token created in the garden cluster.\nReplace the \u003cbootstrap-token\u003e with token-id.token-secret (from our previous example: 07401b.f395accd246ae52d) from the bootstrap token secret.\napiVersion: v1 kind: Config current-context: gardenlet-bootstrap@default clusters: - cluster: certificate-authority-data: \u003cca-of-garden-cluster\u003e server: https://\u003cendpoint-of-garden-cluster\u003e name: default contexts: - context: cluster: default user: gardenlet-bootstrap name: gardenlet-bootstrap@default users: - name: gardenlet-bootstrap user: token: \u003cbootstrap-token\u003e In the gardenClientConnection.bootstrapKubeconfig section of your gardenlet configuration, provide the bootstrap kubeconfig together with a name and namespace to the gardenlet Helm chart.\ngardenClientConnection: bootstrapKubeconfig: name: gardenlet-kubeconfig-bootstrap namespace: garden kubeconfig: | \u003cbootstrap-kubeconfig\u003e # will be base64 encoded by helm The bootstrap kubeconfig is stored in the specified secret.\n In the gardenClientConnection.kubeconfigSecret section of your gardenlet configuration, define a name and a namespace where the gardenlet stores the real kubeconfig that it creates during the bootstrap process. If the secret doesn’t exist, the gardenlet creates it for you.\ngardenClientConnection: kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden Updating the Garden Cluster CA The kubeconfig created by the gardenlet in step 4 will not be recreated as long as it exists, even if a new bootstrap kubeconfig is provided. To enable rotation of the garden cluster CA certificate, a new bundle can be provided via the gardenClientConnection.gardenClusterCACert field. If the provided bundle differs from the one currently in the gardenlet’s kubeconfig secret then it will be updated. To remove the CA completely (e.g. when switching to a publicly trusted endpoint), this field can be set to either none or null.\nPrepare Seed Specification When gardenlet starts, it tries to register a Seed resource in the garden cluster based on the specification provided in seedConfig in its configuration.\n This procedure doesn’t describe all the possible configurations for the Seed resource. For more information, see:\n Example Seed resource Configurable Seed settings Supply the Seed resource in the seedConfig section of your gardenlet configuration gardenlet-values.yaml.\n Add the seedConfig to your gardenlet configuration gardenlet-values.yaml. The field seedConfig.spec.provider.type specifies the infrastructure provider type (for example, aws) of the seed cluster. For all supported infrastructure providers, see Known Extension Implementations.\n# ... seedConfig: metadata: name: sweet-seed labels: environment: evaluation annotations: custom.gardener.cloud/option: special spec: dns: provider: type: \u003cprovider\u003e secretRef: name: ingress-secret namespace: garden ingress: # see prerequisites domain: ingress.dev.my-seed.example.com controller: kind: nginx networks: # see prerequisites nodes: 10.240.0.0/16 pods: 100.244.0.0/16 services: 100.32.0.0/13 shootDefaults: # optional: non-overlapping default CIDRs for shoot clusters of that Seed pods: 100.96.0.0/11 services: 100.64.0.0/13 provider: region: eu-west-1 type: \u003cprovider\u003e Apart from the seed’s name, seedConfig.metadata can optionally contain labels and annotations. gardenlet will set the labels of the registered Seed object to the labels given in the seedConfig plus gardener.cloud/role=seed. Any custom labels on the Seed object will be removed on the next restart of gardenlet. If a label is removed from the seedConfig it is removed from the Seed object as well. In contrast to labels, annotations in the seedConfig are added to existing annotations on the Seed object. Thus, custom annotations that are added to the Seed object during runtime are not removed by gardenlet on restarts. Furthermore, if an annotation is removed from the seedConfig, gardenlet does not remove it from the Seed object.\nOptional: Enable HA Mode You may consider running gardenlet with multiple replicas, especially if the seed cluster is configured to host HA shoot control planes. Therefore, the following Helm chart values define the degree of high availability you want to achieve for the gardenlet deployment.\nreplicaCount: 2 # or more if a higher failure tolerance is required. failureToleranceType: zone # One of `zone` or `node` - defines how replicas are spread. Optional: Enable Backup and Restore The seed cluster can be set up with backup and restore for the main etcds of shoot clusters.\nGardener uses etcd-backup-restore that integrates with different storage providers to store the shoot cluster’s main etcd backups. Make sure to obtain client credentials that have sufficient permissions with the chosen storage provider.\nCreate a secret in the garden cluster with client credentials for the storage provider. The format of the secret is cloud provider specific and can be found in the repository of the respective Gardener extension. For example, the secret for AWS S3 can be found in the AWS provider extension (30-etcd-backup-secret.yaml).\napiVersion: v1 kind: Secret metadata: name: sweet-seed-backup namespace: garden type: Opaque data: # client credentials format is provider specific Configure the Seed resource in the seedConfig section of your gardenlet configuration to use backup and restore:\n# ... seedConfig: metadata: name: sweet-seed spec: backup: provider: \u003cprovider\u003e secretRef: name: sweet-seed-backup namespace: garden Optional: Enable Self-Upgrades In order to take off the continuous task of deploying gardenlet’s Helm chart in case you want to upgrade its version, it supports self-upgrades. The way this works is that it pulls information (its configuration and deployment values) from a seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in the garden cluster. This resource must be in the garden namespace and must have the same name as the Seed the gardenlet is responsible for. For more information, see this section.\nIn order to make gardenlet automatically create a corresponding seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource, you must provide\nselfUpgrade: deployment: helm: ociRepository: ref: \u003curl-to-oci-repository-containing-gardenlet-helm-chart\u003e in your gardenlet-values.yaml file. Please replace the ref placeholder with the URL to the OCI repository containing the gardenlet Helm chart you are installing.\n [!NOTE]\nIf you don’t configure this selfUpgrade section in the initial deployment, you can also do it later, or you directly create the corresponding seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in the garden cluster.\n Deploy the gardenlet The gardenlet-values.yaml looks something like this (with backup for shoot clusters enabled):\n# \u003cdefault config\u003e # ... config: gardenClientConnection: # ... bootstrapKubeconfig: name: gardenlet-bootstrap-kubeconfig namespace: garden kubeconfig: |apiVersion: v1 clusters: - cluster: certificate-authority-data: \u003cdummy\u003e server: \u003cmy-garden-cluster-endpoint\u003e name: my-kubernetes-cluster # ... kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden # ... # \u003cdefault config\u003e # ... seedConfig: metadata: name: sweet-seed spec: dns: provider: type: \u003cprovider\u003e secretRef: name: ingress-secret namespace: garden ingress: # see prerequisites domain: ingress.dev.my-seed.example.com controller: kind: nginx networks: nodes: 10.240.0.0/16 pods: 100.244.0.0/16 services: 100.32.0.0/13 shootDefaults: pods: 100.96.0.0/11 services: 100.64.0.0/13 provider: region: eu-west-1 type: \u003cprovider\u003e backup: provider: \u003cprovider\u003e secretRef: name: sweet-seed-backup namespace: garden Deploy the gardenlet Helm chart to the Kubernetes cluster:\nhelm install gardenlet charts/gardener/gardenlet \\ --namespace garden \\ -f gardenlet-values.yaml \\ --wait This Helm chart creates:\n A service account gardenlet that the gardenlet can use to talk to the Seed API server. RBAC roles for the service account (full admin rights at the moment). The secret (garden/gardenlet-bootstrap-kubeconfig) containing the bootstrap kubeconfig. The gardenlet deployment in the garden namespace. Check that the gardenlet Is Successfully Deployed Check that the gardenlets certificate bootstrap was successful.\nCheck if the secret gardenlet-kubeconfig in the namespace garden in the seed cluster is created and contains a kubeconfig with a valid certificate.\n Get the kubeconfig from the created secret.\n$ kubectl -n garden get secret gardenlet-kubeconfig -o json | jq -r .data.kubeconfig | base64 -d Test against the garden cluster and verify it’s working.\n Extract the client-certificate-data from the user gardenlet.\n View the certificate:\n$ openssl x509 -in ./gardenlet-cert -noout -text Check that the bootstrap secret gardenlet-bootstrap-kubeconfig has been deleted from the seed cluster in namespace garden.\n Check that the seed cluster is registered and READY in the garden cluster.\nCheck that the seed cluster sweet-seed exists and all conditions indicate that it’s available. If so, the Gardenlet is sending regular heartbeats and the seed bootstrapping was successful.\nCheck that the conditions on the Seed resource look similar to the following:\n$ kubectl get seed sweet-seed -o json | jq .status.conditions [ { \"lastTransitionTime\": \"2020-07-17T09:17:29Z\", \"lastUpdateTime\": \"2020-07-17T09:17:29Z\", \"message\": \"Gardenlet is posting ready status.\", \"reason\": \"GardenletReady\", \"status\": \"True\", \"type\": \"GardenletReady\" }, { \"lastTransitionTime\": \"2020-07-17T09:17:49Z\", \"lastUpdateTime\": \"2020-07-17T09:53:17Z\", \"message\": \"Backup Buckets are available.\", \"reason\": \"BackupBucketsAvailable\", \"status\": \"True\", \"type\": \"BackupBucketsReady\" } ] Self Upgrades In order to keep your gardenlets in such “unmanaged seeds” up-to-date (i.e., in seeds which are no shoot clusters), its Helm chart must be regularly deployed. This requires network connectivity to such clusters which can be challenging if they reside behind a firewall or in restricted environments. It is much simpler if gardenlet could keep itself up-to-date, based on configuration read from the garden cluster. This approach greatly reduces operational complexity.\ngardenlet runs a controller which watches for seedmanagement.gardener.cloud/v1alpha1.Gardenlet resources in the garden cluster in the garden namespace having the same name as the Seed the gardenlet is responsible for. Such resources contain its component configuration and deployment values. Most notably, a URL to an OCI repository containing gardenlet’s Helm chart is included.\nAn example Gardenlet resource looks like this:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: deployment: replicaCount: 1 revisionHistoryLimit: 2 helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 config: apiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration gardenClientConnection: kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden controllers: shoot: reconcileInMaintenanceOnly: true respectSyncPeriodOverwrite: true shootState: concurrentSyncs: 0 featureGates: DefaultSeccompProfile: true HVPA: true HVPAForShootedSeed: true IPv6SingleStack: true ShootManagedIssuer: true etcdConfig: featureGates: UseEtcdWrapper: true logging: enabled: true vali: enabled: true shootNodeLogging: shootPurposes: - infrastructure - production - development - evaluation seedConfig: apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: base: kind spec: backup: provider: local region: local secretRef: name: backup-local namespace: garden dns: provider: secretRef: name: internal-domain-internal-local-gardener-cloud namespace: garden type: local ingress: controller: kind: nginx domain: ingress.local.seed.local.gardener.cloud networks: nodes: 172.18.0.0/16 pods: 10.1.0.0/16 services: 10.2.0.0/16 shootDefaults: pods: 10.3.0.0/16 services: 10.4.0.0/16 provider: region: local type: local zones: - \"0\" settings: excessCapacityReservation: enabled: false scheduling: visible: true verticalPodAutoscaler: enabled: true On reconciliation, gardenlet downloads the Helm chart, renders it with the provided values, and then applies it to its own cluster. Hence, in order to keep a gardenlet up-to-date, it is enough to update the tag/digest of the OCI repository ref for the Helm chart:\nspec: deployment: helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 This way, network connectivity to the cluster in which gardenlet runs is not required at all (at least for deployment purposes).\nWhen you delete this resource, nothing happens: gardenlet remains running with the configuration as before. However, self-upgrades are obviously not possible anymore. In order to upgrade it, you have to either recreate the Gardenlet object, or redeploy the Helm chart.\nRelated Links Issue #1724: Harden Gardenlet RBAC privileges. Backup and Restore. ","categories":"","description":"","excerpt":"Deploy a gardenlet Manually Manually deploying a gardenlet is usually …","ref":"/docs/gardener/deployment/deploy_gardenlet_manually/","tags":"","title":"Deploy Gardenlet Manually"},{"body":"Deploy a gardenlet Via gardener-operator The gardenlet can automatically be deployed by gardener-operator into existing Kubernetes clusters in order to register them as seeds.\nPrerequisites Using this method only works when gardener-operator is managing the garden cluster. If you have used the gardener/controlplane Helm chart for the deployment of the Gardener control plane, please refer to this document.\n [!TIP] The initial seed cluster can be the garden cluster itself, but for better separation of concerns, it is recommended to only register other clusters as seeds.\n Deployment of gardenlets Using this method, gardener-operator is only taking care of the very first deployment of gardenlet. Once running, the gardenlet leverages the self upgrade strategy in order to keep itself up-to-date. Concretely, gardener-operator only acts when there is no respective Seed resource yet.\nIn order to request a gardenlet deployment, create following resource in the (virtual) garden cluster:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: deployment: replicaCount: 1 revisionHistoryLimit: 2 helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 config: apiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration controllers: shoot: reconcileInMaintenanceOnly: true respectSyncPeriodOverwrite: true shootState: concurrentSyncs: 0 featureGates: ShootManagedIssuer: true etcdConfig: featureGates: UseEtcdWrapper: true logging: enabled: true vali: enabled: true shootNodeLogging: shootPurposes: - infrastructure - production - development - evaluation seedConfig: apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: base: kind spec: backup: provider: local region: local secretRef: name: backup-local namespace: garden dns: provider: secretRef: name: internal-domain-internal-local-gardener-cloud namespace: garden type: local ingress: controller: kind: nginx domain: ingress.local.seed.local.gardener.cloud networks: nodes: 172.18.0.0/16 pods: 10.1.0.0/16 services: 10.2.0.0/16 shootDefaults: pods: 10.3.0.0/16 services: 10.4.0.0/16 provider: region: local type: local zones: - \"0\" settings: excessCapacityReservation: enabled: false scheduling: visible: true verticalPodAutoscaler: enabled: true This causes gardener-operator to deploy gardenlet to the same cluster where it is running. Once it comes up, gardenlet will create a Seed resource with the same name and uses the Gardenlet resource for self-upgrades (see this document).\nRemote Clusters If you want gardener-operator to deploy gardenlet into some other cluster, create a kubeconfig Secret and reference it in the Gardenlet resource:\napiVersion: v1 kind: Secret metadata: name: remote-cluster-kubeconfig namespace: garden type: Opaque data: kubeconfig: base64(kubeconfig-to-remote-cluster) --- apiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: kubeconfigSecretRef: name: remote-cluster-kubeconfig # ... After successful deployment of gardenlet, gardener-operator will delete the remote-cluster-kubeconfig Secret and set .spec.kubeconfigSecretRef to nil. This is because the kubeconfig will never ever be needed anymore (gardener-operator is only responsible for initial deployment, and gardenlet updates itself with an in-cluster kubeconfig).\n","categories":"","description":"","excerpt":"Deploy a gardenlet Via gardener-operator The gardenlet can …","ref":"/docs/gardener/deployment/deploy_gardenlet_via_operator/","tags":"","title":"Deploy Gardenlet Via Operator"},{"body":"Deploying Registry Cache Extension in Gardener’s Local Setup with Provider Extensions Prerequisites Make sure that you have a running local Gardener setup with enabled provider extensions. The steps to complete this can be found in the Deploying Gardener Locally and Enabling Provider-Extensions guide. Setting up the Registry Cache Extension Make sure that your KUBECONFIG environment variable is targeting the local Gardener cluster.\nThe location of the Gardener project from the Gardener setup step is expected to be under the same root (e.g. ~/go/src/github.com/gardener/). If this is not the case, the location of Gardener project should be specified in GARDENER_REPO_ROOT environment variable:\nexport GARDENER_REPO_ROOT=\"\u003cpath_to_gardener_project\u003e\" Then you can run:\nmake remote-extension-up In case you have added additional Seeds you can specify the seed name:\nmake remote-extension-up SEED_NAME=\u003cseed-name\u003e The corresponding make target will build the extension image, push it into the Seed cluster image registry, and deploy the registry-cache ControllerDeployment and ControllerRegistration resources into the kind cluster. The container image in the ControllerDeployment will be the image that was build and pushed into the Seed cluster image registry.\nThe make target will then deploy the registry-cache admission component. It will build the admission image, push it into the kind cluster image registry, and finally install the admission component charts to the kind cluster.\nCreating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster. In order to create a Shoot cluster, please create your own Shoot definition depending on providers on your Seed cluster.\nTearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the registry-cache extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake remote-extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the registry-cache admission helm deployment.\n","categories":"","description":"Learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster","excerpt":"Learn how to set up a development environment using own Seed clusters …","ref":"/docs/extensions/others/gardener-extension-registry-cache/getting-started-remotely/","tags":"","title":"Deploying Registry Cache Extension in Gardener's Local Setup with Provider Extensions"},{"body":"Deploying Registry Cache Extension Locally Prerequisites Make sure that you have a running local Gardener setup. The steps to complete this can be found in the Deploying Gardener Locally guide. Setting up the Registry Cache Extension Make sure that your KUBECONFIG environment variable is targeting the local Gardener cluster. When this is ensured, run:\nmake extension-up The corresponding make target will build the extension image, load it into the kind cluster Nodes, and deploy the registry-cache ControllerDeployment and ControllerRegistration resources. The container image in the ControllerDeployment will be the image that was build and loaded into the kind cluster Nodes.\nThe make target will then deploy the registry-cache admission component. It will build the admission image, load it into the kind cluster Nodes, and finally install the admission component charts to the kind cluster.\nCreating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster.\nexample/shoot-registry-cache.yaml contains a Shoot specification with the registry-cache extension:\nkubectl create -f example/shoot-registry-cache.yaml example/shoot-registry-mirror.yaml contains a Shoot specification with the registry-mirror extension:\nkubectl create -f example/shoot-registry-mirror.yaml Tearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the registry-cache extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the registry-cache admission helm deployment.\n","categories":"","description":"Learn how to set up a local development environment","excerpt":"Learn how to set up a local development environment","ref":"/docs/extensions/others/gardener-extension-registry-cache/getting-started-locally/","tags":"","title":"Deploying Registry Cache Extension Locally"},{"body":"Deploying Rsyslog Relp Extension Remotely This document will walk you through running the Rsyslog Relp extension controller on a remote seed cluster and the rsyslog relp admission component in your local garden cluster for development purposes. This guide uses Gardener’s setup with provider extensions and builds on top of it.\nIf you encounter difficulties, please open an issue so that we can make this process easier.\nPrerequisites Make sure that you have a running Gardener setup with provider extensions. The steps to complete this can be found in the Deploying Gardener Locally and Enabling Provider-Extensions guide. Make sure you are running Gardener version \u003e= 1.95.0 or the latest version of the master branch. Setting up the Rsyslog Relp Extension Important: Make sure that your KUBECONFIG env variable is targeting the local Gardener cluster!\nThe location of the Gardener project from the Gardener setup is expected to be under the same root as this repository (e.g. ~/go/src/github.com/gardener/). If this is not the case, the location of Gardener project should be specified in GARDENER_REPO_ROOT environment variable:\nexport GARDENER_REPO_ROOT=\"\u003cpath_to_gardener_project\u003e\" Then you can run:\nmake remote-extension-up In case you have added additional Seeds you can specify the seed name:\nmake remote-extension-up SEED_NAME=\u003cseed-name\u003e Creating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster. In order to create a Shoot cluster, please create your own Shoot definition depending on providers on your Seed cluster.\nConfiguring the Shoot Cluster and deploying the Rsyslog Relp Echo Server To be able to properly test the rsyslog relp extension you need a running rsyslog relp echo server to which logs from the Shoot nodes can be sent. To deploy the server and configure the rsyslog relp extension on your Shoot cluster you can run:\nmake configure-shoot SHOOT_NAME=\u003cshoot-name\u003e SHOOT_NAMESPACE=\u003cshoot-namespace\u003e This command will deploy an rsyslog relp echo server in your Shoot cluster in the rsyslog-relp-echo-server namespace. It will also add configuration for the shoot-rsyslog-relp extension to your Shoot spec by patching it with ./example/extension/\u003cshoot-name\u003e--\u003cshoot-namespace\u003e--extension-config-patch.yaml. This file is automatically copied from extension-config-patch.yaml.tmpl in the same directory when you run make configure-shoot for the first time. The file also includes explanations of the properties you should set or change. The command will also deploy the rsyslog-relp-tls secret in case you wish to enable tls.\nTearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the shoot-rsyslog-relp extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake remote-extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the shoot-rsyslog-relp admission helm deployment.\n","categories":"","description":"Learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster","excerpt":"Learn how to set up a development environment using own Seed clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/getting-started-remotely/","tags":"","title":"Deploying Rsyslog Relp Extension Remotely"},{"body":"Deployment of the AliCloud provider extension Disclaimer: This document is NOT a step by step installation guide for the AliCloud provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AliCloud provider extension repository.\ngardener-extension-admission-alicloud Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-alicloud component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the AliCloud provider extension Disclaimer: This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the AWS provider extension Disclaimer: This document is NOT a step by step installation guide for the AWS provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AWS provider extension repository.\ngardener-extension-admission-aws Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-aws component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the AWS provider extension Disclaimer: This document is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the Azure provider extension Disclaimer: This document is NOT a step by step installation guide for the Azure provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the Azure provider extension repository.\ngardener-extension-admission-azure Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-azure component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the Azure provider extension Disclaimer: This document …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the GCP provider extension Disclaimer: This document is NOT a step-by-step installation guide for the GCP provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the GCP provider extension repository.\ngardener-extension-admission-gcp Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-gcp component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the GCP provider extension Disclaimer: This document is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the OpenStack provider extension Disclaimer: This document is NOT a step by step installation guide for the OpenStack provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the OpenStack provider extension repository.\ngardener-extension-admission-openstack Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-openstack component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the OpenStack provider extension Disclaimer: This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the networking Calico extension Disclaimer: This document is NOT a step by step deployment guide for the networking Calico extension and only contains some configuration specifics regarding the deployment of different components via the helm charts residing in the networking Calico extension repository.\ngardener-extension-admission-calico Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-calico component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the networking Calico extension Disclaimer: This …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Certificate Management Introduction Gardener comes with an extension that enables shoot owners to request X.509 compliant certificates for shoot domains.\nExtension Installation The Shoot-Cert-Service extension can be deployed and configured via Gardener’s native resource ControllerRegistration.\nPrerequisites To let the Shoot-Cert-Service operate properly, you need to have:\n a DNS service in your seed contact details and optionally a private key for a pre-existing Let’s Encrypt account ControllerRegistration An example of a ControllerRegistration for the Shoot-Cert-Service can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploy the Shoot-Cert-Service to seed clusters. It offers some configuration options, mainly to set up a default issuer for shoot clusters. With a default issuer, pre-existing Let’s Encrypt accounts can be used and shared with shoot clusters (See “One Account or Many?” of the Integration Guide).\n Please keep the Let’s Encrypt Rate Limits in mind when using this shared account model. Depending on the amount of shoots and domains it is recommended to use an account with increased rate limits.\n apiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... values: certificateConfig: defaultIssuer: acme: email: foo@example.com privateKey: |------BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- server: https://acme-v02.api.letsencrypt.org/directory name: default-issuer # restricted: true # restrict default issuer to any sub-domain of shoot.spec.dns.domain # defaultRequestsPerDayQuota: 50 # precheckNameservers: 8.8.8.8,8.8.4.4 # caCertificates: | # optional custom CA certificates when using private ACME provider # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- # # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- shootIssuers: enabled: false # if true, allows to specify issuers in the shoot clusters Enablement If the Shoot-Cert-Service should be enabled for every shoot cluster in your Gardener managed environment, you need to globally enable it in the ControllerRegistration:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... resources: - globallyEnabled: true kind: Extension type: shoot-cert-service Alternatively, you’re given the option to only enable the service for certain shoots:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 ... spec: extensions: - type: shoot-cert-service ... ","categories":"","description":"","excerpt":"Gardener Certificate Management Introduction Gardener comes with an …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener DNS Management for Shoots Introduction Gardener allows Shoot clusters to request DNS names for Ingresses and Services out of the box. To support this the gardener must be installed with the shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. So, far only the external DNS domain of a shoot (already used for the kubernetes api server and ingress DNS names) can be used for managed DNS names.\nConfiguration To generally enable the DNS management for shoot objects the shoot-dns-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots, the registration must set the globallyEnabled flag to true.\nspec: resources: - kind: Extension type: shoot-dns-service globallyEnabled: true Deployment of DNS controller manager If you are using Gardener version \u003e= 1.54, please make sure to deploy the DNS controller manager by adding the dnsControllerManager section to the providerConfig.values section.\nFor example:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-shoot-dns-service type: helm providerConfig: chart: ... values: image: ... dnsControllerManager: image: repository: europe-docker.pkg.dev/gardener-project/releases/dns-controller-manager tag: v0.16.0 configuration: cacheTtl: 300 controllers: dnscontrollers,dnssources dnsPoolResyncPeriod: 30m #poolSize: 20 #providersPoolResyncPeriod: 24h serverPortHttp: 8080 createCRDs: false deploy: true replicaCount: 1 #resources: # limits: # memory: 1Gi # requests: # cpu: 50m # memory: 500Mi dnsProviderManagement: enabled: true Providing Base Domains usable for a Shoot So, far only the external DNS domain of a shoot already used for the kubernetes api server and ingress DNS names can be used for managed DNS names. This is either the shoot domain as subdomain of the default domain configured for the gardener installation, or a dedicated domain with dedicated access credentials configured for a dedicated shoot via the shoot manifest.\nAlternatively, you can specify DNSProviders and its credentials Secret directly in the shoot, if this feature is enabled. By default, DNSProvider replication is disabled, but it can be enabled globally in the ControllerDeployment or for a shoot cluster in the shoot manifest (details see further below).\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-shoot-dns-service type: helm providerConfig: chart: ... values: image: ... dnsProviderReplication: enabled: true See example files (20-* and 30-*) for details for the various provider types.\nShoot Feature Gate If the shoot DNS feature is not globally enabled by default (depends on the extension registration on the garden cluster), it must be enabled per shoot.\nTo enable the feature for a shoot, the shoot manifest must explicitly add the shoot-dns-service extension.\n... spec: extensions: - type: shoot-dns-service ... Enable/disable DNS provider replication for a shoot The DNSProvider` replication feature enablement can be overwritten in the shoot manifest, e.g.\nKind: Shoot ... spec: extensions: - type: shoot-dns-service providerConfig: apiVersion: service.dns.extensions.gardener.cloud/v1alpha1 kind: DNSConfig dnsProviderReplication: enabled: true ... ","categories":"","description":"","excerpt":"Gardener DNS Management for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Lakom Service for Shoots Introduction Gardener allows Shoot clusters to use Lakom admission controller for cosign image signing verification. To support this the Gardener must be installed with the shoot-lakom-service extension.\nConfiguration To generally enable the Lakom service for shoot objects the shoot-lakom-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\nspec: resources: - kind: Extension type: shoot-lakom-service globallyEnabled: true Shoot Feature Gate If the shoot Lakom service is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-lakom-service extension.\n... spec: extensions: - type: shoot-lakom-service ... If the shoot Lakom service is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\n... spec: extensions: - type: shoot-lakom-service disabled: true ... ","categories":"","description":"","excerpt":"Gardener Lakom Service for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Networking Policy Filter for Shoots Introduction Gardener allows shoot clusters to filter egress traffic on node level. To support this the Gardener must be installed with the shoot-networking-filter extension.\nConfiguration To generally enable the networking filter for shoot objects the shoot-networking-filter extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... spec: resources: - kind: Extension type: shoot-networking-filter globallyEnabled: true ControllerRegistration An example of a ControllerRegistration for the shoot-networking-filter can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploys the shoot-networking-filter to seed clusters. It offers some configuration options, mainly to set up a static filter list or provide the configuration for downloading the filter list from a service endpoint.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment ... values: egressFilter: blackholingEnabled: true filterListProviderType: static staticFilterList: - network: 1.2.3.4/31 policy: BLOCK_ACCESS - network: 5.6.7.8/32 policy: BLOCK_ACCESS - network: ::2/128 policy: BLOCK_ACCESS #filterListProviderType: download #downloaderConfig: # endpoint: https://my.filter.list.server/lists/policy # oauth2Endpoint: https://my.auth.server/oauth2/token # refreshPeriod: 1h ## if the downloader needs an OAuth2 access token, client credentials can be provided with oauth2Secret #oauth2Secret: # clientID: 1-2-3-4 # clientSecret: secret!! ## either clientSecret of client certificate is required # client.crt.pem: | # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- # client.key.pem: | # -----BEGIN PRIVATE KEY----- # ... # -----END PRIVATE KEY----- Enablement for a Shoot If the shoot networking filter is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-networking-filter extension.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter ... If the shoot networking filter is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter disabled: true ... ","categories":"","description":"","excerpt":"Gardener Networking Policy Filter for Shoots Introduction Gardener …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Networking Policy Filter for Shoots Introduction Gardener allows shoot clusters to add network problem observability using the network problem detector. To support this the Gardener must be installed with the shoot-networking-problemdetector extension.\nConfiguration To generally enable the networking problem detector for shoot objects the shoot-networking-problemdetector extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... spec: resources: - kind: Extension type: shoot-networking-problemdetector globallyEnabled: true ControllerRegistration An example of a ControllerRegistration for the shoot-networking-problemdetector can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploys the shoot-networking-problemdetector to seed clusters. It offers some configuration options, mainly to set up a static filter list or provide the configuration for downloading the filter list from a service endpoint.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment ... values: #networkProblemDetector: # defaultPeriod: 30s Enablement for a Shoot If the shoot network problem detector is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-networking-problemdetector extension.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector ... If the shoot network problem detector is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector disabled: true ... ","categories":"","description":"","excerpt":"Gardener Networking Policy Filter for Shoots Introduction Gardener …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/deployment/","tags":"","title":"Deployment"},{"body":"Gardener OIDC Service for Shoots Introduction Gardener allows Shoot clusters to dynamically register OpenID Connect providers. To support this the Gardener must be installed with the shoot-oidc-service extension.\nConfiguration To generally enable the OIDC service for shoot objects the shoot-oidc-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\nspec: resources: - kind: Extension type: shoot-oidc-service globallyEnabled: true Shoot Feature Gate If the shoot OIDC service is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-oidc-service extension.\n... spec: extensions: - type: shoot-oidc-service ... If the shoot OIDC service is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\n... spec: extensions: - type: shoot-oidc-service disabled: true ... ","categories":"","description":"","excerpt":"Gardener OIDC Service for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/deployment/","tags":"","title":"Deployment"},{"body":"Deploying the Machine Controller Manager into a Kubernetes cluster Deploying the Machine Controller Manager into a Kubernetes cluster Prepare the cluster Build the Docker image Configuring optional parameters while deploying Usage As already mentioned, the Machine Controller Manager is designed to run as controller in a Kubernetes cluster. The existing source code can be compiled and tested on a local machine as described in Setting up a local development environment. You can deploy the Machine Controller Manager using the steps described below.\nPrepare the cluster Connect to the remote kubernetes cluster where you plan to deploy the Machine Controller Manager using the kubectl. Set the environment variable KUBECONFIG to the path of the yaml file containing the cluster info. Now, create the required CRDs on the remote cluster using the following command, $ kubectl apply -f kubernetes/crds Build the Docker image ⚠️ Modify the Makefile to refer to your own registry.\n Run the build which generates the binary to bin/machine-controller-manager $ make build Build docker image from latest compiled binary $ make docker-image Push the last created docker image onto the online docker registry. $ make push Now you can deploy this docker image to your cluster. A sample development file is provided. By default, the deployment manages the cluster it is running in. Optionally, the kubeconfig could also be passed as a flag as described in /kubernetes/deployment/out-of-tree/deployment.yaml. This is done when you want your controller running outside the cluster to be managed from. $ kubectl apply -f kubernetes/deployment/out-of-tree/deployment.yaml Also deploy the required clusterRole and clusterRoleBindings $ kubectl apply -f kubernetes/deployment/out-of-tree/clusterrole.yaml $ kubectl apply -f kubernetes/deployment/out-of-tree/clusterrolebinding.yaml Configuring optional parameters while deploying Machine-controller-manager supports several configurable parameters while deploying. Refer to the following lines, to know how each parameter can be configured, and what it’s purpose is for.\nUsage To start using Machine Controller Manager, follow the links given at usage here.\n","categories":"","description":"","excerpt":"Deploying the Machine Controller Manager into a Kubernetes cluster …","ref":"/docs/other-components/machine-controller-manager/deployment/","tags":"","title":"Deployment"},{"body":"Developer Docs for Gardener Extension Registry Cache This document outlines how Shoot reconciliation and deletion works for a Shoot with the registry-cache extension enabled.\nShoot Reconciliation This section outlines how the reconciliation works for a Shoot with the registry-cache extension enabled.\nExtension Enablement / Reconciliation This section outlines how the extension enablement/reconciliation works, e.g., the extension has been added to the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet deploys the Extension resource. The registry-cache extension reconciles the Extension resource. pkg/controller/cache/actuator.go contains the implementation of the extension.Actuator interface. The reconciliation of an Extension of type registry-cache consists of the following steps: The registry-cache extension deploys resources to the Shoot cluster via ManagedResource. For every configured upstream, it creates a StatefulSet (with PVC), Service, and other resources. It lists all Services from the kube-system namespace that have the upstream-host label. It will return an error (and retry in exponential backoff) until the Services count matches the configured registries count. When there is a Service created for each configured upstream registry, the registry-cache extension populates the Extension resource status. In the Extension status, for each upstream, it maintains an endpoint (in the format http://\u003ccluster-ip\u003e:5000) which can be used to access the registry cache from within the Shoot cluster. \u003ccluster-ip\u003e is the cluster IP of the registry cache Service. The cluster IP of a Service is assigned by the Kubernetes API server on Service creation. As part of the Shoot reconciliation flow, the gardenlet deploys the OperatingSystemConfig resource. The registry-cache extension serves a webhook that mutates the OperatingSystemConfig resource for Shoots having the registry-cache extension enabled (the corresponding namespace gets labeled by the gardenlet with extensions.gardener.cloud/registry-cache=true). pkg/webhook/cache/ensurer.go contains an implementation of the genericmutator.Ensurer interface. The webhook appends or updates RegistryConfig entries in the OperatingSystemConfig CRI configuration that corresponds to configured registry caches in the Shoot. The RegistryConfig readiness probe is enabled so that gardener-node-agent creates a hosts.toml containerd registry configuration file when all RegistryConfig hosts are reachable. Extension Disablement This section outlines how the extension disablement works, i.e., the extension has to be removed from the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet destroys the Extension resource because it is no longer needed. The extension deletes the ManagedResource containing the registry cache resources. The OperatingSystemConfig resource will not be mutated and no RegistryConfig entries will be added or updated. The gardener-node-agent detects that RegistryConfig entries have been removed or changed and deletes or updates corresponding hosts.toml configuration files under /etc/containerd/certs.d folder. Shoot Deletion This section outlines how the deletion works for a Shoot with the registry-cache extension enabled.\n As part of the Shoot deletion flow, the gardenlet destroys the Extension resource. The extension deletes the ManagedResource containing the registry cache resources. ","categories":"","description":"Learn about the inner workings","excerpt":"Learn about the inner workings","ref":"/docs/extensions/others/gardener-extension-registry-cache/extension-registry-cache/","tags":"","title":"Developer Docs for Gardener Extension Registry Cache"},{"body":"DNS Autoscaling This is a short guide describing different options how to automatically scale CoreDNS in the shoot cluster.\nBackground Currently, Gardener uses CoreDNS as DNS server. Per default, it is installed as a deployment into the shoot cluster that is auto-scaled horizontally to cover for QPS-intensive applications. However, doing so does not seem to be enough to completely circumvent DNS bottlenecks such as:\n Cloud provider limits for DNS lookups. Unreliable UDP connections that forces a period of timeout in case packets are dropped. Unnecessary node hopping since CoreDNS is not deployed on all nodes, and as a result DNS queries end-up traversing multiple nodes before reaching the destination server. Inefficient load-balancing of services (e.g., round-robin might not be enough when using IPTables mode). Overload of the CoreDNS replicas as the maximum amount of replicas is fixed. and more … As an alternative with extended configuration options, Gardener provides cluster-proportional autoscaling of CoreDNS. This guide focuses on the configuration of cluster-proportional autoscaling of CoreDNS and its advantages/disadvantages compared to the horizontal autoscaling. Please note that there is also the option to use a node-local DNS cache, which helps mitigate potential DNS bottlenecks (see Trade-offs in conjunction with NodeLocalDNS for considerations regarding using NodeLocalDNS together with one of the CoreDNS autoscaling approaches).\nConfiguring Cluster-Proportional DNS Autoscaling All that needs to be done to enable the usage of cluster-proportional autoscaling of CoreDNS is to set the corresponding option (spec.systemComponents.coreDNS.autoscaling.mode) in the Shoot resource to cluster-proportional:\n... spec: ... systemComponents: coreDNS: autoscaling: mode: cluster-proportional ... To switch back to horizontal DNS autoscaling, you can set the spec.systemComponents.coreDNS.autoscaling.mode to horizontal (or remove the coreDNS section).\nOnce the cluster-proportional autoscaling of CoreDNS has been enabled and the Shoot cluster has been reconciled afterwards, a ConfigMap called coredns-autoscaler will be created in the kube-system namespace with the default settings. The content will be similar to the following:\nlinear: '{\"coresPerReplica\":256,\"min\":2,\"nodesPerReplica\":16}' It is possible to adapt the ConfigMap according to your needs in case the defaults do not work as desired. The number of CoreDNS replicas is calculated according to the following formula:\nreplicas = max( ceil( cores × 1 / coresPerReplica ) , ceil( nodes × 1 / nodesPerReplica ) ) Depending on your needs, you can adjust coresPerReplica or nodesPerReplica, but it is also possible to override min if required.\nTrade-Offs of Horizontal and Cluster-Proportional DNS Autoscaling The horizontal autoscaling of CoreDNS as implemented by Gardener is fully managed, i.e., you do not need to perform any configuration changes. It scales according to the CPU usage of CoreDNS replicas, meaning that it will create new replicas if the existing ones are under heavy load. This approach scales between 2 and 5 instances, which is sufficient for most workloads. In case this is not enough, the cluster-proportional autoscaling approach can be used instead, with its more flexible configuration options.\nThe cluster-proportional autoscaling of CoreDNS as implemented by Gardener is fully managed, but allows more configuration options to adjust the default settings to your individual needs. It scales according to the cluster size, i.e., if your cluster grows in terms of cores/nodes so will the amount of CoreDNS replicas. However, it does not take the actual workload, e.g., CPU consumption, into account.\nExperience shows that the horizontal autoscaling of CoreDNS works for a variety of workloads. It does reach its limits if a cluster has a high amount of DNS requests, though. The cluster-proportional autoscaling approach allows to fine-tune the amount of CoreDNS replicas. It helps to scale in clusters of changing size. However, please keep in mind that you need to cater for the maximum amount of DNS requests as the replicas will not be adapted according to the workload, but only according to the cluster size (cores/nodes).\nTrade-Offs in Conjunction with NodeLocalDNS Using a node-local DNS cache can mitigate a lot of the potential DNS related problems. It works fine with a DNS workload that can be handle through the cache and reduces the inter-node DNS communication. As node-local DNS cache reduces the amount of traffic being sent to the cluster’s CoreDNS replicas, it usually works fine with horizontally scaled CoreDNS. Nevertheless, it also works with CoreDNS scaled in a cluster-proportional approach. In this mode, though, it might make sense to adapt the default settings as the CoreDNS workload is likely significantly reduced.\nOverall, you can view the DNS options on a scale. Horizontally scaled DNS provides a small amount of DNS servers. Especially for bigger clusters, a cluster-proportional approach will yield more CoreDNS instances and hence may yield a more balanced DNS solution. By adapting the settings you can further increase the amount of CoreDNS replicas. On the other end of the spectrum, a node-local DNS cache provides DNS on every node and allows to reduce the amount of (backend) CoreDNS instances regardless if they are horizontally or cluster-proportionally scaled.\n","categories":"","description":"","excerpt":"DNS Autoscaling This is a short guide describing different options how …","ref":"/docs/gardener/dns-autoscaling/","tags":"","title":"DNS Autoscaling"},{"body":"Request DNS Names in Shoot Clusters Introduction Within a shoot cluster, it is possible to request DNS records via the following resource types:\n Ingress Service DNSEntry It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In some Gardener setups the shoot-dns-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-dns-service ... Before you start You should :\n Have created a shoot cluster Have created and correctly configured a DNS Provider (Please consult this page for more information) Have a basic understanding of DNS (see link under References) There are 2 types of DNS that you can use within Kubernetes :\n internal (usually managed by coreDNS) external (managed by a public DNS provider). This page, and the extension, exclusively works for external DNS handling.\nGardener allows 2 way of managing your external DNS:\n Manually, which means you are in charge of creating / maintaining your Kubernetes related DNS entries Via the Gardener DNS extension Gardener DNS extension The managed external DNS records feature of the Gardener clusters makes all this easier. You do not need DNS service provider specific knowledge, and in fact you do not need to leave your cluster at all to achieve that. You simply annotate the Ingress / Service that needs its DNS records managed and it will be automatically created / managed by Gardener.\nManaged external DNS records are supported with the following DNS provider types:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns Request DNS records for Ingress resources To request a DNS name for Ingress, Service or Gateway (Istio or Gateway API) objects in the shoot cluster it must be annotated with the DNS class garden and an annotation denoting the desired DNS names.\nExample for an annotated Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: # Let Gardener manage external DNS records for this Ingress. dns.gardener.cloud/dnsnames: special.example.com # Use \"*\" to collects domains names from .spec.rules[].host dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden # If you are delegating the certificate management to Gardener, uncomment the following line #cert.gardener.cloud/purpose: managed spec: rules: - host: special.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 # Uncomment the following part if you are delegating the certificate management to Gardener #tls: # - hosts: # - special.example.com # secretName: my-cert-secret-name For an Ingress, the DNS names are already declared in the specification. Nevertheless the dnsnames annotation must be present. Here a subset of the DNS names of the ingress can be specified. If DNS names for all names are desired, the value all can be used.\nKeep in mind that ingress resources are ignored unless an ingress controller is set up. Gardener does not provide an ingress controller by default. For more details, see Ingress Controllers and Service in the Kubernetes documentation.\nRequest DNS records for service type LoadBalancer Example for an annotated Service (it must have the type LoadBalancer) resource:\napiVersion: v1 kind: Service metadata: name: amazing-svc annotations: # Let Gardener manage external DNS records for this Service. dns.gardener.cloud/dnsnames: special.example.com dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden spec: selector: app: amazing-app ports: - protocol: TCP port: 80 targetPort: 8080 type: LoadBalancer Request DNS records for Gateway resources Please see Istio Gateways or Gateway API for details.\nCreating a DNSEntry resource explicitly It is also possible to create a DNS entry via the Kubernetes resource called DNSEntry:\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSEntry metadata: annotations: # Let Gardener manage this DNS entry. dns.gardener.cloud/class: garden name: special-dnsentry namespace: default spec: dnsName: special.example.com ttl: 600 targets: - 1.2.3.4 If one of the accepted DNS names is a direct subname of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nYou can check the status of the DNSEntry with\n$ kubectl get dnsentry NAME DNS TYPE PROVIDER STATUS AGE mydnsentry special.example.com aws-route53 default/aws Ready 24s As soon as the status of the entry is Ready, the provider has accepted the new DNS record. Depending on the provider and your DNS settings and cache, it may take up to 24 hours for the new entry to be propagated over all internet.\nMore examples can be found here\nRequest DNS records for Service/Ingress resources using a DNSAnnotation resource In rare cases it may not be possible to add annotations to a Service or Ingress resource object.\nE.g.: the helm chart used to deploy the resource may not be adaptable for some reasons or some automation is used, which always restores the original content of the resource object by dropping any additional annotations.\nIn these cases, it is recommended to use an additional DNSAnnotation resource in order to have more flexibility that DNSentry resources. The DNSAnnotation resource makes the DNS shoot service behave as if annotations have been added to the referenced resource.\nFor the Ingress example shown above, you can create a DNSAnnotation resource alternatively to provide the annotations.\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSAnnotation metadata: annotations: dns.gardener.cloud/class: garden name: test-ingress-annotation namespace: default spec: resourceRef: kind: Ingress apiVersion: networking.k8s.io/v1 name: test-ingress namespace: default annotations: dns.gardener.cloud/dnsnames: '*' dns.gardener.cloud/class: garden Note that the DNSAnnotation resource itself needs the dns.gardener.cloud/class=garden annotation. This also only works for annotations known to the DNS shoot service (see Accepted External DNS Records Annotations).\nFor more details, see also DNSAnnotation objects\nAccepted External DNS Records Annotations Here are all of the accepted annotation related to the DNS extension:\n Annotation Description dns.gardener.cloud/dnsnames Mandatory for service and ingress resources, accepts a comma-separated list of DNS names if multiple names are required. For ingress you can use the special value '*'. In this case, the DNS names are collected from .spec.rules[].host. dns.gardener.cloud/class Mandatory, in the context of the shoot-dns-service it must always be set to garden. dns.gardener.cloud/ttl Recommended, overrides the default Time-To-Live of the DNS record. dns.gardener.cloud/cname-lookup-interval Only relevant if multiple domain name targets are specified. It specifies the lookup interval for CNAMEs to map them to IP addresses (in seconds) dns.gardener.cloud/realms Internal, for restricting provider access for shoot DNS entries. Typcially not set by users of the shoot-dns-service. dns.gardener.cloud/ip-stack Only relevant for provider type aws-route53 if target is an AWS load balancer domain name. Can be set for service, ingress and DNSEntry resources. It specify which DNS records with alias targets are created instead of the usual CNAME records. If the annotation is not set (or has the value ipv4), only an A record is created. With value dual-stack, both A and AAAA records are created. With value ipv6 only an AAAA record is created. service.beta.kubernetes.io/aws-load-balancer-ip-address-type=dualstack For services, behaves similar to dns.gardener.cloud/ip-stack=dual-stack. loadbalancer.openstack.org/load-balancer-address Internal, for services only: support for PROXY protocol on Openstack (which needs a hostname as ingress). Typcially not set by users of the shoot-dns-service. If one of the accepted DNS names is a direct subdomain of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore, this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nTroubleshooting General DNS tools To check the DNS resolution, use the nslookup or dig command.\n$ nslookup special.your-domain.com or with dig\n$ dig +short special.example.com Depending on your network settings, you may get a successful response faster using a public DNS server (e.g. 8.8.8.8, 8.8.4.4, or 1.1.1.1) dig @8.8.8.8 +short special.example.com DNS record events The DNS controller publishes Kubernetes events for the resource which requested the DNS record (Ingress, Service, DNSEntry). These events reveal more information about the DNS requests being processed and are especially useful to check any kind of misconfiguration, e.g. requests for a domain you don’t own.\nEvents for a successfully created DNS record:\n$ kubectl describe service my-service Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal dns-annotation 19s dns-controller-manager special.example.com: dns entry is pending Normal dns-annotation 19s (x3 over 19s) dns-controller-manager special.example.com: dns entry pending: waiting for dns reconciliation Normal dns-annotation 9s (x3 over 10s) dns-controller-manager special.example.com: dns entry active Please note, events vanish after their retention period (usually 1h).\nDNSEntry status DNSEntry resources offer a .status sub-resource which can be used to check the current state of the object.\nStatus of a erroneous DNSEntry.\n status: message: No responsible provider found observedGeneration: 3 provider: remote state: Error References Understanding DNS Kubernetes Internal DNS DNSEntry API (Golang) Managing Certificates with Gardener ","categories":"","description":"","excerpt":"Request DNS Names in Shoot Clusters Introduction Within a shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/dns_names/","tags":"","title":"DNS Names"},{"body":"DNS Providers Introduction Gardener can manage DNS records on your behalf, so that you can request them via different resource types (see here) within the shoot cluster. The domains for which you are permitted to request records, are however restricted and depend on the DNS provider configuration.\nShoot provider By default, every shoot cluster is equipped with a default provider. It is the very same provider that manages the shoot cluster’s kube-apiserver public DNS record (DNS address in your Kubeconfig).\nkind: Shoot ... dns: domain: shoot.project.default-domain.gardener.cloud You are permitted to request any sub-domain of .dns.domain that is not already taken (e.g. api.shoot.project.default-domain.gardener.cloud, *.ingress.shoot.project.default-domain.gardener.cloud) with this provider.\nAdditional providers If you need to request DNS records for domains not managed by the default provider, additional providers can be configured in the shoot specification. Alternatively, if it is enabled, it can be added as DNSProvider resources to the shoot cluster.\nAdditional providers in the shoot specification To add a providers in the shoot spec, you need set them in the spec.dns.providers list.\nFor example:\nkind: Shoot ... spec: dns: domain: shoot.project.default-domain.gardener.cloud providers: - secretName: my-aws-account type: aws-route53 - secretName: my-gcp-account type: google-clouddns Please consult the API-Reference to get a complete list of supported fields and configuration options.\n Referenced secrets should exist in the project namespace in the Garden cluster and must comply with the provider specific credentials format. The External-DNS-Management project provides corresponding examples (20-secret-\u003cprovider-name\u003e-credentials.yaml) for known providers.\nAdditional providers as resources in the shoot cluster If it is not enabled globally, you have to enable the feature in the shoot manifest:\nKind: Shoot ... spec: extensions: - type: shoot-dns-service providerConfig: apiVersion: service.dns.extensions.gardener.cloud/v1alpha1 kind: DNSConfig dnsProviderReplication: enabled: true ... To add a provider directly in the shoot cluster, provide a DNSProvider in any namespace together with Secret containing the credentials.\nFor example if the domain is hosted with AWS Route 53 (provider type aws-route53):\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSProvider metadata: annotations: dns.gardener.cloud/class: garden name: my-own-domain namespace: my-namespace spec: type: aws-route53 secretRef: name: my-own-domain-credentials domains: include: - my.own.domain.com --- apiVersion: v1 kind: Secret metadata: name: my-own-domain-credentials namespace: my-namespace type: Opaque data: # replace '...' with values encoded as base64 AWS_ACCESS_KEY_ID: ... AWS_SECRET_ACCESS_KEY: ... The External-DNS-Management project provides examples with more details for DNSProviders (30-provider-\u003cprovider-name\u003e.yaml) and credential Secrets (20-secret-\u003cprovider-name\u003e.yaml) at https://github.com/gardener/external-dns-management//examples for all supported provider types.\n","categories":"","description":"","excerpt":"DNS Providers Introduction Gardener can manage DNS records on your …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/dns_providers/","tags":"","title":"DNS Providers"},{"body":"Contract: DNSRecord Resources Every shoot cluster requires external DNS records that are publicly resolvable. The management of these DNS records requires provider-specific knowledge which is to be developed outside the Gardener’s core repository.\nCurrently, Gardener uses DNSProvider and DNSEntry resources. However, this introduces undesired coupling of Gardener to a controller that does not adhere to the Gardener extension contracts. Because of this, we plan to stop using DNSProvider and DNSEntry resources for Gardener DNS records in the future and use the DNSRecord resources described here instead.\nWhat does Gardener create DNS records for? Internal Domain Name Every shoot cluster’s kube-apiserver running in the seed is exposed via a load balancer that has a public endpoint (IP or hostname). This endpoint is used by end-users and also by system components (that are running in another network, e.g., the kubelet or kube-proxy) to talk to the cluster. In order to be robust against changes of this endpoint (e.g., caused due to re-creation of the load balancer or move of the DNS record to another seed cluster), Gardener creates a so-called internal domain name for every shoot cluster. The internal domain name is a publicly resolvable DNS record that points to the load balancer of the kube-apiserver. Gardener uses this domain name in the kubeconfigs of all system components, instead of using directly the load balancer endpoint. This way Gardener does not need to recreate all kubeconfigs if the endpoint changes - it just needs to update the DNS record.\nExternal Domain Name The internal domain name is not configurable by end-users directly but configured by the Gardener administrator. However, end-users usually prefer to have another DNS name, maybe even using their own domain sometimes, to access their Kubernetes clusters. Gardener supports that by creating another DNS record, named external domain name, that actually points to the internal domain name. The kubeconfig handed out to end-users does contain this external domain name, i.e., users can access their clusters with the DNS name they like to.\nAs not every end-user has an own domain, it is possible for Gardener administrators to configure so-called default domains. If configured, shoots that do not specify a domain explicitly get an external domain name based on a default domain (unless explicitly stated that this shoot should not get an external domain name (.spec.dns.provider=unmanaged).\nIngress Domain Name (Deprecated) Gardener allows to deploy a nginx-ingress-controller into a shoot cluster (deprecated). This controller is exposed via a public load balancer (again, either IP or hostname). Gardener creates a wildcard DNS record pointing to this load balancer. Ingress resources can later use this wildcard DNS record to expose underlying applications.\nSeed Ingress If .spec.ingress is configured in the Seed, Gardener deploys the ingress controller mentioned in .spec.ingress.controller.kind to the seed cluster. Currently, the only supported kind is “nginx”. If the ingress field is set, then .spec.dns.provider must also be set. Gardener creates a wildcard DNS record pointing to the load balancer of the ingress controller. The Ingress resources of components like Plutono and Prometheus in the garden namespace and the shoot namespaces use this wildcard DNS record to expose their underlying applications.\nWhat needs to be implemented to support a new DNS provider? As part of the shoot flow, Gardener will create a number of DNSRecord resources in the seed cluster (one for each of the DNS records mentioned above) that need to be reconciled by an extension controller. These resources contain the following information:\n The DNS provider type (e.g., aws-route53, google-clouddns, …) A reference to a Secret object that contains the provider-specific credentials used to communicate with the provider’s API. The fully qualified domain name (FQDN) of the DNS record, e.g. “api.\u003cshoot domain\u003e”. The DNS record type, one of A, AAAA, CNAME, or TXT. The DNS record values, that is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records. Optionally, the DNSRecord resource may contain also the following information:\n The region of the DNS record. If not specified, the region specified in the referenced Secret shall be used. If that is also not specified, the extension controller shall use a certain default region. The DNS hosted zone of the DNS record. If not specified, it shall be determined automatically by the extension controller by getting all hosted zones of the account and searching for the longest zone name that is a suffix of the fully qualified domain name (FQDN) mentioned above. The TTL of the DNS record in seconds. If not specified, it shall be set by the extension controller to 120. Example DNSRecord:\n--- apiVersion: v1 kind: Secret metadata: name: dnsrecord-bar-external namespace: shoot--foo--bar type: Opaque data: # aws-route53 specific credentials here --- apiVersion: extensions.gardener.cloud/v1alpha1 kind: DNSRecord metadata: name: dnsrecord-external namespace: default spec: type: aws-route53 secretRef: name: dnsrecord-bar-external namespace: shoot--foo--bar # region: eu-west-1 # zone: ZFOO name: api.bar.foo.my-fancy-domain.com recordType: A values: - 1.2.3.4 # ttl: 600 In order to support a new DNS record provider, you need to write a controller that watches all DNSRecords with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the AWS route53 provider.\nKey Names in Secrets Containing Provider-Specific Credentials For compatibility with existing setups, extension controllers shall support two different namings of keys in secrets containing provider-specific credentials:\n The naming used by the external-dns-management DNS controller. For example, on AWS the key names are AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION. The naming used by other provider-specific extension controllers, e.g. for infrastructure. For example, on AWS the key names are accessKeyId, secretAccessKey, and region. Avoiding Reading the DNS Hosted Zones If the DNS hosted zone is not specified in the DNSRecord resource, during the first reconciliation the extension controller shall determine the correct DNS hosted zone for the specified FQDN and write it to the status of the resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: DNSRecord metadata: name: dnsrecord-external namespace: shoot--foo--bar spec: ... status: lastOperation: ... zone: ZFOO On subsequent reconciliations, the extension controller shall use the zone from the status and avoid reading the DNS hosted zones from the provider. If the DNSRecord resource specifies a zone in .spec.zone and the extension controller has written a value to .status.zone, the first one shall be considered with higher priority by the extension controller.\nNon-Provider Specific Information Required for DNS Record Creation Some providers might require further information that is not provider specific but already part of the shoot resource. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the DNSRecord resource itself.\nUsing DNSRecord Resources gardenlet manages DNSRecord resources for all three DNS records mentioned above (internal, external, and ingress). In order to successfully reconcile a shoot with the feature gate enabled, extension controllers for DNSRecord resources for types used in the default, internal, and custom domain secrets should be registered via ControllerRegistration resources.\n Note: For compatibility reasons, the spec.dns.providers section is still used to specify additional providers. Only the one marked as primary: true will be used for DNSRecord. All others are considered by the shoot-dns-service extension only (if deployed).\n Support for DNSRecord Resources in the Provider Extensions The following table contains information about the provider extension version that adds support for DNSRecord resources:\n Extension Version provider-alicloud v1.26.0 provider-aws v1.27.0 provider-azure v1.21.0 provider-gcp v1.18.0 provider-openstack v1.21.0 provider-vsphere N/A provider-equinix-metal N/A provider-kubevirt N/A provider-openshift N/A Support for DNSRecord IPv6 recordType: AAAA in the Provider Extensions The following table contains information about the provider extension version that adds support for DNSRecord IPv6 recordType: AAAA:\n Extension Version provider-alicloud N/A provider-aws N/A provider-azure N/A provider-gcp N/A provider-openstack N/A provider-vsphere N/A provider-equinix-metal N/A provider-kubevirt N/A provider-openshift N/A provider-local v1.63.0 References and Additional Resources DNSRecord API (Golang specification) Sample Implementation for the AWS Route53 Provider ","categories":"","description":"","excerpt":"Contract: DNSRecord Resources Every shoot cluster requires external …","ref":"/docs/gardener/extensions/dnsrecord/","tags":"","title":"DNS Record"},{"body":"DNS Search Path Optimization DNS Search Path Using fully qualified names has some downsides, e.g., it may become harder to move deployments from one landscape to the next. It is far easier and simple to rely on short/local names, which may have different meaning depending on the context they are used in.\nThe DNS search path allows for the usage of short/local names. It is an ordered list of DNS suffixes to append to short/local names to create a fully qualified name.\nIf a short/local name should be resolved, each entry is appended to it one by one to check whether it can be resolved. The process stops when either the name could be resolved or the DNS search path ends. As the last step after trying the search path, the short/local name is attempted to be resolved on it own.\nDNS Option ndots As explained in the section above, the DNS search path is used for short/local names to create fully qualified names. The DNS option ndots specifies how many dots (.) a name needs to have to be considered fully qualified. For names with less than ndots dots (.), the DNS search path will be applied.\nDNS Search Path, ndots, and Kubernetes Kubernetes tries to make it easy/convenient for developers to use name resolution. It provides several means to address a service, most notably by its name directly, using the namespace as suffix, utilizing \u003cnamespace\u003e.svc as suffix or as a fully qualified name as \u003cservice\u003e.\u003cnamespace\u003e.svc.cluster.local (assuming cluster.local to be the cluster domain).\nThis is why the DNS search path is fairly long in Kubernetes, usually consisting of \u003cnamespace\u003e.svc.cluster.local, svc.cluster.local, cluster.local, and potentially some additional entries coming from the local network of the cluster. For various reasons, the default ndots value in the context of Kubernetes is with 5, also fairly large. See this comment for a more detailed description.\nDNS Search Path/ndots Problem in Kubernetes As the DNS search path is long and ndots is large, a lot of DNS queries might traverse the DNS search path. This results in an explosion of DNS requests.\nFor example, consider the name resolution of the default kubernetes service kubernetes.default.svc.cluster.local. As this name has only four dots, it is not considered a fully qualified name according to the default ndots=5 setting. Therefore, the DNS search path is applied, resulting in the following queries being created\n kubernetes.default.svc.cluster.local.some-namespace.svc.cluster.local kubernetes.default.svc.cluster.local.svc.cluster.local kubernetes.default.svc.cluster.local.cluster.local kubernetes.default.svc.cluster.local.network-domain … In IPv4/IPv6 dual stack systems, the amount of DNS requests may even double as each name is resolved for IPv4 and IPv6.\nGeneral Workarounds/Mitigations Kubernetes provides the capability to set the DNS options for each pod (see Pod DNS config for details). However, this has to be applied for every pod (doing name resolution) to resolve the problem. A mutating webhook may be useful in this regard. Unfortunately, the DNS requirements may be different depending on the workload. Therefore, a general solution may difficult to impossible.\nAnother approach is to use always fully qualified names and append a dot (.) to the name to prevent the name resolution system from using the DNS search path. This might be somewhat counterintuitive as most developers are not used to the trailing dot (.). Furthermore, it makes moving to different landscapes more difficult/error-prone.\nGardener Specific Workarounds/Mitigations Gardener allows users to customize their DNS configuration. CoreDNS allows several approaches to deal with the requests generated by the DNS search path. Caching is possible as well as query rewriting. There are also several other plugins available, which may mitigate the situation.\nGardener DNS Query Rewriting As explained above, the application of the DNS search path may lead to the undesired creation of DNS requests. Especially with the default setting of ndots=5, seemingly fully qualified names pointing to services in the cluster may trigger the DNS search path application.\nGardener allows to automatically rewrite some obviously incorrect DNS names, which stem from an application of the DNS search path to the most likely desired name. This will automatically rewrite requests like service.namespace.svc.cluster.local.svc.cluster.local to service.namespace.svc.cluster.local.\nIn case the applications also target services for name resolution, which are outside of the cluster and have less than ndots dots, it might be helpful to prevent search path application for them as well. One way to achieve it is by adding them to the commonSuffixes:\n... spec: ... systemComponents: coreDNS: rewriting: commonSuffixes: - gardener.cloud - example.com ... DNS requests containing a common suffix and ending in .svc.cluster.local are assumed to be incorrect application of the DNS search path. Therefore, they are rewritten to everything ending in the common suffix. For example, www.gardener.cloud.svc.cluster.local would be rewritten to www.gardener.cloud.\nPlease note that the common suffixes should be long enough and include enough dots (.) to prevent random overlap with other DNS queries. For example, it would be a bad idea to simply put com on the list of common suffixes, as there may be services/namespaces which have com as part of their name. The effect would be seemingly random DNS requests. Gardener requires that common suffixes contain at least one dot (.) and adds a second dot at the beginning. For instance, a common suffix of example.com in the configuration would match *.example.com.\nSince some clients verify the host in the response of a DNS query, the host must also be rewritten. For that reason, we can’t rewrite a query for service.dst-namespace.svc.cluster.local.src-namespace.svc.cluster.local or www.example.com.src-namespace.svc.cluster.local, as for an answer rewrite src-namespace would not be known.\n","categories":"","description":"","excerpt":"DNS Search Path Optimization DNS Search Path Using fully qualified …","ref":"/docs/gardener/dns-search-path-optimization/","tags":"","title":"DNS Search Path Optimization"},{"body":"Gardener Extension for DNS services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nExtension-Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: \"extension-dns-service\" namespace: shoot--project--abc spec: type: shoot-dns-service How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for DNS services for shoot clusters","excerpt":"Gardener extension controller for DNS services for shoot clusters","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/","tags":"","title":"DNS services"},{"body":"Using the latest Tag for an Image Many Dockerfiles use the FROM package:latest pattern at the top of their Dockerfiles to pull the latest image from a Docker registry.\nBad Dockerfile FROMalpineWhile simple, using the latest tag for an image means that your build can suddenly break if that image gets updated. This can lead to problems where everything builds fine locally (because your local cache thinks it is the latest), while a build server may fail, because some pipelines make a clean pull on every build. Additionally, troubleshooting can prove to be difficult, since the maintainer of the Dockerfile didn’t actually make any changes.\nGood Dockerfile A digest takes the place of the tag when pulling an image. This will ensure that your Dockerfile remains immutable.\nFROMalpine@sha256:7043076348bf5040220df6ad703798fd8593a0918d06d3ce30c6c93be117e430Running apt/apk/yum update Running apt-get install is one of those things virtually every Debian-based Dockerfile will have to do in order to satiate some external package requirements your code needs to run. However, using apt-get as an example, this comes with its own problems.\napt-get upgrade\nThis will update all your packages to their latests versions, which can be bad because it prevents your Dockerfile from creating consistent, immutable builds.\napt-get update (in a different line than the one running your apt-get install command)\nRunning apt-get update as a single line entry will get cached by the build and won’t actually run every time you need to run apt-get install. Instead, make sure you run apt-get update in the same line with all the packages to ensure that all are updated correctly.\nAvoid Big Container Images Building a small container image will reduce the time needed to start or restart pods. An image based on the popular Alpine Linux project is much smaller than most distribution based images (~5MB). For most popular languages and products, there is usually an official Alpine Linux image, e.g., golang, nodejs, and postgres.\n$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE postgres 9.6.9-alpine 6583932564f8 13 days ago 39.26 MB postgres 9.6 d92dad241eff 13 days ago 235.4 MB postgres 10.4-alpine 93797b0f31f4 13 days ago 39.56 MB In addition, for compiled languages such as Go or C++ that do not require build time tooling during runtime, it is recommended to avoid build time tooling in the final images. With Docker’s support for multi-stages builds, this can be easily achieved with minimal effort. Such an example can be found at Multi-stage builds.\nGoogle’s distroless image is also a good base image.\n","categories":"","description":"Common Dockerfile pitfalls","excerpt":"Common Dockerfile pitfalls","ref":"/docs/guides/applications/dockerfile-pitfall/","tags":"","title":"Dockerfile Pitfalls"},{"body":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster Motivation IPv6 adoption is continuously growing, already overtaking IPv4 in certain regions, e.g. India, or scenarios, e.g. mobile. Even though most IPv6 installations deploy means to reach IPv4, it might still be beneficial to expose services natively via IPv4 and IPv6 instead of just relying on IPv4.\nDisadvantages of full IPv4/IPv6 (dual-stack) Deployments Enabling full IPv4/IPv6 (dual-stack) support in a kubernetes cluster is a major endeavor. It requires a lot of changes and restarts of all pods so that all pods get addresses for both IP families. A side-effect of dual-stack networking is that failures may be hidden as network traffic may take the other protocol to reach the target. For this reason and also due to reduced operational complexity, service teams might lean towards staying in a single-stack environment as much as possible. Luckily, this is possible with Gardener and IPv4/IPv6 (dual-stack) ingress on AWS.\nSimplifying IPv4/IPv6 (dual-stack) Ingress with Protocol Translation on AWS Fortunately, the network load balancer on AWS supports automatic protocol translation, i.e. it can expose both IPv4 and IPv6 endpoints while communicating with just one protocol to the backends. Under the hood, automatic protocol translation takes place. Client IP address preservation can be achieved by using proxy protocol.\nThis approach enables users to expose IPv4 workload to IPv6-only clients without having to change the workload/service. Without requiring invasive changes, it allows a fairly simple first step into the IPv6 world for services just requiring ingress (incoming) communication.\nNecessary Shoot Cluster Configuration Changes for IPv4/IPv6 (dual-stack) Ingress To be able to utilize IPv4/IPv6 (dual-stack) Ingress in an IPv4 shoot cluster, the cluster needs to meet two preconditions:\n dualStack.enabled needs to be set to true to configure VPC/subnet for IPv6 and add a routing rule for IPv6. (This does not add IPv6 addresses to kubernetes nodes.) loadBalancerController.enabled needs to be set to true as well to use the load balancer controller, which supports dual-stack ingress. apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig dualStack: enabled: true controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerController: enabled: true ... When infrastructureConfig.networks.vpc.id is set to the ID of an existing VPC, please make sure that your VPC has an Amazon-provided IPv6 CIDR block added.\nAfter adapting the shoot specification and reconciling the cluster, dual-stack load balancers can be created using kubernetes services objects.\nCreating an IPv4/IPv6 (dual-stack) Ingress With the preconditions set, creating an IPv4/IPv6 load balancer is as easy as annotating a service with the correct annotations:\napiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance service.beta.kubernetes.io/aws-load-balancer-type: external name: ... namespace: ... spec: ... type: LoadBalancer In case the client IP address should be preserved, the following annotation can be used to enable proxy protocol. (The pod receiving the traffic needs to be configured for proxy protocol as well.)\n service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: \"*\" Please note that changing an existing Service to dual-stack may cause the creation of a new load balancer without deletion of the old AWS load balancer resource. While this helps in a seamless migration by not cutting existing connections it may lead to wasted/forgotten resources. Therefore, the (manual) cleanup needs to be taken into account when migrating an existing Service instance.\nFor more details see AWS Load Balancer Documentation - Network Load Balancer.\nDNS Considerations to Prevent Downtime During a Dual-Stack Migration In case the migration of an existing service is desired, please check if there are DNS entries directly linked to the corresponding load balancer. The migrated load balancer will have a new domain name immediately, which will not be ready in the beginning. Therefore, a direct migration of the domain name entries is not desired as it may cause a short downtime, i.e. domain name entries without backing IP addresses.\nIf there are DNS entries directly linked to the corresponding load balancer and they are managed by the shoot-dns-service, you can identify this via annotations with the prefix dns.gardener.cloud/. Those annotations can be linked to a Service, Ingress or Gateway resources. Alternatively, they may also use DNSEntry or DNSAnnotation resources.\nFor a seamless migration without downtime use the following three step approach:\n Temporarily prevent direct DNS updates Migrate the load balancer and wait until it is operational Allow DNS updates again To prevent direct updates of the DNS entries when the load balancer is migrated add the annotation dns.gardener.cloud/ignore: 'true' to all affected resources next to the other dns.gardener.cloud/... annotations before starting the migration. For example, in case of a Service ensure that the service looks like the following:\nkind: Service metadata: annotations: dns.gardener.cloud/ignore: 'true' dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: '...' ... Next, migrate the load balancer to be dual-stack enabled by adding/changing the corresponding annotations.\nYou have multiple options how to check that the load balancer has been provisioned successfully. It might be useful to peek into status.loadBalancer.ingress of the corresponding Service to identify the load balancer:\n Check in the AWS console for the corresponding load balancer provisioning state Perform domain name lookups with nslookup/dig to check whether the name resolves to an IP address. Call your workload via the new load balancer, e.g. using curl --resolve \u003cmy-domain-name\u003e:\u003cport\u003e:\u003cIP-address\u003e https://\u003cmy-domain-name\u003e:\u003cport\u003e, which allows you to call your service with the “correct” domain name without using actual name resolution. Wait a fixed period of time as load balancer creation is usually finished within 15 minutes Once the load balancer has been provisioned, you can remove the annotation dns.gardener.cloud/ignore: 'true' again from the affected resources. It may take some additional time until the domain name change finally propagates (up to one hour).\n","categories":"","description":"","excerpt":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/dual-stack-ingress/","tags":"","title":"Dual Stack Ingress"},{"body":"Dependency Watchdog with Local Garden Cluster Setting up Local Garden cluster A convenient way to test local dependency-watchdog changes is to use a local garden cluster. To setup a local garden cluster you can follow the setup-guide.\nDependency Watchdog resources As part of the local garden installation, a local seed will be available.\nDependency Watchdog resources created in the seed Namespaced resources In the garden namespace of the seed cluster, following resources will be created:\n Resource (GVK) Name {apiVersion: v1, Kind: ServiceAccount} dependency-watchdog-prober {apiVersion: v1, Kind: ServiceAccount} dependency-watchdog-weeder {apiVersion: apps/v1, Kind: Deployment} dependency-watchdog-prober {apiVersion: apps/v1, Kind: Deployment} dependency-watchdog-weeder {apiVersion: v1, Kind: ConfigMap} dependency-watchdog-prober-* {apiVersion: v1, Kind: ConfigMap} dependency-watchdog-weeder-* {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:dependency-watchdog-prober:role {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:dependency-watchdog-weeder:role {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:dependency-watchdog-prober:role-binding {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:dependency-watchdog-weeder:role-binding {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} dependency-watchdog-prober {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} dependency-watchdog-weeder {apiVersion: v1, Kind: Secret} managedresource-dependency-watchdog-weeder {apiVersion: v1, Kind: Secret} managedresource-dependency-watchdog-prober Cluster resources Resource (GVK) Name {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRole} gardener.cloud:dependency-watchdog-prober:cluster-role {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRole} gardener.cloud:dependency-watchdog-weeder:cluster-role {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRoleBinding} gardener.cloud:dependency-watchdog-prober:cluster-role-binding {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRoleBinding} gardener.cloud:dependency-watchdog-weeder:cluster-role-binding Dependency Watchdog resources created in Shoot control namespace Resource (GVK) Name {apiVersion: v1, Kind: Secret} dependency-watchdog-prober {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} shoot-core-dependency-watchdog Dependency Watchdog resources created in the kube-node-lease namespace of the shoot Resource (GVK) Name {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:target:dependency-watchdog {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:target:dependency-watchdog These will be created by the GRM and will have a managed resource named shoot-core-dependency-watchdog in the shoot namespace in the seed.\nUpdate Gardener with custom Dependency Watchdog Docker images Build, Tag and Push docker images To build dependency watchdog docker images run the following make target:\n\u003e make docker-build Local gardener hosts a docker registry which can be access at localhost:5001. To enable local gardener to be able to access the custom docker images you need to tag and push these images to the embedded docker registry. To do that execute the following commands:\n\u003e docker images # Get the IMAGE ID of the dependency watchdog images that were built using docker-build make target. \u003e docker tag \u003cIMAGE-ID\u003e localhost:5001/europe-docker.pkg.dev/gardener-project/public/gardener/dependency-watchdog-prober:\u003cTAGNAME\u003e \u003e docker push localhost:5001/europe-docker.pkg.dev/gardener-project/public/gardener/dependency-watchdog-prober:\u003cTAGNAME\u003e Update ManagedResource Garden resource manager will revert back any changes that are done to the kubernetes deployment for dependency watchdog. This is quite useful in live landscapes where only tested and qualified images are used for all gardener managed components. Any change therefore is automatically reverted.\nHowever, during development and testing you will need to use custom docker images. To prevent garden resource manager from reverting the changes done to the kubernetes deployment for dependency watchdog components you must update the respective managed resources first.\n# List the managed resources \u003e kubectl get mr -n garden | grep dependency # Sample response dependency-watchdog-weeder seed True True False 26h dependency-watchdog-prober seed True True False 26h # Lets assume that you are currently testing prober and would like to use a custom docker image \u003e kubectl edit mr dependency-watchdog-prober -n garden # This will open the resource YAML for editing. Add the annotation resources.gardener.cloud/ignore=true # Reference: https://github.com/gardener/gardener/blob/master/docs/concepts/resource-manager.md # Save the YAML file. When you are done with your testing then you can again edit the ManagedResource and remove the annotation. Garden resource manager will revert back to the image with which gardener was initially built and started.\nUpdate Kubernetes Deployment Find and update the kubernetes deployment for dependency watchdog.\n\u003e kubectl get deploy -n garden | grep dependency # Sample response dependency-watchdog-weeder 1/1 1 1 26h dependency-watchdog-prober 1/1 1 1 26h # Lets assume that you are currently testing prober and would like to use a custom docker image \u003e kubectl edit deploy dependency-watchdog-prober -n garden # This will open the resource YAML for editing. Change the image or any other changes and save. ","categories":"","description":"","excerpt":"Dependency Watchdog with Local Garden Cluster Setting up Local Garden …","ref":"/docs/other-components/dependency-watchdog/setup/dwd-using-local-garden/","tags":"","title":"Dwd Using Local Garden"},{"body":"Overview The example shows how to run a Postgres database on Kubernetes and how to dynamically provision and mount the storage volumes needed by the database\nRun Postgres Database Define the following Kubernetes resources in a yaml file:\n PersistentVolumeClaim (PVC) Deployment PersistentVolumeClaim apiVersion: v1 kind: PersistentVolumeClaim metadata: name: postgresdb-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 9Gi storageClassName: 'default' This defines a PVC using the storage class default. Storage classes abstract from the underlying storage provider as well as other parameters, like disk-type (e.g., solid-state vs standard disks).\nThe default storage class has the annotation {“storageclass.kubernetes.io/is-default-class”:“true”}.\n $ kubectl describe sc default Name: default IsDefaultClass: Yes Annotations: kubectl.kubernetes.io/last-applied-configuration={\"apiVersion\":\"storage.k8s.io/v1beta1\",\"kind\":\"StorageClass\",\"metadata\":{\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"},\"labels\":{\"addonmanager.kubernetes.io/mode\":\"Exists\"},\"name\":\"default\",\"namespace\":\"\"},\"parameters\":{\"type\":\"gp2\"},\"provisioner\":\"kubernetes.io/aws-ebs\"} ,storageclass.kubernetes.io/is-default-class=true Provisioner: kubernetes.io/aws-ebs Parameters: type=gp2 AllowVolumeExpansion: \u003cunset\u003e MountOptions: \u003cnone\u003e ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: \u003cnone\u003e A Persistent Volume is automatically created when it is dynamically provisioned. In the following example, the PVC is defined as “postgresdb-pvc”, and a corresponding PV “pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb” is created and associated with the PVC automatically.\n$ kubectl create -f .\\postgres_deployment.yaml persistentvolumeclaim \"postgresdb-pvc\" created $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Delete Bound default/postgresdb-pvc default 3s $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresdb-pvc Bound pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO default 8s Notice that the RECLAIM POLICY is Delete (default value), which is one of the two reclaim policies, the other one is Retain. (A third policy Recycle has been deprecated). In the case of Delete, the PV is deleted automatically when the PVC is removed, and the data on the PVC will also be lost.\nOn the other hand, a PV with Retain policy will not be deleted when the PVC is removed, and moved to Release status, so that data can be recovered by Administrators later.\nYou can use the kubectl patch command to change the reclaim policy as described in Change the Reclaim Policy of a PersistentVolume or use kubectl edit pv \u003cpv-name\u003e to edit it online as shown below:\n$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Delete Bound default/postgresdb-pvc default 44m # change the reclaim policy from \"Delete\" to \"Retain\" $ kubectl edit pv pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb persistentvolume \"pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb\" edited # check the reclaim policy afterwards $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Bound default/postgresdb-pvc default 45m Deployment Once a PVC is created, you can use it in your container via volumes.persistentVolumeClaim.claimName. In the below example, the PVC postgresdb-pvc is mounted as readable and writable, and in volumeMounts two paths in the container are mounted to subfolders in the volume.\napiVersion: apps/v1 kind: Deployment metadata: name: postgres namespace: default labels: app: postgres annotations: deployment.kubernetes.io/revision: \"1\" spec: replicas: 1 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 1 selector: matchLabels: app: postgres template: metadata: name: postgres labels: app: postgres spec: containers: - name: postgres image: \"cpettech.docker.repositories.sap.ondemand.com/jtrack_postgres:howto\" env: - name: POSTGRES_USER value: postgres - name: POSTGRES_PASSWORD value: p5FVqfuJFrM42cVX9muQXxrC3r8S9yn0zqWnFR6xCoPqxqVQ - name: POSTGRES_INITDB_XLOGDIR value: \"/var/log/postgresql/logs\" ports: - containerPort: 5432 volumeMounts: - mountPath: /var/lib/postgresql/data name: postgre-db subPath: data # https://github.com/kubernetes/website/pull/2292. Solve the issue of crashing initdb due to non-empty directory (i.e. lost+found) - mountPath: /var/log/postgresql/logs name: postgre-db subPath: logs volumes: - name: postgre-db persistentVolumeClaim: claimName: postgresdb-pvc readOnly: false imagePullSecrets: - name: cpettechregistry To check the mount points in the container:\n$ kubectl get po NAME READY STATUS RESTARTS AGE postgres-7f485fd768-c5jf9 1/1 Running 0 32m $ kubectl exec -it postgres-7f485fd768-c5jf9 bash root@postgres-7f485fd768-c5jf9:/# ls /var/lib/postgresql/data/ base pg_clog pg_dynshmem pg_ident.conf pg_multixact pg_replslot pg_snapshots pg_stat_tmp pg_tblspc PG_VERSION postgresql.auto.conf postmaster.opts global pg_commit_ts pg_hba.conf pg_logical pg_notify pg_serial pg_stat pg_subtrans pg_twophase pg_xlog postgresql.conf postmaster.pid root@postgres-7f485fd768-c5jf9:/# ls /var/log/postgresql/logs/ 000000010000000000000001 archive_status Deleting a PersistentVolumeClaim In case of a Delete policy, deleting a PVC will also delete its associated PV. If Retain is the reclaim policy, the PV will change status from Bound to Released when the PVC is deleted.\n# Check pvc and pv before deletion $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresdb-pvc Bound pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO default 50m $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Bound default/postgresdb-pvc default 50m # delete pvc $ kubectl delete pvc postgresdb-pvc persistentvolumeclaim \"postgresdb-pvc\" deleted # pv changed to status \"Released\" $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Released default/postgresdb-pvc default 51m ","categories":"","description":"Running a Postgres database on Kubernetes","excerpt":"Running a Postgres database on Kubernetes","ref":"/docs/guides/applications/dynamic-pvc/","tags":"","title":"Dynamic Volume Provisioning"},{"body":"Gardener Extension for Networking Filter \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-networking-filter extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nExtension Resources Currently there is nothing to specify in the extension spec.\nExample extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-networking-filter namespace: shoot--project--abc spec: When an extension resource is reconciled, the extension controller will create a daemonset egress-filter-applier on the shoot containing a Dockerfile container.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for egress filtering for shoot clusters","excerpt":"Gardener extension controller for egress filtering for shoot clusters","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/","tags":"","title":"Egress filtering"},{"body":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster Motivation IPv6 adoption is continuously growing, already overtaking IPv4 in certain regions, e.g. India, or scenarios, e.g. mobile. Even though most IPv6 installations deploy means to reach IPv4, it might still be beneficial to expose services natively via IPv4 and IPv6 instead of just relying on IPv4.\nDisadvantages of full IPv4/IPv6 (dual-stack) Deployments Enabling full IPv4/IPv6 (dual-stack) support in a kubernetes cluster is a major endeavor. It requires a lot of changes and restarts of all pods so that all pods get addresses for both IP families. A side-effect of dual-stack networking is that failures may be hidden as network traffic may take the other protocol to reach the target. For this reason and also due to reduced operational complexity, service teams might lean towards staying in a single-stack environment as much as possible. Luckily, this is possible with Gardener and IPv4/IPv6 (dual-stack) ingress on AWS.\nSimplifying IPv4/IPv6 (dual-stack) Ingress with Protocol Translation on AWS Fortunately, the network load balancer on AWS supports automatic protocol translation, i.e. it can expose both IPv4 and IPv6 endpoints while communicating with just one protocol to the backends. Under the hood, automatic protocol translation takes place. Client IP address preservation can be achieved by using proxy protocol.\nThis approach enables users to expose IPv4 workload to IPv6-only clients without having to change the workload/service. Without requiring invasive changes, it allows a fairly simple first step into the IPv6 world for services just requiring ingress (incoming) communication.\nNecessary Shoot Cluster Configuration Changes for IPv4/IPv6 (dual-stack) Ingress To be able to utilize IPv4/IPv6 (dual-stack) Ingress in an IPv4 shoot cluster, the cluster needs to meet two preconditions:\n dualStack.enabled needs to be set to true to configure VPC/subnet for IPv6 and add a routing rule for IPv6. (This does not add IPv6 addresses to kubernetes nodes.) loadBalancerController.enabled needs to be set to true as well to use the load balancer controller, which supports dual-stack ingress. apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig dualStack: enabled: true controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerController: enabled: true ... When infrastructureConfig.networks.vpc.id is set to the ID of an existing VPC, please make sure that your VPC has an Amazon-provided IPv6 CIDR block added.\nAfter adapting the shoot specification and reconciling the cluster, dual-stack load balancers can be created using kubernetes services objects.\nCreating an IPv4/IPv6 (dual-stack) Ingress With the preconditions set, creating an IPv4/IPv6 load balancer is as easy as annotating a service with the correct annotations:\napiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance service.beta.kubernetes.io/aws-load-balancer-type: external name: ... namespace: ... spec: ... type: LoadBalancer In case the client IP address should be preserved, the following annotation can be used to enable proxy protocol. (The pod receiving the traffic needs to be configured for proxy protocol as well.)\n service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: \"*\" Please note that changing an existing Service to dual-stack may cause the creation of a new load balancer without deletion of the old AWS load balancer resource. While this helps in a seamless migration by not cutting existing connections it may lead to wasted/forgotten resources. Therefore, the (manual) cleanup needs to be taken into account when migrating an existing Service instance.\nFor more details see AWS Load Balancer Documentation - Network Load Balancer.\nDNS Considerations to Prevent Downtime During a Dual-Stack Migration In case the migration of an existing service is desired, please check if there are DNS entries directly linked to the corresponding load balancer. The migrated load balancer will have a new domain name immediately, which will not be ready in the beginning. Therefore, a direct migration of the domain name entries is not desired as it may cause a short downtime, i.e. domain name entries without backing IP addresses.\nIf there are DNS entries directly linked to the corresponding load balancer and they are managed by the shoot-dns-service, you can identify this via annotations with the prefix dns.gardener.cloud/. Those annotations can be linked to a Service, Ingress or Gateway resources. Alternatively, they may also use DNSEntry or DNSAnnotation resources.\nFor a seamless migration without downtime use the following three step approach:\n Temporarily prevent direct DNS updates Migrate the load balancer and wait until it is operational Allow DNS updates again To prevent direct updates of the DNS entries when the load balancer is migrated add the annotation dns.gardener.cloud/ignore: 'true' to all affected resources next to the other dns.gardener.cloud/... annotations before starting the migration. For example, in case of a Service ensure that the service looks like the following:\nkind: Service metadata: annotations: dns.gardener.cloud/ignore: 'true' dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: '...' ... Next, migrate the load balancer to be dual-stack enabled by adding/changing the corresponding annotations.\nYou have multiple options how to check that the load balancer has been provisioned successfully. It might be useful to peek into status.loadBalancer.ingress of the corresponding Service to identify the load balancer:\n Check in the AWS console for the corresponding load balancer provisioning state Perform domain name lookups with nslookup/dig to check whether the name resolves to an IP address. Call your workload via the new load balancer, e.g. using curl --resolve \u003cmy-domain-name\u003e:\u003cport\u003e:\u003cIP-address\u003e https://\u003cmy-domain-name\u003e:\u003cport\u003e, which allows you to call your service with the “correct” domain name without using actual name resolution. Wait a fixed period of time as load balancer creation is usually finished within 15 minutes Once the load balancer has been provisioned, you can remove the annotation dns.gardener.cloud/ignore: 'true' again from the affected resources. It may take some additional time until the domain name change finally propagates (up to one hour).\n","categories":"","description":"Use IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster on AWS","excerpt":"Use IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster on …","ref":"/docs/guides/networking/dual-stack-ipv4-ipv6-ingress-aws/","tags":"","title":"Enable IPv4/IPv6 (dual-stack) Ingress on AWS"},{"body":"etcd - Key-Value Store for Kubernetes etcd is a strongly consistent key-value store and the most prevalent choice for the Kubernetes persistence layer. All API cluster objects like Pods, Deployments, Secrets, etc., are stored in etcd, which makes it an essential part of a Kubernetes control plane.\nGarden or Shoot Cluster Persistence Each garden or shoot cluster gets its very own persistence for the control plane. It runs in the shoot namespace on the respective seed cluster (or in the garden namespace in the garden cluster, respectively). Concretely, there are two etcd instances per shoot cluster, which the kube-apiserver is configured to use in the following way:\n etcd-main A store that contains all “cluster critical” or “long-term” objects. These object kinds are typically considered for a backup to prevent any data loss.\n etcd-events A store that contains all Event objects (events.k8s.io) of a cluster. Events usually have a short retention period and occur frequently, but are not essential for a disaster recovery.\nThe setup above prevents both, the critical etcd-main is not flooded by Kubernetes Events, as well as backup space is not occupied by non-critical data. This separation saves time and resources.\netcd Operator Configuring, maintaining, and health-checking etcd is outsourced to a dedicated operator called etcd Druid. When a gardenlet reconciles a Shoot resource or a gardener-operator reconciles a Garden resource, they manage an Etcd resource in the seed or garden cluster, containing necessary information (backup information, defragmentation schedule, resources, etc.). etcd-druid needs to manage the lifecycle of the desired etcd instance (today main or events). Likewise, when the Shoot or Garden is deleted, gardenlet or gardener-operator deletes the Etcd resources and etcd Druid takes care of cleaning up all related objects, e.g. the backing StatefulSets.\nBackup If Seeds specify backups for etcd (example), then Gardener and the respective provider extensions are responsible for creating a bucket on the cloud provider’s side (modelled through a BackupBucket resource). The bucket stores backups of Shoots scheduled on that Seed. Furthermore, Gardener creates a BackupEntry, which subdivides the bucket and thus makes it possible to store backups of multiple shoot clusters.\nHow long backups are stored in the bucket after a shoot has been deleted depends on the configured retention period in the Seed resource. Please see this example configuration for more information.\nFor Gardens specifying backups for etcd (example), the bucket must be pre-created externally and provided via the Garden specification.\nBoth etcd instances are configured to run with a special backup-restore sidecar. It takes care about regularly backing up etcd data and restoring it in case of data loss (in the main etcd only). The sidecar also performs defragmentation and other house-keeping tasks. More information can be found in the component’s GitHub repository.\nHousekeeping etcd maintenance tasks must be performed from time to time in order to re-gain database storage and to ensure the system’s reliability. The backup-restore sidecar takes care about this job as well.\nFor both Shoots and Gardens, a random time within the shoot’s maintenance time is chosen for scheduling these tasks.\n","categories":"","description":"How Gardener uses the etcd key-value store","excerpt":"How Gardener uses the etcd key-value store","ref":"/docs/gardener/concepts/etcd/","tags":"","title":"etcd"},{"body":"etcd-druid \n \netcd-druid is an etcd operator which makes it easy to configure, provision, reconcile and monitor etcd clusters. It enables management of an etcd cluster through declarative Kubernetes API model.\nIn every etcd cluster managed by etcd-druid, each etcd member is a two container Pod which consists of:\n etcd-wrapper which manages the lifecycle (validation \u0026 initialization) of an etcd. etcd-backup-restore sidecar which currently provides the following capabilities (the list is not comprehensive): etcd DB validation. Scheduled etcd DB defragmentation. Backup - etcd DB snapshots are taken regularly and backed in an object store if one is configured. Restoration - In case of a DB corruption for a single-member cluster it helps in restoring from latest set of snapshots (full \u0026 delta). Member control operations. etcd-druid additional provides the following capabilities:\n Facilitates declarative scale-out of etcd clusters.\n Provides protection against accidental deletion/mutation of resources provisioned as part of an etcd cluster.\n Offers an asynchronous and threshold based capability to process backed up snapshots to:\n Potentially minimize the recovery time by leveraging restoration from backups followed by etcd’s compaction and defragmentation. Indirectly assert integrity of the backed up snaphots. Allows seamless copy of backups between any two object store buckets.\n Start using or developing etcd-druid locally If you are looking to try out druid then you can use a Kind cluster based setup.\nhttps://github.com/user-attachments/assets/cfe0d891-f709-4d7f-b975-4300c6de67e4\nFor detailed documentation, see our /docs folder. Please find the index here.\nContributions If you wish to contribute then please see our guidelines.\nFeedback and Support We always look forward to active community engagement. Please report bugs or suggestions on how we can enhance etcd-druid on GitHub Issues.\nLicense Release under Apache-2.0 license.\n","categories":"","description":"A druid for etcd management in Gardener","excerpt":"A druid for etcd management in Gardener","ref":"/docs/other-components/etcd-druid/","tags":"","title":"Etcd Druid"},{"body":"Documentation Index Concepts Controllers Webhooks Development Testing(Unit, Integration and E2E Tests) etcd Network Latency Getting started locally using azurite emulator Getting started locally using localstack emulator Getting started locally Local End-To-End Tests Deployment etcd-druid CLI Flags Feature Gates Operations Metrics Recovery from Permanent Quorum Loss in etcd cluster Restoring single member in a Multi-Node etcd cluster Proposals DEP: Template DEP-1: Multi-Node etcd clusters DEP-2: Snapshot compaction DEP-3: Scaling up an Etcd cluster DEP-4: Etcd Member custom resource DEP-5: Etcd Operator Tasks Usage Supported K8S versions ","categories":"","description":"","excerpt":"Documentation Index Concepts Controllers Webhooks Development …","ref":"/docs/other-components/etcd-druid/readme/","tags":"","title":"Etcd Druid"},{"body":"ETCD Encryption Config The spec.kubernetes.kubeAPIServer.encryptionConfig field in the Shoot API allows users to customize encryption configurations for the API server. It provides options to specify additional resources for encryption beyond secrets.\nUsage Guidelines The resources field can be used to specify resources that should be encrypted in addition to secrets. Secrets are always encrypted. Each item is a Kubernetes resource name in plural (resource or resource.group). Wild cards are not supported. Adding an item to this list will cause patch requests for all the resources of that kind to encrypt them in the etcd. See Encrypting Confidential Data at Rest for more details. Removing an item from this list will cause patch requests for all the resources of that type to decrypt and rewrite the resource as plain text. See Decrypt Confidential Data that is Already Encrypted at Rest for more details. ℹ️ Note that configuring encryption for a custom resource is only supported for Kubernetes versions \u003e= 1.26.\n Example Usage in a Shoot spec: kubernetes: kubeAPIServer: encryptionConfig: resources: - configmaps - statefulsets.apps - customresource.fancyoperator.io ","categories":"","description":"Specifying resource types for encryption with `spec.kubernetes.kubeAPIServer.encryptionConfig`","excerpt":"Specifying resource types for encryption with …","ref":"/docs/gardener/etcd_encryption_config/","tags":"","title":"ETCD Encryption Config"},{"body":"Network Latency analysis: sn-etcd-sz vs mn-etcd-sz vs mn-etcd-mz This page captures the etcd cluster latency analysis for below scenarios using the benchmark tool (build from etcd benchmark tool).\nsn-etcd-sz -\u003e single-node etcd single zone (Only single replica of etcd will be running)\nmn-etcd-sz -\u003e multi-node etcd single zone (Multiple replicas of etcd pods will be running across nodes in a single zone)\nmn-etcd-mz -\u003e multi-node etcd multi zone (Multiple replicas of etcd pods will be running across nodes in multiple zones)\nPUT Analysis Summary sn-etcd-sz latency is ~20% less than mn-etcd-sz when benchmark tool with single client. mn-etcd-sz latency is less than mn-etcd-mz but the difference is ~+/-5%. Compared to mn-etcd-sz, sn-etcd-sz latency is higher and gradually grows with more clients and larger value size. Compared to mn-etcd-mz, mn-etcd-sz latency is higher and gradually grows with more clients and larger value size. Compared to follower, leader latency is less, when benchmark tool with single client for all cases. Compared to follower, leader latency is high, when benchmark tool with multiple clients for all cases. Sample commands:\n# write to leader benchmark put --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --val-size=256 --total=10000 \\ --endpoints=$ETCD_HOST # write to follower benchmark put --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --val-size=256 --total=10000 \\ --endpoints=$ETCD_FOLLOWER_HOST Latency analysis during PUT requests to etcd In this case benchmark tool tries to put key with random 256 bytes value. Benchmark tool loads key/value to leader with single client .\n sn-etcd-sz latency (~0.815ms) is ~50% lesser than mn-etcd-sz (~1.74ms ). mn-etcd-sz latency (~1.74ms ) is slightly lesser than mn-etcd-mz (~1.8ms) but the difference is negligible (within same ms). Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 leader 1220.0520 0.815ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 leader 586.545 1.74ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 leader 554.0155654442634 1.8ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with single client.\n mn-etcd-sz latency(~2.2ms) is 20% to 30% lesser than mn-etcd-mz(~2.7ms). Compare to follower, leader has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 follower-1 445.743 2.23ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 follower-1 378.9366747610789 2.63ms eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 follower-2 457.967 2.17ms eu-west-1a etcd-main-2 mn-etcd-sz 10000 256 1 1 follower-2 345.6586129825796 2.89ms eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool loads key/value to leader with multiple clients.\n sn-etcd-sz latency(~78.3ms) is ~10% greater than mn-etcd-sz(~71.81ms). mn-etcd-sz latency(~71.81ms) is less than mn-etcd-mz(~72.5ms) but the difference is negligible. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 leader 12638.905 78.32ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 leader 13789.248 71.81ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 leader 13728.446436395223 72.5ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with multiple clients.\n mn-etcd-sz latency(~69.8ms) is ~5% greater than mn-etcd-mz(~72.6ms). Compare to leader, follower has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 follower-1 14271.983 69.80ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 follower-1 13695.98 72.62ms eu-west-1a etcd-main-1 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 follower-2 14325.436 69.47ms eu-west-1a etcd-main-2 mn-etcd-sz 100000 256 100 1000 follower-2 15750.409490407475 63.3ms eu-west-1b etcd-main-2 mn-etcd-mz In this case benchmark tool tries to put key with random 1 MB value. Benchmark tool loads key/value to leader with single client.\n sn-etcd-sz latency(~16.35ms) is ~20% lesser than mn-etcd-sz(~20.64ms). mn-etcd-sz latency(~20.64ms) is less than mn-etcd-mz(~21.08ms) but the difference is negligible.. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 leader 61.117 16.35ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 leader 48.416 20.64ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 leader 45.7517341664802 21.08ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value withto follower single client.\n mn-etcd-sz latency(~23.10ms) is ~10% greater than mn-etcd-mz(~21.8ms). Compare to follower, leader has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 follower-1 43.261 23.10ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 follower-1 45.7517341664802 21.8ms eu-west-1c etcd-main-0 mn-etcd-mz 1000 1000000 1 1 follower-1 45.33 22.05ms eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 follower-2 40.0518 24.95ms eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 1 1 follower-2 43.28573155709838 23.09ms eu-west-1b etcd-main-2 mn-etcd-mz 1000 1000000 1 1 follower-2 45.92 21.76ms eu-west-1a etcd-main-1 mn-etcd-mz 1000 1000000 1 1 follower-2 35.5705 28.1ms eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool loads key/value to leader with multiple clients.\n sn-etcd-sz latency(~6.0375secs) is ~30% greater than mn-etcd-sz``~4.000secs). mn-etcd-sz latency(~4.000secs) is less than mn-etcd-mz(~ 4.09secs) but the difference is negligible. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 leader 55.373 6.0375secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 300 leader 67.319 4.000secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 300 leader 65.91914167957594 4.09secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with multiple clients.\n mn-etcd-sz latency(~4.04secs) is ~5% greater than mn-etcd-mz(~ 3.90secs). Compare to leader, follower has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 follower-1 66.528 4.0417secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 300 follower-1 70.6493461856332 3.90secs eu-west-1c etcd-main-0 mn-etcd-mz 1000 1000000 100 300 follower-1 71.95 3.84secs eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 follower-2 66.447 4.0164secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 300 follower-2 67.53038086369484 3.87secs eu-west-1b etcd-main-2 mn-etcd-mz 1000 1000000 100 300 follower-2 68.46 3.92secs eu-west-1a etcd-main-1 mn-etcd-mz Range Analysis Sample commands are:\n# Single connection read request with sequential keys benchmark range 0 --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --total=10000 \\ --consistency=l \\ --endpoints=$ETCD_HOST # --consistency=s [Serializable] benchmark range 0 --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --total=10000 \\ --consistency=s \\ --endpoints=$ETCD_HOST # Each read request with range query matches key 0 9999 and repeats for total number of requests. benchmark range 0 9999 --target-leader --conns=1 --clients=1 --precise \\ --total=10 \\ --consistency=s \\ --endpoints=https://etcd-main-client:2379 # Read requests with multiple connections benchmark range 0 --target-leader --conns=100 --clients=1000 --precise \\ --sequential-keys --key-starts 0 --total=100000 \\ --consistency=l \\ --endpoints=$ETCD_HOST benchmark range 0 --target-leader --conns=100 --clients=1000 --precise \\ --sequential-keys --key-starts 0 --total=100000 \\ --consistency=s \\ --endpoints=$ETCD_HOST Latency analysis during Range requests to etcd In this case benchmark tool tries to get specific key with random 256 bytes value. Benchmark tool range requests to leader with single client.\n sn-etcd-sz latency(~1.24ms) is ~40% greater than mn-etcd-sz(~0.67ms).\n mn-etcd-sz latency(~0.67ms) is ~20% lesser than mn-etcd-mz(~0.85ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true l leader 800.272 1.24ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 true l leader 1173.9081 0.67ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 true l leader 999.3020189178693 0.85ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~40% less for all cases\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true s leader 1411.229 0.70ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 true s leader 2033.131 0.35ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 true s leader 2100.2426362012025 0.47ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with single client .\n mn-etcd-sz latency(~1.3ms) is ~20% lesser than mn-etcd-mz(~1.6ms). Compare to follower, leader read request latency is ~50% less for both mn-etcd-sz, mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true l follower-1 765.325 1.3ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 true l follower-1 596.1 1.6ms eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~50% less for all cases Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true s follower-1 1823.631 0.54ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 true s follower-1 1442.6 0.69ms eu-west-1c etcd-main-0 mn-etcd-mz 10000 256 1 1 true s follower-1 1416.39 0.70ms eu-west-1c etcd-main-0 mn-etcd-mz 10000 256 1 1 true s follower-1 2077.449 0.47ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to leader with multiple client.\n sn-etcd-sz latency(~84.66ms) is ~20% greater than mn-etcd-sz(~73.95ms).\n mn-etcd-sz latency(~73.95ms) is more or less equal to mn-etcd-mz(~ 73.8ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true l leader 11775.721 84.66ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 true l leader 13446.9598 73.95ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 true l leader 13527.19810605353 73.8ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~20% lesser for all cases\n sn-etcd-sz latency(~69.37ms) is more or less equal to mn-etcd-sz(~69.89ms).\n mn-etcd-sz latency(~69.89ms) is slightly higher than mn-etcd-mz(~67.63ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true s leader 14334.9027 69.37ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 true s leader 14270.008 69.89ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 true s leader 14715.287354023869 67.63ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with multiple client.\n mn-etcd-sz latency(~60.69ms) is ~20% lesser than mn-etcd-mz(~70.76ms).\n Compare to leader, follower has lower read request latency.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true l follower-1 11586.032 60.69ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 true l follower-1 14050.5 70.76ms eu-west-1c etcd-main-0 mn-etcd-mz mn-etcd-sz latency(~86.09ms) is ~20 higher than mn-etcd-mz(~64.6ms).\n Compare to mn-etcd-sz consistency Linearizable, Serializable is ~20% higher.* Compare to mn-etcd-mz consistency Linearizable, Serializable is ~slightly less.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true s follower-1 11582.438 86.09ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 true s follower-1 15422.2 64.6ms eu-west-1c etcd-main-0 mn-etcd-mz Benchmark tool range requests to leader all keys.\n sn-etcd-sz latency(~678.77ms) is ~5% slightly lesser than mn-etcd-sz(~697.29ms).\n mn-etcd-sz latency(~697.29ms) is less than mn-etcd-mz(~701ms) but the difference is negligible.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false l leader 6.8875 678.77ms eu-west-1c etcd-main-0 sn-etcd-sz 20 256 2 5 false l leader 6.720 697.29ms eu-west-1a etcd-main-1 mn-etcd-sz 20 256 2 5 false l leader 6.7 701ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~5% slightly higher for all cases sn-etcd-sz latency(~687.36ms) is less than mn-etcd-sz(~692.68ms) but the difference is negligible.\n mn-etcd-sz latency(~692.68ms) is ~5% slightly lesser than mn-etcd-mz(~735.7ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false s leader 6.76 687.36ms eu-west-1c etcd-main-0 sn-etcd-sz 20 256 2 5 false s leader 6.635 692.68ms eu-west-1a etcd-main-1 mn-etcd-sz 20 256 2 5 false s leader 6.3 735.7ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower all keys\n mn-etcd-sz(~737.68ms) latency is ~5% slightly higher than mn-etcd-mz(~713.7ms).\n Compare to leader consistency Linearizableread request, follower is ~5% slightly higher.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false l follower-1 6.163 737.68ms eu-west-1a etcd-main-0 mn-etcd-sz 20 256 2 5 false l follower-1 6.52 713.7ms eu-west-1c etcd-main-0 mn-etcd-mz mn-etcd-sz latency(~757.73ms) is ~10% higher than mn-etcd-mz(~690.4ms).\n Compare to follower consistency Linearizableread request, follower consistency Serializable is ~3% slightly higher for mn-etcd-sz.\n Compare to follower consistency Linearizableread request, follower consistency Serializable is ~5% less for mn-etcd-mz.\n *Compare to leader consistency Serializableread request, follower consistency Serializable is ~5% less for mn-etcd-mz. *\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false s follower-1 6.0295 757.73ms eu-west-1a etcd-main-0 mn-etcd-sz 20 256 2 5 false s follower-1 6.87 690.4ms eu-west-1c etcd-main-0 mn-etcd-mz In this case benchmark tool tries to get specific key with random `1MB` value. Benchmark tool range requests to leader with single client.\n sn-etcd-sz latency(~5.96ms) is ~5% lesser than mn-etcd-sz(~6.28ms).\n mn-etcd-sz latency(~6.28ms) is ~10% higher than mn-etcd-mz(~5.3ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true l leader 167.381 5.96ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 true l leader 158.822 6.28ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 true l leader 187.94 5.3ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~15% less for sn-etcd-sz, mn-etcd-sz, mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true s leader 184.95 5.398ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 true s leader 176.901 5.64ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 true s leader 209.99 4.7ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with single client.\n mn-etcd-sz latency(~6.66ms) is ~10% higher than mn-etcd-mz(~6.16ms).\n Compare to leader, follower read request latency is ~10% high for mn-etcd-sz\n Compare to leader, follower read request latency is ~20% high for mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true l follower-1 150.680 6.66ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 true l follower-1 162.072 6.16ms eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~15% less for mn-etcd-sz(~5.84ms), mn-etcd-mz(~5.01ms).\n Compare to leader, follower read request latency is ~5% slightly high for mn-etcd-sz, mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true s follower-1 170.918 5.84ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 true s follower-1 199.01 5.01ms eu-west-1c etcd-main-0 mn-etcd-mz Benchmark tool range requests to leader with multiple clients.\n sn-etcd-sz latency(~1.593secs) is ~20% lesser than mn-etcd-sz(~1.974secs).\n mn-etcd-sz latency(~1.974secs) is ~5% greater than mn-etcd-mz(~1.81secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l leader 252.149 1.593secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 500 true l leader 205.589 1.974secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 500 true l leader 230.42 1.81secs eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is more or less same for sn-etcd-sz(~1.57961secs), mn-etcd-mz(~1.8secs) not a big difference\n Compare to consistency Linearizable, Serializable is ~10% high for mn-etcd-sz(~ 2.277secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s leader 252.406 1.57961secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 500 true s leader 181.905 2.277secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 500 true s leader 227.64 1.8secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with multiple client.\n mn-etcd-sz latency is ~20% less than mn-etcd-mz.\n Compare to leader consistency Linearizable, follower read request latency is ~15 less for mn-etcd-sz(~1.694secs).\n Compare to leader consistency Linearizable, follower read request latency is ~10% higher for mn-etcd-sz(~1.977secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l follower-1 248.489 1.694secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 500 true l follower-1 210.22 1.977secs eu-west-1c etcd-main-0 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l follower-2 205.765 1.967secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 500 true l follower-2 195.2 2.159secs eu-west-1b etcd-main-2 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s follower-1 231.458 1.7413secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 500 true s follower-1 214.80 1.907secs eu-west-1c etcd-main-0 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s follower-2 183.320 2.2810secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 500 true s follower-2 195.40 2.164secs eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool range requests to leader all keys.\n sn-etcd-sz latency(~8.993secs) is ~3% slightly lower than mn-etcd-sz(~9.236secs).\n mn-etcd-sz latency(~9.236secs) is ~2% slightly lower than mn-etcd-mz(~9.100secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false l leader 0.5139 8.993secs eu-west-1c etcd-main-0 sn-etcd-sz 20 1000000 2 5 false l leader 0.506 9.236secs eu-west-1a etcd-main-1 mn-etcd-sz 20 1000000 2 5 false l leader 0.508 9.100secs eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizableread request, follower for sn-etcd-sz(~9.secs) is a slight difference 10ms.\n Compare to consistency Linearizableread request, follower for mn-etcd-sz(~9.113secs) is ~1% less, not a big difference.\n Compare to consistency Linearizableread request, follower for mn-etcd-mz(~8.799secs) is ~3% less, not a big difference.\n sn-etcd-sz latency(~9.secs) is ~1% slightly less than mn-etcd-sz(~9.113secs).\n mn-etcd-sz latency(~9.113secs) is ~3% slightly higher than mn-etcd-mz(~8.799secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false s leader 0.51125 9.0003secs eu-west-1c etcd-main-0 sn-etcd-sz 20 1000000 2 5 false s leader 0.4993 9.113secs eu-west-1a etcd-main-1 mn-etcd-sz 20 1000000 2 5 false s leader 0.522 8.799secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower all keys\n mn-etcd-sz latency(~9.065secs) is ~1% slightly higher than mn-etcd-mz(~9.007secs).\n Compare to leader consistency Linearizableread request, follower is ~1% slightly higher for both cases mn-etcd-sz, mn-etcd-mz .\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false l follower-1 0.512 9.065secs eu-west-1a etcd-main-0 mn-etcd-sz 20 1000000 2 5 false l follower-1 0.533 9.007secs eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizableread request, follower for mn-etcd-sz(~9.553secs) is ~5% high.\n Compare to consistency Linearizableread request, follower for mn-etcd-mz(~7.7433secs) is ~15% less.\n mn-etcd-sz(~9.553secs) latency is ~20% higher than mn-etcd-mz(~7.7433secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false s follower-1 0.4743 9.553secs eu-west-1a etcd-main-0 mn-etcd-sz 20 1000000 2 5 false s follower-1 0.5500 7.7433secs eu-west-1c etcd-main-0 mn-etcd-mz NOTE: This Network latency analysis is inspired by etcd performance.\n ","categories":"","description":"","excerpt":"Network Latency analysis: sn-etcd-sz vs mn-etcd-sz vs mn-etcd-mz This …","ref":"/docs/other-components/etcd-druid/etcd-network-latency/","tags":"","title":"etcd Network Latency"},{"body":"DEP-04: EtcdMember Custom Resource Table of Contents DEP-04: EtcdMember Custom Resource Table of Contents Summary Terminology Motivation Goals Non-Goals Proposal Etcd Member Metadata Etcd Member State Transitions States and Sub-States Top Level State Transitions Starting an Etcd-Member in a Single-Node Etcd Cluster Addition of a New Etcd-Member in a Multi-Node Etcd Cluster Restart of a Voting Etcd-Member in a Multi-Node Etcd Cluster Deterministic Etcd Member Creation/Restart During Scale-Up TLS Enablement for Peer Communication Monitoring Backup Health Enhanced Snapshot Compaction Enhanced Defragmentation Monitoring Defragmentations Monitoring Restorations Monitoring Volume Mismatches Custom Resource API Spec vs Status Representing State Transitions Reason Codes API EtcdMember Etcd Lifecycle of an EtcdMember Creation Updation Deletion Reconciliation Stale EtcdMember Status Handling Reference Summary Today, etcd-druid mainly acts as an etcd cluster provisioner, and seldom takes remediatory actions if the etcd cluster goes into an undesired state that needs to be resolved by a human operator. In other words, etcd-druid cannot perform day-2 operations on etcd clusters in its current form, and hence cannot carry out its full set of responsibilities as a true “operator” of etcd clusters. For etcd-druid to be fully capable of its responsibilities, it must know the latest state of the etcd clusters and their individual members at all times.\nThis proposal aims to bridge that gap by introducing EtcdMember custom resource allowing individual etcd cluster members to publish information/state (previously unknown to etcd-druid). This provides etcd-druid a handle to potentially take cluster-scoped remediatory actions.\nTerminology druid: etcd-druid - an operator for etcd clusters.\n etcd-member: A single etcd pod in an etcd cluster that is realised as a StatefulSet.\n backup-sidecar: It is the etcd-backup-restore sidecar container in each etcd-member pod.\n NOTE: Term sidecar can now be confused with the latest definition in KEP-73. etcd-backup-restore container is currently not set as an init-container as proposed in the KEP but as a regular container in a multi-container [Pod](Pods | Kubernetes).\n leading-backup-sidecar: A backup-sidecar that is associated to an etcd leader.\n restoration: It refers to an individual etcd-member restoring etcd data from an existing backup (comprising of full and delta snapshots). The authors have deliberately chosen to distinguish between restoration and learning. Learning refers to a process where a learner “learns” from an etcd-cluster leader.\n Motivation Sharing state of an individual etcd-member with druid is essential for diagnostics, monitoring, cluster-wide-operations and potential remediation. At present, only a subset of etcd-member state is shared with druid using leases. It was always meant as a stopgap arrangement as mentioned in the corresponding issue and is not the best use of leases.\nThere is a need to have a clear distinction between an etcd-member state and etcd cluster state since most of an etcd cluster state is often derived by looking at individual etcd-member states. In addition, actors which update each of these states should be clearly identified so as to prevent multiple actors updating a single resource holding the state of either an etcd cluster or an etcd-member. As a consequence, etcd-members should not directly update the Etcd resource status and would therefore need a new custom resource allowing each member to publish detailed information about its latest state.\nGoals Introduce EtcdMember custom resource via which each etcd-member can publish information about its state. This enables druid to deterministically orchestrate out-of-turn operations like compaction, defragmentation, volume management etc. Define and capture states, sub-states and deterministic transitions amongst states of an etcd-member. Today leases are misused to share member-specific information with druid. Their usage to share member state [leader, follower, learner], member-id, snapshot revisions etc should be removed. Non-Goals Auto-recovery from quorum loss or cluster-split due to network partitioning. Auto-recovery of an etcd-member due to volume mismatch. Relooking at segregating responsiblities between etcd and backup-sidecar containers. Proposal This proposal introduces a new custom resource EtcdMember, and in the following sections describes different sets of information that should be captured as part of the new resource.\nEtcd Member Metadata Every etcd-member has a unique memberID and it is part of an etcd cluster which has a unique clusterID. In a well-formed etcd cluster every member must have the same clusterID. Publishing this information to druid helps in identifying issues when one or more etcd-members form their own individual clusters, thus resulting in multiple clusters where only one was expected. Issues Issue#419, Canary#4027, Canary#3973 are some such occurrences.\nToday, this information is published by using a member lease. Both these fields are populated in the leases’ Spec.HolderIdentity by the backup-sidecar container.\nThe authors propose to publish member metadata information in EtcdMember resource.\nid: \u003cetcd-member id\u003e clusterID: \u003cetcd cluster id\u003e NOTE: Druid would not do any auto-recovery when it finds out that there are more than one clusters being formed. Instead this information today will be used for diagnostic and alerting.\n Etcd Member State Transitions Each etcd-member goes through different States during its lifetime. State is a derived high-level summary of where an etcd-member is in its lifecycle. A SubState gives additional information about the state. This proposal extends the concept of states with the notion of a SubState, since State indicates a top-level state of an EtcdMember resource, which can have one or more SubStates.\nWhile State is sufficient for many human operators, the notion of a SubState provides operators with an insight about the discrete stage of an etcd-member in its lifecycle. For example, consider a top-level State: Starting, which indicates that an etcd-member is starting. Starting is meant to be a transient state for an etcd-member. If an etcd-member remains in this State longer than expected, then an operator would require additional insight, which the authors propose to provide via SubState (in this case, the possible SubStates could be PendingLearner and Learner, which are detailed in the following sections).\nAt present, these states are not captured and only the final state is known - i.e the etcd-member either fails to come up (all re-attempts to bring up the pod via the StatefulSet controller has exhausted) or it comes up. Getting an insight into all its state transitions would help in diagnostics.\nThe status of an etcd-member at any given point in time can be best categorized as a combination of a top-level State and a SubState. The authors propose to introduce the following states and sub-states:\nStates and Sub-States NOTE: Abbreviations have been used wherever possible, only to represent sub-states. These representations are chosen only for brevity and will have proper longer names.\n States Sub-States Description New - Every newly created etcd-member will start in this state and is termed as the initial state or the start state. Initializing DBV-S (DBValidationSanity) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State DBV-S which is an abbreviation for DBValidationSanity denotes that currently sanity etcd DB validation is in progress. Initializing DBV-F (DBValidationFull) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State DBV-F which is an abbreviation for DBValidationFull denotes that currently full etcd DB validation is in progress. Initializing R (Restoration) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State R which is an abbreviation for Restoration denotes that DB validation failed and now backup-restore has commenced restoration of etcd DB from the backup (comprising of full snapshot and delta-snapshots). An etcd-member will transition to this sub-state only when it is part of a single-node etcd-cluster. Starting (SI) PL (PendingLearner) An etcd-member can transition from Initializing state to PendingLearner state. In this state backup-restore container will optionally delete any existing etcd data directory and then attempts to add its peer etcd-member process as a learner. Since there can be only one learner at a time in an etcd cluster, an etcd-member could be in this state for some time till its request to get added as a learner is accepted. Starting (SI) Learner When backup-restore is successfully able to add its peer etcd-member process as a Learner. In this state the etcd-member process will start its DB sync from an etcd leader. Started (Sd) Follower A follower is a voting raft member. A Learner etcd-member will get promoted to a Follower once its DB is in sync with the leader. It could also become a follower if during a re-election it loses leadership and transitions from being a Leader to Follower. Started (Sd) Leader A leader is an etcd-member which will handle all client write requests and linearizable read requests. A member could transition to being a Leader from an existing Follower role due to winning a leader election or for a single node etcd cluster it directly transitions from Initializing state to Leader state as there is no other member. In the following sub-sections, the state transitions are categorized into several flows making it easier to grasp the different transitions.\nTop Level State Transitions Following DFA represents top level state transitions (without any representation of sub-states). As described in the table above there are 4 top level states:\n New- this is a start state for all newly created etcd-members\n Initializing - In this state backup-restore will perform pre-requisite actions before it triggers the start of an etcd process. DB validation and optionally restoration is done in this state. Possible sub-states are: DBValidationSanity, DBValidationFull and Restoration\n Starting - Once the optional initialization is done backup-restore will trigger the start of an etcd process. It can either directly go to Learner sub-state or wait for getting added as a learner and therefore be in PendingLearner sub-state.\n Started - In this state the etcd-member is a full voting member. It can either be in Leader or Follower sub-states.\n Starting an Etcd-Member in a Single-Node Etcd Cluster Following DFA represents the states, sub-states and transitions of a single etcd-member for a cluster that is bootstrapped from cluster size of 0 -\u003e 1.\nAddition of a New Etcd-Member in a Multi-Node Etcd Cluster Following DFA represents the states, sub-states and transitions of an etcd cluster which starts with having a single member (Leader) and then one or more new members are added which represents a scale-up of an etcd cluster from 1 -\u003e n, where n is odd.\nRestart of a Voting Etcd-Member in a Multi-Node Etcd Cluster Following DFA represents the states, sub-states and transitions when a voting etcd-member in a multi-node etcd cluster restarts.\n NOTE: If the DB validation fails then data directory of the etcd-member is removed and etcd-member is removed from cluster membership, thus transitioning it to New state. The state transitions from New state are depicted by this section.\n Deterministic Etcd Member Creation/Restart During Scale-Up Bootstrap information:\nWhen an etcd-member starts, then it needs to find out:\n If it should join an existing cluster or start a new cluster.\n If it should add itself as a Learner or directly start as a voting member.\n Issue with the current approach:\nAt present, this is facilitated by three things:\n During scale-up, druid adds an annotation gardener.cloud/scaled-to-multi-node to the StatefulSet. Each etcd-members looks up this annotation.\n backup-sidecar attempts to fetch etcd cluster member-list and checks if this etcd-member is already part of the cluster.\n Size of the cluster by checking initial-cluster in the etcd config.\n Druid adds an annotation gardener.cloud/scaled-to-multi-node on the StatefulSet which is then shared by all etcd-members irrespective of the starting state of an etcd-member (as Learner or Voting-Member). This especially creates an issue for the current leader (often pod with index 0) during the scale-up of an etcd cluster as described in this issue.\nIt has been agreed that the current solution to this issue is a quick and dirty fix and needs to be revisited to be uniformly applied to all etcd-members. The authors propose to provide a more deterministic approach to scale-up using the EtcdMember resource.\nNew approach\nInstead of adding an annotation gardener.cloud/scaled-to-multi-node on the StatefulSet, a new annotation druid.gardener.cloud/create-as-learner should be added by druid on an EtcdMember resource. This annotation will only be added to newly created members during scale-up.\nEach etcd-member should look at the following to deterministically compute the bootstrap information specified above:\n druid.gardener.cloud/create-as-learner annotation on its respective EtcdMember resource. This new annotation will be honored in the following cases:\n When an etcd-member is created for the very first time.\n An etcd-member is restarted while it is in Starting state (PendingLearner and Learner sub-states).\n Etcd-cluster member list. to check if it is already part of the cluster.\n Existing etcd data directory and its validity.\n NOTE: When the etcd-member gets promoted to a voting-member, then it should remove the annotation on its respective EtcdMember resource.\n TLS Enablement for Peer Communication Etcd-members in a cluster use peer URL(s) to communicate amongst each other. If the advertised peer URL(s) for an etcd-member are updated then etcd mandates a restart of the etcd-member.\nDruid only supports toggling the transport level security for the advertised peer URL(s). To indicate that the etcd process within the etcd-member has the updated advertised peer URL(s), an annotation member.etcd.gardener.cloud/tls-enabled is added by backup-sidecar container to the member lease object.\nDuring the reconciliation run for an Etcd resource in druid, if reconciler detects a change in advertised peer URL(s) TLS configuration then it will watch for the above mentioned annotation on the member lease. If the annotation has a value of false then it will trigger a restart of the etcd-member pod.\nThe authors propose to publish member metadata information in EtcdMember resource and not misuse member leases.\npeerTLSEnabled: \u003cbool\u003e Monitoring Backup Health Backup-sidecar takes delta and full snapshot both periodically and threshold based. These backed-up snapshots are essential for restoration operations for bootstrapping an etcd cluster from 0 -\u003e 1 replicas. It is essential that leading-backup-sidecar container which is responsible for taking delta/full snapshots and uploading these snapshots to the configured backup store, publishes this information for druid to consume.\nAt present, information about backed-up snapshot (only latest-revision-number) is published by leading-backup-sidecar container by updating Spec.HolderIdentity of the delta-snapshot and full-snapshot leases.\nDruid maintains conditions in the Etcd resource status, which include but are not limited to maintaining information on whether backups being taken for an etcd cluster are healthy (up-to-date) or stale (outdated in context to a configured schedule). Druid computes these conditions using information from full/delta snapshot leases.\nIn order to provide a holistic view of the health of backups to human operators, druid requires additional information about the snapshots that are being backed-up. The authors propose to not misuse leases and instead publish the following snapshot information as part EtcdMember custom resource:\nsnapshots: lastFull: timestamp: \u003ctime of full snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e lastDelta: timestamp: \u003ctime of delta snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e While this information will primarily help druid compute accurate conditions regarding backup health from snapshot information and publish this to human operators, it could be further utilised by human operators to take remediatory actions (e.g. manually triggering a full or delta snapshot or further restarting the leader if the issue is still not resolved) if backup is unhealthy.\nEnhanced Snapshot Compaction Druid can be configured to perform regular snapshot compactions for etcd clusters, to reduce the total number of delta snapshots to be restored if and when a DB restoration for an etcd cluster is required. Druid triggers a snapshot compaction job when the accumulated etcd events in the latest set of delta snapshots (taken after the last full snapshot) crosses a specified threshold.\nAs described in Issue#591 scheduling compaction only based on number of accumulated etcd events is not sufficient to ensure a successful compaction. This is specifically targeted for kubernetes clusters where each etcd event is larger in size owing to large spec or status fields or respective resources.\nDruid will now need information regarding snapshot sizes, and more importantly the total size of accumulated delta snapshots since the last full snapshot.\nThe authors propose to enhance the proposed snapshots field described in Use Case #3 with the following additional field:\nsnapshots: accumulatedDeltaSize: \u003ctotal size of delta snapshots since last full snapshot\u003e Druid can then use this information in addition to the existing revision information to decide to trigger an early snapshot compaction job. This effectively allows druid to be proactive in performing regular compactions for etcds receiving large events, reducing the probability of a failed snapshot compaction or restoration.\nEnhanced Defragmentation Reader is recommended to read Etcd Compaction \u0026 Defragmentation in order to understand the following terminology:\ndbSize - total storage space used by the etcd database\ndbSizeInUse - logical storage space used by the etcd database, not accounting for free pages in the DB due to etcd history compaction\nThe leading-backup-sidecar performs periodic defragmentations of the DBs of all the etcd-members in the cluster, controlled via a defragmentation cron schedule provided to each backup-sidecar. Defragmentation is a costly maintenance operation and causes a brief downtime to the etcd-member being defragmented, due to which the leading-backup-sidecar defragments each etcd-member sequentially. This ensures that only one etcd-member would be unavailable at any given time, thus avoiding an accidental quorum loss in the etcd cluster.\nThe authors propose to move the responsibility of orchestrating these individual defragmentations to druid due to the following reasons:\n Since each backup-sidecar only has knowledge of the health of its own etcd, it can only determine whether its own etcd can be defragmented or not, based on etcd-member health. Trying to defragment a different healthy etcd-member while another etcd-member is unhealthy would lead to a transient quorum loss. Each backup-sidecar is only a sidecar to its own etcd-member, and by good design principles, it must not be performing any cluster-wide maintenance operations, and this responsibility should remain with the etcd cluster operator. Additionally, defragmentation of an etcd DB becomes inevitable if the DB size exceeds the specified DB space quota, since the etcd DB then becomes read-only, ie no write operations on the etcd would be possible unless the etcd DB is defragmented and storage space is freed up. In order to automate this, druid will now need information about the etcd DB size from each member, specifically the leading etcd-member, so that a cluster-wide defragmentation can be triggered if the DB size reaches a certain threshold, as already described by this issue.\nThe authors propose to enhance each etcd-member to regularly publish information about the dbSize and dbSizeInUse so that druid may trigger defragmentation for the etcd cluster.\ndbSize: \u003cdb-size\u003e # e.g 6Gi dbSizeInUse: \u003cdb-size-in-use\u003e # e.g 3.5Gi Difference between dbSize and dbSizeInUse gives a clear indication of how much storage space would be freed up if a defragmentation is performed. If the difference is not significant (based on a configurable threshold provided to druid), then no defragmentation should be performed. This will ensure that druid does not perform frequent defragmentations that do not yield much benefit. Effectively it is to maximise the benefit of defragmentation since this operations involves transient downtime for each etcd-member.\nMonitoring Defragmentations As discussed in the previous section, every etcd-member is defragmented periodically, and can also be defragmented based on the DB size reaching a certain threshold. It is beneficial for druid to have knowledge of this data from each etcd-member for the following reasons:\n [Diagnostics] It is expected that backup-sidecar will push releveant metrics and configure alerts on these metrics.\n [Operational] Derive status of defragmentation at etcd cluster level. In case of partial failures for a subset of etcd-members druid can potentially re-trigger defragmentation only for those etcd-members.\n The authors propose to capture this information as part of lastDefragmentation section in the EtcdMember resource.\nlastDefragmentation: startTime: \u003cstart time of defragmentation\u003e endTime: \u003cend time of defragmentation\u003e status: \u003cSucceeded | Failed\u003e message: \u003csuccess or failure message\u003e initialDBSize: \u003csize of etcd DB prior to defragmentation\u003e finalDBSize: \u003csize of etcd DB post defragmentation\u003e NOTE: Defragmentation is a cluster-wide operation, and insights derived from aggregating defragmentation data from individual etcd-members would be captured in the Etcd resource status\n Monitoring Restorations Each etcd-member may perform restoration of data multiple times throughout its lifecycle, possibly owing to data corruptions. It would be useful to capture this information as part of an EtcdMember resource, for the following use cases:\n [Diagnostics] It is expected that backup-sidecar will push a metric indicating failure to restore.\n [Operational] Restoration from backup-bucket only happens for a single node etcd cluster. If restoration is failing then druid cannot take any remediatory actions since there is no etcd quorum.\n The authors propose to capture this information under lastRestoration section in the EtcdMember resource.\nlastRestoration: status: \u003cFailed | Success | In-Progress\u003e reason: \u003creason-code for status\u003e message: \u003chuman readable message for status\u003e startTime: \u003cstart time of restoration\u003e endTime: \u003cend time of restoration\u003e Authors have considered the following cases to better understand how errors during restoration will be handled:\nCase #1 - Failure to connect to Provider Object Store\nAt present full and delta snapshots are downloaded during restoration. If there is a failure then initialization status transitions to Failed followed by New which forces etcd-wrapper to trigger the initialization again. This in a way forces a retry and currently there is no limit on the number of attempts.\nAuthors propose to improve the retry logic but keep the overall behavior of not forcing a container restart the same.\nCase #2 - Read-Only Mounted volume\nIf a mounted volume which is used to create the etcd data directory turns read-only then authors propose to capture this state via EtcdMember.\nAuthors propose that druid should initiate recovery by deleting the PVC for this etcd-member and letting StatefulSet controller re-create the Pod and the PVC. Removing PVC and deleting the pod is considered safe because:\n Data directory is present and is the DB is corrupt resulting in an un-usasble etcd. Data directory is not present but any attempt to create a directory structure fails due to read-only FS. In both these cases there is no side-effect of deleting the PVC and the Pod.\nCase #3 - Revision mismatch\nThere is currently an issue in backup-sidecar which results in a revision mismatch in the snapshots (full/delta) taken by leading the backup-sidecar container. This results in a restoration failure. One occurance of such issue has been captured in Issue#583. This occurence points to a bug which should be fixed however there is a rare possibility that these snapshots (full/delta) get corrupted. In this rare situation, backup-sidecar should only raise an alert.\nAuthors propose that druid should not take any remediatory actions as this involves:\n Inspecting snapshots If the full snapshot is corrupt then a decision needs to be taken to recover from the last full snapshot as the base snapshot. This can result in data loss and therefore needs manual intervention. If a delta snapshot is corrupt, then recovery can be done till the corrupt revision in the delta snapshot. Since this will also result in a loss of data therefore this decision needs to be take by an operator. Monitoring Volume Mismatches Each etcd-member checks for possible etcd data volume mismatches, based on which it decides whether to start the etcd process or not, but this information is not captured anywhere today. It would be beneficial to capture this information as part of the EtcdMember resource so that a human operator may check this and manually fix the underlying problem with the wrong volume being attached or mounted to an etcd-member pod.\nThe authors propose to capture this information under volumeMismatches section in the EtcdMember resource.\nvolumeMismatches: - identifiedAt: \u003ctime at which wrong volume mount was identified\u003e fixedAt: \u003ctime at which correct volume was mounted\u003e volumeID: \u003cvolume ID of wrong volume that got mounted\u003e numRestarts: \u003cnum of etcd-member restarts that were attempted\u003e Each entry under volumeMismatches will be for a unique volumeID. If there is a pod restart and it results in yet another unexpected volumeID (different from the already captured volumeIDs) then a new entry will get created. numRestarts denotes the number of restarts seen by the etcd-member for a specific volumeID.\nBased on information from the volumeMismatches section, druid may choose to perform rudimentary remediatory actions as simple as restarting the member pod to force a possible rescheduling of the pod to a different node which could potentially force the correct volume to be mounted to the member.\nCustom Resource API Spec vs Status Information that is captured in the etcd-member custom resource could be represented either as EtcdMember.Status or EtcdMemberState.Spec.\nGardener has a similar need to capture a shoot state and they have taken the decision to represent it via ShootState resource where the state or status of a shoot is captured as part of the Spec field in the ShootState custom resource.\nThe authors wish to instead align themselves with the K8S API conventions and choose to use EtcdMember custom resource and capture the status of each member in Status field of this resource. This has the following advantages:\n Spec represents a desired state of a resource and what is intended to be captured is the As-Is state of a resource which Status is meant to capture. Therefore, semantically using Status is the correct choice.\n Not mis-using Spec now to represent As-Is state provides us with a choice to extend the custom resource with any future need for a Spec a.k.a desired state.\n Representing State Transitions The authors propose to use a custom representation for states, sub-states and transitions.\nConsider the following representation:\ntransitions: - state: \u003cname of the state that the etcd-member has transitioned to\u003e subState: \u003cname of the sub-state if any\u003e reason: \u003creason code for the transition\u003e transitionTime: \u003ctime of transition to this state\u003e message: \u003cdetailed message if any\u003e As an example, consider the following transitions which represent addition of an etcd-member during scale-up of an etcd cluster, followed by a restart of the etcd-member which detects a corrupt DB:\nstatus: transitions: - state: New subState: New reason: ClusterScaledUp transitionTime: \"2023-07-17T05:00:00Z\" message: \"New member added due to etcd cluster scale-up\" - state: Starting subState: PendingLearner reason: WaitingToJoinAsLearner transitionTime: \"2023-07-17T05:00:30Z\" message: \"Waiting to join the cluster as a learner\" - state: Starting subState: Learner reason: JoinedAsLearner transitionTime: \"2023-07-17T05:01:20Z\" message: \"Joined the cluster as a learner\" - state: Started subState: Follower reason: PromotedAsVotingMember transitionTime: \"2023-07-17T05:02:00Z\" message: \"Now in sync with leader, promoted as voting member\" - state: Initializing subState: DBValidationFull reason: DetectedPreviousUncleanExit transitionTime: \"2023-07-17T08:00:00Z\" message: \"Detected previous unclean exit, requires full DB validation\" - state: New subState: New reason: DBCorruptionDetected transitionTime: \"2023-07-17T08:01:30Z\" message: \"Detected DB corruption during initialization, removing member from cluster\" - state: Starting subState: PendingLearner reason: WaitingToJoinAsLearner transitionTime: \"2023-07-17T08:02:10Z\" message: \"Waiting to join the cluster as a learner\" - state: Starting subState: Learner reason: JoinedAsLearner transitionTime: \"2023-07-17T08:02:20Z\" message: \"Joined the cluster as a learner\" - state: Started subState: Follower reason: PromotedAsVotingMember transitionTime: \"2023-07-17T08:04:00Z\" message: \"Now in sync with leader, promoted as voting member\" Reason Codes The authors propose the following list of possible reason codes for transitions. This list is not exhaustive, and can be further enhanced to capture any new transitions in the future.\n Reason Transition From State (SubState) Transition To State (SubState) ClusterScaledUp | NewSingleNodeClusterCreated nil New DetectedPreviousCleanExit New | Started (Leader) | Started (Follower) Initializing (DBValidationSanity) DetectedPreviousUncleanExit New | Started (Leader) | Started (Follower) Initializing (DBValidationFull) DBValidationFailed Initializing (DBValidationSanity) | Initializing (DBValidationFull) Initializing (Restoration) | New DBValidationSucceeded Initializing (DBValidationSanity) | Initializing (DBValidationFull) Started (Leader) | Started (Follower) Initializing (Restoration)Succeeded Initializing (Restoration) Started (Leader) WaitingToJoinAsLearner New Starting (PendingLearner) JoinedAsLearner Starting (PendingLearner) Starting (Learner) PromotedAsVotingMember Starting (Learner) Started (Follower) GainedClusterLeadership Started (Follower) Started (Leader) LostClusterLeadership Started (Leader) Started (Follower) API EtcdMember The authors propose to add the EtcdMember custom resource API to etcd-druid APIs and initially introduce it with v1alpha1 version.\napiVersion: druid.gardener.cloud/v1alpha1 kind: EtcdMember metadata: labels: gardener.cloud/owned-by: \u003cname of parent Etcd resource\u003e name: \u003cname of the etcd-member\u003e namespace: \u003cnamespace | will be the same as that of parent Etcd resource\u003e ownerReferences: - apiVersion: druid.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: Etcd name: \u003cname of the parent Etcd resource\u003e uid: \u003cUID of the parent Etcd resource\u003e status: id: \u003cetcd-member id\u003e clusterID: \u003cetcd cluster id\u003e peerTLSEnabled: \u003cbool\u003e dbSize: \u003cdb-size\u003e dbSizeInUse: \u003cdb-size-in-use\u003e snapshots: lastFull: timestamp: \u003ctime of full snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e lastDelta: timestamp: \u003ctime of delta snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e accumulatedDeltaSize: \u003ctotal size of delta snapshots since last full snapshot\u003e lastRestoration: type: \u003cFromSnapshot | FromLeader\u003e status: \u003cFailed | Success | In-Progress\u003e startTime: \u003cstart time of restoration\u003e endTime: \u003cend time of restoration\u003e lastDefragmentation: startTime: \u003cstart time of defragmentation\u003e endTime: \u003cend time of defragmentation\u003e reason: message: initialDBSize: \u003csize of etcd DB prior to defragmentation\u003e finalDBSize: \u003csize of etcd DB post defragmentation\u003e volumeMismatches: - identifiedAt: \u003ctime at which wrong volume mount was identified\u003e fixedAt: \u003ctime at which correct volume was mounted\u003e volumeID: \u003cvolume ID of wrong volume that got mounted\u003e numRestarts: \u003cnum of pod restarts that were attempted\u003e transitions: - state: \u003cname of the state that the etcd-member has transitioned to\u003e subState: \u003cname of the sub-state if any\u003e reason: \u003creason code for the transition\u003e transitionTime: \u003ctime of transition to this state\u003e message: \u003cdetailed message if any\u003e Etcd Authors propose the following changes to the Etcd API:\n In the Etcd.Status resource API, member status is computed and stored. This field will be marked as deprecated and in a later version of druid it will be removed. In its place, the authors propose to introduce the following: type EtcdStatus struct { // MemberRefs contains references to all existing EtcdMember resources MemberRefs []CrossVersionObjectReference } In Etcd.Status resource API, PeerUrlTLSEnabled reflects the status of enabling TLS for peer communication across all etcd-members. Currentlty this field is not been used anywhere. In this proposal, the authors have also proposed that each EtcdMember resource should capture the status of TLS enablement of peer URL. The authors propose to relook at the need to have this field under EtcdStatus. Lifecycle of an EtcdMember Creation Druid creates an EtcdMember resource for every replica in etcd.Spec.Replicas during reconciliation of an etcd resource. For a fresh etcd cluster this is done prior to creation of the StatefulSet resource and for an existing cluster which has now been scaled-up, it is done prior to updating the StatefulSet resource.\nUpdation All fields in EtcdMember.Status are only updated by the corresponding etcd-member. Druid only consumes the information published via EtcdMember resources.\nDeletion Druid is responsible for deletion of all existing EtcdMember resources for an etcd cluster. There are three scenarios where an EtcdMember resource will be deleted:\n Deletion of etcd resource.\n Scale down of an etcd cluster to 0 replicas due to hibernation of the k8s control plane.\n Transient scale down of an etcd cluster to 0 replicas to recover from a quorum loss.\n Authors found no reason to retain EtcdMember resources when the etcd cluster is scale down to 0 replicas since the information contained in each EtcdMember resource would no longer represent the current state of each member and would thus be stale. Any controller in druid which acts upon the EtcdMember.Status could potentially take incorrect actions.\nReconciliation Authors propose to introduce a new controller (let’s call it etcd-member-controller) which watches for changes to the EtcdMember resource(s). If a reconciliation of an Etcd resource is required as a result of change in EtcdMember status then this controller should enqueue an event and force a reconciliation via existing etcd-controller, thus preserving the single-actor-principal constraint which ensures deterministic changes to etcd cluster resources.\n NOTE: Further decisions w.r.t responsibility segregation will be taken during implementation and will not be documented in this proposal.\n Stale EtcdMember Status Handling It is possible that an etcd-member is unable to update its respective EtcdMember resource. Following can be some of the implications which should be kept in mind while reconciling EtcdMember resource in druid:\n Druid sees stale state transitions (this assumes that the backup-sidecar attempts to update the state/sub-state in etcdMember.status.transitions with best attempt). There is currently no implication other than an operator seeing a stale state. dbSize and dbSizeInUse could not be updated. A consequence could be that druid continues to see high value for dbSize - dbSizeInUse for a extended amount of time. Druid should ensure that it does not trigger repeated defragmentations. If VolumeMismatches is stale, then druid should no longer attempt to recover by repeatedly restarting the pod. Failed restoration was recorded last and further updates to this array failed. Druid should not repeatedly take full-snapshots. If snapshots.accumulatedDeltaSize could not be updated, then druid should not schedule repeated compaction Jobs. Reference Disaster recovery | etcd\n etcd API Reference | etcd\n Raft Consensus Algorithm\n ","categories":"","description":"","excerpt":"DEP-04: EtcdMember Custom Resource Table of Contents DEP-04: …","ref":"/docs/other-components/etcd-druid/proposals/04-etcd-member-custom-resource/","tags":"","title":"EtcdMember Custom Resource"},{"body":"Excess Reserve Capacity Excess Reserve Capacity Goal Note Possible Approaches Approach 1: Enhance Machine-controller-manager to also entertain the excess machines Approach 2: Enhance Cluster-autoscaler by simulating fake pods in it Approach 3: Enhance cluster-autoscaler to support pluggable scaling-events Approach 4: Make intelligent use of Low-priority pods Goal Currently, autoscaler optimizes the number of machines for a given application-workload. Along with effective resource utilization, this feature brings concern where, many times, when new application instances are created - they don’t find space in existing cluster. This leads the cluster-autoscaler to create new machines via MachineDeployment, which can take from 3-4 minutes to ~10 minutes, for the machine to really come-up and join the cluster. In turn, application-instances have to wait till new machines join the cluster.\nOne of the promising solutions to this issue is Excess Reserve Capacity. Idea is to keep a certain number of machines or percent of resources[cpu/memory] always available, so that new workload, in general, can be scheduled immediately unless huge spike in the workload. Also, the user should be given enough flexibility to choose how many resources or how many machines should be kept alive and non-utilized as this affects the Cost directly.\nNote We decided to go with Approach-4 which is based on low priority pods. Please find more details here: https://github.com/gardener/gardener/issues/254 Approach-3 looks more promising in long term, we may decide to adopt that in future based on developments/contributions in autoscaler-community. Possible Approaches Following are the possible approaches, we could think of so far.\nApproach 1: Enhance Machine-controller-manager to also entertain the excess machines Machine-controller-manager currently takes care of the machines in the shoot cluster starting from creation-deletion-health check to efficient rolling-update of the machines. From the architecture point of view, MachineSet makes sure that X number of machines are always running and healthy. MachineDeployment controller smartly uses this facility to perform rolling-updates.\n We can expand the scope of MachineDeployment controller to maintain excess number of machines by introducing new parallel independent controller named MachineTaint controller. This will result in MCM to include Machine, MachineSet, MachineDeployment, MachineSafety, MachineTaint controllers. MachineTaint controller does not need to introduce any new CRD - analogy fits where taint-controller also resides into kube-controller-manager.\n Only Job of MachineTaint controller will be:\n List all the Machines under each MachineDeployment. Maintain taints of noSchedule and noExecute on X latest MachineObjects. There should be an event-based informer mechanism where MachineTaintController gets to know about any Update/Delete/Create event of MachineObjects - in turn, maintains the noSchedule and noExecute taints on all the latest machines. - Why latest machines? - Whenever autoscaler decides to add new machines - essentially ScaleUp event - taints from the older machines are removed and newer machines get the taints. This way X number of Machines immediately becomes free for new pods to be scheduled. - While ScaleDown event, autoscaler specifically mentions which machines should be deleted, and that should not bring any concerns. Though we will have to put proper label/annotation defined by autoscaler on taintedMachines, so that autoscaler does not consider the taintedMachines for deletion while scale-down. * Annotation on tainted node: \"cluster-autoscaler.kubernetes.io/scale-down-disabled\": \"true\" Implementation Details:\n Expect new optional field ExcessReplicas in MachineDeployment.Spec. MachineDeployment controller now adds both Spec.Replicas and Spec.ExcessReplicas[if provided], and considers that as a standard desiredReplicas. - Current working of MCM will not be affected if ExcessReplicas field is kept nil. MachineController currently reads the NodeObject and sets the MachineConditions in MachineObject. Machine-controller will now also read the taints/labels from the MachineObject - and maintains it on the NodeObject. We expect cluster-autoscaler to intelligently make use of the provided feature from MCM.\n CA gets the input of min:max:excess from Gardener. CA continues to set the MachineDeployment.Spec.Replicas as usual based on the application-workload. In addition, CA also sets the MachieDeployment.Spec.ExcessReplicas . Corner-case: * CA should decrement the excessReplicas field accordingly when desiredReplicas+excessReplicas on MachineDeployment goes beyond max. Approach 2: Enhance Cluster-autoscaler by simulating fake pods in it There was already an attempt by community to support this feature. Refer for details to: https://github.com/kubernetes/autoscaler/pull/77/files Approach 3: Enhance cluster-autoscaler to support pluggable scaling-events Forked version of cluster-autoscaler could be improved to plug-in the algorithm for excess-reserve capacity. Needs further discussion around upstream support. Create golang channel to separate the algorithms to trigger scaling (hard-coded in cluster-autoscaler, currently) from the algorithms about how to to achieve the scaling (already pluggable in cluster-autoscaler). This kind of separation can help us introduce/plug-in new algorithms (such as based node resource utilisation) without affecting existing code-base too much while almost completely re-using the code-base for the actual scaling. Also this approach is not specific to our fork of cluster-autoscaler. It can be made upstream eventually as well. Approach 4: Make intelligent use of Low-priority pods Refer to: pod-priority-preemption TL; DR: High priority pods can preempt the low-priority pods which are already scheduled. Pre-create bunch[equivivalent of X shoot-control-planes] of low-priority pods with priority of zero, then start creating the workload pods with better priority which will reschedule the low-priority pods or otherwise keep them in pending state if the limit for max-machines has reached. This is still alpha feature. ","categories":"","description":"","excerpt":"Excess Reserve Capacity Excess Reserve Capacity Goal Note Possible …","ref":"/docs/other-components/machine-controller-manager/proposals/excess_reserve_capacity/","tags":"","title":"Excess Reserve Capacity"},{"body":"ExposureClasses The Gardener API server provides a cluster-scoped ExposureClass resource. This resource is used to allow exposing the control plane of a Shoot cluster in various network environments like restricted corporate networks, DMZ, etc.\nBackground The ExposureClass resource is based on the concept for the RuntimeClass resource in Kubernetes.\nA RuntimeClass abstracts the installation of a certain container runtime (e.g., gVisor, Kata Containers) on all nodes or a subset of the nodes in a Kubernetes cluster. See Runtime Class for more information.\nIn contrast, an ExposureClass abstracts the ability to expose a Shoot clusters control plane in certain network environments (e.g., corporate networks, DMZ, internet) on all Seeds or a subset of the Seeds.\nExample: RuntimeClass and ExposureClass\napiVersion: node.k8s.io/v1 kind: RuntimeClass metadata: name: gvisor handler: gvisorconfig # scheduling: # nodeSelector: # env: prod --- kind: ExposureClass metadata: name: internet handler: internet-config # scheduling: # seedSelector: # matchLabels: # network/env: internet Similar to RuntimeClasses, ExposureClasses also define a .handler field reflecting the name reference for the corresponding CRI configuration of the RuntimeClass and the control plane exposure configuration for the ExposureClass.\nThe CRI handler for RuntimeClasses is usually installed by an administrator (e.g., via a DaemonSet which installs the corresponding container runtime on the nodes). The control plane exposure configuration for ExposureClasses will be also provided by an administrator. This exposure configuration is part of the gardenlet configuration, as this component is responsible to configure the control plane accordingly. See the gardenlet Configuration ExposureClass Handlers section for more information.\nThe RuntimeClass also supports the selection of a node subset (which have the respective controller runtime binaries installed) for pod scheduling via its .scheduling section. The ExposureClass also supports the selection of a subset of available Seed clusters whose gardenlet is capable of applying the exposure configuration for the Shoot control plane accordingly via its .scheduling section.\nUsage by a Shoot A Shoot can reference an ExposureClass via the .spec.exposureClassName field.\n ⚠️ When creating a Shoot resource, the Gardener scheduler will try to assign the Shoot to a Seed which will host its control plane.\n The scheduling behaviour can be influenced via the .spec.seedSelectors and/or .spec.tolerations fields in the Shoot. ExposureClasses can also contain scheduling instructions. If a Shoot is referencing an ExposureClass, then the scheduling instructions of both will be merged into the Shoot. Those unions of scheduling instructions might lead to a selection of a Seed which is not able to deal with the handler of the ExposureClass and the Shoot creation might end up in an error. In such case, the Shoot scheduling instructions should be revisited to check that they are not interfering with the ones from the ExposureClass. If this is not feasible, then the combination with the ExposureClass might not be possible and you need to contact your Gardener administrator.\n Example: Shoot and ExposureClass scheduling instructions merge flow Assuming there is the following Shoot which is referencing the ExposureClass below: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: abc namespace: garden-dev spec: exposureClassName: abc seedSelectors: matchLabels: env: prod --- apiVersion: core.gardener.cloud/v1beta1 kind: ExposureClass metadata: name: abc handler: abc scheduling: seedSelector: matchLabels: network: internal Both seedSelectors would be merged into the Shoot. The result would be the following: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: abc namespace: garden-dev spec: exposureClassName: abc seedSelectors: matchLabels: env: prod network: internal Now the Gardener Scheduler would try to find a Seed with those labels. If there are no Seeds with matching labels for the seed selector, then the Shoot will be unschedulable. If there are Seeds with matching labels for the seed selector, then the Shoot will be assigned to the best candidate after the scheduling strategy is applied, see Gardener Scheduler. If the Seed is not able to serve the ExposureClass handler abc, then the Shoot will end up in error state. If the Seed is able to serve the ExposureClass handler abc, then the Shoot will be created. gardenlet Configuration ExposureClass Handlers The gardenlet is responsible to realize the control plane exposure strategy defined in the referenced ExposureClass of a Shoot.\nTherefore, the GardenletConfiguration can contain an .exposureClassHandlers list with the respective configuration.\nExample of the GardenletConfiguration:\nexposureClassHandlers: - name: internet-config loadBalancerService: annotations: loadbalancer/network: internet - name: internal-config loadBalancerService: annotations: loadbalancer/network: internal sni: ingress: namespace: ingress-internal labels: network: internal Each gardenlet can define how the handler of a certain ExposureClass needs to be implemented for the Seed(s) where it is responsible for.\nThe .name is the name of the handler config and it must match to the .handler in the ExposureClass.\nAll control planes on a Seed are exposed via a load balancer, either a dedicated one or a central shared one. The load balancer service needs to be configured in a way that it is reachable from the target network environment. Therefore, the configuration of load balancer service need to be specified, which can be done via the .loadBalancerService section. The common way to influence load balancer service behaviour is via annotations where the respective cloud-controller-manager will react on and configure the infrastructure load balancer accordingly.\nThe control planes on a Seed will be exposed via a central load balancer and with Envoy via TLS SNI passthrough proxy. In this case, the gardenlet will install a dedicated ingress gateway (Envoy + load balancer + respective configuration) for each handler on the Seed. The configuration of the ingress gateways can be controlled via the .sni section in the same way like for the default ingress gateways.\n","categories":"","description":"","excerpt":"ExposureClasses The Gardener API server provides a cluster-scoped …","ref":"/docs/gardener/exposureclasses/","tags":"","title":"ExposureClasses"},{"body":"Contract: Extension Resource Gardener defines common procedures which must be passed to create a functioning shoot cluster. Well known steps are represented by special resources like Infrastructure, OperatingSystemConfig or DNS. These resources are typically reconciled by dedicated controllers setting up the infrastructure on the hyperscaler or managing DNS entries, etc.\nBut, some requirements don’t match with those special resources or don’t depend on being proceeded at a specific step in the creation / deletion flow of the shoot. They require a more generic hook. Therefore, Gardener offers the Extension resource.\nWhat is required to register and support an Extension type? Gardener creates one Extension resource per registered extension type in ControllerRegistration per shoot.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: extension-example spec: resources: - kind: Extension type: example globallyEnabled: true workerlessSupported: true If spec.resources[].globallyEnabled is true, then the Extension resources of the given type is created for every shoot cluster. Set to false, the Extension resource is only created if configured in the Shoot manifest. In case of workerless Shoot, a globally enabled Extension resource is created only if spec.resources[].workerlessSupported is also set to true. If an extension configured in the spec of a workerless Shoot is not supported yet, the admission request will be rejected.\nThe Extension resources are created in the shoot namespace of the seed cluster.\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: example namespace: shoot--foo--bar spec: type: example providerConfig: {} Your controller needs to reconcile extensions.extensions.gardener.cloud. Since there can exist multiple Extension resources per shoot, each one holds a spec.type field to let controllers check their responsibility (similar to all other extension resources of Gardener).\nProviderConfig It is possible to provide data in the Shoot resource which is copied to spec.providerConfig of the Extension resource.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: bar namespace: garden-foo spec: extensions: - type: example providerConfig: foo: bar ... results in\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: example namespace: shoot--foo--bar spec: type: example providerConfig: foo: bar Shoot Reconciliation Flow and Extension Status Gardener creates Extension resources as part of the Shoot reconciliation. Moreover, it is guaranteed that the Cluster resource exists before the Extension resource is created. Extensions can be reconciled at different stages during Shoot reconciliation depending on the defined extension lifecycle strategy in the respective ControllerRegistration resource. Please consult the Extension Lifecycle section for more information.\nFor an Extension controller it is crucial to maintain the Extension’s status correctly. At the end Gardener checks the status of each Extension and only reports a successful shoot reconciliation if the state of the last operation is Succeeded.\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: generation: 1 name: example namespace: shoot--foo--bar spec: type: example status: lastOperation: state: Succeeded observedGeneration: 1 ","categories":"","description":"","excerpt":"Contract: Extension Resource Gardener defines common procedures which …","ref":"/docs/gardener/extensions/extension/","tags":"","title":"Extension"},{"body":"Packages:\n extensions.gardener.cloud/v1alpha1 extensions.gardener.cloud/v1alpha1 Package v1alpha1 is the v1alpha1 version of the API.\nResource Types: BackupBucket BackupEntry Bastion Cluster ContainerRuntime ControlPlane DNSRecord Extension Infrastructure Network OperatingSystemConfig Worker BackupBucket BackupBucket is a specification for backup bucket.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string BackupBucket metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BackupBucketSpec Specification of the BackupBucket. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this bucket. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n status BackupBucketStatus (Optional) BackupEntry BackupEntry is a specification for backup Entry.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string BackupEntry metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BackupEntrySpec Specification of the BackupEntry. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n backupBucketProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) BackupBucketProviderStatus contains the provider status that has been generated by the controller responsible for the BackupBucket resource.\n region string Region is the region of this Entry. This field is immutable.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n status BackupEntryStatus (Optional) Bastion Bastion is a bastion or jump host that is dynamically created to provide SSH access to shoot nodes.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Bastion metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BastionSpec Spec is the specification of this Bastion. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n userData []byte UserData is the base64-encoded user data for the bastion instance. This should contain code to provision the SSH key on the bastion instance. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n status BastionStatus (Optional) Status is the bastion’s status.\n Cluster Cluster is a specification for a Cluster resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Cluster metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec ClusterSpec cloudProfile k8s.io/apimachinery/pkg/runtime.RawExtension CloudProfile is a raw extension field that contains the cloudprofile resource referenced by the shoot that has to be reconciled.\n seed k8s.io/apimachinery/pkg/runtime.RawExtension Seed is a raw extension field that contains the seed resource referenced by the shoot that has to be reconciled.\n shoot k8s.io/apimachinery/pkg/runtime.RawExtension Shoot is a raw extension field that contains the shoot resource that has to be reconciled.\n ContainerRuntime ContainerRuntime is a specification for a container runtime resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string ContainerRuntime metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec ContainerRuntimeSpec Specification of the ContainerRuntime. If the object’s deletion timestamp is set, this field is immutable.\n binaryPath string BinaryPath is the Worker’s machine path where container runtime extensions should copy the binaries to.\n workerPool ContainerRuntimeWorkerPool WorkerPool identifies the worker pool of the Shoot. For each worker pool and type, Gardener deploys a ContainerRuntime CRD.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n status ContainerRuntimeStatus (Optional) ControlPlane ControlPlane is a specification for a ControlPlane resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string ControlPlane metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec ControlPlaneSpec Specification of the ControlPlane. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose Purpose (Optional) Purpose contains the data if a cloud provider needs additional components in order to expose the control plane. This field is immutable.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the region of this control plane. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n status ControlPlaneStatus (Optional) DNSRecord DNSRecord is a specification for a DNSRecord resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string DNSRecord metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec DNSRecordSpec Specification of the DNSRecord. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n region string (Optional) Region is the region of this DNS record. If not specified, the region specified in SecretRef will be used. If that is also not specified, the extension controller will use its default region.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record. If not specified, it will be determined automatically by getting all hosted zones of the account and searching for the longest zone name that is a suffix of Name.\n name string Name is the fully qualified domain name, e.g. “api.”. This field is immutable.\n recordType DNSRecordType RecordType is the DNS record type. Only A, CNAME, and TXT records are currently supported. This field is immutable.\n values []string Values is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records.\n ttl int64 (Optional) TTL is the time to live in seconds. Defaults to 120.\n status DNSRecordStatus (Optional) Extension Extension is a specification for a Extension resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Extension metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec ExtensionSpec Specification of the Extension. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n status ExtensionStatus (Optional) Infrastructure Infrastructure is a specification for cloud provider infrastructure.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Infrastructure metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec InfrastructureSpec Specification of the Infrastructure. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this infrastructure. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with this infrastructure.\n status InfrastructureStatus (Optional) Network Network is the specification for cluster networking.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Network metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec NetworkSpec Specification of the Network. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n podCIDR string PodCIDR defines the CIDR that will be used for pods. This field is immutable.\n serviceCIDR string ServiceCIDR defines the CIDR that will be used for services. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md\n status NetworkStatus (Optional) OperatingSystemConfig OperatingSystemConfig is a specification for a OperatingSystemConfig resource\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string OperatingSystemConfig metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec OperatingSystemConfigSpec Specification of the OperatingSystemConfig. If the object’s deletion timestamp is set, this field is immutable.\n criConfig CRIConfig (Optional) CRI config is a structure contains configurations of the CRI library\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose OperatingSystemConfigPurpose Purpose describes how the result of this OperatingSystemConfig is used by Gardener. Either it gets sent to the Worker extension controller to bootstrap a VM, or it is downloaded by the gardener-node-agent already running on a bootstrapped VM. This field is immutable.\n units []Unit (Optional) Units is a list of unit for the operating system configuration (usually, a systemd unit).\n files []File (Optional) Files is a list of files that should get written to the host’s file system.\n status OperatingSystemConfigStatus (Optional) Worker Worker is a specification for a Worker resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Worker metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec WorkerSpec Specification of the Worker. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus is a raw extension field that contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the name of the region where the worker pool should be deployed to. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with these workers.\n pools []WorkerPool Pools is a list of worker pools.\n status WorkerStatus (Optional) BackupBucketSpec (Appears on: BackupBucket) BackupBucketSpec is the spec for an BackupBucket resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this bucket. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n BackupBucketStatus (Appears on: BackupBucket) BackupBucketStatus is the status for an BackupBucket resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n generatedSecretRef Kubernetes core/v1.SecretReference (Optional) GeneratedSecretRef is reference to the secret generated by backup bucket, which will have object store specific credentials.\n BackupEntrySpec (Appears on: BackupEntry) BackupEntrySpec is the spec for an BackupEntry resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n backupBucketProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) BackupBucketProviderStatus contains the provider status that has been generated by the controller responsible for the BackupBucket resource.\n region string Region is the region of this Entry. This field is immutable.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n BackupEntryStatus (Appears on: BackupEntry) BackupEntryStatus is the status for an BackupEntry resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n BastionIngressPolicy (Appears on: BastionSpec) BastionIngressPolicy represents an ingress policy for SSH bastion hosts.\n Field Description ipBlock Kubernetes networking/v1.IPBlock IPBlock defines an IP block that is allowed to access the bastion.\n BastionSpec (Appears on: Bastion) BastionSpec contains the specification for an SSH bastion host.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n userData []byte UserData is the base64-encoded user data for the bastion instance. This should contain code to provision the SSH key on the bastion instance. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n BastionStatus (Appears on: Bastion) BastionStatus holds the most recently observed status of the Bastion.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n ingress Kubernetes core/v1.LoadBalancerIngress (Optional) Ingress is the external IP and/or hostname of the bastion host.\n CRIConfig (Appears on: OperatingSystemConfigSpec) CRIConfig contains configurations of the CRI library.\n Field Description name CRIName Name is a mandatory string containing the name of the CRI library. Supported values are containerd.\n cgroupDriver CgroupDriverName (Optional) CgroupDriver configures the CRI’s cgroup driver. Supported values are cgroupfs or systemd.\n containerd ContainerdConfig (Optional) ContainerdConfig is the containerd configuration. Only to be set for OperatingSystemConfigs with purpose ‘reconcile’.\n CRIName (string alias)\n (Appears on: CRIConfig) CRIName is a type alias for the CRI name string.\nCgroupDriverName (string alias)\n (Appears on: CRIConfig) CgroupDriverName is a string denoting the CRI cgroup driver.\nCloudConfig (Appears on: OperatingSystemConfigStatus) CloudConfig contains the generated output for the given operating system config spec. It contains a reference to a secret as the result may contain confidential data.\n Field Description secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the actual result of the generated cloud config.\n ClusterAutoscalerOptions (Appears on: WorkerPool) ClusterAutoscalerOptions contains the cluster autoscaler configurations for a worker pool.\n Field Description scaleDownUtilizationThreshold string (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed.\n scaleDownGpuUtilizationThreshold string (Optional) ScaleDownGpuUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) of gpu resources under which a node is being removed.\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down.\n scaleDownUnreadyTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnreadyTime defines how long an unready node should be unneeded before it is eligible for scale down.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long cluster autoscaler should wait for a node to be provisioned.\n ClusterSpec (Appears on: Cluster) ClusterSpec is the spec for a Cluster resource.\n Field Description cloudProfile k8s.io/apimachinery/pkg/runtime.RawExtension CloudProfile is a raw extension field that contains the cloudprofile resource referenced by the shoot that has to be reconciled.\n seed k8s.io/apimachinery/pkg/runtime.RawExtension Seed is a raw extension field that contains the seed resource referenced by the shoot that has to be reconciled.\n shoot k8s.io/apimachinery/pkg/runtime.RawExtension Shoot is a raw extension field that contains the shoot resource that has to be reconciled.\n ContainerRuntimeSpec (Appears on: ContainerRuntime) ContainerRuntimeSpec is the spec for a ContainerRuntime resource.\n Field Description binaryPath string BinaryPath is the Worker’s machine path where container runtime extensions should copy the binaries to.\n workerPool ContainerRuntimeWorkerPool WorkerPool identifies the worker pool of the Shoot. For each worker pool and type, Gardener deploys a ContainerRuntime CRD.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n ContainerRuntimeStatus (Appears on: ContainerRuntime) ContainerRuntimeStatus is the status for a ContainerRuntime resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n ContainerRuntimeWorkerPool (Appears on: ContainerRuntimeSpec) ContainerRuntimeWorkerPool identifies a Shoot worker pool by its name and selector.\n Field Description name string Name specifies the name of the worker pool the container runtime should be available for. This field is immutable.\n selector Kubernetes meta/v1.LabelSelector Selector is the label selector used by the extension to match the nodes belonging to the worker pool.\n ContainerdConfig (Appears on: CRIConfig) ContainerdConfig contains configuration options for containerd.\n Field Description registries []RegistryConfig (Optional) Registries configures the registry hosts for containerd.\n sandboxImage string SandboxImage configures the sandbox image for containerd.\n plugins []PluginConfig (Optional) Plugins configures the plugins section in containerd’s config.toml.\n ControlPlaneSpec (Appears on: ControlPlane) ControlPlaneSpec is the spec of a ControlPlane resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose Purpose (Optional) Purpose contains the data if a cloud provider needs additional components in order to expose the control plane. This field is immutable.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the region of this control plane. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n ControlPlaneStatus (Appears on: ControlPlane) ControlPlaneStatus is the status of a ControlPlane resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n DNSRecordSpec (Appears on: DNSRecord) DNSRecordSpec is the spec of a DNSRecord resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n region string (Optional) Region is the region of this DNS record. If not specified, the region specified in SecretRef will be used. If that is also not specified, the extension controller will use its default region.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record. If not specified, it will be determined automatically by getting all hosted zones of the account and searching for the longest zone name that is a suffix of Name.\n name string Name is the fully qualified domain name, e.g. “api.”. This field is immutable.\n recordType DNSRecordType RecordType is the DNS record type. Only A, CNAME, and TXT records are currently supported. This field is immutable.\n values []string Values is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records.\n ttl int64 (Optional) TTL is the time to live in seconds. Defaults to 120.\n DNSRecordStatus (Appears on: DNSRecord) DNSRecordStatus is the status of a DNSRecord resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record.\n DNSRecordType (string alias)\n (Appears on: DNSRecordSpec) DNSRecordType is a string alias.\nDataVolume (Appears on: WorkerPool) DataVolume contains information about a data volume.\n Field Description name string Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string Size is the of the root volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n DefaultSpec (Appears on: BackupBucketSpec, BackupEntrySpec, BastionSpec, ContainerRuntimeSpec, ControlPlaneSpec, DNSRecordSpec, ExtensionSpec, InfrastructureSpec, NetworkSpec, OperatingSystemConfigSpec, WorkerSpec) DefaultSpec contains common status fields for every extension resource.\n Field Description type string Type contains the instance of the resource’s kind.\n class ExtensionClass (Optional) Class holds the extension class used to control the responsibility for multiple provider extensions.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the provider specific configuration.\n DefaultStatus (Appears on: BackupBucketStatus, BackupEntryStatus, BastionStatus, ContainerRuntimeStatus, ControlPlaneStatus, DNSRecordStatus, ExtensionStatus, InfrastructureStatus, NetworkStatus, OperatingSystemConfigStatus, WorkerStatus) DefaultStatus contains common status fields for every extension resource.\n Field Description providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains provider-specific status.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Seed’s current state.\n lastError github.com/gardener/gardener/pkg/apis/core/v1beta1.LastError (Optional) LastError holds information about the last occurred error during an operation.\n lastOperation github.com/gardener/gardener/pkg/apis/core/v1beta1.LastOperation (Optional) LastOperation holds information about the last operation on the resource.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n state k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) State can be filled by the operating controller with what ever data it needs.\n resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in the state by their names.\n DropIn (Appears on: Unit) DropIn is a drop-in configuration for a systemd unit.\n Field Description name string Name is the name of the drop-in.\n content string Content is the content of the drop-in.\n ExtensionClass (string alias)\n (Appears on: DefaultSpec) ExtensionClass is a string alias for an extension class.\nExtensionSpec (Appears on: Extension) ExtensionSpec is the spec for a Extension resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n ExtensionStatus (Appears on: Extension) ExtensionStatus is the status for a Extension resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n File (Appears on: OperatingSystemConfigSpec, OperatingSystemConfigStatus) File is a file that should get written to the host’s file system. The content can either be inlined or referenced from a secret in the same namespace.\n Field Description path string Path is the path of the file system where the file should get written to.\n permissions int32 (Optional) Permissions describes with which permissions the file should get written to the file system. If no permissions are set, the operating system’s defaults are used.\n content FileContent Content describe the file’s content.\n FileCodecID (string alias)\n FileCodecID is the id of a FileCodec for cloud-init scripts.\nFileContent (Appears on: File) FileContent can either reference a secret or contain inline configuration.\n Field Description secretRef FileContentSecretRef (Optional) SecretRef is a struct that contains information about the referenced secret.\n inline FileContentInline (Optional) Inline is a struct that contains information about the inlined data.\n transmitUnencoded bool (Optional) TransmitUnencoded set to true will ensure that the os-extension does not encode the file content when sent to the node. This for example can be used to manipulate the clear-text content before it reaches the node.\n imageRef FileContentImageRef (Optional) ImageRef describes a container image which contains a file.\n FileContentImageRef (Appears on: FileContent) FileContentImageRef describes a container image which contains a file\n Field Description image string Image contains the container image repository with tag.\n filePathInImage string FilePathInImage contains the path in the image to the file that should be extracted.\n FileContentInline (Appears on: FileContent) FileContentInline contains keys for inlining a file content’s data and encoding.\n Field Description encoding string Encoding is the file’s encoding (e.g. base64).\n data string Data is the file’s data.\n FileContentSecretRef (Appears on: FileContent) FileContentSecretRef contains keys for referencing a file content’s data from a secret in the same namespace.\n Field Description name string Name is the name of the secret.\n dataKey string DataKey is the key in the secret’s .data field that should be read.\n IPFamily (string alias)\n (Appears on: NetworkSpec) IPFamily is a type for specifying an IP protocol version to use in Gardener clusters.\nInfrastructureSpec (Appears on: Infrastructure) InfrastructureSpec is the spec for an Infrastructure resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this infrastructure. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with this infrastructure.\n InfrastructureStatus (Appears on: Infrastructure) InfrastructureStatus is the status for an Infrastructure resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n nodesCIDR string (Optional) NodesCIDR is the CIDR of the node network that was optionally created by the acting extension controller. This might be needed in environments in which the CIDR for the network for the shoot worker node cannot be statically defined in the Shoot resource but must be computed dynamically.\n egressCIDRs []string (Optional) EgressCIDRs is a list of CIDRs used by the shoot as the source IP for egress traffic. For certain environments the egress IPs may not be stable in which case the extension controller may opt to not populate this field.\n networking InfrastructureStatusNetworking (Optional) Networking contains information about cluster networking such as CIDRs.\n InfrastructureStatusNetworking (Appears on: InfrastructureStatus) InfrastructureStatusNetworking is a structure containing information about the node, service and pod network ranges.\n Field Description pods []string (Optional) Pods are the CIDRs of the pod network.\n nodes []string (Optional) Nodes are the CIDRs of the node network.\n services []string (Optional) Services are the CIDRs of the service network.\n MachineDeployment (Appears on: WorkerStatus) MachineDeployment is a created machine deployment.\n Field Description name string Name is the name of the MachineDeployment resource.\n minimum int32 Minimum is the minimum number for this machine deployment.\n maximum int32 Maximum is the maximum number for this machine deployment.\n MachineImage (Appears on: WorkerPool) MachineImage contains logical information about the name and the version of the machie image that should be used. The logical information must be mapped to the provider-specific information (e.g., AMIs, …) by the provider itself.\n Field Description name string Name is the logical name of the machine image.\n version string Version is the version of the machine image.\n NetworkSpec (Appears on: Network) NetworkSpec is the spec for an Network resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n podCIDR string PodCIDR defines the CIDR that will be used for pods. This field is immutable.\n serviceCIDR string ServiceCIDR defines the CIDR that will be used for services. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md\n NetworkStatus (Appears on: Network) NetworkStatus is the status for an Network resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n NodeTemplate (Appears on: WorkerPool) NodeTemplate contains information about the expected node properties.\n Field Description capacity Kubernetes core/v1.ResourceList Capacity represents the expected Node capacity.\n Object Object is an extension object resource.\nOperatingSystemConfigPurpose (string alias)\n (Appears on: OperatingSystemConfigSpec) OperatingSystemConfigPurpose is a string alias.\nOperatingSystemConfigSpec (Appears on: OperatingSystemConfig) OperatingSystemConfigSpec is the spec for a OperatingSystemConfig resource.\n Field Description criConfig CRIConfig (Optional) CRI config is a structure contains configurations of the CRI library\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose OperatingSystemConfigPurpose Purpose describes how the result of this OperatingSystemConfig is used by Gardener. Either it gets sent to the Worker extension controller to bootstrap a VM, or it is downloaded by the gardener-node-agent already running on a bootstrapped VM. This field is immutable.\n units []Unit (Optional) Units is a list of unit for the operating system configuration (usually, a systemd unit).\n files []File (Optional) Files is a list of files that should get written to the host’s file system.\n OperatingSystemConfigStatus (Appears on: OperatingSystemConfig) OperatingSystemConfigStatus is the status for a OperatingSystemConfig resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n extensionUnits []Unit (Optional) ExtensionUnits is a list of additional systemd units provided by the extension.\n extensionFiles []File (Optional) ExtensionFiles is a list of additional files provided by the extension.\n cloudConfig CloudConfig (Optional) CloudConfig is a structure for containing the generated output for the given operating system config spec. It contains a reference to a secret as the result may contain confidential data.\n PluginConfig (Appears on: ContainerdConfig) PluginConfig contains configuration values for the containerd plugins section.\n Field Description op PluginPathOperation (Optional) Op is the operation for the given path. Possible values are ‘add’ and ‘remove’, defaults to ‘add’.\n path []string Path is a list of elements that construct the path in the plugins section.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the values configured at the given path. If defined, it is expected as json format: - A given json object will be put to the given path. - If not configured, only the table entry to be created.\n PluginPathOperation (string alias)\n (Appears on: PluginConfig) PluginPathOperation is a type alias for operations at containerd’s plugin configuration.\nPurpose (string alias)\n (Appears on: ControlPlaneSpec) Purpose is a string alias.\nRegistryCapability (string alias)\n (Appears on: RegistryHost) RegistryCapability specifies an action a client can perform against a registry.\nRegistryConfig (Appears on: ContainerdConfig) RegistryConfig contains registry configuration options.\n Field Description upstream string Upstream is the upstream name of the registry.\n server string (Optional) Server is the URL to registry server of this upstream. It corresponds to the server field in the hosts.toml file, see https://github.com/containerd/containerd/blob/c51463010e0682f76dfdc10edc095e6596e2764b/docs/hosts.md#server-field for more information.\n hosts []RegistryHost Hosts are the registry hosts. It corresponds to the host fields in the hosts.toml file, see https://github.com/containerd/containerd/blob/c51463010e0682f76dfdc10edc095e6596e2764b/docs/hosts.md#host-fields-in-the-toml-table-format for more information.\n readinessProbe bool (Optional) ReadinessProbe determines if host registry endpoints should be probed before they are added to the containerd config.\n RegistryHost (Appears on: RegistryConfig) RegistryHost contains configuration values for a registry host.\n Field Description url string URL is the endpoint address of the registry mirror.\n capabilities []RegistryCapability Capabilities determine what operations a host is capable of performing. Defaults to - pull - resolve\n caCerts []string CACerts are paths to public key certificates used for TLS.\n Spec Spec is the spec section of an Object.\nStatus Status is the status of an Object.\nUnit (Appears on: OperatingSystemConfigSpec, OperatingSystemConfigStatus) Unit is a unit for the operating system configuration (usually, a systemd unit).\n Field Description name string Name is the name of a unit.\n command UnitCommand (Optional) Command is the unit’s command.\n enable bool (Optional) Enable describes whether the unit is enabled or not.\n content string (Optional) Content is the unit’s content.\n dropIns []DropIn (Optional) DropIns is a list of drop-ins for this unit.\n filePaths []string FilePaths is a list of files the unit depends on. If any file changes a restart of the dependent unit will be triggered. For each FilePath there must exist a File with matching Path in OperatingSystemConfig.Spec.Files.\n UnitCommand (string alias)\n (Appears on: Unit) UnitCommand is a string alias.\nVolume (Appears on: WorkerPool) Volume contains information about the root disks that should be used for worker pools.\n Field Description name string (Optional) Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string Size is the of the root volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n WorkerPool (Appears on: WorkerSpec) WorkerPool is the definition of a specific worker pool.\n Field Description machineType string MachineType contains information about the machine type that should be used for this worker pool.\n maximum int32 Maximum is the maximum size of the worker pool.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString MaxSurge is maximum number of VMs that are created during an update.\n maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString MaxUnavailable is the maximum number of VMs that can be unavailable during an update.\n annotations map[string]string (Optional) Annotations is a map of key/value pairs for annotations for all the Node objects in this worker pool.\n labels map[string]string (Optional) Labels is a map of key/value pairs for labels for all the Node objects in this worker pool.\n taints []Kubernetes core/v1.Taint (Optional) Taints is a list of taints for all the Node objects in this worker pool.\n machineImage MachineImage MachineImage contains logical information about the name and the version of the machie image that should be used. The logical information must be mapped to the provider-specific information (e.g., AMIs, …) by the provider itself.\n minimum int32 Minimum is the minimum size of the worker pool.\n name string Name is the name of this worker pool.\n nodeAgentSecretName string (Optional) NodeAgentSecretName is uniquely identifying selected aspects of the OperatingSystemConfig. If it changes, then the worker pool must be rolled.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is a provider specific configuration for the worker pool.\n userDataSecretRef Kubernetes core/v1.SecretKeySelector UserDataSecretRef references a Secret and a data key containing the data that is sent to the provider’s APIs when a new machine/VM that is part of this worker pool shall be spawned.\n volume Volume (Optional) Volume contains information about the root disks that should be used for this worker pool.\n dataVolumes []DataVolume (Optional) DataVolumes contains a list of additional worker volumes.\n kubeletDataVolumeName string (Optional) KubeletDataVolumeName contains the name of a dataVolume that should be used for storing kubelet state.\n zones []string (Optional) Zones contains information about availability zones for this worker pool.\n machineControllerManager github.com/gardener/gardener/pkg/apis/core/v1beta1.MachineControllerManagerSettings (Optional) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n kubernetesVersion string (Optional) KubernetesVersion is the kubernetes version in this worker pool\n nodeTemplate NodeTemplate (Optional) NodeTemplate contains resource information of the machine which is used by Cluster Autoscaler to generate nodeTemplate during scaling a nodeGroup from zero\n architecture string (Optional) Architecture is the CPU architecture of the worker pool machines and machine image.\n clusterAutoscaler ClusterAutoscalerOptions (Optional) ClusterAutoscaler contains the cluster autoscaler configurations for the worker pool.\n WorkerSpec (Appears on: Worker) WorkerSpec is the spec for a Worker resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus is a raw extension field that contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the name of the region where the worker pool should be deployed to. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with these workers.\n pools []WorkerPool Pools is a list of worker pools.\n WorkerStatus (Appears on: Worker) WorkerStatus is the status for a Worker resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n machineDeployments []MachineDeployment MachineDeployments is a list of created machine deployments. It will be used to e.g. configure the cluster-autoscaler properly.\n machineDeploymentsLastUpdateTime Kubernetes meta/v1.Time (Optional) MachineDeploymentsLastUpdateTime is the timestamp when the status.MachineDeployments slice was last updated.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n extensions.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/extensions/","tags":"","title":"Extensions"},{"body":"Feature Gates in Gardener This page contains an overview of the various feature gates an administrator can specify on different Gardener components.\nOverview Feature gates are a set of key=value pairs that describe Gardener features. You can turn these features on or off using the component configuration file for a specific component.\nEach Gardener component lets you enable or disable a set of feature gates that are relevant to that component. For example, this is the configuration of the gardenlet component.\nThe following tables are a summary of the feature gates that you can set on different Gardener components.\n The “Since” column contains the Gardener release when a feature is introduced or its release stage is changed. The “Until” column, if not empty, contains the last Gardener release in which you can still use a feature gate. If a feature is in the Alpha or Beta state, you can find the feature listed in the Alpha/Beta feature gate table. If a feature is stable you can find all stages for that feature listed in the Graduated/Deprecated feature gate table. The Graduated/Deprecated feature gate table also lists deprecated and withdrawn features. Feature Gates for Alpha or Beta Features Feature Default Stage Since Until HVPA false Alpha 0.31 HVPAForShootedSeed false Alpha 0.32 DefaultSeccompProfile false Alpha 1.54 IPv6SingleStack false Alpha 1.63 ShootForceDeletion false Alpha 1.81 1.90 ShootForceDeletion true Beta 1.91 UseNamespacedCloudProfile false Alpha 1.92 ShootManagedIssuer false Alpha 1.93 VPAForETCD false Alpha 1.94 1.96 VPAForETCD true Beta 1.97 VPAAndHPAForAPIServer false Alpha 1.95 1.100 VPAAndHPAForAPIServer true Beta 1.101 ShootCredentialsBinding false Alpha 1.98 NewWorkerPoolHash false Alpha 1.98 NewVPN false Alpha 1.104 Feature Gates for Graduated or Deprecated Features Feature Default Stage Since Until NodeLocalDNS false Alpha 1.7 NodeLocalDNS Removed 1.26 KonnectivityTunnel false Alpha 1.6 KonnectivityTunnel Removed 1.27 MountHostCADirectories false Alpha 1.11 1.25 MountHostCADirectories true Beta 1.26 1.27 MountHostCADirectories true GA 1.27 MountHostCADirectories Removed 1.30 DisallowKubeconfigRotationForShootInDeletion false Alpha 1.28 1.31 DisallowKubeconfigRotationForShootInDeletion true Beta 1.32 1.35 DisallowKubeconfigRotationForShootInDeletion true GA 1.36 DisallowKubeconfigRotationForShootInDeletion Removed 1.38 Logging false Alpha 0.13 1.40 Logging Removed 1.41 AdminKubeconfigRequest false Alpha 1.24 1.38 AdminKubeconfigRequest true Beta 1.39 1.41 AdminKubeconfigRequest true GA 1.42 1.49 AdminKubeconfigRequest Removed 1.50 UseDNSRecords false Alpha 1.27 1.38 UseDNSRecords true Beta 1.39 1.43 UseDNSRecords true GA 1.44 1.49 UseDNSRecords Removed 1.50 CachedRuntimeClients false Alpha 1.7 1.33 CachedRuntimeClients true Beta 1.34 1.44 CachedRuntimeClients true GA 1.45 1.49 CachedRuntimeClients Removed 1.50 DenyInvalidExtensionResources false Alpha 1.31 1.41 DenyInvalidExtensionResources true Beta 1.42 1.44 DenyInvalidExtensionResources true GA 1.45 1.49 DenyInvalidExtensionResources Removed 1.50 RotateSSHKeypairOnMaintenance false Alpha 1.28 1.44 RotateSSHKeypairOnMaintenance true Beta 1.45 1.47 RotateSSHKeypairOnMaintenance (deprecated) false Beta 1.48 1.50 RotateSSHKeypairOnMaintenance (deprecated) Removed 1.51 ShootMaxTokenExpirationOverwrite false Alpha 1.43 1.44 ShootMaxTokenExpirationOverwrite true Beta 1.45 1.47 ShootMaxTokenExpirationOverwrite true GA 1.48 1.50 ShootMaxTokenExpirationOverwrite Removed 1.51 ShootMaxTokenExpirationValidation false Alpha 1.43 1.45 ShootMaxTokenExpirationValidation true Beta 1.46 1.47 ShootMaxTokenExpirationValidation true GA 1.48 1.50 ShootMaxTokenExpirationValidation Removed 1.51 WorkerPoolKubernetesVersion false Alpha 1.35 1.45 WorkerPoolKubernetesVersion true Beta 1.46 1.49 WorkerPoolKubernetesVersion true GA 1.50 1.51 WorkerPoolKubernetesVersion Removed 1.52 DisableDNSProviderManagement false Alpha 1.41 1.49 DisableDNSProviderManagement true Beta 1.50 1.51 DisableDNSProviderManagement true GA 1.52 1.59 DisableDNSProviderManagement Removed 1.60 SecretBindingProviderValidation false Alpha 1.38 1.50 SecretBindingProviderValidation true Beta 1.51 1.52 SecretBindingProviderValidation true GA 1.53 1.54 SecretBindingProviderValidation Removed 1.55 SeedKubeScheduler false Alpha 1.15 1.54 SeedKubeScheduler false Deprecated 1.55 1.60 SeedKubeScheduler Removed 1.61 ShootCARotation false Alpha 1.42 1.50 ShootCARotation true Beta 1.51 1.56 ShootCARotation true GA 1.57 1.59 ShootCARotation Removed 1.60 ShootSARotation false Alpha 1.48 1.50 ShootSARotation true Beta 1.51 1.56 ShootSARotation true GA 1.57 1.59 ShootSARotation Removed 1.60 ReversedVPN false Alpha 1.22 1.41 ReversedVPN true Beta 1.42 1.62 ReversedVPN true GA 1.63 1.69 ReversedVPN Removed 1.70 ForceRestore Removed 1.66 SeedChange false Alpha 1.12 1.52 SeedChange true Beta 1.53 1.68 SeedChange true GA 1.69 1.72 SeedChange Removed 1.73 CopyEtcdBackupsDuringControlPlaneMigration false Alpha 1.37 1.52 CopyEtcdBackupsDuringControlPlaneMigration true Beta 1.53 1.68 CopyEtcdBackupsDuringControlPlaneMigration true GA 1.69 1.72 CopyEtcdBackupsDuringControlPlaneMigration Removed 1.73 ManagedIstio false Alpha 1.5 1.18 ManagedIstio true Beta 1.19 ManagedIstio true Deprecated 1.48 1.69 ManagedIstio Removed 1.70 APIServerSNI false Alpha 1.7 1.18 APIServerSNI true Beta 1.19 APIServerSNI true Deprecated 1.48 1.72 APIServerSNI Removed 1.73 HAControlPlanes false Alpha 1.49 1.70 HAControlPlanes true Beta 1.71 1.72 HAControlPlanes true GA 1.73 1.73 HAControlPlanes Removed 1.74 FullNetworkPoliciesInRuntimeCluster false Alpha 1.66 1.70 FullNetworkPoliciesInRuntimeCluster true Beta 1.71 1.72 FullNetworkPoliciesInRuntimeCluster true GA 1.73 1.73 FullNetworkPoliciesInRuntimeCluster Removed 1.74 DisableScalingClassesForShoots false Alpha 1.73 1.78 DisableScalingClassesForShoots true Beta 1.79 1.80 DisableScalingClassesForShoots true GA 1.81 1.81 DisableScalingClassesForShoots Removed 1.82 ContainerdRegistryHostsDir false Alpha 1.77 1.85 ContainerdRegistryHostsDir true Beta 1.86 1.86 ContainerdRegistryHostsDir true GA 1.87 1.87 ContainerdRegistryHostsDir Removed 1.88 WorkerlessShoots false Alpha 1.70 1.78 WorkerlessShoots true Beta 1.79 1.85 WorkerlessShoots true GA 1.86 WorkerlessShoots Removed 1.88 MachineControllerManagerDeployment false Alpha 1.73 MachineControllerManagerDeployment true Beta 1.81 1.81 MachineControllerManagerDeployment true GA 1.82 1.91 MachineControllerManagerDeployment Removed 1.92 APIServerFastRollout true Beta 1.82 1.89 APIServerFastRollout true GA 1.90 1.91 APIServerFastRollout Removed 1.92 UseGardenerNodeAgent false Alpha 1.82 1.88 UseGardenerNodeAgent true Beta 1.89 UseGardenerNodeAgent true GA 1.90 1.91 UseGardenerNodeAgent Removed 1.92 CoreDNSQueryRewriting false Alpha 1.55 1.95 CoreDNSQueryRewriting true Beta 1.96 1.96 CoreDNSQueryRewriting true GA 1.97 1.100 CoreDNSQueryRewriting Removed 1.101 MutableShootSpecNetworkingNodes false Alpha 1.64 1.95 MutableShootSpecNetworkingNodes true Beta 1.96 1.96 MutableShootSpecNetworkingNodes true GA 1.97 1.100 MutableShootSpecNetworkingNodes Removed 1.101 Using a Feature A feature can be in Alpha, Beta or GA stage. An Alpha feature means:\n Disabled by default. Might be buggy. Enabling the feature may expose bugs. Support for feature may be dropped at any time without notice. The API may change in incompatible ways in a later software release without notice. Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support. A Beta feature means:\n Enabled by default. The feature is well tested. Enabling the feature is considered safe. Support for the overall feature will not be dropped, though details may change. The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature. Recommended for only non-critical uses because of potential for incompatible changes in subsequent releases. Please do try Beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.\n A General Availability (GA) feature is also referred to as a stable feature. It means:\n The feature is always enabled; you cannot disable it. The corresponding feature gate is no longer needed. Stable versions of features will appear in released software for many subsequent versions. List of Feature Gates Feature Relevant Components Description HVPA gardenlet, gardener-operator Enables simultaneous horizontal and vertical scaling in garden or seed clusters. HVPAForShootedSeed gardenlet Enables simultaneous horizontal and vertical scaling in managed seed (aka “shooted seed”) clusters. DefaultSeccompProfile gardenlet, gardener-operator Enables the defaulting of the seccomp profile for Gardener managed workload in the garden or seed to RuntimeDefault. IPv6SingleStack gardener-apiserver, gardenlet Allows creating seed and shoot clusters with IPv6 single-stack networking enabled in their spec (GEP-21). If enabled in gardenlet, the default behavior is unchanged, but setting ipFamilies=[IPv6] in the seedConfig is allowed. Only if the ipFamilies setting is changed, gardenlet behaves differently. ShootForceDeletion gardener-apiserver Allows forceful deletion of Shoots by annotating them with the confirmation.gardener.cloud/force-deletion annotation. UseNamespacedCloudProfile gardener-apiserver Enables usage of NamespacedCloudProfiles in Shoots. ShootManagedIssuer gardenlet Enables the shoot managed issuer functionality described in GEP 24. VPAForETCD gardenlet, gardener-operator Enables VPA for etcd-main and etcd-events, regardless of HVPA enablement. VPAAndHPAForAPIServer gardenlet, gardener-operator Enables an autoscaling mechanism for kube-apiserver of shoot or virtual garden clusters, and the gardener-apiserver. They are scaled simultaneously by VPA and HPA on the same metric (CPU and memory usage). The pod-trashing cycle between VPA and HPA scaling on the same metric is avoided by configuring the HPA to scale on average usage (not on average utilization) and by picking the target average utilization values in sync with VPA’s allowed maximums. The feature gate takes precedence over the HVPA feature gate when they are both enabled. ShootCredentialsBinding gardener-apiserver Enables usage of CredentialsBindingName in Shoots. NewWorkerPoolHash gardenlet Enables usage of the new worker pool hash calculation. The new calculation supports rolling worker pools if kubeReserved, systemReserved, evicitonHard or cpuManagerPolicy in the kubelet configuration are changed. All provider extensions must be upgraded to support this feature first. Existing worker pools are not immediately migrated to the new hash variant, since this would trigger the replacement of all nodes. The migration happens when a rolling update is triggered according to the old or new hash version calculation. NewVPN gardenlet Enables usage of the new implementation of the VPN (go rewrite) using an IPv6 transfer network. ","categories":"","description":"","excerpt":"Feature Gates in Gardener This page contains an overview of the …","ref":"/docs/gardener/deployment/feature_gates/","tags":"","title":"Feature Gates"},{"body":"Feature Gates in Etcd-Druid This page contains an overview of the various feature gates an administrator can specify on etcd-druid.\nOverview Feature gates are a set of key=value pairs that describe etcd-druid features. You can turn these features on or off by passing them to the --feature-gates CLI flag in the etcd-druid command.\nThe following tables are a summary of the feature gates that you can set on etcd-druid.\n The “Since” column contains the etcd-druid release when a feature is introduced or its release stage is changed. The “Until” column, if not empty, contains the last etcd-druid release in which you can still use a feature gate. If a feature is in the Alpha or Beta state, you can find the feature listed in the Alpha/Beta feature gate table. If a feature is stable you can find all stages for that feature listed in the Graduated/Deprecated feature gate table. The Graduated/Deprecated feature gate table also lists deprecated and withdrawn features. Feature Gates for Alpha or Beta Features Feature Default Stage Since Until UseEtcdWrapper false Alpha 0.19 0.21 UseEtcdWrapper true Beta 0.22 Feature Gates for Graduated or Deprecated Features Feature Default Stage Since Until Using a Feature A feature can be in Alpha, Beta or GA stage. An Alpha feature means:\n Disabled by default. Might be buggy. Enabling the feature may expose bugs. Support for feature may be dropped at any time without notice. The API may change in incompatible ways in a later software release without notice. Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support. A Beta feature means:\n Enabled by default. The feature is well tested. Enabling the feature is considered safe. Support for the overall feature will not be dropped, though details may change. The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature. Recommended for only non-critical uses because of potential for incompatible changes in subsequent releases. Please do try Beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.\n A General Availability (GA) feature is also referred to as a stable feature. It means:\n The feature is always enabled; you cannot disable it. The corresponding feature gate is no longer needed. Stable versions of features will appear in released software for many subsequent versions. List of Feature Gates Feature Description UseEtcdWrapper Enables the use of etcd-wrapper image and a compatible version of etcd-backup-restore, along with component-specific configuration changes necessary for the usage of the etcd-wrapper image. ","categories":"","description":"","excerpt":"Feature Gates in Etcd-Druid This page contains an overview of the …","ref":"/docs/other-components/etcd-druid/deployment/feature-gates/","tags":"","title":"Feature Gates in Etcd-Druid"},{"body":"Reasoning Custom Resource Definition (CRD) is what you use to define a Custom Resource. This is a powerful way to extend Kubernetes capabilities beyond the default installation, adding any kind of API objects useful for your application.\nThe CustomResourceDefinition API provides a workflow for introducing and upgrading to new versions of a CustomResourceDefinition. In a scenario where a CRD adds support for a new version and switches its spec.versions.storage field to it (i.e., from v1beta1 to v1), existing objects are not migrated in etcd. For more information, see Versions in CustomResourceDefinitions.\nThis creates a mismatch between the requested and stored version for all clients (kubectl, KCM, etc.). When the CRD also declares the usage of a conversion webhook, it gets called whenever a client requests information about a resource that still exists in the old version. If the CRD is created by the end-user, the webhook runs on the shoot side, whereas controllers / kapi-servers run separated, as part of the control-plane. For the webhook to be reachable, a working VPN connection seed -\u003e shoot is essential. In scenarios where the VPN connection is broken, the kube-controller-manager eventually stops its garbage collection, as that requires it to list v1.PartialObjectMetadata for everything to build a dependency graph. Without the kube-controller-manager’s garbage collector, managed resources get stuck during update/rollout.\nBreaking Situations When a user upgrades to failureTolerance: node|zone, that will cause the VPN deployments to be replaced by statefulsets. However, as the VPN connection is broken upon teardown of the deployment, garbage collection will fail, leading to a situation that is stuck until an operator manually tackles it.\nSuch a situation can be avoided if the end-user has correctly configured CRDs containing conversion webhooks.\nChecking Problematic CRDs In order to make sure there are no version problematic CRDs, please run the script below in your shoot. It will return the name of the CRDs in case they have one of the 2 problems:\n the returned version of the CR is different than what is maintained in the status.storedVersions field of the CRD. the status.storedVersions field of the CRD has more than 1 version defined. #!/bin/bash set -e -o pipefail echo \"Checking all CRDs in the cluster...\" for p in $(kubectl get crd | awk 'NR\u003e1' | awk '{print $1}'); do strategy=$(kubectl get crd \"$p\" -o json | jq -r .spec.conversion.strategy) if [ \"$strategy\" == \"Webhook\" ]; then crd_name=$(kubectl get crd \"$p\" -o json | jq -r .metadata.name) number_of_stored_versions=$(kubectl get crd \"$crd_name\" -o json | jq '.status.storedVersions | length') if [[ \"$number_of_stored_versions\" == 1 ]]; then returned_cr_version=$(kubectl get \"$crd_name\" -A -o json | jq -r '.items[] | .apiVersion' | sed 's:.*/::') if [ -z \"$returned_cr_version\" ]; then continue else variable=$(echo \"$returned_cr_version\" | xargs -n1 | sort -u | xargs) present_version=$(kubectl get crd \"$crd_name\" -o json | jq -cr '.status.storedVersions |.[]') if [[ $variable != \"$present_version\" ]]; then echo \"ERROR: Stored version differs from the version that CRs are being returned. $crd_namewith conversion webhook needs to be fixed\" fi fi fi if [[ \"$number_of_stored_versions\" -gt 1 ]]; then returned_cr_version=$(kubectl get \"$crd_name\" -A -o json | jq -r '.items[] | .apiVersion' | sed 's:.*/::') if [ -z \"$returned_cr_version\" ]; then continue else echo \"ERROR: Too many stored versions defined. $crd_namewith conversion webhook needs to be fixed\" fi fi fi done echo \"Problematic CRDs are reported above.\" Resolve CRDs Below we give the steps needed to be taken in order to fix the CRDs reported by the script above.\nInspect all your CRDs that have conversion webhooks in place. If you have more than 1 version defined in its spec.status.storedVersions field, then initiate migration as described in Option 2 in the Upgrade existing objects to a new stored version guide.\nFor convenience, we have provided the necessary steps below.\nNote Please test the following steps on a non-productive landscape to make sure that the new CR version doesn’t break any of your existing workloads. Please check/set the old CR version to storage:false and set the new CR version to storage:true.\nFor the sake of an example, let’s consider the two versions v1beta1 (old) and v1 (new).\nBefore:\nspec: versions: - name: v1beta1 ...... storage: true - name: v1 ...... storage: false After:\nspec: versions: - name: v1beta1 ...... storage: false - name: v1 ...... storage: true Convert custom-resources to the newest version.\nkubectl get \u003ccustom-resource-name\u003e -A -ojson | k apply -f - Patch the CRD to keep only the latest version under storedVersions.\nkubectl patch customresourcedefinitions \u003ccrd-name\u003e --subresource='status' --type='merge' -p '{\"status\":{\"storedVersions\":[\"your-latest-cr-version\"]}}' ","categories":"","description":"","excerpt":"Reasoning Custom Resource Definition (CRD) is what you use to define a …","ref":"/docs/guides/administer-shoots/conversion-webhook/","tags":"","title":"Fix Problematic Conversion Webhooks"},{"body":"Force Deletion From v1.81, Gardener supports Shoot Force Deletion. All extension controllers should also properly support it. This document outlines some important points that extension maintainers should keep in mind to support force deletion in their extensions.\nOverall Principles The following principles should always be upheld:\n All resources pertaining to the extension and managed by it should be appropriately handled and cleaned up by the extension when force deletion is initiated. Implementation Details ForceDelete Actuator Methods Most extension controller implementations follow a common pattern where a generic Reconciler implementation delegates to an Actuator interface that contains the methods Reconcile, Delete, Migrate and Restore provided by the extension. A new method, ForceDelete has been added to all such Actuator interfaces; see the infrastructure Actuator interface as an example. The generic reconcilers call this method if the Shoot has annotation confirmation.gardener.cloud/force-deletion=true. Thus, it should be implemented by the extension controller to forcefully delete resources if not possible to delete them gracefully. If graceful deletion is possible, then in the ForceDelete, they can simply call the Delete method.\nExtension Controllers Based on Generic Actuators In practice, the implementation of many extension controllers (for example, the controlplane and worker controllers in most provider extensions) are based on a generic Actuator implementation that only delegates to extension methods for behavior that is truly provider-specific. In all such cases, the ForceDelete method has already been implemented with a method that should suit most of the extensions. If it doesn’t suit your extension, then the ForceDelete method needs to be overridden; see the Azure controlplane controller as an example.\nExtension Controllers Not Based on Generic Actuators The implementation of some extension controllers (for example, the infrastructure controllers in all provider extensions) are not based on a generic Actuator implementation. Such extension controllers must always provide a proper implementation of the ForceDelete method according to the above guidelines; see the AWS infrastructure controller as an example. In practice, this might result in code duplication between the different extensions, since the ForceDelete code is usually not OS-specific.\nSome General Implementation Examples If the extension deploys only resources in the shoot cluster not backed by infrastructure in third-party systems, then performing the regular deletion code (actuator.Delete) will suffice in the majority of cases. (e.g - https://github.com/gardener/gardener-extension-shoot-networking-filter/blob/1d95a483d803874e8aa3b1de89431e221a7d574e/pkg/controller/lifecycle/actuator.go#L175-L178) If the extension deploys resources which are backed by infrastructure in third-party systems: If the resource is in the Seed cluster, the extension should remove the finalizers and delete the resource. This is needed especially if the resource is a custom resource since gardenlet will not be aware of this resource and cannot take action. If the resource is in the Shoot and if it’s deployed by a ManagedResource, then gardenlet will take care to forcefully delete it in a later step of force-deletion. If the resource is not deployed via a ManagedResource, then it wouldn’t block the deletion flow anyway since it is in the Shoot cluster. In both cases, the extension controller can ignore the resource and return nil. ","categories":"","description":"","excerpt":"Force Deletion From v1.81, Gardener supports Shoot Force Deletion. All …","ref":"/docs/gardener/extensions/force-deletion/","tags":"","title":"Force Deletion"},{"body":"This page gives writing formatting guidelines for the Gardener documentation. For style guidelines, see the Style Guide.\nThese are guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.\n Formatting of Inline Elements Code Snippet Formatting Related Links Formatting of Inline Elements Type of Text Formatting Markdown Syntax API Objects and Technical Components code Deploy a `Pod`. New Terms and Emphasis bold Do **not** stop it. Technical Names code Open file `root.yaml`. User Interface Elements italics Choose *CLUSTERS*. Inline Code and Inline Commands code For declarative management, use `kubectl apply`. Object Field Names and Field Values code Set the value of `image` to `nginx:1.8`. Links and References link Visit the [Gardener website](https://gardener.cloud/) Headers various # API Server API Objects and Technical Components When you refer to an API object, use the same uppercase and lowercase letters that are used in the actual object name, and use backticks (`) to format them. Typically, the names of API objects use camel case.\nDon’t split the API object name into separate words. For example, use PodTemplateList, not Pod Template List.\nRefer to API objects without saying “object,” unless omitting “object” leads to an awkward construction.\n Do Don’t The Pod has two containers. The pod has two containers. The Deployment is responsible for… The Deployment object is responsible for… A PodList is a list of Pods. A Pod List is a list of pods. The gardener-control-manager has control loops… The gardener-control-manager has control loops… The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources. The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources. Note Due to the way the website is built from content taken from different repositories, when editing or updating already existing documentation, you should follow the style used in the topic. When contributing new documentation, follow the guidelines outlined in this guide. New Terms and Emphasis Use bold to emphasize something or to introduce a new term.\n Do Don’t A cluster is a set of nodes … A “cluster” is a set of nodes … The system does not delete your objects. The system does not(!) delete your objects. Technical Names Use backticks (`) for filenames, technical componentes, directories, and paths.\n Do Don’t Open file envars.yaml. Open the envars.yaml file. Go to directory /docs/tutorials. Go to the /docs/tutorials directory. Open file /_data/concepts.yaml. Open the /_data/concepts.yaml file. User Interface Elements When referring to UI elements, refrain from using verbs like “Click” or “Select with right mouse button”. This level of detail is hardly ever needed and also invalidates a procedure if other devices are used. For example, for a tablet you’d say “Tap on”.\nUse italics when you refer to UI elements.\n UI Element Standard Formulation Markdown Syntax Button, Menu path Choose UI Element. Choose *UI Element*. Menu path, context menu, navigation path Choose System \u003e User Profile \u003e Own Data. Choose *System* \\\u003e *User Profile* \\\u003e *Own Data*. Entry fields Enter your password. Enter your password. Checkbox, radio button Select Filter. Select *Filter*. Expandable screen elements Expand User Settings.\nCollapse User Settings. Expand *User Settings*.\nCollapse *User Settings*. Inline Code and Inline Commands Use backticks (`) for inline code.\n Do Don’t The kubectl run command creates a Deployment. The “kubectl run” command creates a Deployment. For declarative management, use kubectl apply. For declarative management, use “kubectl apply”. Object Field Names and Field Values Use backticks (`) for field names, and field values.\n Do Don’t Set the value of the replicas field in the configuration file. Set the value of the “replicas” field in the configuration file. The value of the exec field is an ExecAction object. The value of the “exec” field is an ExecAction object. Set the value of imagePullPolicy to Always. Set the value of imagePullPolicy to “Always”. Set the value of image to nginx:1.8. Set the value of image to nginx:1.8. Links and References Do Don’t Use a descriptor of the link’s destination: “For more information, visit Gardener’s website.” Use a generic placeholder: “For more information, go here.” Use relative links when linking to content in the same repository: [Style Guide](../style-guide/_index.md) Use absolute links when linking to content in the same repository: [Style Guide](https://github.com/gardener/documentation/blob/master/website/documentation/contribute/documentation/style-guide/_index.md) Another thing to keep in mind is that markdown links do not work in certain shortcodes (e.g., mermaid). To circumvent this problem, you can use HTML links.\nHeaders Use H1 for the title of the topic. (# H1 Title) Use H2 for each main section. (## H2 Title) Use H3 for any sub-section in the main sections. (### H3 Title) Avoid using H4-H6. Try moving the additional information to a new topic instead. Code Snippet Formatting Don’t Include the Command Prompt Do Don’t kubectl get pods $ kubectl get pods Separate Commands from Output Verify that the pod is running on your chosen node: kubectl get pods --output=wide The output is similar to:\nNAME READY STATUS RESTARTS AGE IP NODE nginx 1/1 Running 0 13s 10.200.0.4 worker0 Placeholders Use angle brackets for placeholders. Tell the reader what a placeholder represents, for example:\n Display information about a pod:\nkubectl describe pod \u003cpod-name\u003e \u003cpod-name\u003e is the name of one of your pods.\n Versioning Kubernetes Examples Make code examples and configuration examples that include version information consistent with the accompanying text. Identify the Kubernetes version in the Prerequisites section.\nRelated Links Style Guide Contributors Guide ","categories":"","description":"","excerpt":"This page gives writing formatting guidelines for the Gardener …","ref":"/docs/contribute/documentation/formatting-guide/","tags":"","title":"Formatting Guide"},{"body":"Gardener Extension for Garden Linux OS \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group.\nIt manages those objects that are requesting…\n Garden Linux OS configuration (.spec.type=gardenlinux):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: gardenlinux units: ... files: ... Please find a concrete example in the example folder.\n MemoryOne on Garden Linux configuration (spec.type=memoryone-gardenlinux):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: memoryone-gardenlinux units: ... files: ... providerConfig: apiVersion: memoryone-gardenlinux.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"2\" systemMemory: \"6x\" Please find a concrete example in the example folder.\n After reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Garden Linux operating system","excerpt":"Gardener extension controller for the Garden Linux operating system","ref":"/docs/extensions/os-extensions/gardener-extension-os-gardenlinux/","tags":"","title":"Garden Linux OS"},{"body":"Overview While the Gardener API server works with admission plugins to validate and mutate resources belonging to Gardener related API groups, e.g. core.gardener.cloud, the same is needed for resources belonging to non-Gardener API groups as well, e.g. secrets in the core API group. Therefore, the Gardener Admission Controller runs a http(s) server with the following handlers which serve as validating/mutating endpoints for admission webhooks. It is also used to serve http(s) handlers for authorization webhooks.\nAdmission Webhook Handlers This section describes the admission webhook handlers that are currently served.\nAdmission Plugin Secret Validator In Shoot, AdmissionPlugin can have reference to other files. This validation handler validates the referred admission plugin secret and ensures that the secret always contains the required data kubeconfig.\nKubeconfig Secret Validator Malicious Kubeconfigs applied by end users may cause a leakage of sensitive data. This handler checks if the incoming request contains a Kubernetes secret with a .data.kubeconfig field and denies the request if the Kubeconfig structure violates Gardener’s security standards.\nNamespace Validator Namespaces are the backing entities of Gardener projects in which shoot cluster objects reside. This validation handler protects active namespaces against premature deletion requests. Therefore, it denies deletion requests if a namespace still contains shoot clusters or if it belongs to a non-deleting Gardener project (without .metadata.deletionTimestamp).\nResource Size Validator Since users directly apply Kubernetes native objects to the Garden cluster, it also involves the risk of being vulnerable to DoS attacks because these resources are continuously watched and read by controllers. One example is the creation of shoot resources with large annotation values (up to 256 kB per value), which can cause severe out-of-memory issues for the gardenlet component. Vertical autoscaling can help to mitigate such situations, but we cannot expect to scale infinitely, and thus need means to block the attack itself.\nThe Resource Size Validator checks arbitrary incoming admission requests against a configured maximum size for the resource’s group-version-kind combination. It denies the request if the object exceeds the quota.\n [!NOTE] The contents of status subresources and metadata.managedFields are not taken into account for the resource size calculation.\n Example for Gardener Admission Controller configuration:\nserver: resourceAdmissionConfiguration: limits: - apiGroups: [\"core.gardener.cloud\"] apiVersions: [\"*\"] resources: [\"shoots\"] size: 100k - apiGroups: [\"\"] apiVersions: [\"v1\"] resources: [\"secrets\"] size: 100k unrestrictedSubjects: - kind: Group name: gardener.cloud:system:seeds apiGroup: rbac.authorization.k8s.io # - kind: User # name: admin # apiGroup: rbac.authorization.k8s.io # - kind: ServiceAccount # name: \"*\" # namespace: garden # apiGroup: \"\" operationMode: block #log With the configuration above, the Resource Size Validator denies requests for shoots with Gardener’s core API group which exceed a size of 100 kB. The same is done for Kubernetes secrets.\nAs this feature is meant to protect the system from malicious requests sent by users, it is recommended to exclude trusted groups, users or service accounts from the size restriction via resourceAdmissionConfiguration.unrestrictedSubjects. For example, the backing user for the gardenlet should always be capable of changing the shoot resource instead of being blocked due to size restrictions. This is because the gardenlet itself occasionally changes the shoot specification, labels or annotations, and might violate the quota if the existing resource is already close to the quota boundary. Also, operators are supposed to be trusted users and subjecting them to a size limitation can inhibit important operational tasks. Wildcard (\"*\") in subject name is supported.\nSize limitations depend on the individual Gardener setup and choosing the wrong values can affect the availability of your Gardener service. resourceAdmissionConfiguration.operationMode allows to control if a violating request is actually denied (default) or only logged. It’s recommended to start with log, check the logs for exceeding requests, adjust the limits if necessary and finally switch to block.\nSeedRestriction Please refer to Scoped API Access for Gardenlets for more information.\nAuthorization Webhook Handlers This section describes the authorization webhook handlers that are currently served.\nSeedAuthorization Please refer to Scoped API Access for Gardenlets for more information.\n","categories":"","description":"Functions and list of handlers for the Gardener Admission Controller","excerpt":"Functions and list of handlers for the Gardener Admission Controller","ref":"/docs/gardener/concepts/admission-controller/","tags":"","title":"Gardener Admission Controller"},{"body":"Overview The Gardener API server is a Kubernetes-native extension based on its aggregation layer. It is registered via an APIService object and designed to run inside a Kubernetes cluster whose API it wants to extend.\nAfter registration, it exposes the following resources:\nCloudProfiles CloudProfiles are resources that describe a specific environment of an underlying infrastructure provider, e.g. AWS, Azure, etc. Each shoot has to reference a CloudProfile to declare the environment it should be created in. In a CloudProfile, the gardener operator specifies certain constraints like available machine types, regions, which Kubernetes versions they want to offer, etc. End-users can read CloudProfiles to see these values, but only operators can change the content or create/delete them. When a shoot is created or updated, then an admission plugin checks that only allowed values are used via the referenced CloudProfile.\nAdditionally, a CloudProfile may contain a providerConfig, which is a special configuration dedicated for the infrastructure provider. Gardener does not evaluate or understand this config, but extension controllers might need it for declaration of provider-specific constraints, or global settings.\nPlease see this example manifest and consult the documentation of your provider extension controller to get information about its providerConfig.\nNamespacedCloudProfiles In addition to CloudProfiles, NamespacedCloudProfiles exist to enable project level CloudProfiles. Please view GEP-25 for additional information. This feature is currently under development and not ready for productive use. At the moment, only the necessary APIs and validations exist to allow for extensions to adapt to the new NamespacedCloudProfile resource.\nWhen a shoot is created or updated, the cloudprofile reference can be set to point to a directly descendant NamespacedCloudProfile. Updates from one CloudProfile to another CloudProfile or from one NamespacedCloudProfile to another NamespacedCloudProfile or even to another CloudProfile are not allowed.\nProject viewers have the permission to see NamespacedCloudProfiles associated with a particular project. Project members can create, edit or delete NamespacedCloudProfiles, except for the special fields .spec.kubernetes and .spec.machineImages. In order to make changes to these special fields, a user needs to be granted the custom RBAC verbs modify-spec-kubernetes and modify-spec-machineimages respectively, which is typically only granted to landscape operators.\nInternalSecrets End-users can read and/or write Secrets in their project namespaces in the garden cluster. This prevents Gardener components from storing such “Gardener-internal” secrets in the respective project namespace. InternalSecrets are resources that contain shoot or project-related secrets that are “Gardener-internal”, i.e., secrets used and managed by the system that end-users don’t have access to. InternalSecrets are defined like plain Kubernetes Secrets, behave exactly like them, and can be used in the same manners. The only difference is, that the InternalSecret resource is a dedicated API resource (exposed by gardener-apiserver). This allows separating access to “normal” secrets and internal secrets by the usual RBAC means.\nGardener uses an InternalSecret per Shoot for syncing the client CA to the project namespace in the garden cluster (named \u003cshoot-name\u003e.ca-client). The shoots/adminkubeconfig subresource signs short-lived client certificates by retrieving the CA from the InternalSecret.\nOperators should configure gardener-apiserver to encrypt the internalsecrets.core.gardener.cloud resource in etcd.\nPlease see this example manifest.\nSeeds Seeds are resources that represent seed clusters. Gardener does not care about how a seed cluster got created - the only requirement is that it is of at least Kubernetes v1.25 and passes the Kubernetes conformance tests. The Gardener operator has to either deploy the gardenlet into the cluster they want to use as seed (recommended, then the gardenlet will create the Seed object itself after bootstrapping) or provide the kubeconfig to the cluster inside a secret (that is referenced by the Seed resource) and create the Seed resource themselves.\nPlease see this, this, and optionally this example manifests.\nShoot Quotas To allow end-users not having their dedicated infrastructure account to try out Gardener, the operator can register an account owned by them that they allow to be used for trial clusters. Trial clusters can be put under quota so that they don’t consume too many resources (resulting in costs) and that one user cannot consume all resources on their own. These clusters are automatically terminated after a specified time, but end-users may extend the lifetime manually if needed.\nPlease see this example manifest.\nProjects The first thing before creating a shoot cluster is to create a Project. A project is used to group multiple shoot clusters together. End-users can invite colleagues to the project to enable collaboration, and they can either make them admin or viewer. After an end-user has created a project, they will get a dedicated namespace in the garden cluster for all their shoots.\nPlease see this example manifest.\nSecretBindings Now that the end-user has a namespace the next step is registering their infrastructure provider account.\nPlease see this example manifest and consult the documentation of the extension controller for the respective infrastructure provider to get information about which keys are required in this secret.\nAfter the secret has been created, the end-user has to create a special SecretBinding resource that binds this secret. Later, when creating shoot clusters, they will reference such binding.\nPlease see this example manifest.\nShoots Shoot cluster contain various settings that influence how end-user Kubernetes clusters will look like in the end. As Gardener heavily relies on extension controllers for operating system configuration, networking, and infrastructure specifics, the end-user has the possibility (and responsibility) to provide these provider-specific configurations as well. Such configurations are not evaluated by Gardener (because it doesn’t know/understand them), but they are only transported to the respective extension controller.\n⚠️ This means that any configuration issues/mistake on the end-user side that relates to a provider-specific flag or setting cannot be caught during the update request itself but only later during the reconciliation (unless a validator webhook has been registered in the garden cluster by an operator).\nPlease see this example manifest and consult the documentation of the provider extension controller to get information about its spec.provider.controlPlaneConfig, .spec.provider.infrastructureConfig, and .spec.provider.workers[].providerConfig.\n(Cluster)OpenIDConnectPresets Please see this separate documentation file.\nOverview Data Model ","categories":["Users"],"description":"Understand the Gardener API server extension and the resources it exposes","excerpt":"Understand the Gardener API server extension and the resources it …","ref":"/docs/gardener/concepts/apiserver/","tags":"","title":"Gardener API Server"},{"body":"Overview The gardener-controller-manager (often referred to as “GCM”) is a component that runs next to the Gardener API server, similar to the Kubernetes Controller Manager. It runs several controllers that do not require talking to any seed or shoot cluster. Also, as of today, it exposes an HTTP server that is serving several health check endpoints and metrics.\nThis document explains the various functionalities of the gardener-controller-manager and their purpose.\nControllers Bastion Controller Bastion resources have a limited lifetime which can be extended up to a certain amount by performing a heartbeat on them. The Bastion controller is responsible for deleting expired or rotten Bastions.\n “expired” means a Bastion has exceeded its status.expirationTimestamp. “rotten” means a Bastion is older than the configured maxLifetime. The maxLifetime defaults to 24 hours and is an option in the BastionControllerConfiguration which is part of gardener-controller-managers ControllerManagerControllerConfiguration, see the example config file for details.\nThe controller also deletes Bastions in case the referenced Shoot:\n no longer exists is marked for deletion (i.e., have a non-nil .metadata.deletionTimestamp) was migrated to another seed (i.e., Shoot.spec.seedName is different than Bastion.spec.seedName). The deletion of Bastions triggers the gardenlet to perform the necessary cleanups in the Seed cluster, so some time can pass between deletion and the Bastion actually disappearing. Clients like gardenctl are advised to not re-use Bastions whose deletion timestamp has been set already.\nRefer to GEP-15 for more information on the lifecycle of Bastion resources.\nCertificateSigningRequest Controller After the gardenlet gets deployed on the Seed cluster, it needs to establish itself as a trusted party to communicate with the Gardener API server. It runs through a bootstrap flow similar to the kubelet bootstrap process.\nOn startup, the gardenlet uses a kubeconfig with a bootstrap token which authenticates it as being part of the system:bootstrappers group. This kubeconfig is used to create a CertificateSigningRequest (CSR) against the Gardener API server.\nThe controller in gardener-controller-manager checks whether the CertificateSigningRequest has the expected organization, common name and usages which the gardenlet would request.\nIt only auto-approves the CSR if the client making the request is allowed to “create” the certificatesigningrequests/seedclient subresource. Clients with the system:bootstrappers group are bound to the gardener.cloud:system:seed-bootstrapper ClusterRole, hence, they have such privileges. As the bootstrap kubeconfig for the gardenlet contains a bootstrap token which is authenticated as being part of the systems:bootstrappers group, its created CSR gets auto-approved.\nCloudProfile Controller CloudProfiles are essential when it comes to reconciling Shoots since they contain constraints (like valid machine types, Kubernetes versions, or machine images) and sometimes also some global configuration for the respective environment (typically via provider-specific configuration in .spec.providerConfig).\nConsequently, to ensure that CloudProfiles in-use are always present in the system until the last referring Shoot or NamespacedCloudProfile gets deleted, the controller adds a finalizer which is only released when there is no Shoot or NamespacedCloudProfile referencing the CloudProfile anymore.\nNamespacedCloudProfile Controller NamespacedCloudProfiles provide a project-scoped extension to CloudProfiles, allowing for adjustments of a parent CloudProfile (e.g. by overriding expiration dates of Kubernetes versions or machine images). This allows for modifications without global project visibility. Like CloudProfiles do in their spec, NamespacedCloudProfiles also expose the resulting Shoot constraints as a CloudProfileSpec in their status.\nThe controller ensures that NamespacedCloudProfiles in-use remain present in the system until the last referring Shoot is deleted by adding a finalizer that is only released when there is no Shoot referencing the NamespacedCloudProfile anymore.\nControllerDeployment Controller Extensions are registered in the garden cluster via ControllerRegistration and deployment of respective extensions are specified via ControllerDeployment. For more info refer to Registering Extension Controllers.\nThis controller ensures that ControllerDeployment in-use always exists until the last ControllerRegistration referencing them gets deleted. The controller adds a finalizer which is only released when there is no ControllerRegistration referencing the ControllerDeployment anymore.\nControllerRegistration Controller The ControllerRegistration controller makes sure that the required Gardener Extensions specified by the ControllerRegistration resources are present in the seed clusters. It also takes care of the creation and deletion of ControllerInstallation objects for a given seed cluster. The controller has three reconciliation loops.\n“Main” Reconciler This reconciliation loop watches the Seed objects and determines which ControllerRegistrations are required for them and reconciles the corresponding ControllerInstallation resources to reach the determined state. To begin with, it computes the kind/type combinations of extensions required for the seed. For this, the controller examines a live list of ControllerRegistrations, ControllerInstallations, BackupBuckets, BackupEntrys, Shoots, and Secrets from the garden cluster. For example, it examines the shoots running on the seed and deducts the kind/type, like Infrastructure/gcp. The seed (seed.spec.provider.type) and DNS (seed.spec.dns.provider.type) provider types are considered when calculating the list of required ControllerRegistrations, as well. It also decides whether they should always be deployed based on the .spec.deployment.policy. For the configuration options, please see this section.\nBased on these required combinations, each of them are mapped to ControllerRegistration objects and then to their corresponding ControllerInstallation objects (if existing). The controller then creates or updates the required ControllerInstallation objects for the given seed. It also deletes every existing ControllerInstallation whose referenced ControllerRegistration is not part of the required list. For example, if the shoots in the seed are no longer using the DNS provider aws-route53, then the controller proceeds to delete the respective ControllerInstallation object.\n\"ControllerRegistration Finalizer\" Reconciler This reconciliation loop watches the ControllerRegistration resource and adds finalizers to it when they are created. In case a deletion request comes in for the resource, i.e., if a .metadata.deletionTimestamp is set, it actively scans for a ControllerInstallation resource using this ControllerRegistration, and decides whether the deletion can be allowed. In case no related ControllerInstallation is present, it removes the finalizer and marks it for deletion.\n\"Seed Finalizer\" Reconciler This loop also watches the Seed object and adds finalizers to it at creation. If a .metadata.deletionTimestamp is set for the seed, then the controller checks for existing ControllerInstallation objects which reference this seed. If no such objects exist, then it removes the finalizer and allows the deletion.\n“Extension ClusterRole” Reconciler This reconciler watches two resources in the garden cluster:\n ClusterRoles labelled with authorization.gardener.cloud/custom-extensions-permissions=true ServiceAccounts in seed namespaces matching the selector provided via the authorization.gardener.cloud/extensions-serviceaccount-selector annotation of such ClusterRoles. Its core task is to maintain a ClusterRoleBinding resource referencing the respective ClusterRole. This gets bound to all ServiceAccounts in seed namespaces whose labels match the selector provided via the authorization.gardener.cloud/extensions-serviceaccount-selector annotation of such ClusterRoles.\nYou can read more about the purpose of this reconciler in this document.\nCredentialsBinding Controller CredentialsBindings reference Secrets, WorkloadIdentitys and Quotas and are themselves referenced by Shoots.\nThe controller adds finalizers to the referenced objects to ensure they don’t get deleted while still being referenced. Similarly, to ensure that CredentialsBindings in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the CredentialsBinding anymore.\nReferenced Secrets and WorkloadIdentitys will also be labeled with provider.shoot.gardener.cloud/\u003ctype\u003e=true, where \u003ctype\u003e is the value of the .provider.type of the CredentialsBinding. Also, all referenced Secrets and WorkloadIdentitys, as well as Quotas, will be labeled with reference.gardener.cloud/credentialsbinding=true to allow for easily filtering for objects referenced by CredentialsBindings.\nEvent Controller With the Gardener Event Controller, you can prolong the lifespan of events related to Shoot clusters. This is an optional controller which will become active once you provide the below mentioned configuration.\nAll events in K8s are deleted after a configurable time-to-live (controlled via a kube-apiserver argument called --event-ttl (defaulting to 1 hour)). The need to prolong the time-to-live for Shoot cluster events frequently arises when debugging customer issues on live systems. This controller leaves events involving Shoots untouched, while deleting all other events after a configured time. In order to activate it, provide the following configuration:\n concurrentSyncs: The amount of goroutines scheduled for reconciling events. ttlNonShootEvents: When an event reaches this time-to-live it gets deleted unless it is a Shoot-related event (defaults to 1h, equivalent to the event-ttl default). ⚠️ In addition, you should also configure the --event-ttl for the kube-apiserver to define an upper-limit of how long Shoot-related events should be stored. The --event-ttl should be larger than the ttlNonShootEvents or this controller will have no effect.\n ExposureClass Controller ExposureClass abstracts the ability to expose a Shoot clusters control plane in certain network environments (e.g. corporate networks, DMZ, internet) on all Seeds or a subset of the Seeds. For more information, see ExposureClasses.\nConsequently, to ensure that ExposureClasses in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the ExposureClass anymore.\nManagedSeedSet Controller ManagedSeedSet objects maintain a stable set of replicas of ManagedSeeds, i.e. they guarantee the availability of a specified number of identical ManagedSeeds on an equal number of identical Shoots. The ManagedSeedSet controller creates and deletes ManagedSeeds and Shoots in response to changes to the replicas and selector fields. For more information, refer to the ManagedSeedSet proposal document.\n The reconciler first gets all the replicas of the given ManagedSeedSet in the ManagedSeedSet’s namespace and with the matching selector. Each replica is a struct that contains a ManagedSeed, its corresponding Seed and Shoot objects. Then the pending replica is retrieved, if it exists. Next it determines the ready, postponed, and deletable replicas. A replica is considered ready when a Seed owned by a ManagedSeed has been registered either directly or by deploying gardenlet into a Shoot, the Seed is Ready and the Shoot’s status is Healthy. If a replica is not ready and it is not pending, i.e. it is not specified in the ManagedSeed’s status.pendingReplica field, then it is added to the postponed replicas. A replica is deletable if it has no scheduled Shoots and the replica’s Shoot and ManagedSeed do not have the seedmanagement.gardener.cloud/protect-from-deletion annotation. Finally, it checks the actual and target replica counts. If the actual count is less than the target count, the controller scales up the replicas by creating new replicas to match the desired target count. If the actual count is more than the target, the controller deletes replicas to match the desired count. Before scale-out or scale-in, the controller first reconciles the pending replica (there can always only be one) and makes sure the replica is ready before moving on to the next one. Scale-out(actual count \u003c target count) During the scale-out phase, the controller first creates the Shoot object from the ManagedSeedSet’s spec.shootTemplate field and adds the replica to the status.pendingReplica of the ManagedSeedSet. For the subsequent reconciliation steps, the controller makes sure that the pending replica is ready before proceeding to the next replica. Once the Shoot is created successfully, the ManagedSeed object is created from the ManagedSeedSet’s spec.template. The ManagedSeed object is reconciled by the ManagedSeed controller and a Seed object is created for the replica. Once the replica’s Seed becomes ready and the Shoot becomes healthy, the replica also becomes ready. Scale-in(actual count \u003e target count) During the scale-in phase, the controller first determines the replica that can be deleted. From the deletable replicas, it chooses the one with the lowest priority and deletes it. Priority is determined in the following order: First, compare replica statuses. Replicas with “less advanced” status are considered lower priority. For example, a replica with StatusShootReconciling status has a lower value than a replica with StatusShootReconciled status. Hence, in this case, a replica with a StatusShootReconciling status will have lower priority and will be considered for deletion. Then, the replicas are compared with the readiness of their Seeds. Replicas with non-ready Seeds are considered lower priority. Then, the replicas are compared with the health statuses of their Shoots. Replicas with “worse” statuses are considered lower priority. Finally, the replica ordinals are compared. Replicas with lower ordinals are considered lower priority. Quota Controller Quota object limits the resources consumed by shoot clusters either per provider secret or per project/namespace.\nConsequently, to ensure that Quotas in-use are always present in the system until the last SecretBinding or CredentialsBinding that references them gets deleted, the controller adds a finalizer which is only released when there is no SecretBinding or CredentialsBinding referencing the Quota anymore.\nProject Controller There are multiple controllers responsible for different aspects of Project objects. Please also refer to the Project documentation.\n“Main” Reconciler This reconciler manages a dedicated Namespace for each Project. The namespace name can either be specified explicitly in .spec.namespace (must be prefixed with garden-) or it will be determined by the controller. If .spec.namespace is set, it tries to create it. If it already exists, it tries to adopt it. This will only succeed if the Namespace was previously labeled with gardener.cloud/role=project and project.gardener.cloud/name=\u003cproject-name\u003e. This is to prevent end-users from being able to adopt arbitrary namespaces and escalate their privileges, e.g. the kube-system namespace.\nAfter the namespace was created/adopted, the controller creates several ClusterRoles and ClusterRoleBindings that allow the project members to access related resources based on their roles. These RBAC resources are prefixed with gardener.cloud:system:project{-member,-viewer}:\u003cproject-name\u003e. Gardener administrators and extension developers can define their own roles. For more information, see Extending Project Roles for more information.\nIn addition, operators can configure the Project controller to maintain a default ResourceQuota for project namespaces. Quotas can especially limit the creation of user facing resources, e.g. Shoots, SecretBindings, CredentialsBinding, Secrets and thus protect the garden cluster from massive resource exhaustion but also enable operators to align quotas with respective enterprise policies.\n ⚠️ Gardener itself is not exempted from configured quotas. For example, Gardener creates Secrets for every shoot cluster in the project namespace and at the same time increases the available quota count. Please mind this additional resource consumption.\n The controller configuration provides a template section controllers.project.quotas where such a ResourceQuota (see the example below) can be deposited.\ncontrollers: project: quotas: - config: apiVersion: v1 kind: ResourceQuota spec: hard: count/shoots.core.gardener.cloud: \"100\" count/secretbindings.core.gardener.cloud: \"10\" count/credentialsbindings.security.gardener.cloud: \"10\" count/secrets: \"800\" projectSelector: {} The Project controller takes the specified config and creates a ResourceQuota with the name gardener in the project namespace. If a ResourceQuota resource with the name gardener already exists, the controller will only update fields in spec.hard which are unavailable at that time. This is done to configure a default Quota in all projects but to allow manual quota increases as the projects’ demands increase. spec.hard fields in the ResourceQuota object that are not present in the configuration are removed from the object. Labels and annotations on the ResourceQuota config get merged with the respective fields on existing ResourceQuotas. An optional projectSelector narrows down the amount of projects that are equipped with the given config. If multiple configs match for a project, then only the first match in the list is applied to the project namespace.\nThe .status.phase of the Project resources is set to Ready or Failed by the reconciler to indicate whether the reconciliation loop was performed successfully. Also, it generates Events to provide further information about its operations.\nWhen a Project is marked for deletion, the controller ensures that there are no Shoots left in the project namespace. Once all Shoots are gone, the Namespace and Project are released.\n“Stale Projects” Reconciler As Gardener is a large-scale Kubernetes as a Service, it is designed for being used by a large amount of end-users. Over time, it is likely to happen that some of the hundreds or thousands of Project resources are no longer actively used.\nGardener offers the “stale projects” reconciler which will take care of identifying such stale projects, marking them with a “warning”, and eventually deleting them after a certain time period. This reconciler is enabled by default and works as follows:\n Projects are considered as “stale”/not actively used when all of the following conditions apply: The namespace associated with the Project does not have any… Shoot resources. BackupEntry resources. Secret resources that are referenced by a SecretBinding or a CredentialsBinding that is in use by a Shoot (not necessarily in the same namespace). Quota resources that are referenced by a SecretBinding or a CredentialsBinding that is in use by a Shoot (not necessarily in the same namespace). The time period when the project was used for the last time (status.lastActivityTimestamp) is longer than the configured minimumLifetimeDays If a project is considered “stale”, then its .status.staleSinceTimestamp will be set to the time when it was first detected to be stale. If it gets actively used again, this timestamp will be removed. After some time, the .status.staleAutoDeleteTimestamp will be set to a timestamp after which Gardener will auto-delete the Project resource if it still is not actively used.\nThe component configuration of the gardener-controller-manager offers to configure the following options:\n minimumLifetimeDays: Don’t consider newly created Projects as “stale” too early to give people/end-users some time to onboard and get familiar with the system. The “stale project” reconciler won’t set any timestamp for Projects younger than minimumLifetimeDays. When you change this value, then projects marked as “stale” may be no longer marked as “stale” in case they are young enough, or vice versa. staleGracePeriodDays: Don’t compute auto-delete timestamps for stale Projects that are unused for less than staleGracePeriodDays. This is to not unnecessarily make people/end-users nervous “just because” they haven’t actively used their Project for a given amount of time. When you change this value, then already assigned auto-delete timestamps may be removed if the new grace period is not yet exceeded. staleExpirationTimeDays: Expiration time after which stale Projects are finally auto-deleted (after .status.staleSinceTimestamp). If this value is changed and an auto-delete timestamp got already assigned to the projects, then the new value will only take effect if it’s increased. Hence, decreasing the staleExpirationTimeDays will not decrease already assigned auto-delete timestamps. Gardener administrators/operators can exclude specific Projects from the stale check by annotating the related Namespace resource with project.gardener.cloud/skip-stale-check=true.\n “Activity” Reconciler Since the other two reconcilers are unable to actively monitor the relevant objects that are used in a Project (Shoot, Secret, etc.), there could be a situation where the user creates and deletes objects in a short period of time. In that case, the Stale Project Reconciler could not see that there was any activity on that project and it will still mark it as a Stale, even though it is actively used.\nThe Project Activity Reconciler is implemented to take care of such cases. An event handler will notify the reconciler for any activity and then it will update the status.lastActivityTimestamp. This update will also trigger the Stale Project Reconciler.\nSecretBinding Controller SecretBindings reference Secrets and Quotas and are themselves referenced by Shoots. The controller adds finalizers to the referenced objects to ensure they don’t get deleted while still being referenced. Similarly, to ensure that SecretBindings in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the SecretBinding anymore.\nReferenced Secrets will also be labeled with provider.shoot.gardener.cloud/\u003ctype\u003e=true, where \u003ctype\u003e is the value of the .provider.type of the SecretBinding. Also, all referenced Secrets, as well as Quotas, will be labeled with reference.gardener.cloud/secretbinding=true to allow for easily filtering for objects referenced by SecretBindings.\nSeed Controller The Seed controller in the gardener-controller-manager reconciles Seed objects with the help of the following reconcilers.\n“Main” Reconciler This reconciliation loop takes care of seed related operations in the garden cluster. When a new Seed object is created, the reconciler creates a new Namespace in the garden cluster seed-\u003cseed-name\u003e. Namespaces dedicated to single seed clusters allow us to segregate access permissions i.e., a gardenlet must not have permissions to access objects in all Namespaces in the garden cluster. There are objects in a Garden environment which are created once by the operator e.g., default domain secret, alerting credentials, and are required for operations happening in the gardenlet. Therefore, we not only need a seed specific Namespace but also a copy of these “shared” objects.\nThe “main” reconciler takes care about this replication:\n Kind Namespace Label Selector Secret garden gardener.cloud/role “Backup Buckets Check” Reconciler Every time a BackupBucket object is created or updated, the referenced Seed object is enqueued for reconciliation. It’s the reconciler’s task to check the status subresource of all existing BackupBuckets that reference this Seed. If at least one BackupBucket has .status.lastError != nil, the BackupBucketsReady condition on the Seed will be set to False, and consequently the Seed is considered as NotReady. If the SeedBackupBucketsCheckControllerConfiguration (which is part of gardener-controller-managers component configuration) contains a conditionThreshold for the BackupBucketsReady, the condition will instead first be set to Progressing and eventually to False once the conditionThreshold expires. See the example config file for details. Once the BackupBucket is healthy again, the seed will be re-queued and the condition will turn true.\n“Extensions Check” Reconciler This reconciler reconciles Seed objects and checks whether all ControllerInstallations referencing them are in a healthy state. Concretely, all three conditions Valid, Installed, and Healthy must have status True and the Progressing condition must have status False. Based on this check, it maintains the ExtensionsReady condition in the respective Seed’s .status.conditions list.\n“Lifecycle” Reconciler The “Lifecycle” reconciler processes Seed objects which are enqueued every 10 seconds in order to check if the responsible gardenlet is still responding and operable. Therefore, it checks renewals via Lease objects of the seed in the garden cluster which are renewed regularly by the gardenlet.\nIn case a Lease is not renewed for the configured amount in config.controllers.seed.monitorPeriod.duration:\n The reconciler assumes that the gardenlet stopped operating and updates the GardenletReady condition to Unknown. Additionally, the conditions and constraints of all Shoot resources scheduled on the affected seed are set to Unknown as well, because a striking gardenlet won’t be able to maintain these conditions any more. If the gardenlet’s client certificate has expired (identified based on the .status.clientCertificateExpirationTimestamp field in the Seed resource) and if it is managed by a ManagedSeed, then this will be triggered for a reconciliation. This will trigger the bootstrapping process again and allows gardenlets to obtain a fresh client certificate. Shoot Controller “Conditions” Reconciler In case the reconciled Shoot is registered via a ManagedSeed as a seed cluster, this reconciler merges the conditions in the respective Seed’s .status.conditions into the .status.conditions of the Shoot. This is to provide a holistic view on the status of the registered seed cluster by just looking at the Shoot resource.\n“Hibernation” Reconciler This reconciler is responsible for hibernating or awakening shoot clusters based on the schedules defined in their .spec.hibernation.schedules. It ignores failed Shoots and those marked for deletion.\n“Maintenance” Reconciler This reconciler is responsible for maintaining shoot clusters based on the time window defined in their .spec.maintenance.timeWindow. It might auto-update the Kubernetes version or the operating system versions specified in the worker pools (.spec.provider.workers). It could also add some operation or task annotations. For more information, see Shoot Maintenance.\n“Quota” Reconciler This reconciler might auto-delete shoot clusters in case their referenced SecretBinding or CredentialsBinding is itself referencing a Quota with .spec.clusterLifetimeDays != nil. If the shoot cluster is older than the configured lifetime, then it gets deleted. It maintains the expiration time of the Shoot in the value of the shoot.gardener.cloud/expiration-timestamp annotation. This annotation might be overridden, however only by at most twice the value of the .spec.clusterLifetimeDays.\n“Reference” Reconciler Shoot objects may specify references to other objects in the garden cluster which are required for certain features. For example, users can configure various DNS providers via .spec.dns.providers and usually need to refer to a corresponding Secret with valid DNS provider credentials inside. Such objects need a special protection against deletion requests as long as they are still being referenced by one or multiple shoots.\nTherefore, this reconciler checks Shoots for referenced objects and adds the finalizer gardener.cloud/reference-protection to their .metadata.finalizers list. The reconciled Shoot also gets this finalizer to enable a proper garbage collection in case the gardener-controller-manager is offline at the moment of an incoming deletion request. When an object is not actively referenced anymore because the Shoot specification has changed or all related shoots were deleted (are in deletion), the controller will remove the added finalizer again so that the object can safely be deleted or garbage collected.\nThis reconciler inspects the following references:\n DNS provider secrets (.spec.dns.provider) Audit policy configmaps (.spec.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef) Further checks might be added in the future.\n“Retry” Reconciler This reconciler is responsible for retrying certain failed Shoots. Currently, the reconciler retries only failed Shoots with an error code ERR_INFRA_RATE_LIMITS_EXCEEDED. See Shoot Status for more details.\n“Status Label” Reconciler This reconciler is responsible for maintaining the shoot.gardener.cloud/status label on Shoots. See Shoot Status for more details.\n","categories":"","description":"Understand where the gardener-controller-manager runs and its functionalities","excerpt":"Understand where the gardener-controller-manager runs and its …","ref":"/docs/gardener/concepts/controller-manager/","tags":"","title":"Gardener Controller Manager"},{"body":"Overview The goal of the gardener-node-agent is to bootstrap a machine into a worker node and maintain node-specific components, which run on the node and are unmanaged by Kubernetes (e.g. the kubelet service, systemd units, …).\nIt effectively is a Kubernetes controller deployed onto the worker node.\nArchitecture and Basic Design This figure visualizes the overall architecture of the gardener-node-agent. On the left side, it starts with an OperatingSystemConfig resource (OSC) with a corresponding worker pool specific cloud-config-\u003cworker-pool\u003e secret being passed by reference through the userdata to a machine by the machine-controller-manager (MCM).\nOn the right side, the cloud-config secret will be extracted and used by the gardener-node-agent after being installed. Details on this can be found in the next section.\nFinally, the gardener-node-agent runs a systemd service watching on secret resources located in the kube-system namespace like our cloud-config secret that contains the OperatingSystemConfig. When gardener-node-agent applies the OSC, it installs the kubelet + configuration on the worker node.\nInstallation and Bootstrapping This section describes how the gardener-node-agent is initially installed onto the worker node.\nIn the beginning, there is a very small bash script called gardener-node-init.sh, which will be copied to /var/lib/gardener-node-agent/init.sh on the node with cloud-init data. This script’s sole purpose is downloading and starting the gardener-node-agent. The binary artifact is extracted from an OCI artifact and lives at /opt/bin/gardener-node-agent.\nAlong with the init script, a configuration for the gardener-node-agent is carried over to the worker node at /var/lib/gardener-node-agent/config.yaml. This configuration contains things like the shoot’s kube-apiserver endpoint, the according certificates to communicate with it, and controller configuration.\nIn a bootstrapping phase, the gardener-node-agent sets itself up as a systemd service. It also executes tasks that need to be executed before any other components are installed, e.g. formatting the data device for the kubelet.\nControllers This section describes the controllers in more details.\nLease Controller This controller creates a Lease for gardener-node-agent in kube-system namespace of the shoot cluster. Each instance of gardener-node-agent creates its own Lease when its corresponding Node was created. It renews the Lease resource every 10 seconds. This indicates a heartbeat to the external world.\nNode Controller This controller watches the Node object for the machine it runs on. The correct Node is identified based on the hostname of the machine (Nodes have the kubernetes.io/hostname label). Whenever the worker.gardener.cloud/restart-systemd-services annotation changes, the controller performs the desired changes by restarting the specified systemd unit files. See also this document for more information. After restarting all units, the annotation is removed.\n ℹ️ When the gardener-node-agent systemd service itself is requested to be restarted, the annotation is removed first to ensure it does not restart itself indefinitely.\n Operating System Config Controller This controller contains the main logic of gardener-node-agent. It watches Secrets whose data map contains the OperatingSystemConfig which consists of all systemd units and files that are relevant for the node configuration. Amongst others, a prominent example is the configuration file for kubelet and its unit file for the kubelet.service.\nThe controller decodes the configuration and computes the files and units that have changed since its last reconciliation. It writes or update the files and units to the file system, removes no longer needed files and units, reloads the systemd daemon, and starts or stops the units accordingly.\nAfter successful reconciliation, it persists the just applied OperatingSystemConfig into a file on the host. This file will be used for future reconciliations to compute file/unit changes.\nThe controller also maintains two annotations on the Node:\n worker.gardener.cloud/kubernetes-version, describing the version of the installed kubelet. checksum/cloud-config-data, describing the checksum of the applied OperatingSystemConfig (used in future reconciliations to determine whether it needs to reconcile, and to report that this node is up-to-date). Token Controller This controller watches the access token Secrets in the kube-system namespace configured via the gardener-node-agent’s component configuration (.controllers.token.syncConfigs[] field). Whenever the .data.token field changes, it writes the new content to a file on the configured path on the host file system. This mechanism is used to download its own access token for the shoot cluster, but also the access tokens of other systemd components (e.g., valitail). Since the underlying client is based on k8s.io/client-go and the kubeconfig points to this token file, it is dynamically reloaded without the necessity of explicit configuration or code changes. This procedure ensures that the most up-to-date tokens are always present on the host and used by the gardener-node-agent and the other systemd components.\nReasoning The gardener-node-agent is a replacement for what was called the cloud-config-downloader and the cloud-config-executor, both written in bash. The gardener-node-agent implements this functionality as a regular controller and feels more uniform in terms of maintenance.\nWith the new architecture we gain a lot, let’s describe the most important gains here.\nDeveloper Productivity Since the Gardener community develops in Go day by day, writing business logic in bash is difficult, hard to maintain, almost impossible to test. Getting rid of almost all bash scripts which are currently in use for this very important part of the cluster creation process will enhance the speed of adding new features and removing bugs.\nSpeed Until now, the cloud-config-downloader runs in a loop every 60s to check if something changed on the shoot which requires modifications on the worker node. This produces a lot of unneeded traffic on the API server and wastes time, it will sometimes take up to 60s until a desired modification is started on the worker node. By writing a “real” Kubernetes controller, we can watch for the Node, the OSC in the Secret, and the shoot-access token in the secret. If any of these object changed, and only then, the required action will take effect immediately. This will speed up operations and will reduce the load on the API server of the shoot especially for large clusters.\nScalability The cloud-config-downloader adds a random wait time before restarting the kubelet in case the kubelet was updated or a configuration change was made to it. This is required to reduce the load on the API server and the traffic on the internet uplink. It also reduces the overall downtime of the services in the cluster because every kubelet restart transforms a node for several seconds into NotReady state which potentially interrupts service availability.\nDecision was made to keep the existing jitter mechanism which calculates the kubelet-download-and-restart-delay-seconds on the controller itself.\nCorrectness The configuration of the cloud-config-downloader is actually done by placing a file for every configuration item on the disk on the worker node. This was done because parsing the content of a single file and using this as a value in bash reduces to something like VALUE=$(cat /the/path/to/the/file). Simple, but it lacks validation, type safety and whatnot. With the gardener-node-agent we introduce a new API which is then stored in the gardener-node-agent secret and stored on disk in a single YAML file for comparison with the previous known state. This brings all benefits of type safe configuration. Because actual and previous configuration are compared, removed files and units are also removed and stopped on the worker if removed from the OSC.\nAvailability Previously, the cloud-config-downloader simply restarted the systemd units on every change to the OSC, regardless which of the services changed. The gardener-node-agent first checks which systemd unit was changed, and will only restart these. This will prevent unneeded kubelet restarts.\n","categories":"","description":"How Gardener bootstraps machines into worker nodes and how it installs and maintains gardener-managed node-specific components","excerpt":"How Gardener bootstraps machines into worker nodes and how it installs …","ref":"/docs/gardener/concepts/node-agent/","tags":"","title":"Gardener Node Agent"},{"body":"Overview The gardener-operator is responsible for the garden cluster environment. Without this component, users must deploy ETCD, the Gardener control plane, etc., manually and with separate mechanisms (not maintained in this repository). This is quite unfortunate since this requires separate tooling, processes, etc. A lot of production- and enterprise-grade features were built into Gardener for managing the seed and shoot clusters, so it makes sense to re-use them as much as possible also for the garden cluster.\nDeployment There is a Helm chart which can be used to deploy the gardener-operator. Once deployed and ready, you can create a Garden resource. Note that there can only be one Garden resource per system at a time.\n ℹ️ Similar to seed clusters, garden runtime clusters require a VPA, see this section. By default, gardener-operator deploys the VPA components. However, when there already is a VPA available, then set .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=false in the Garden resource.\n Garden Resources Please find an exemplary Garden resource here.\nConfiguration For Runtime Cluster Settings The Garden resource offers a few settings that are used to control the behaviour of gardener-operator in the runtime cluster. This section provides an overview over the available settings in .spec.runtimeCluster.settings:\nLoad Balancer Services gardener-operator deploys Istio and relevant resources to the runtime cluster in order to expose the virtual-garden-kube-apiserver service (similar to how the kube-apiservers of shoot clusters are exposed). In most cases, the cloud-controller-manager (responsible for managing these load balancers on the respective underlying infrastructure) supports certain customization and settings via annotations. This document provides a good overview and many examples.\nBy setting the .spec.runtimeCluster.settings.loadBalancerServices.annotations field the Gardener administrator can specify a list of annotations which will be injected into the Services of type LoadBalancer.\nVertical Pod Autoscaler gardener-operator heavily relies on the Kubernetes vertical-pod-autoscaler component. By default, the Garden controller deploys the VPA components into the garden namespace of the respective runtime cluster. In case you want to manage the VPA deployment on your own or have a custom one, then you might want to disable the automatic deployment of gardener-operator. Otherwise, you might end up with two VPAs which will cause erratic behaviour. By setting the .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=false you can disable the automatic deployment.\n⚠️ In any case, there must be a VPA available for your runtime cluster. Using a runtime cluster without VPA is not supported.\nTopology-Aware Traffic Routing Refer to the Topology-Aware Traffic Routing documentation as this document contains the documentation for the topology-aware routing setting for the garden runtime cluster.\nVolumes It is possible to define the minimum size for PersistentVolumeClaims in the runtime cluster created by gardener-operator via the .spec.runtimeCluster.volume.minimumSize field. This can be relevant in case the runtime cluster runs on an infrastructure that does only support disks of at least a certain size.\nCert-Management The operator can deploy the Gardener cert-management component optionally. A default issuer has to be specified and will be deployed, too. Please note that the cert-controller-manager is configured to use DNSRecords for ACME DNS challenges on certificate requests. A suitable provider extension must be deployed in this case, e.g. using an operator Extension resource. The default issuer must be set at .spec.runtimeCluster.certManagement.defaultIssuer either specifying an ACME or CA issuer.\nIf the cert-controller-manager should make requests to any ACME servers running with self-signed TLS certificates, the related CA can be provided using a secret with data field bundle.crt referenced with .spec.runtimeCluster.certManagement.config.caCertificatesSecretRef.\nDefault Issuer using an ACME server Please provide at least server and e-mail address.\nspec: runtimeCluster: certManagement: defaultIssuer: ACME: server: https://acme-v02.api.letsencrypt.org/directory email: some.name@my-email-domain.com # secretRef: # name: defaultIssuerPrivateKey # precheckNameservers: # - 1.2.3.4 # - 5.6.7.8 If needed, an existing ACME account can be specified with the secretRef. The referenced secret must contain a field privateKey. Otherwise, an account is auto-registered if supported by the ACME server. If you are using a private DNS server, you may need to set the precheckNameservers used to check the propagation of the DNS challenges.\nDefault Issuer using a root or intermediate CA If you want to use a root or intermediate CA for signing the certificates, provide a TLS secret containing the CA and reference it as shown in the example below.\nspec: runtimeCluster: certManagement: defaultIssuer: CA: secretRef: name: my-ca-tls-secret Configuration For Virtual Cluster ETCD Encryption Config The spec.virtualCluster.kubernetes.kubeAPIServer.encryptionConfig field in the Garden API allows operators to customize encryption configurations for the kube-apiserver of the virtual cluster. It provides options to specify additional resources for encryption. Similarly spec.virtualCluster.gardener.gardenerAPIServer.encryptionConfig field allows operators to customize encryption configurations for the gardener-apiserver.\n The resources field can be used to specify resources that should be encrypted in addition to secrets. Secrets are always encrypted for the kube-apiserver. For the gardener-apiserver, the following resources are always encrypted: controllerdeployments.core.gardener.cloud controllerregistrations.core.gardener.cloud internalsecrets.core.gardener.cloud shootstates.core.gardener.cloud Adding an item to any of the lists will cause patch requests for all the resources of that kind to encrypt them in the etcd. See Encrypting Confidential Data at Rest for more details. Removing an item from any of these lists will cause patch requests for all the resources of that type to decrypt and rewrite the resource as plain text. See Decrypt Confidential Data that is Already Encrypted at Rest for more details. ℹ️ Note that configuring encryption for a custom resource for the kube-apiserver is only supported for Kubernetes versions \u003e= 1.26.\n Extension Resource A Gardener installation relies on extensions to provide support for new cloud providers or to add new capabilities. You can find out more about Gardener extensions and how they can be used here.\nThe Extension resource is intended to automate the installation and management of extensions in a Gardener landscape. It contains configuration for the following scenarios:\n The deployment of the extension chart in the garden runtime cluster. The deployment of ControllerRegistration and ControllerDeployment resources in the (virtual) garden cluster. The deployment of extension admissions charts in runtime and virtual clusters. In the near future, the Extension will be used by the gardener-operator to automate the management of the backup bucket for ETCD and DNS records required by the garden cluster. To do that, gardener-operator will leverage extensions that support DNSRecord and BackupBucket resources. As of today, the support for managed DNSRecords and BackupBuckets in the gardener-operator is still being built. However, the Extension’s specification already reflects the target picture.\nPlease find an exemplary Extension resource here.\nExtension Deployment The .spec.deployment specifies how an extension can be installed for a Gardener landscape and consists of the following parts:\n .spec.deployment.extension contains the deployment specification of an extension. .spec.deployment.admission contains the deployment specification of an extension admission. Each one is described in more details below.\nConfiguration for Extension Deployment .spec.deployment.extension contains configuration for the registration of an extension in the garden cluster. gardener-operator follows the same principles described by this document:\n .spec.deployment.extension.helm and .spec.deployment.extension.values are used when creating the ControllerDeployment in the garden cluster. .spec.deployment.extension.policy and .spec.deployment.extension.seedSelector define the extension’s installation policy as per the ControllerDeployment's respective fields Runtime Extensions can manage resources required by the Garden resource (e.g. BackupBucket, DNSRecord, Extension) in the runtime cluster. Since the environment in the runtime cluster may differ from that of a Seed, the extension is installed in the runtime cluster with a distinct set of Helm chart values specified in .spec.deployment.extension.runtimeValues. If no runtimeValues are provided, the extension deployment for the runtime garden is considered superfluous and the deployment is uninstalled. The configuration allows for precise control over various extension parameters, such as requested resources, priority classes, and more.\nBesides the values configured in .spec.deployment.extension.runtimeValues, a runtime deployment flag and a priority class are merged into the values:\ngardener: runtimeCluster: enabled: true # indicates the extension is enabled for the Garden cluster, e.g. for handling `BackupBucket`, `DNSRecord` and `Extension` objects. priorityClassName: gardener-garden-system-200 As soon as a Garden object is created and runtimeValues are configured, the extension is deployed in the runtime cluster.\nExtension Registration When the virtual garden cluster is available, the Extension controller manages ControllerRegistration/ControllerDeployment resources to register extensions for shoots. The fields of .spec.deployment.extension include their configuration options.\nConfiguration for Admission Deployment The .spec.deployment.admission defines how an extension admission may be deployed by the gardener-operator. This deployment is optional and may be omitted. Typically, the admission are split in two parts:\nRuntime The runtime part contains deployment relevant manifests, required to run the admission service in the runtime cluster. The following values are passed to the chart during reconciliation:\ngardener: runtimeCluster: priorityClassName: \u003cClass to be used for extension admission\u003e Virtual The virtual part includes the webhook registration (MutatingWebhookConfiguration/Validatingwebhookconfiguration) and RBAC configuration. The following values are passed to the chart during reconciliation:\ngardener: virtualCluster: serviceAccount: name: \u003cName of the service account used to connect to the garden cluster\u003e namespace: \u003cNamespace of the service account\u003e Extension admissions often need to retrieve additional context from the garden cluster in order to process validating or mutating requests.\nFor example, the corresponding CloudProfile might be needed to perform a provider specific shoot validation. Therefore, Gardener automatically injects a kubeconfig into the admission deployment to interact with the (virtual) garden cluster (see this document for more information).\nConfiguration for Extension Resources The .spec.resources field refers to the extension resources as defined by Gardener in the extensions.gardener.cloud/v1alpha1 API. These include both well-known types such as Infrastructure, Worker etc. and generic resources. The field will be used to populate the respective field in the resulting ControllerRegistration in the garden cluster.\nControllers The gardener-operator controllers are now described in more detail.\nGarden Controller The Garden controller in the operator reconciles Garden objects with the help of the following reconcilers.\nMain Reconciler The reconciler first generates a general CA certificate which is valid for ~30d and auto-rotated when 80% of its lifetime is reached. Afterwards, it brings up the so-called “garden system components”. The gardener-resource-manager is deployed first since its ManagedResource controller will be used to bring up the remainders.\nOther system components are:\n runtime garden system resources (PriorityClasses for the workload resources) virtual garden system resources (RBAC rules) Vertical Pod Autoscaler (if enabled via .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=true in the Garden) HVPA Controller (when HVPA feature gate is enabled) ETCD Druid Istio As soon as all system components are up, the reconciler deploys the virtual garden cluster. It comprises out of two ETCDs (one “main” etcd, one “events” etcd) which are managed by ETCD Druid via druid.gardener.cloud/v1alpha1.Etcd custom resources. The whole management works similar to how it works for Shoots, so you can take a look at this document for more information in general.\nThe virtual garden control plane components are:\n virtual-garden-etcd-main virtual-garden-etcd-events virtual-garden-kube-apiserver virtual-garden-kube-controller-manager virtual-garden-gardener-resource-manager If the .spec.virtualCluster.controlPlane.highAvailability={} is set then these components will be deployed in a “highly available” mode. For ETCD, this means that there will be 3 replicas each. This works similar like for Shoots (see this document) except for the fact that there is no failure tolerance type configurability. The gardener-resource-manager’s HighAvailabilityConfig webhook makes sure that all pods with multiple replicas are spread on nodes, and if there are at least two zones in .spec.runtimeCluster.provider.zones then they also get spread across availability zones.\n If once set, removing .spec.virtualCluster.controlPlane.highAvailability again is not supported.\n The virtual-garden-kube-apiserver Deployment is exposed via Istio, similar to how the kube-apiservers of shoot clusters are exposed.\nSimilar to the Shoot API, the version of the virtual garden cluster is controlled via .spec.virtualCluster.kubernetes.version. Likewise, specific configuration for the control plane components can be provided in the same section, e.g. via .spec.virtualCluster.kubernetes.kubeAPIServer for the kube-apiserver or .spec.virtualCluster.kubernetes.kubeControllerManager for the kube-controller-manager.\nThe kube-controller-manager only runs a few controllers that are necessary in the scenario of the virtual garden. Most prominently, the serviceaccount-token controller is unconditionally disabled. Hence, the usage of static ServiceAccount secrets is not supported generally. Instead, the TokenRequest API should be used. Third-party components that need to communicate with the virtual cluster can leverage the gardener-resource-manager’s TokenRequestor controller and the generic kubeconfig, just like it works for Shoots. Please note, that this functionality is restricted to the garden namespace. The current Secret name of the generic kubeconfig can be found in the annotations (key: generic-token-kubeconfig.secret.gardener.cloud/name) of the Garden resource.\nFor the virtual cluster, it is essential to provide at least one DNS domain via .spec.virtualCluster.dns.domains. The respective DNS records are not managed by gardener-operator and should be created manually. They should point to the load balancer IP of the istio-ingressgateway Service in namespace virtual-garden-istio-ingress. The DNS records must be prefixed with both gardener. and api. for all domains in .spec.virtualCluster.dns.domains.\nThe first DNS domain in this list is used for the server in the kubeconfig, and for configuring the --external-hostname flag of the API server.\nApart from the control plane components of the virtual cluster, the reconcile also deploys the control plane components of Gardener. gardener-apiserver reuses the same ETCDs like the virtual-garden-kube-apiserver, so all data related to the “the garden cluster” is stored together and “isolated” from ETCD data related to the runtime cluster. This drastically simplifies backup and restore capabilities (e.g., moving the virtual garden cluster from one runtime cluster to another).\nThe Gardener control plane components are:\n gardener-apiserver gardener-admission-controller gardener-controller-manager gardener-scheduler Besides those, the gardener-operator is able to deploy the following optional components:\n Gardener Dashboard (and the controller for web terminals) when .spec.virtualCluster.gardener.gardenerDashboard (or .spec.virtualCluster.gardener.gardenerDashboard.terminal, respectively) is set. You can read more about it and its configuration in this section. Gardener Discovery Server when .spec.virtualCluster.gardener.gardenerDiscoveryServer is set. The service account issuer of shoots will be calculated in the format https://discovery.\u003c.spec.runtimeCluster.ingress.domains[0]\u003e/projects/\u003cproject-name\u003e/shoots/\u003cshoot-uid\u003e/issuer. This configuration applies for all seeds registered with the Garden cluster. Once set it should not be modified. The reconciler also manages a few observability-related components (more planned as part of GEP-19):\n fluent-operator fluent-bit gardener-metrics-exporter kube-state-metrics plutono vali prometheus-operator alertmanager-garden (read more here) prometheus-garden (read more here) prometheus-longterm (read more here) blackbox-exporter It is also mandatory to provide an IPv4 CIDR for the service network of the virtual cluster via .spec.virtualCluster.networking.services. This range is used by the API server to compute the cluster IPs of Services.\nThe controller maintains the .status.lastOperation which indicates the status of an operation.\nGardener Dashboard .spec.virtualCluster.gardener.gardenerDashboard serves a few configuration options for the dashboard. This section highlights the most prominent fields:\n oidcConfig: The general OIDC configuration is part of .spec.virtualCluster.kubernetes.kubeAPIServer.oidcConfig. This section allows you to define a few specific settings for the dashboard. sessionLifetime is the duration after which a session is terminated (i.e., after which a user is automatically logged out). additionalScopes allows to extend the list of scopes of the JWT token that are to be recognized. You must reference a Secret in the garden namespace containing the client ID/secret for the dashboard: apiVersion: v1 kind: Secret metadata: name: gardener-dashboard-oidc namespace: garden type: Opaque stringData: client_id: \u003csecret\u003e client_secret: \u003csecret\u003e enableTokenLogin: This is enabled by default and allows logging into the dashboard with a JWT token. You can disable it in case you want to only allow OIDC-based login. However, at least one of the both login methods must be enabled. frontendConfigMapRef: Reference a ConfigMap in the garden namespace containing the frontend configuration in the data with key frontend-config.yaml, for example apiVersion: v1 kind: ConfigMap metadata: name: gardener-dashboard-frontend namespace: garden data: frontend-config.yaml: |helpMenuItems: - title: Homepage icon: mdi-file-document url: https://gardener.cloud Please take a look at this file to get an idea of which values are configurable. This configuration can also include branding, themes, and colors. Read more about it here. Assets (logos/icons) are configured in a separate ConfigMap, see below. assetsConfigMapRef: Reference a ConfigMap in the garden namespace containing the assets, for example apiVersion: v1 kind: ConfigMap metadata: name: gardener-dashboard-assets namespace: garden binaryData: favicon-16x16.png: base64(favicon-16x16.png) favicon-32x32.png: base64(favicon-32x32.png) favicon-96x96.png: base64(favicon-96x96.png) favicon.ico: base64(favicon.ico) logo.svg: base64(logo.svg) Note that the assets must be provided base64-encoded, hence binaryData (instead of data) must be used. Please take a look at this file to get more information. gitHub: You can connect a GitHub repository that can be used to create issues for shoot clusters in the cluster details page. You have to reference a Secret in the garden namespace that contains the GitHub credentials, for example: apiVersion: v1 kind: Secret metadata: name: gardener-dashboard-github namespace: garden type: Opaque stringData: # This is for GitHub token authentication: authentication.token: \u003csecret\u003e # Alternatively, this is for GitHub app authentication: authentication.appId: \u003csecret\u003e authentication.clientId: \u003csecret\u003e authentication.clientSecret: \u003csecret\u003e authentication.installationId: \u003csecret\u003e authentication.privateKey: \u003csecret\u003e # This is the webhook secret, see explanation below webhookSecret: \u003csecret\u003e Note that you can also set up a GitHub webhook to the dashboard such that it receives updates when somebody changes the GitHub issue. The webhookSecret field is the secret that you enter in GitHub in the webhook configuration. The dashboard uses it to verify that received traffic is indeed originated from GitHub. If you don’t want to set up such webhook, or if the dashboard is not reachable by the GitHub webhook (e.g., in restricted environments) you can also configure gitHub.pollInterval. It is the interval of how often the GitHub API is polled for issue updates. This field is used as a fallback mechanism to ensure state synchronization, even when there is a GitHub webhook configuration. If a webhook event is missed or not successfully delivered, the polling will help catch up on any missed updates. If this field is not provided and there is no webhookSecret key in the referenced secret, it will be implicitly defaulted to 15m. The dashboard will use this to regularly poll the GitHub API for updates on issues. terminal: This enables the web terminal feature, read more about it here. When set, the terminal-controller-manager will be deployed to the runtime cluster. The allowedHosts field is explained here. The container section allows you to specify a container image and a description that should be used for the web terminals. Observability Garden Prometheus gardener-operator deploys a Prometheus instance in the garden namespace (called “Garden Prometheus”) which fetches metrics and data from garden system components, cAdvisors, the virtual cluster control plane, and the Seeds’ aggregate Prometheus instances. Its purpose is to provide an entrypoint for operators when debugging issues with components running in the garden cluster. It also serves as the top-level aggregator of metering across a Gardener landscape.\nTo extend the configuration of the Garden Prometheus, you can create the prometheus-operator’s custom resources and label them with prometheus=garden, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: garden name: garden-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Long-Term Prometheus gardener-operator deploys another Prometheus instance in the garden namespace (called “Long-Term Prometheus”) which federates metrics from Garden Prometheus. Its purpose is to store those with a longer retention than Garden Prometheus would. It is not possible to define different retention periods for different metrics in Prometheus, hence, using another Prometheus instance is the only option. This Long-term Prometheus also has an additional Cortex sidecar container for caching some queries to achieve faster processing times.\nTo extend the configuration of the Long-term Prometheus, you can create the prometheus-operator’s custom resources and label them with prometheus=longterm, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: longterm name: longterm-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Alertmanager By default, the alertmanager-garden deployed by gardener-operator does not come with any configuration. It is the responsibility of the human operators to design and provide it. This can be done by creating monitoring.coreos.com/v1alpha1.AlertmanagerConfig resources labeled with alertmanager=garden (read more about them here), for example:\napiVersion: monitoring.coreos.com/v1alpha1 kind: AlertmanagerConfig metadata: name: config namespace: garden labels: alertmanager: garden spec: route: receiver: dev-null groupBy: - alertname - landscape routes: - continue: true groupWait: 3m groupInterval: 5m repeatInterval: 12h routes: - receiver: ops matchers: - name: severity value: warning matchType: = - name: topology value: garden matchType: = receivers: - name: dev-null - name: ops slackConfigs: - apiURL: https://\u003cslack-api-url\u003e channel: \u003cchannel-name\u003e username: Gardener-Alertmanager iconEmoji: \":alert:\" title: \"[{{ .Status | toUpper }}] Gardener Alert(s)\" text: \"{{ range .Alerts }}*{{ .Annotations.summary }} ({{ .Status }})*\\n{{ .Annotations.description }}\\n\\n{{ end }}\" sendResolved: true Plutono A Plutono instance is deployed by gardener-operator into the garden namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the garden namespace labelled with dashboard.monitoring.gardener.cloud/garden=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/garden: \"true\" name: my-custom-dashboard namespace: garden data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Care Reconciler This reconciler performs four “care” actions related to Gardens.\nIt maintains the following conditions:\n VirtualGardenAPIServerAvailable: The /healthz endpoint of the garden’s virtual-garden-kube-apiserver is called and considered healthy when it responds with 200 OK. RuntimeComponentsHealthy: The conditions of the ManagedResources applied to the runtime cluster are checked (e.g., ResourcesApplied). VirtualComponentsHealthy: The virtual components are considered healthy when the respective Deployments (for example virtual-garden-kube-apiserver,virtual-garden-kube-controller-manager), and Etcds (for example virtual-garden-etcd-main) exist and are healthy. Additionally, the conditions of the ManagedResources applied to the virtual cluster are checked (e.g., ResourcesApplied). ObservabilityComponentsHealthy: This condition is considered healthy when the respective Deployments (for example plutono) and StatefulSets (for example prometheus, vali) exist and are healthy. If all checks for a certain condition are succeeded, then its status will be set to True. Otherwise, it will be set to False or Progressing.\nIf at least one check fails and there is threshold configuration for the conditions (in .controllers.gardenCare.conditionThresholds), then the status will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nIn order to compute the condition statuses, this reconciler considers ManagedResources (in the garden and istio-system namespace) and their status, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when RuntimeComponentsHealthy .spec.class=seed and care.gardener.cloud/condition-type label either unset, or set to RuntimeComponentsHealthy VirtualComponentsHealthy .spec.class unset or care.gardener.cloud/condition-type label set to VirtualComponentsHealthy ObservabilityComponentsHealthy care.gardener.cloud/condition-type label set to ObservabilityComponentsHealthy Reference Reconciler Garden objects may specify references to other objects in the Garden cluster which are required for certain features. For example, operators can configure a secret for ETCD backup via .spec.virtualCluster.etcd.main.backup.secretRef.name or an audit policy ConfigMap via .spec.virtualCluster.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef.name. Such objects need a special protection against deletion requests as long as they are still being referenced by the Garden.\nTherefore, this reconciler checks Gardens for referenced objects and adds the finalizer gardener.cloud/reference-protection to their .metadata.finalizers list. The reconciled Garden also gets this finalizer to enable a proper garbage collection in case the gardener-operator is offline at the moment of an incoming deletion request. When an object is not actively referenced anymore because the Garden specification has changed is in deletion, the controller will remove the added finalizer again so that the object can safely be deleted or garbage collected.\nThis reconciler inspects the following references:\n ETCD backup Secrets (.spec.virtualCluster.etcd.main.backup.secretRef) Admission plugin kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.admissionPlugins[].kubeconfigSecretName and .spec.virtualCluster.gardener.gardenerAPIServer.admissionPlugins[].kubeconfigSecretName) Authentication webhook kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.authentication.webhook.kubeconfigSecretName) Audit webhook kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.auditWebhook.kubeconfigSecretName and .spec.virtualCluster.gardener.gardenerAPIServer.auditWebhook.kubeconfigSecretName) SNI Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.sni.secretName) Audit policy ConfigMaps (.spec.virtualCluster.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef.name and .spec.virtualCluster.gardener.gardenerAPIServer.auditConfig.auditPolicy.configMapRef.name) Further checks might be added in the future.\nController Registrar controller This controller registers controllers, which need to be installed in two contexts. If the Garden cluster is at the same time used as a Seed cluster, the gardener-operator will start these controllers. If the Garden cluster is separate from the Seed cluster, the controllers will be started by gardenlet.\nCurrently, this applies to two controllers:\n NetworkPolicy controller VPA EvictionRequirements controller The registration happens as soon as the Garden resource is created. It contains the networking information of the garden runtime cluster which is required configuration for the NetworkPolicy controller.\nExtension Controller Gardener relies on extensions to provide various capabilities, such as supporting cloud providers. This controller automates the management of extensions by managing all necessary resources in the runtime and virtual garden clusters.\nCurrently, this controller handles the following scenarios:\n Extension deployment in the runtime cluster Extension admission deployment for the virtual garden cluster. ControllerDeployment and ControllerRegistration reconciliation in the virtual garden cluster. Gardenlet Controller The Gardenlet controller reconciles a seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in case there is no Seed yet with the same name. This is used to allow easy deployments of gardenlets into unmanaged seed clusters. For a general overview, see this document.\nOn Gardenlet reconciliation, the controller deploys the gardenlet to the cluster (either its own, or the one provided via the .spec.kubeconfigSecretRef) after downloading the Helm chart specified in .spec.deployment.helm.ociRepository and rendering it with the provided values/configuration.\nOn Gardenlet deletion, nothing happens: gardenlets must always be deleted manually (by deleting the Seed and, once gone, then the gardenlet Deployment).\n [!NOTE] This controller only takes care of the very first gardenlet deployment (since it only reacts when there is no Seed resource yet). After the gardenlet is running, it uses the self-upgrade mechanism by watching the seedmanagement.gardener.cloud/v1alpha1.Gardenlet (see this for more details.)\nAfter a successful Garden reconciliation, gardener-operator also updates the .spec.deployment.helm.ociRepository.ref to its own version in all Gardenlet resources labeled with operator.gardener.cloud/auto-update-gardenlet-helm-chart-ref=true. gardenlets then updates themselves.\n⚠️ If you prefer to manage the Gardenlet resources via GitOps, Flux, or similar tools, then you should better manage the .spec.deployment.helm.ociRepository.ref field yourself and not label the resources as mentioned above (to prevent gardener-operator from interfering with your desired state). Make sure to apply your Gardenlet resources (potentially containing a new version) after the Garden resource was successfully reconciled (i.e., after Gardener control plane was successfully rolled out, see this for more information.)\n Webhooks As of today, the gardener-operator only has one webhook handler which is now described in more detail.\nValidation This webhook handler validates CREATE/UPDATE/DELETE operations on Garden resources. Simple validation is performed via standard CRD validation. However, more advanced validation is hard to express via these means and is performed by this webhook handler.\nFurthermore, for deletion requests, it is validated that the Garden is annotated with a deletion confirmation annotation, namely confirmation.gardener.cloud/deletion=true. Only if this annotation is present it allows the DELETE operation to pass. This prevents users from accidental/undesired deletions.\nAnother validation is to check that there is only one Garden resource at a time. It prevents creating a second Garden when there is already one in the system.\nDefaulting This webhook handler mutates the Garden resource on CREATE/UPDATE/DELETE operations. Simple defaulting is performed via standard CRD defaulting. However, more advanced defaulting is hard to express via these means and is performed by this webhook handler.\nUsing Garden Runtime Cluster As Seed Cluster In production scenarios, you probably wouldn’t use the Kubernetes cluster running gardener-operator and the Gardener control plane (called “runtime cluster”) as seed cluster at the same time. However, such setup is technically possible and might simplify certain situations (e.g., development, evaluation, …).\nIf the runtime cluster is a seed cluster at the same time, gardenlet’s Seed controller will not manage the components which were already deployed (and reconciled) by gardener-operator. As of today, this applies to:\n gardener-resource-manager vpa-{admission-controller,recommender,updater} hvpa-controller (when HVPA feature gate is enabled) etcd-druid istio control-plane nginx-ingress-controller Those components are so-called “seed system components”. In addition, there are a few observability components:\n fluent-operator fluent-bit vali plutono kube-state-metrics prometheus-operator As all of these components are managed by gardener-operator in this scenario, the gardenlet just skips them.\n ℹ️ There is no need to configure anything - the gardenlet will automatically detect when its seed cluster is the garden runtime cluster at the same time.\n ⚠️ Note that such setup requires that you upgrade the versions of gardener-operator and gardenlet in lock-step. Otherwise, you might experience unexpected behaviour or issues with your seed or shoot clusters.\nCredentials Rotation The credentials rotation works in the same way as it does for Shoot resources, i.e. there are gardener.cloud/operation annotation values for starting or completing the rotation procedures.\nFor certificate authorities, gardener-operator generates one which is automatically rotated roughly each month (ca-garden-runtime) and several CAs which are NOT automatically rotated but only on demand.\n🚨 Hence, it is the responsibility of the (human) operator to regularly perform the credentials rotation.\nPlease refer to this document for more details. As of today, gardener-operator only creates the following types of credentials (i.e., some sections of the document don’t apply for Gardens and can be ignored):\n certificate authorities (and related server and client certificates) ETCD encryption key observability password for Plutono ServiceAccount token signing key WorkloadIdentity token signing key ⚠️ Rotation of static ServiceAccount secrets is not supported since the kube-controller-manager does not enable the serviceaccount-token controller.\nWhen the ServiceAccount token signing key rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-garden-access-secrets. This causes gardenlet to populate new ServiceAccount tokens for the garden cluster to all extensions, which are now signed with the new signing key. Read more about it here.\nSimilarly, when the CA certificate rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-kubeconfig. This causes gardenlet to request a new client certificate for its garden cluster kubeconfig, which is now signed with the new client CA, and which also contains the new CA bundle for the server certificate verification. Read more about it here.\nAlso, when the WorkloadIdentity token signing key rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-workload-identity-tokens. This causes gardenlet to renew all workload identity tokens in the seed cluster with new tokens now signed with the new signing key.\nMigrating an Existing Gardener Landscape to gardener-operator Since gardener-operator was only developed in 2023, six years after the Gardener project initiation, most users probably already have an existing Gardener landscape. The most prominent installation procedure is garden-setup, however experience shows that most community members have developed their own tooling for managing the garden cluster and the Gardener control plane components.\n Consequently, providing a general migration guide is not possible since the detailed steps vary heavily based on how the components were set up previously. As a result, this section can only highlight the most important caveats and things to know, while the concrete migration steps must be figured out individually based on the existing installation.\nPlease test your migration procedure thoroughly. Note that in some cases it can be easier to set up a fresh landscape with gardener-operator, restore the ETCD data, switch the DNS records, and issue new credentials for all clients.\n Please make sure that you configure all your desired fields in the Garden resource.\nETCD gardener-operator leverages etcd-druid for managing the virtual-garden-etcd-main and virtual-garden-etcd-events, similar to how shoot cluster control planes are handled. The PersistentVolumeClaim names differ slightly - for virtual-garden-etcd-events it’s virtual-garden-etcd-events-virtual-garden-etcd-events-0, while for virtual-garden-etcd-main it’s main-virtual-garden-etcd-virtual-garden-etcd-main-0. The easiest approach for the migration is to make your existing ETCD volumes follow the same naming scheme. Alternatively, backup your data, let gardener-operator take over ETCD, and then restore your data to the new volume.\nThe backup bucket must be created separately, and its name as well as the respective credentials must be provided via the Garden resource in .spec.virtualCluster.etcd.main.backup.\nvirtual-garden-kube-apiserver Deployment gardener-operator deploys a virtual-garden-kube-apiserver into the runtime cluster. This virtual-garden-kube-apiserver spans a new cluster, called the virtual cluster. There are a few certificates and other credentials that should not change during the migration. You have to prepare the environment accordingly by leveraging the secret’s manager capabilities.\n The existing Cluster CA Secret should be labeled with secrets-manager-use-data-for-name=ca. The existing Client CA Secret should be labeled with secrets-manager-use-data-for-name=ca-client. The existing Front Proxy CA Secret should be labeled with secrets-manager-use-data-for-name=ca-front-proxy. The existing Service Account Signing Key Secret should be labeled with secrets-manager-use-data-for-name=service-account-key. The existing ETCD Encryption Key Secret should be labeled with secrets-manager-use-data-for-name=kube-apiserver-etcd-encryption-key. virtual-garden-kube-apiserver Exposure The virtual-garden-kube-apiserver is exposed via a dedicated istio-ingressgateway deployed to namespace virtual-garden-istio-ingress. The virtual-garden-kube-apiserver Service in the garden namespace is only of type ClusterIP. Consequently, DNS records for this API server must target the load balancer IP of the istio-ingressgateway.\nVirtual Garden Kubeconfig gardener-operator does not generate any static token or likewise for access to the virtual cluster. Ideally, human users access it via OIDC only. Alternatively, you can create an auto-rotated token that you can use for automation like CI/CD pipelines:\napiVersion: v1 kind: Secret type: Opaque metadata: name: shoot-access-virtual-garden namespace: garden labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: shoot annotations: serviceaccount.resources.gardener.cloud/name: virtual-garden-user serviceaccount.resources.gardener.cloud/namespace: kube-system serviceaccount.resources.gardener.cloud/token-expiration-duration: 3h --- apiVersion: v1 kind: Secret metadata: name: managedresource-virtual-garden-access namespace: garden type: Opaque stringData: clusterrolebinding____gardener.cloud.virtual-garden-access.yaml: |apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: gardener.cloud.sap:virtual-garden roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: virtual-garden-user namespace: kube-system --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: virtual-garden-access namespace: garden spec: secretRefs: - name: managedresource-virtual-garden-access The shoot-access-virtual-garden Secret will get a .data.token field which can be used to authenticate against the virtual garden cluster. See also this document for more information about the TokenRequestor.\ngardener-apiserver Similar to the virtual-garden-kube-apiserver, the gardener-apiserver also uses a few certificates and other credentials that should not change during the migration. Again, you have to prepare the environment accordingly by leveraging the secret’s manager capabilities.\n The existing ETCD Encryption Key Secret should be labeled with secrets-manager-use-data-for-name=gardener-apiserver-etcd-encryption-key. Also note that gardener-operator manages the Service and Endpoints resources for the gardener-apiserver in the virtual cluster within the kube-system namespace (garden-setup uses the garden namespace).\nLocal Development The easiest setup is using a local KinD cluster and the Skaffold based approach to deploy and develop the gardener-operator.\nSetting Up the KinD Cluster (runtime cluster) make kind-operator-up This command sets up a new KinD cluster named gardener-local and stores the kubeconfig in the ./example/gardener-local/kind/operator/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/operator/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/operator/kubeconfig.\n All the following steps assume that you are using this kubeconfig.\nSetting Up Gardener Operator make operator-up This will first build the base images (which might take a bit if you do it for the first time). Afterwards, the Gardener Operator resources will be deployed into the cluster.\nDeveloping Gardener Operator (Optional) make operator-dev This is similar to make operator-up but additionally starts a skaffold dev loop. After the initial deployment, skaffold starts watching source files. Once it has detected changes, press any key to trigger a new build and deployment of the changed components.\nDebugging Gardener Operator (Optional) make operator-debug This is similar to make gardener-debug but for Gardener Operator component. Please check Debugging Gardener for details.\nCreating a Garden In order to create a garden, just run:\nkubectl apply -f example/operator/20-garden.yaml You can wait for the Garden to be ready by running:\n./hack/usage/wait-for.sh garden local VirtualGardenAPIServerAvailable VirtualComponentsHealthy Alternatively, you can run kubectl get garden and wait for the RECONCILED status to reach True:\nNAME LAST OPERATION RUNTIME VIRTUAL API SERVER OBSERVABILITY AGE local Processing False False False False 1s (Optional): Instead of creating above Garden resource manually, you could execute the e2e tests by running:\nmake test-e2e-local-operator Accessing the Virtual Garden Cluster ⚠️ Please note that in this setup, the virtual garden cluster is not accessible by default when you download the kubeconfig and try to communicate with it. The reason is that your host most probably cannot resolve the DNS name of the cluster. Hence, if you want to access the virtual garden cluster, you have to run the following command which will extend your /etc/hosts file with the required information to make the DNS names resolvable:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Manually created to access local Gardener virtual garden cluster. # TODO: Remove this again when the virtual garden cluster access is no longer required. 172.18.255.3 api.virtual-garden.local.gardener.cloud EOF To access the virtual garden, you can acquire a kubeconfig by\nkubectl -n garden get secret gardener -o jsonpath={.data.kubeconfig} | base64 -d \u003e /tmp/virtual-garden-kubeconfig kubectl --kubeconfig /tmp/virtual-garden-kubeconfig get namespaces Note that this kubeconfig uses a token that has validity of 12h only, hence it might expire and causing you to re-download the kubeconfig.\nCreating Seeds and Shoots You can also create Seeds and Shoots from your local development setup. Please see here for details.\nDeleting the Garden ./hack/usage/delete garden local Tear Down the Gardener Operator Environment make operator-down make kind-operator-down ","categories":"","description":"Understand the component responsible for the garden cluster environment and its various features","excerpt":"Understand the component responsible for the garden cluster …","ref":"/docs/gardener/concepts/operator/","tags":"","title":"Gardener Operator"},{"body":"Overview Initially, the gardener-resource-manager was a project similar to the kube-addon-manager. It manages Kubernetes resources in a target cluster which means that it creates, updates, and deletes them. Also, it makes sure that manual modifications to these resources are reconciled back to the desired state.\nIn the Gardener project we were using the kube-addon-manager since more than two years. While we have progressed with our extensibility story (moving cloud providers out-of-tree), we had decided that the kube-addon-manager is no longer suitable for this use-case. The problem with it is that it needs to have its managed resources on its file system. This requires storing the resources in ConfigMaps or Secrets and mounting them to the kube-addon-manager pod during deployment time. The gardener-resource-manager uses CustomResourceDefinitions which allows to dynamically add, change, and remove resources with immediate action and without the need to reconfigure the volume mounts/restarting the pod.\nMeanwhile, the gardener-resource-manager has evolved to a more generic component comprising several controllers and webhook handlers. It is deployed by gardenlet once per seed (in the garden namespace) and once per shoot (in the respective shoot namespaces in the seed).\nComponent Configuration Similar to other Gardener components, the gardener-resource-manager uses a so-called component configuration file. It allows specifying certain central settings like log level and formatting, client connection configuration, server ports and bind addresses, etc. In addition, controllers and webhooks can be configured and sometimes even disabled.\nNote that the very basic ManagedResource and health controllers cannot be disabled.\nYou can find an example configuration file here.\nControllers ManagedResource Controller This controller watches custom objects called ManagedResources in the resources.gardener.cloud/v1alpha1 API group. These objects contain references to secrets, which itself contain the resources to be managed. The reason why a Secret is used to store the resources is that they could contain confidential information like credentials.\n--- apiVersion: v1 kind: Secret metadata: name: managedresource-example1 namespace: default type: Opaque data: objects.yaml: YXBpVmVyc2lvbjogdjEKa2luZDogQ29uZmlnTWFwCm1ldGFkYXRhOgogIG5hbWU6IHRlc3QtMTIzNAogIG5hbWVzcGFjZTogZGVmYXVsdAotLS0KYXBpVmVyc2lvbjogdjEKa2luZDogQ29uZmlnTWFwCm1ldGFkYXRhOgogIG5hbWU6IHRlc3QtNTY3OAogIG5hbWVzcGFjZTogZGVmYXVsdAo= # apiVersion: v1 # kind: ConfigMap # metadata: # name: test-1234 # namespace: default # --- # apiVersion: v1 # kind: ConfigMap # metadata: # name: test-5678 # namespace: default --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: example namespace: default spec: secretRefs: - name: managedresource-example1 In the above example, the controller creates two ConfigMaps in the default namespace. When a user is manually modifying them, they will be reconciled back to the desired state stored in the managedresource-example secret.\nIt is also possible to inject labels into all the resources:\n--- apiVersion: v1 kind: Secret metadata: name: managedresource-example2 namespace: default type: Opaque data: other-objects.yaml: YXBpVmVyc2lvbjogYXBwcy92MSAjIGZvciB2ZXJzaW9ucyBiZWZvcmUgMS45LjAgdXNlIGFwcHMvdjFiZXRhMgpraW5kOiBEZXBsb3ltZW50Cm1ldGFkYXRhOgogIG5hbWU6IG5naW54LWRlcGxveW1lbnQKc3BlYzoKICBzZWxlY3RvcjoKICAgIG1hdGNoTGFiZWxzOgogICAgICBhcHA6IG5naW54CiAgcmVwbGljYXM6IDIgIyB0ZWxscyBkZXBsb3ltZW50IHRvIHJ1biAyIHBvZHMgbWF0Y2hpbmcgdGhlIHRlbXBsYXRlCiAgdGVtcGxhdGU6CiAgICBtZXRhZGF0YToKICAgICAgbGFiZWxzOgogICAgICAgIGFwcDogbmdpbngKICAgIHNwZWM6CiAgICAgIGNvbnRhaW5lcnM6CiAgICAgIC0gbmFtZTogbmdpbngKICAgICAgICBpbWFnZTogbmdpbng6MS43LjkKICAgICAgICBwb3J0czoKICAgICAgICAtIGNvbnRhaW5lclBvcnQ6IDgwCg== # apiVersion: apps/v1 # kind: Deployment # metadata: # name: nginx-deployment # spec: # selector: # matchLabels: # app: nginx # replicas: 2 # tells deployment to run 2 pods matching the template # template: # metadata: # labels: # app: nginx # spec: # containers: # - name: nginx # image: nginx:1.7.9 # ports: # - containerPort: 80 --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: example namespace: default spec: secretRefs: - name: managedresource-example2 injectLabels: foo: bar In this example, the label foo=bar will be injected into the Deployment, as well as into all created ReplicaSets and Pods.\nPreventing Reconciliations If a ManagedResource is annotated with resources.gardener.cloud/ignore=true, then it will be skipped entirely by the controller (no reconciliations or deletions of managed resources at all). However, when the ManagedResource itself is deleted (for example when a shoot is deleted), then the annotation is not respected and all resources will be deleted as usual. This feature can be helpful to temporarily patch/change resources managed as part of such ManagedResource. Condition checks will be skipped for such ManagedResources.\nModes The gardener-resource-manager can manage a resource in the following supported modes:\n Ignore The corresponding resource is removed from the ManagedResource status (.status.resources). No action is performed on the cluster. The resource is no longer “managed” (updated or deleted). The primary use case is a migration of a resource from one ManagedResource to another one. The mode for a resource can be specified with the resources.gardener.cloud/mode annotation. The annotation should be specified in the encoded resource manifest in the Secret that is referenced by the ManagedResource.\nResource Class and Reconcilation Scope By default, the gardener-resource-manager controller watches for ManagedResources in all namespaces. The .sourceClientConnection.namespace field in the component configuration restricts the watch to ManagedResources in a single namespace only. Note that this setting also affects all other controllers and webhooks since it’s a central configuration.\nA ManagedResource has an optional .spec.class field that allows it to indicate that it belongs to a given class of resources. The .controllers.resourceClass field in the component configuration restricts the watch to ManagedResources with the given .spec.class. A default class is assumed if no class is specified.\nFor instance, the gardener-resource-manager which is deployed in the Shoot’s control plane namespace in the Seed does not specify a .spec.class and watches only for resources in the control plane namespace by specifying it in the .sourceClientConnection.namespace field.\nIf the .spec.class changes this means that the resources have to be handled by a different Gardener Resource Manager. That is achieved by:\n Cleaning all referenced resources by the Gardener Resource Manager that was responsible for the old class in its target cluster. Creating all referenced resources by the Gardener Resource Manager that is responsible for the new class in its target cluster. Conditions A ManagedResource has a ManagedResourceStatus, which has an array of Conditions. Conditions currently include:\n Condition Description ResourcesApplied True if all resources are applied to the target cluster ResourcesHealthy True if all resources are present and healthy ResourcesProgressing False if all resources have been fully rolled out ResourcesApplied may be False when:\n the resource apiVersion is not known to the target cluster the resource spec is invalid (for example the label value does not match the required regex for it) … ResourcesHealthy may be False when:\n the resource is not found the resource is a Deployment and the Deployment does not have the minimum availability. … ResourcesProgressing may be True when:\n a Deployment, StatefulSet or DaemonSet has not been fully rolled out yet, i.e. not all replicas have been updated with the latest changes to spec.template. there are still old Pods belonging to an older ReplicaSet of a Deployment which are not terminated yet. Each Kubernetes resources has different notion for being healthy. For example, a Deployment is considered healthy if the controller observed its current revision and if the number of updated replicas is equal to the number of replicas.\nThe following status.conditions section describes a healthy ManagedResource:\nconditions: - lastTransitionTime: \"2022-05-03T10:55:39Z\" lastUpdateTime: \"2022-05-03T10:55:39Z\" message: All resources are healthy. reason: ResourcesHealthy status: \"True\" type: ResourcesHealthy - lastTransitionTime: \"2022-05-03T10:55:36Z\" lastUpdateTime: \"2022-05-03T10:55:36Z\" message: All resources have been fully rolled out. reason: ResourcesRolledOut status: \"False\" type: ResourcesProgressing - lastTransitionTime: \"2022-05-03T10:55:18Z\" lastUpdateTime: \"2022-05-03T10:55:18Z\" message: All resources are applied. reason: ApplySucceeded status: \"True\" type: ResourcesApplied Ignoring Updates In some cases, it is not desirable to update or re-apply some of the cluster components (for example, if customization is required or needs to be applied by the end-user). For these resources, the annotation “resources.gardener.cloud/ignore” needs to be set to “true” or a truthy value (Truthy values are “1”, “t”, “T”, “true”, “TRUE”, “True”) in the corresponding managed resource secrets. This can be done from the components that create the managed resource secrets, for example Gardener extensions or Gardener. Once this is done, the resource will be initially created and later ignored during reconciliation.\nFinalizing Deletion of Resources After Grace Period When a ManagedResource is deleted, the controller deletes all managed resources from the target cluster. In case the resources still have entries in their .metadata.finalizers[] list, they will remain stuck in the system until another entity removes the finalizers. If you want the controller to forcefully finalize the deletion after some grace period (i.e., setting .metadata.finalizers=null), you can annotate the managed resources with resources.gardener.cloud/finalize-deletion-after=\u003cduration\u003e, e.g., resources.gardener.cloud/finalize-deletion-after=1h.\nPreserving replicas or resources in Workload Resources The objects which are part of the ManagedResource can be annotated with:\n resources.gardener.cloud/preserve-replicas=true in case the .spec.replicas field of workload resources like Deployments, StatefulSets, etc., shall be preserved during updates. resources.gardener.cloud/preserve-resources=true in case the .spec.containers[*].resources fields of all containers of workload resources like Deployments, StatefulSets, etc., shall be preserved during updates. This can be useful if there are non-standard horizontal/vertical auto-scaling mechanisms in place. Standard mechanisms like HorizontalPodAutoscaler or VerticalPodAutoscaler will be auto-recognized by gardener-resource-manager, i.e., in such cases the annotations are not needed.\n Origin All the objects managed by the resource manager get a dedicated annotation resources.gardener.cloud/origin describing the ManagedResource object that describes this object. The default format is \u003cnamespace\u003e/\u003cobjectname\u003e.\nIn multi-cluster scenarios (the ManagedResource objects are maintained in a cluster different from the one the described objects are managed), it might be useful to include the cluster identity, as well.\nThis can be enforced by setting the .controllers.clusterID field in the component configuration. Here, several possibilities are supported:\n given a direct value: use this as id for the source cluster. \u003ccluster\u003e: read the cluster identity from a cluster-identity config map in the kube-system namespace (attribute cluster-identity). This is automatically maintained in all clusters managed or involved in a gardener landscape. \u003cdefault\u003e: try to read the cluster identity from the config map. If not found, no identity is used. empty string: no cluster identity is used (completely cluster local scenarios). By default, cluster id is not used. If cluster id is specified, the format is \u003ccluster id\u003e:\u003cnamespace\u003e/\u003cobjectname\u003e.\nIn addition to the origin annotation, all objects managed by the resource manager get a dedicated label resources.gardener.cloud/managed-by. This label can be used to describe these objects with a selector. By default it is set to “gardener”, but this can be overwritten by setting the .conrollers.managedResources.managedByLabelValue field in the component configuration.\nCompression The number and size of manifests for a ManagedResource can accumulate to a considerable amount which leads to increased Secret data. A decent compression algorithm helps to reduce the footprint of such Secrets and the load they put on etcd, the kube-apiserver, and client caches. We found Brotli to be a suitable candidate for most use cases (see comparison table here). When the gardener-resource-manager detects a data key with the known suffix .br, it automatically un-compresses the data first before processing the contained manifest.\nhealth Controller This controller processes ManagedResources that were reconciled by the main ManagedResource Controller at least once. Its main job is to perform checks for maintaining the well known conditions ResourcesHealthy and ResourcesProgressing.\nProgressing Checks In Kubernetes, applied changes must usually be rolled out first, e.g. when changing the base image in a Deployment. Progressing checks detect ongoing roll-outs and report them in the ResourcesProgressing condition of the corresponding ManagedResource.\nThe following object kinds are considered for progressing checks:\n DaemonSet Deployment StatefulSet Prometheus Alertmanager Certificate Issuer Health Checks gardener-resource-manager can evaluate the health of specific resources, often by consulting their conditions. Health check results are regularly updated in the ResourcesHealthy condition of the corresponding ManagedResource.\nThe following object kinds are considered for health checks:\n CustomResourceDefinition DaemonSet Deployment Job Pod ReplicaSet ReplicationController Service StatefulSet VerticalPodAutoscaler Prometheus Alertmanager Certificate Issuer Skipping Health Check If a resource owned by a ManagedResource is annotated with resources.gardener.cloud/skip-health-check=true, then the resource will be skipped during health checks by the health controller. The ManagedResource conditions will not reflect the health condition of this resource anymore. The ResourcesProgressing condition will also be set to False.\nGarbage Collector For Immutable ConfigMaps/Secrets In Kubernetes, workload resources (e.g., Pods) can mount ConfigMaps or Secrets or reference them via environment variables in containers. Typically, when the content of such a ConfigMap/Secret gets changed, then the respective workload is usually not dynamically reloading the configuration, i.e., a restart is required. The most commonly used approach is probably having the so-called checksum annotations in the pod template, which makes Kubernetes recreate the pod if the checksum changes. However, it has the downside that old, still running versions of the workload might not be able to properly work with the already updated content in the ConfigMap/Secret, potentially causing application outages.\nIn order to protect users from such outages (and also to improve the performance of the cluster), the Kubernetes community provides the “immutable ConfigMaps/Secrets feature”. Enabling immutability requires ConfigMaps/Secrets to have unique names. Having unique names requires the client to delete ConfigMaps/Secrets no longer in use.\nIn order to provide a similarly lightweight experience for clients (compared to the well-established checksum annotation approach), the gardener-resource-manager features an optional garbage collector controller (disabled by default). The purpose of this controller is cleaning up such immutable ConfigMaps/Secrets if they are no longer in use.\nHow Does the Garbage Collector Work? The following algorithm is implemented in the GC controller:\n List all ConfigMaps and Secrets labeled with resources.gardener.cloud/garbage-collectable-reference=true. List all Deployments, StatefulSets, DaemonSets, Jobs, CronJobs, Pods, ManagedResources and for each of them: iterate over the .metadata.annotations and for each of them: If the annotation key follows the reference.resources.gardener.cloud/{configmap,secret}-\u003chash\u003e scheme and the value equals \u003cname\u003e, then consider it as “in-use”. Delete all ConfigMaps and Secrets not considered as “in-use”. Consequently, clients need to:\n Create immutable ConfigMaps/Secrets with unique names (e.g., a checksum suffix based on the .data).\n Label such ConfigMaps/Secrets with resources.gardener.cloud/garbage-collectable-reference=true.\n Annotate their workload resources with reference.resources.gardener.cloud/{configmap,secret}-\u003chash\u003e=\u003cname\u003e for all ConfigMaps/Secrets used by the containers of the respective Pods.\n⚠️ Add such annotations to .metadata.annotations, as well as to all templates of other resources (e.g., .spec.template.metadata.annotations in Deployments or .spec.jobTemplate.metadata.annotations and .spec.jobTemplate.spec.template.metadata.annotations for CronJobs. This ensures that the GC controller does not unintentionally consider ConfigMaps/Secrets as “not in use” just because there isn’t a Pod referencing them anymore (e.g., they could still be used by a Deployment scaled down to 0).\n ℹ️ For the last step, there is a helper function InjectAnnotations in the pkg/controller/garbagecollector/references, which you can use for your convenience.\nExample:\n--- apiVersion: v1 kind: ConfigMap metadata: name: test-1234 namespace: default labels: resources.gardener.cloud/garbage-collectable-reference: \"true\" --- apiVersion: v1 kind: ConfigMap metadata: name: test-5678 namespace: default labels: resources.gardener.cloud/garbage-collectable-reference: \"true\" --- apiVersion: v1 kind: Pod metadata: name: example namespace: default annotations: reference.resources.gardener.cloud/configmap-82a3537f: test-5678 spec: containers: - name: nginx image: nginx:1.14.2 terminationGracePeriodSeconds: 2 The GC controller would delete the ConfigMap/test-1234 because it is considered as not “in-use”.\nℹ️ If the GC controller is activated then the ManagedResource controller will no longer delete ConfigMaps/Secrets having the above label.\nHow to Activate the Garbage Collector? The GC controller can be activated by setting the .controllers.garbageCollector.enabled field to true in the component configuration.\nTokenInvalidator Controller The Kubernetes community is slowly transitioning from static ServiceAccount token Secrets to ServiceAccount Token Volume Projection. Typically, when you create a ServiceAccount\napiVersion: v1 kind: ServiceAccount metadata: name: default then the serviceaccount-token controller (part of kube-controller-manager) auto-generates a Secret with a static token:\napiVersion: v1 kind: Secret metadata: annotations: kubernetes.io/service-account.name: default kubernetes.io/service-account.uid: 86e98645-2e05-11e9-863a-b2d4d086dd5a) name: default-token-ntxs9 type: kubernetes.io/service-account-token data: ca.crt: base64(cluster-ca-cert) namespace: base64(namespace) token: base64(static-jwt-token) Unfortunately, when using ServiceAccount Token Volume Projection in a Pod, this static token is actually not used at all:\napiVersion: v1 kind: Pod metadata: name: nginx spec: serviceAccountName: default containers: - image: nginx name: nginx volumeMounts: - mountPath: /var/run/secrets/tokens name: token volumes: - name: token projected: sources: - serviceAccountToken: path: token expirationSeconds: 7200 While the Pod is now using an expiring and auto-rotated token, the static token is still generated and valid.\nThere is neither a way of preventing kube-controller-manager to generate such static tokens, nor a way to proactively remove or invalidate them:\n https://github.com/kubernetes/kubernetes/issues/77599 https://github.com/kubernetes/kubernetes/issues/77600 Disabling the serviceaccount-token controller is an option, however, especially in the Gardener context it may either break end-users or it may not even be possible to control such settings. Also, even if a future Kubernetes version supports native configuration of the above behaviour, Gardener still supports older versions which won’t get such features but need a solution as well.\nThis is where the TokenInvalidator comes into play: Since it is not possible to prevent kube-controller-manager from generating static ServiceAccount Secrets, the TokenInvalidator is, as its name suggests, just invalidating these tokens. It considers all such Secrets belonging to ServiceAccounts with .automountServiceAccountToken=false. By default, all namespaces in the target cluster are watched, however, this can be configured by specifying the .targetClientConnection.namespace field in the component configuration. Note that this setting also affects all other controllers and webhooks since it’s a central configuration.\napiVersion: v1 kind: ServiceAccount metadata: name: my-serviceaccount automountServiceAccountToken: false This will result in a static ServiceAccount token secret whose token value is invalid:\napiVersion: v1 kind: Secret metadata: annotations: kubernetes.io/service-account.name: my-serviceaccount kubernetes.io/service-account.uid: 86e98645-2e05-11e9-863a-b2d4d086dd5a name: my-serviceaccount-token-ntxs9 type: kubernetes.io/service-account-token data: ca.crt: base64(cluster-ca-cert) namespace: base64(namespace) token: AAAA Any attempt to regenerate the token or creating a new such secret will again make the component invalidating it.\n You can opt-out of this behaviour for ServiceAccounts setting .automountServiceAccountToken=false by labeling them with token-invalidator.resources.gardener.cloud/skip=true.\n In order to enable the TokenInvalidator you have to set both .controllers.tokenValidator.enabled=true and .webhooks.tokenValidator.enabled=true in the component configuration.\nThe below graphic shows an overview of the Token Invalidator for Service account secrets in the Shoot cluster. TokenRequestor Controller This controller provides the service to create and auto-renew tokens via the TokenRequest API.\nIt provides a functionality similar to the kubelet’s Service Account Token Volume Projection. It was created to handle the special case of issuing tokens to pods that run in a different cluster than the API server they communicate with (hence, using the native token volume projection feature is not possible).\nThe controller differentiates between source cluster and target cluster. The source cluster hosts the gardener-resource-manager pod. Secrets in this cluster are watched and modified by the controller. The target cluster can be configured to point to another cluster. The existence of ServiceAccounts are ensured and token requests are issued against the target. When the gardener-resource-manager is deployed next to the Shoot’s controlplane in the Seed, the source cluster is the Seed while the target cluster points to the Shoot.\nReconciliation Loop This controller reconciles Secrets in all namespaces in the source cluster with the label: resources.gardener.cloud/purpose=token-requestor. See this YAML file for an example of the secret.\nThe controller ensures a ServiceAccount exists in the target cluster as specified in the annotations of the Secret in the source cluster:\nserviceaccount.resources.gardener.cloud/name: \u003csa-name\u003e serviceaccount.resources.gardener.cloud/namespace: \u003csa-namespace\u003e You can optionally annotate the Secret with serviceaccount.resources.gardener.cloud/labels, e.g. serviceaccount.resources.gardener.cloud/labels={\"some\":\"labels\",\"foo\":\"bar\"}. This will make the ServiceAccount getting labelled accordingly.\nThe requested tokens will act with the privileges which are assigned to this ServiceAccount.\nThe controller will then request a token via the TokenRequest API and populate it into the .data.token field to the Secret in the source cluster.\nAlternatively, the client can provide a raw kubeconfig (in YAML or JSON format) via the Secret’s .data.kubeconfig field. The controller will then populate the requested token in the kubeconfig for the user used in the .current-context. For example, if .data.kubeconfig is\napiVersion: v1 clusters: - cluster: certificate-authority-data: AAAA server: some-server-url name: shoot--foo--bar contexts: - context: cluster: shoot--foo--bar user: shoot--foo--bar-token name: shoot--foo--bar current-context: shoot--foo--bar kind: Config preferences: {} users: - name: shoot--foo--bar-token user: token: \"\" then the .users[0].user.token field of the kubeconfig will be updated accordingly.\nThe controller also adds an annotation to the Secret to keep track when to renew the token before it expires. By default, the tokens are issued to expire after 12 hours. The expiration time can be set with the following annotation:\nserviceaccount.resources.gardener.cloud/token-expiration-duration: 6h It automatically renews once 80% of the lifetime is reached, or after 24h.\nOptionally, the controller can also populate the token into a Secret in the target cluster. This can be requested by annotating the Secret in the source cluster with:\ntoken-requestor.resources.gardener.cloud/target-secret-name: \"foo\" token-requestor.resources.gardener.cloud/target-secret-namespace: \"bar\" Overall, the TokenRequestor controller provides credentials with limited lifetime (JWT tokens) used by Shoot control plane components running in the Seed to talk to the Shoot API Server. Please see the graphic below:\n ℹ️ Generally, the controller can run with multiple instances in different components. For example, gardener-resource-manager might run the TokenRequestor controller, but gardenlet might run it, too. In order to differentiate which instance of the controller is responsible for a Secret, it can be labeled with resources.gardener.cloud/class=\u003cclass\u003e. The \u003cclass\u003e must be configured in the respective controller, otherwise it will be responsible for all Secrets no matter whether they have the label or not.\n Kubelet Server CertificateSigningRequest Approver Gardener configures the kubelets such that they request two certificates via the CertificateSigningRequest API:\n client certificate for communicating with the kube-apiserver server certificate for serving its HTTPS server For client certificates, the kubernetes.io/kube-apiserver-client-kubelet signer is used (see Certificate Signing Requests for more details). The kube-controller-manager’s csrapprover controller is responsible for auto-approving such CertificateSigningRequests so that the respective certificates can be issued.\nFor server certificates, the kubernetes.io/kubelet-serving signer is used. Unfortunately, the kube-controller-manager is not able to auto-approve such CertificateSigningRequests (see kubernetes/kubernetes#73356 for details).\nThat’s the motivation for having this controller as part of gardener-resource-manager. It watches CertificateSigningRequests with the kubernetes.io/kubelet-serving signer and auto-approves them when all the following conditions are met:\n The .spec.username is prefixed with system:node:. There must be at least one DNS name or IP address as part of the certificate SANs. The common name in the CSR must match the .spec.username. The organization in the CSR must only contain system:nodes. There must be a Node object with the same name in the shoot cluster. There must be exactly one Machine for the node in the seed cluster. The DNS names part of the SANs must be equal to all .status.addresses[] of type Hostname in the Node. The IP addresses part of the SANs must be equal to all .status.addresses[] of type InternalIP in the Node. If any one of these requirements is violated, the CertificateSigningRequest will be denied. Otherwise, once approved, the kube-controller-manager’s csrsigner controller will issue the requested certificate.\nNetworkPolicy Controller This controller reconciles Services with a non-empty .spec.podSelector. It creates two NetworkPolicys for each port in the .spec.ports[] list. For example:\napiVersion: v1 kind: Service metadata: name: gardener-resource-manager namespace: a spec: selector: app: gardener-resource-manager ports: - name: server port: 443 protocol: TCP targetPort: 10250 leads to\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from pods running in namespace a labeled with map[networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250:allowed]. name: ingress-to-gardener-resource-manager-tcp-10250 namespace: a spec: ingress: - from: - podSelector: matchLabels: networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250: allowed ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in namespace a labeled with map[networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250:allowed] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-gardener-resource-manager-tcp-10250 namespace: a spec: egress: - to: - podSelector: matchLabels: app: gardener-resource-manager ports: - port: 10250 protocol: TCP podSelector: matchLabels: networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250: allowed policyTypes: - Egress A component that initiates the connection to gardener-resource-manager’s tcp/10250 port can now be labeled with networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250=allowed. That’s all this component needs to do - it does not need to create any NetworkPolicys itself.\nCross-Namespace Communication Apart from this “simple” case where both communicating components run in the same namespace a, there is also the cross-namespace communication case. With above example, let’s say there are components running in another namespace b, and they would like to initiate the communication with gardener-resource-manager in a. To cover this scenario, the Service can be annotated with networking.resources.gardener.cloud/namespace-selectors='[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"b\"}}]'.\n Note that you can specify multiple namespace selectors in this annotation which are OR-ed.\n This will make the controller create additional NetworkPolicys as follows:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from pods running in namespace b labeled with map[networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250:allowed]. name: ingress-to-gardener-resource-manager-tcp-10250-from-b namespace: a spec: ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: b podSelector: matchLabels: networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250: allowed ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in namespace b labeled with map[networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250:allowed] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-a-gardener-resource-manager-tcp-10250 namespace: b spec: egress: - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: a podSelector: matchLabels: app: gardener-resource-manager ports: - port: 10250 protocol: TCP podSelector: matchLabels: networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250: allowed policyTypes: - Egress The components in namespace b now need to be labeled with networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250=allowed, but that’s already it.\n Obviously, this approach also works for namespace selectors different from kubernetes.io/metadata.name to cover scenarios where the namespace name is not known upfront or where multiple namespaces with a similar label are relevant. The controller creates two dedicated policies for each namespace matching the selectors.\n Service Targets In Multiple Namespaces Finally, let’s say there is a Service called example which exists in different namespaces whose names are not static (e.g., foo-1, foo-2), and a component in namespace bar wants to initiate connections with all of them.\nThe example Services in these namespaces can now be annotated with networking.resources.gardener.cloud/namespace-selectors='[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"bar\"}}]'. As a consequence, the component in namespace bar now needs to be labeled with networking.resources.gardener.cloud/to-foo-1-example-tcp-8080=allowed, networking.resources.gardener.cloud/to-foo-2-example-tcp-8080=allowed, etc. This approach does not work in practice, however, since the namespace names are neither static nor known upfront.\nTo overcome this, it is possible to specify an alias for the concrete namespace in the pod label selector via the networking.resources.gardener.cloud/pod-label-selector-namespace-alias annotation.\nIn above case, the example Service in the foo-* namespaces could be annotated with networking.resources.gardener.cloud/pod-label-selector-namespace-alias=all-foos. This would modify the label selector in all NetworkPolicys related to cross-namespace communication, i.e. instead of networking.resources.gardener.cloud/to-foo-{1,2,...}-example-tcp-8080=allowed, networking.resources.gardener.cloud/to-all-foos-example-tcp-8080=allowed would be used. Now the component in namespace bar only needs this single label and is able to talk to all such Services in the different namespaces.\n Real-world examples for this scenario are the kube-apiserver Service (which exists in all shoot namespaces), or the istio-ingressgateway Service (which exists in all istio-ingress* namespaces). In both cases, the names of the namespaces are not statically known and depend on user input.\n Overwriting The Pod Selector Label For a component which initiates the connection to many other components, it’s sometimes impractical to specify all the respective labels in its pod template. For example, let’s say a component foo talks to bar{0..9} on ports tcp/808{0..9}. foo would need to have the ten networking.resources.gardener.cloud/to-bar{0..9}-tcp-808{0..9}=allowed labels.\nAs an alternative and to simplify this, it is also possible to annotate the targeted Services with networking.resources.gardener.cloud/from-\u003csome-alias\u003e-allowed-ports. For our example, \u003csome-alias\u003e could be all-bars.\nAs a result, component foo just needs to have the label networking.resources.gardener.cloud/to-all-bars=allowed instead of all the other ten explicit labels.\n⚠️ Note that this also requires to specify the list of allowed container ports as annotation value since the pod selector label will no longer be specific for a dedicated service/port. For our example, the Service for barX with X in {0..9} needs to be annotated with networking.resources.gardener.cloud/from-all-bars-allowed-ports=[{\"port\":808X,\"protocol\":\"TCP\"}] in addition.\n Real-world examples for this scenario are the Prometheis in seed clusters which initiate the communication to a lot of components in order to scrape their metrics. Another example is the kube-apiserver which initiates the communication to webhook servers (potentially of extension components that are not known by Gardener itself).\n Ingress From Everywhere All above scenarios are about components initiating connections to some targets. However, some components also receive incoming traffic from sources outside the cluster. This traffic requires adequate ingress policies so that it can be allowed.\nTo cover this scenario, the Service can be annotated with networking.resources.gardener.cloud/from-world-to-ports=[{\"port\":\"10250\",\"protocol\":\"TCP\"}]. As a result, the controller creates the following NetworkPolicy:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: ingress-to-gardener-resource-manager-from-world namespace: a spec: ingress: - from: - namespaceSelector: {} podSelector: {} - ipBlock: cidr: 0.0.0.0/0 - ipBlock: cidr: ::/0 ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress The respective pods don’t need any additional labels. If the annotation’s value is empty ([]) then all ports are allowed.\nServices Exposed via Ingress Resources The controller can optionally be configured to watch Ingress resources by specifying the pod and namespace selectors for the Ingress controller. If this information is provided, it automatically creates NetworkPolicy resources allowing the respective ingress/egress traffic for the backends exposed by the Ingresses. This way, neither custom NetworkPolicys nor custom labels must be provided.\nThe needed configuration is part of the component configuration:\ncontrollers: networkPolicy: enabled: true concurrentSyncs: 5 # namespaceSelectors: # - matchLabels: # kubernetes.io/metadata.name: default ingressControllerSelector: namespace: default podSelector: matchLabels: foo: bar As an example, let’s assume that above gardener-resource-manager Service was exposed via the following Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: gardener-resource-manager namespace: a spec: rules: - host: grm.foo.example.com http: paths: - backend: service: name: gardener-resource-manager port: number: 443 path: / pathType: Prefix As a result, the controller would automatically create the following NetworkPolicys:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from ingress controller pods running in the default namespace labeled with map[foo:bar]. name: ingress-to-gardener-resource-manager-tcp-10250-from-ingress-controller namespace: a spec: ingress: - from: - podSelector: matchLabels: foo: bar namespaceSelector: matchLabels: kubernetes.io/metadata.name: default ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in the default namespace labeled with map[foo:bar] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-a-gardener-resource-manager-tcp-10250-from-ingress-controller namespace: default spec: egress: - to: - podSelector: matchLabels: app: gardener-resource-manager namespaceSelector: matchLabels: kubernetes.io/metadata.name: a ports: - port: 10250 protocol: TCP podSelector: matchLabels: foo: bar policyTypes: - Egress ℹ️ Note that Ingress resources reference the service port while NetworkPolicys reference the target port/container port. The controller automatically translates this when reconciling the NetworkPolicy resources.\n Node Controller Critical Components Controller Gardenlet configures kubelet of shoot worker nodes to register the Node object with the node.gardener.cloud/critical-components-not-ready taint (effect NoSchedule). This controller watches newly created Node objects in the shoot cluster and removes the taint once all node-critical components are scheduled and ready. If the controller finds node-critical components that are not scheduled or not ready yet, it checks the Node again after the duration configured in ResourceManagerConfiguration.controllers.node.backoff Please refer to the feature documentation or proposal issue for more details.\nNode Agent Reconciliation Delay Controller This controller computes a reconciliation delay per node by using a simple linear mapping approach based on the index of the nodes in the list of all nodes in the shoot cluster. This approach ensures that the delays of all instances of gardener-node-agent are distributed evenly.\nThe minimum and maximum delays can be configured, but they are defaulted to 0s and 5m, respectively.\nThis approach works well as long as the number of nodes in the cluster is not higher than the configured maximum delay in seconds. In this case, the delay is still computed linearly, however, the more nodes exist in the cluster, the closer the delay times become (which might be of limited use then). Consider increasing the maximum delay by annotating the Shoot with shoot.gardener.cloud/cloud-config-execution-max-delay-seconds=\u003cvalue\u003e. The highest possible value is 1800.\nThe controller adds the node-agent.gardener.cloud/reconciliation-delay annotation to nodes whose value is read by the node-agents.\nWebhooks Mutating Webhooks High Availability Config This webhook is used to conveniently apply the configuration to make components deployed to seed or shoot clusters highly available. The details and scenarios are described in High Availability Of Deployed Components.\nThe webhook reacts on creation/update of Deployments, StatefulSets, HorizontalPodAutoscalers and HVPAs in namespaces labeled with high-availability-config.resources.gardener.cloud/consider=true.\nThe webhook performs the following actions:\n The .spec.replicas (or spec.minReplicas respectively) field is mutated based on the high-availability-config.resources.gardener.cloud/type label of the resource and the high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation of the namespace:\n Failure Tolerance Type ➡️\n/\n⬇️ Component Type️ ️ unset empty non-empty controller 2 1 2 server 2 2 2 The replica count values can be overwritten by the high-availability-config.resources.gardener.cloud/replicas annotation. It does NOT mutate the replicas when: the replicas are already set to 0 (hibernation case), or when the resource is scaled horizontally by HorizontalPodAutoscaler or Hvpa, and the current replica count is higher than what was computed above. When the high-availability-config.resources.gardener.cloud/zones annotation is NOT empty and either the high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation is set or the high-availability-config.resources.gardener.cloud/zone-pinning annotation is set to true, then it adds a node affinity to the pod template spec:\nspec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - \u003czone1\u003e # - ... This ensures that all pods are pinned to only nodes in exactly those concrete zones.\n Topology Spread Constraints are added to the pod template spec when the .spec.replicas are greater than 1. When the high-availability-config.resources.gardener.cloud/zones annotation …\n … contains only one zone, then the following is added:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname minDomains: 3 # lower value of max replicas or 3 maxSkew: 1 whenUnsatisfiable: ScheduleAnyway labelSelector: ... This ensures that the (multiple) pods are scheduled across nodes. minDomains is set when failure tolerance is configured or annotation high-availability-config.resources.gardener.cloud/host-spread=\"true\" is given.\n … contains at least two zones, then the following is added:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname maxSkew: 1 whenUnsatisfiable: ScheduleAnyway labelSelector: ... - topologyKey: topology.kubernetes.io/zone minDomains: 2 # lower value of max replicas or number of zones maxSkew: 1 whenUnsatisfiable: DoNotSchedule labelSelector: ... This enforces that the (multiple) pods are scheduled across zones. It circumvents a known limitation in Kubernetes for clusters \u003c 1.26 (ref kubernetes/kubernetes#109364. In case the number of replicas is larger than twice the number of zones, then the maxSkew=2 for the second spread constraints. The minDomains calculation is based on whatever value is lower - (maximum) replicas or number of zones. This is the number of minimum domains required to schedule pods in a highly available manner.\n Independent on the number of zones, when one of the following conditions is true, then the field whenUnsatisfiable is set to DoNotSchedule for the constraint with topologyKey=kubernetes.io/hostname (which enforces the node-spread):\n The high-availability-config.resources.gardener.cloud/host-spread annotation is set to true. The high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation is set and NOT empty. Adds default tolerations for taint-based evictions:\nTolerations for taints node.kubernetes.io/not-ready and node.kubernetes.io/unreachable are added to the handled Deployment and StatefulSet if their podTemplates do not already specify them. The TolerationSeconds are taken from the respective configuration section of the webhook’s configuration (see example)).\nWe consider fine-tuned values for those tolerations a matter of high-availability because they often help to reduce recovery times in case of node or zone outages, also see High-Availability Best Practices. In addition, this webhook handling helps to set defaults for many but not all workload components in a cluster. For instance, Gardener can use this webhook to set defaults for nearly every component in seed clusters but only for the system components in shoot clusters. Any customer workload remains unchanged.\n Kubernetes Service Host Injection By default, when Pods are created, Kubernetes implicitly injects the KUBERNETES_SERVICE_HOST environment variable into all containers. The value of this variable points it to the default Kubernetes service (i.e., kubernetes.default.svc.cluster.local). This allows pods to conveniently talk to the API server of their cluster.\nIn shoot clusters, this network path involves the apiserver-proxy DaemonSet which eventually forwards the traffic to the API server. Hence, it results in additional network hop.\nThe purpose of this webhook is to explicitly inject the KUBERNETES_SERVICE_HOST environment variable into all containers and setting its value to the FQDN of the API server. This way, the additional network hop is avoided.\nAuto-Mounting Projected ServiceAccount Tokens When this webhook is activated, then it automatically injects projected ServiceAccount token volumes into Pods and all its containers if all of the following preconditions are fulfilled:\n The Pod is NOT labeled with projected-token-mount.resources.gardener.cloud/skip=true. The Pod’s .spec.serviceAccountName field is NOT empty and NOT set to default. The ServiceAccount specified in the Pod’s .spec.serviceAccountName sets .automountServiceAccountToken=false. The Pod’s .spec.volumes[] DO NOT already contain a volume with a name prefixed with kube-api-access-. The projected volume will look as follows:\nspec: volumes: - name: kube-api-access-gardener projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 43200 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace The expirationSeconds are defaulted to 12h and can be overwritten with the .webhooks.projectedTokenMount.expirationSeconds field in the component configuration, or with the projected-token-mount.resources.gardener.cloud/expiration-seconds annotation on a Pod resource.\n The volume will be mounted into all containers specified in the Pod to the path /var/run/secrets/kubernetes.io/serviceaccount. This is the default location where client libraries expect to find the tokens and mimics the upstream ServiceAccount admission plugin. See Managing Service Accounts for more information.\nOverall, this webhook is used to inject projected service account tokens into pods running in the Shoot and the Seed cluster. Hence, it is served from the Seed GRM and each Shoot GRM. Please find an overview below for pods deployed in the Shoot cluster:\nPod Topology Spread Constraints When this webhook is enabled, then it mimics the topologyKey feature for Topology Spread Constraints (TSC) on the label pod-template-hash. Concretely, when a pod is labelled with pod-template-hash, the handler of this webhook extends any topology spread constraint in the pod:\nmetadata: labels: pod-template-hash: 123abc spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: pod-template-hash: 123abc # added by webhook The procedure circumvents a known limitation with TSCs which leads to imbalanced deployments after rolling updates. Gardener enables this webhook to schedule pods of deployments across nodes and zones.\nPlease note that the gardener-resource-manager itself as well as pods labelled with topology-spread-constraints.resources.gardener.cloud/skip are excluded from any mutations.\nSystem Components Webhook If enabled, this webhook handles scheduling concerns for system components Pods (except those managed by DaemonSets). The following tasks are performed by this webhook:\n Add pod.spec.nodeSelector as given in the webhook configuration. Add pod.spec.tolerations as given in the webhook configuration. Add pod.spec.tolerations for any existing nodes matching the node selector given in the webhook configuration. Known taints and tolerations used for taint based evictions are disregarded. Gardener enables this webhook for kube-system and kubernetes-dashboard namespaces in shoot clusters, selecting Pods being labelled with resources.gardener.cloud/managed-by: gardener. It adds a configuration, so that Pods will get the worker.gardener.cloud/system-components: true node selector (step 1) as well as tolerate any custom taint (step 2) that is added to system component worker nodes (shoot.spec.provider.workers[].systemComponents.allow: true). In addition, the webhook merges these tolerations with the ones required for at that time available system component Nodes in the cluster (step 3). Both is required to ensure system component Pods can be scheduled or executed during an active shoot reconciliation that is happening due to any modifications to shoot.spec.provider.workers[].taints, e.g. Pods must be scheduled while there are still Nodes not having the updated taint configuration.\n You can opt-out of this behaviour for Pods by labeling them with system-components-config.resources.gardener.cloud/skip=true.\n EndpointSlice Hints This webhook mutates EndpointSlices. For each endpoint in the EndpointSlice, it sets the endpoint’s hints to the endpoint’s zone.\napiVersion: discovery.k8s.io/v1 kind: EndpointSlice metadata: name: example-hints endpoints: - addresses: - \"10.1.2.3\" conditions: ready: true hostname: pod-1 zone: zone-a hints: forZones: - name: \"zone-a\" # added by webhook - addresses: - \"10.1.2.4\" conditions: ready: true hostname: pod-2 zone: zone-b hints: forZones: - name: \"zone-b\" # added by webhook The webhook aims to circumvent issues with the Kubernetes TopologyAwareHints feature that currently does not allow to achieve a deterministic topology-aware traffic routing. For more details, see the following issue kubernetes/kubernetes#113731 that describes drawbacks of the TopologyAwareHints feature for our use case. If the above-mentioned issue gets resolved and there is a native support for deterministic topology-aware traffic routing in Kubernetes, then this webhook can be dropped in favor of the native Kubernetes feature.\nValidating Webhooks Unconfirmed Deletion Prevention For Custom Resources And Definitions As part of Gardener’s extensibility concepts, a lot of CustomResourceDefinitions are deployed to the seed clusters that serve as extension points for provider-specific controllers. For example, the Infrastructure CRD triggers the provider extension to prepare the IaaS infrastructure of the underlying cloud provider for a to-be-created shoot cluster. Consequently, these extension CRDs have a lot of power and control large portions of the end-user’s shoot cluster. Accidental or undesired deletions of those resource can cause tremendous and hard-to-recover-from outages and should be prevented.\nWhen this webhook is activated, it reacts for CustomResourceDefinitions and most of the custom resources in the extensions.gardener.cloud/v1alpha1 API group. It also reacts for the druid.gardener.cloud/v1alpha1.Etcd resources.\nThe webhook prevents DELETE requests for those CustomResourceDefinitions labeled with gardener.cloud/deletion-protected=true, and for all mentioned custom resources if they were not previously annotated with the confirmation.gardener.cloud/deletion=true. This prevents that undesired kubectl delete \u003c...\u003e requests are accepted.\nExtension Resource Validation When this webhook is activated, it reacts for most of the custom resources in the extensions.gardener.cloud/v1alpha1 API group. It also reacts for the druid.gardener.cloud/v1alpha1.Etcd resources.\nThe webhook validates the resources specifications for CREATE and UPDATE requests.\n","categories":"","description":"Set of controllers with different responsibilities running once per seed and once per shoot","excerpt":"Set of controllers with different responsibilities running once per …","ref":"/docs/gardener/concepts/resource-manager/","tags":"","title":"Gardener Resource Manager"},{"body":"Overview The Gardener Scheduler is in essence a controller that watches newly created shoots and assigns a seed cluster to them. Conceptually, the task of the Gardener Scheduler is very similar to the task of the Kubernetes Scheduler: finding a seed for a shoot instead of a node for a pod.\nEither the scheduling strategy or the shoot cluster purpose hereby determines how the scheduler is operating. The following sections explain the configuration and flow in greater detail.\nWhy Is the Gardener Scheduler Needed? 1. Decoupling Previously, an admission plugin in the Gardener API server conducted the scheduling decisions. This implies changes to the API server whenever adjustments of the scheduling are needed. Decoupling the API server and the scheduler comes with greater flexibility to develop these components independently.\n2. Extensibility It should be possible to easily extend and tweak the scheduler in the future. Possibly, similar to the Kubernetes scheduler, hooks could be provided which influence the scheduling decisions. It should be also possible to completely replace the standard Gardener Scheduler with a custom implementation.\nAlgorithm Overview The following sequence describes the steps involved to determine a seed candidate:\n Determine usable seeds with “usable” defined as follows: no .metadata.deletionTimestamp .spec.settings.scheduling.visible is true .status.lastOperation is not nil conditions GardenletReady, BackupBucketsReady (if available) are true Filter seeds: matching .spec.seedSelector in CloudProfile used by the Shoot matching .spec.seedSelector in Shoot having no network intersection with the Shoot’s networks (due to the VPN connectivity between seeds and shoots their networks must be disjoint) whose taints (.spec.taints) are tolerated by the Shoot (.spec.tolerations) whose capacity for shoots would not be exceeded if the shoot is scheduled onto the seed, see Ensuring seeds capacity for shoots is not exceeded which have at least three zones in .spec.provider.zones if shoot requests a high available control plane with failure tolerance type zone. Apply active strategy e.g., Minimal Distance strategy Choose least utilized seed, i.e., the one with the least number of shoot control planes, will be the winner and written to the .spec.seedName field of the Shoot. In order to put the scheduling decision into effect, the scheduler sends an update request for the Shoot resource to the API server. After validation, the gardener-apiserver updates the Shoot to have the spec.seedName field set. Subsequently, the gardenlet picks up and starts to create the cluster on the specified seed.\nConfiguration The Gardener Scheduler configuration has to be supplied on startup. It is a mandatory and also the only available flag. This yaml file holds an example scheduler configuration.\nMost of the configuration options are the same as in the Gardener Controller Manager (leader election, client connection, …). However, the Gardener Scheduler on the other hand does not need a TLS configuration, because there are currently no webhooks configurable.\nStrategies The scheduling strategy is defined in the candidateDeterminationStrategy of the scheduler’s configuration and can have the possible values SameRegion and MinimalDistance. The SameRegion strategy is the default strategy.\nSame Region strategy The Gardener Scheduler reads the spec.provider.type and .spec.region fields from the Shoot resource. It tries to find a seed that has the identical .spec.provider.type and .spec.provider.region fields set. If it cannot find a suitable seed, it adds an event to the shoot stating that it is unschedulable.\nMinimal Distance strategy The Gardener Scheduler tries to find a valid seed with minimal distance to the shoot’s intended region. Distances are configured via ConfigMap(s), usually per cloud provider in a Gardener landscape. The configuration is structured like this:\n It refers to one or multiple CloudProfiles via annotation scheduling.gardener.cloud/cloudprofiles. It contains the declaration as region-config via label scheduling.gardener.cloud/purpose. If a CloudProfile is referred by multiple ConfigMaps, only the first one is considered. The data fields configure actual distances, where key relates to the Shoot region and value contains distances to Seed regions. apiVersion: v1 kind: ConfigMap metadata: name: \u003cname\u003e namespace: garden annotations: scheduling.gardener.cloud/cloudprofiles: cloudprofile-name-1{,optional-cloudprofile-name-2,...} labels: scheduling.gardener.cloud/purpose: region-config data: region-1: |region-2: 10 region-3: 20 ... region-2: |region-1: 10 region-3: 10 ... Gardener provider extensions for public cloud providers usually have an example weight ConfigMap in their repositories. We suggest to check them out before defining your own data.\n If a valid seed candidate cannot be found after consulting the distance configuration, the scheduler will fall back to the Levenshtein distance to find the closest region. Therefore, the region name is split into a base name and an orientation. Possible orientations are north, south, east, west and central. The distance then is twice the Levenshtein distance of the region’s base name plus a correction value based on the orientation and the provider.\nIf the orientations of shoot and seed candidate match, the correction value is 0, if they differ it is 2 and if either the seed’s or the shoot’s region does not have an orientation it is 1. If the provider differs, the correction value is additionally incremented by 2.\nBecause of this, a matching region with a matching provider is always preferred.\nSpecial handling based on shoot cluster purpose Every shoot cluster can have a purpose that describes what the cluster is used for, and also influences how the cluster is setup (see Shoot Cluster Purpose for more information).\nIn case the shoot has the testing purpose, then the scheduler only reads the .spec.provider.type from the Shoot resource and tries to find a Seed that has the identical .spec.provider.type. The region does not matter, i.e., testing shoots may also be scheduled on a seed in a complete different region if it is better for balancing the whole Gardener system.\nshoots/binding Subresource The shoots/binding subresource is used to bind a Shoot to a Seed. On creation of a shoot cluster/s, the scheduler updates the binding automatically if an appropriate seed cluster is available. Only an operator with the necessary RBAC can update this binding manually. This can be done by changing the .spec.seedName of the shoot. However, if a different seed is already assigned to the shoot, this will trigger a control-plane migration. For required steps, please see Triggering the Migration.\nspec.schedulerName Field in the Shoot Specification Similar to the spec.schedulerName field in Pods, the Shoot specification has an optional .spec.schedulerName field. If this field is set on creation, only the scheduler which relates to the configured name is responsible for scheduling the shoot. The default-scheduler name is reserved for the default scheduler of Gardener. Affected Shoots will remain in Pending state if the mentioned scheduler is not present in the landscape.\nspec.seedName Field in the Shoot Specification Similar to the .spec.nodeName field in Pods, the Shoot specification has an optional .spec.seedName field. If this field is set on creation, the shoot will be scheduled to this seed. However, this field can only be set by users having RBAC for the shoots/binding subresource. If this field is not set, the scheduler will assign a suitable seed automatically and populate this field with the seed name.\nseedSelector Field in the Shoot Specification Similar to the .spec.nodeSelector field in Pods, the Shoot specification has an optional .spec.seedSelector field. It allows the user to provide a label selector that must match the labels of the Seeds in order to be scheduled to one of them. The labels on the Seeds are usually controlled by Gardener administrators/operators - end users cannot add arbitrary labels themselves. If provided, the Gardener Scheduler will only consider as “suitable” those seeds whose labels match those provided in the .spec.seedSelector of the Shoot.\nBy default, only seeds with the same provider as the shoot are selected. By adding a providerTypes field to the seedSelector, a dedicated set of possible providers (* means all provider types) can be selected.\nEnsuring a Seed’s Capacity for Shoots Is Not Exceeded Seeds have a practical limit of how many shoots they can accommodate. Exceeding this limit is undesirable, as the system performance will be noticeably impacted. Therefore, the scheduler ensures that a seed’s capacity for shoots is not exceeded by taking into account a maximum number of shoots that can be scheduled onto a seed.\nThis mechanism works as follows:\n The gardenlet is configured with certain resources and their total capacity (and, for certain resources, the amount reserved for Gardener), see /example/20-componentconfig-gardenlet.yaml. Currently, the only such resource is the maximum number of shoots that can be scheduled onto a seed. The gardenlet seed controller updates the capacity and allocatable fields in the Seed status with the capacity of each resource and how much of it is actually available to be consumed by shoots. The allocatable value of a resource is equal to capacity minus reserved. When scheduling shoots, the scheduler filters out all candidate seeds whose allocatable capacity for shoots would be exceeded if the shoot is scheduled onto the seed. Failure to Determine a Suitable Seed In case the scheduler fails to find a suitable seed, the operation is being retried with exponential backoff. The reason for the failure will be reported in the Shoot’s .status.lastOperation field as well as a Kubernetes event (which can be retrieved via kubectl -n \u003cnamespace\u003e describe shoot \u003cshoot-name\u003e).\nCurrent Limitation / Future Plans Azure unfortunately has a geographically non-hierarchical naming pattern and does not start with the continent. This is the reason why we will exchange the implementation of the MinimalDistance strategy with a more suitable one in the future. ","categories":"","description":"Understand the configuration and flow of the controller that assigns a seed cluster to newly created shoots","excerpt":"Understand the configuration and flow of the controller that assigns a …","ref":"/docs/gardener/concepts/scheduler/","tags":"","title":"Gardener Scheduler"},{"body":"As we ramp up more and more friends of Gardener, I thought it worthwhile to explore and write a tutorial about how to simply:\n create a Gardener managed Kubernetes Cluster (Shoot) via kubectl install Istio as a preferred, production ready Ingress/Service Mesh (instead of the Nginx Ingress addon) attach your own custom domain to be managed by Gardener combine everything with certificates from Let’s Encrypt Here are some pre-pointers that you will need to go deeper:\n CRUD Gardener Shoot DNS Management Certificate Management Tutorial Domain Names Tutorial Certificates Tip If you try my instructions and fail, then read the alternative title of this tutorial as “Shoot yourself in the foot with Gardener, custom Domains, Istio and Certificates”. First Things First Login to your Gardener landscape, setup a project with adequate infrastructure credentials and then navigate to your account. Note down the name of your secret. I chose the GCP infrastructure from the vast possible options that my Gardener provides me with, so i had named the secret as shoot-operator-gcp.\nFrom the Access widget (leave the default settings) download your personalized kubeconfig into ~/.kube/kubeconfig-garden-myproject. Follow the instructions to setup kubelogin:\nFor convinience, let us set an alias command with\nalias kgarden=\"kubectl --kubeconfig ~/.kube/kubeconfig-garden-myproject.yaml\" kgarden now gives you all botanical powers and connects you directly with your Gardener.\nYou should now be able to run kgarden get shoots, automatically get an oidc token, and list already running clusters/shoots.\nPrepare your Custom Domain I am going to use Cloud Flare as programmatic DNS of my custom domain mydomain.io. Please follow detailed instructions from Cloud Flare on how to delegate your domain (the free account does not support delegating subdomains). Alternatively, AWS Route53 (and most others) support delegating subdomains.\nI needed to follow these instructions and created the following secret:\napiVersion: v1 kind: Secret metadata: name: cloudflare-mydomain-io type: Opaque data: CLOUDFLARE_API_TOKEN: useYOURownDAMITzNDU2Nzg5MDEyMzQ1Njc4OQ== Apply this secret into your project with kgarden create -f cloudflare-mydomain-io.yaml.\nOur External DNS Manager also supports Amazon Route53, Google CloudDNS, AliCloud DNS, Azure DNS, or OpenStack Designate. Check it out.\nPrepare Gardener Extensions I now need to prepare the Gardener extensions shoot-dns-service and shoot-cert-service and set the parameters accordingly.\nPlease note, that the availability of Gardener Extensions depends on how your administrator has configured the Gardener landscape. Please contact your Gardener administrator in case you experience any issues during activation. The following snippet allows Gardener to manage my entire custom domain, whereas with the include: attribute I restrict all dynamic entries under the subdomain gsicdc.mydomain.io:\n dns: providers: - domains: include: - gsicdc.mydomain.io primary: false secretName: cloudflare-mydomain-io type: cloudflare-dns extensions: - type: shoot-dns-service The next snipplet allows Gardener to manage certificates automatically from Let’s Encrypt on mydomain.io for me:\n extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 issuers: - email: me@mail.com name: mydomain server: 'https://acme-v02.api.letsencrypt.org/directory' - email: me@mail.com name: mydomain-staging server: 'https://acme-staging-v02.api.letsencrypt.org/directory' Adjust the snipplets with your parameters (don’t forget your email). And please use the mydomain-staging issuer while you are testing and learning. Otherwise, Let’s Encrypt will rate limit your frequent requests and you can wait a week until you can continue. References for Let’s Encrypt:\n Rate limit Staging environment Challenge Types Wildcard Certificates Create the Gardener Shoot Cluster Remember I chose to create the Shoot on GCP, so below is the simplest declarative shoot or cluster order document. Notice that I am referring to the infrastructure credentials with shoot-operator-gcp and I combined the above snippets into the yaml file:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: gsicdc spec: dns: providers: - domains: include: - gsicdc.mydomain.io primary: false secretName: cloudflare-mydomain-io type: cloudflare-dns extensions: - type: shoot-dns-service - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 issuers: - email: me@mail.com name: mydomain server: 'https://acme-v02.api.letsencrypt.org/directory' - email: me@mail.com name: mydomain-staging server: 'https://acme-staging-v02.api.letsencrypt.org/directory' cloudProfileName: gcp kubernetes: allowPrivilegedContainers: true version: 1.24.8 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true networking: nodes: 10.250.0.0/16 pods: 100.96.0.0/11 services: 100.64.0.0/13 type: calico provider: controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-d infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.250.0.0/16 type: gcp workers: - machine: image: name: gardenlinux version: 576.9.0 type: n1-standard-2 maxSurge: 1 maxUnavailable: 0 maximum: 2 minimum: 1 name: my-workerpool volume: size: 50Gi type: pd-standard zones: - europe-west1-d purpose: testing region: europe-west1 secretBindingName: shoot-operator-gcp Create your cluster and wait for it to be ready (about 5 to 7min).\n$ kgarden create -f gsicdc.yaml shoot.core.gardener.cloud/gsicdc created $ kgarden get shoot gsicdc --watch NAME CLOUDPROFILE VERSION SEED DOMAIN HIBERNATION OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE gsicdc gcp 1.24.8 gcp gsicdc.myproject.shoot.devgarden.cloud Awake Processing 38 Progressing Progressing Unknown Unknown 83s ... gsicdc gcp 1.24.8 gcp gsicdc.myproject.shoot.devgarden.cloud Awake Succeeded 100 True True True False 6m7s Get access to your freshly baked cluster and set your KUBECONFIG:\n$ kgarden get secrets gsicdc.kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d \u003ekubeconfig-gsicdc.yaml $ export KUBECONFIG=$(pwd)/kubeconfig-gsicdc.yaml $ kubectl get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 100.64.0.1 \u003cnone\u003e 443/TCP 89m Install Istio Please follow the Istio installation instructions and download istioctl. If you are on a Mac, I recommend:\nbrew install istioctl I want to install Istio with a default profile and SDS enabled. Furthermore I pass the following annotations to the service object istio-ingressgateway in the istio-system namespace.\n annotations: cert.gardener.cloud/issuer: mydomain-staging cert.gardener.cloud/secretname: wildcard-tls dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: \"*.gsicdc.mydomain.io\" dns.gardener.cloud/ttl: \"120\" With these annotations three things now happen automatically:\n The External DNS Manager, provided to you as a service (dns.gardener.cloud/class: garden), picks up the request and creates the wildcard DNS entry *.gsicdc.mydomain.io with a time to live of 120sec at your DNS provider. My provider Cloud Flare is very very quick (as opposed to some other services). You should be able to verify the entry with dig lovemygardener.gsicdc.mydomain.io within seconds. The Certificate Management picks up the request as well and initiates a DNS01 protocol exchange with Let’s Encrypt; using the staging environment referred to with the issuer behind mydomain-staging. After aproximately 70sec (give and take) you will receive the wildcard certificate in the wildcard-tls secret in the namespace istio-system. Notice, that the namespace for the certificate secret is often the cause of many troubleshooting sessions: the secret must reside in the same namespace of the gateway. Here is the istio-install script:\n$ export domainname=\"*.gsicdc.mydomain.io\" $ export issuer=\"mydomain-staging\" $ cat \u003c\u003cEOF | istioctl install -y -f - apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: profile: default components: ingressGateways: - name: istio-ingressgateway enabled: true k8s: serviceAnnotations: cert.gardener.cloud/issuer: \"${issuer}\" cert.gardener.cloud/secretname: wildcard-tls dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: \"${domainname}\" dns.gardener.cloud/ttl: \"120\" EOF Verify that setup is working and that DNS and certificates have been created/delivered:\n$ kubectl -n istio-system describe service istio-ingressgateway \u003csnip\u003e Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 58s service-controller Ensuring load balancer Normal reconcile 58s cert-controller-manager created certificate object istio-system/istio-ingressgateway-service-pwqdm Normal cert-annotation 58s cert-controller-manager wildcard-tls: cert request is pending Normal cert-annotation 54s cert-controller-manager wildcard-tls: certificate pending: certificate requested, preparing/waiting for successful DNS01 challenge Normal cert-annotation 28s cert-controller-manager wildcard-tls: certificate ready Normal EnsuredLoadBalancer 26s service-controller Ensured load balancer Normal reconcile 26s dns-controller-manager created dns entry object shoot--core--gsicdc/istio-ingressgateway-service-p9qqb Normal dns-annotation 26s dns-controller-manager *.gsicdc.mydomain.io: dns entry is pending Normal dns-annotation 21s (x3 over 21s) dns-controller-manager *.gsicdc.mydomain.io: dns entry active $ dig lovemygardener.gsicdc.mydomain.io ; \u003c\u003c\u003e\u003e DiG 9.10.6 \u003c\u003c\u003e\u003e lovemygardener.gsicdc.mydomain.io \u003csnip\u003e ;; ANSWER SECTION: lovemygardener.gsicdc.mydomain.io. 120 IN A\t35.195.120.62 \u003csnip\u003e There you have it, the wildcard-tls certificate is ready and the *.gsicdc.mydomain.io dns entry is active. Traffic will be going your way.\nHandy Tools to Install Another set of fine tools to use are kapp (formerly known as k14s), k9s and HTTPie. While we are at it, let’s install them all. If you are on a Mac, I recommend:\nbrew tap vmware-tanzu/carvel brew install ytt kbld kapp kwt imgpkg vendir brew install derailed/k9s/k9s brew install httpie Ingress at Your Service Networking is a central part of Kubernetes, but it can be challenging to understand exactly how it is expected to work. You should learn about Kubernetes networking, and first try to debug problems yourself. With a solid managed cluster from Gardener, it is always PEBCAK! Kubernetes Ingress is a subject that is evolving to much broader standard. Please watch Evolving the Kubernetes Ingress APIs to GA and Beyond for a good introduction. In this example, I did not want to use the Kubernetes Ingress compatibility option of Istio. Instead, I used VirtualService and Gateway from the Istio’s API group networking.istio.io/v1 directly, and enabled istio-injection generically for the namespace.\nI use httpbin as service that I want to expose to the internet, or where my ingress should be routed to (depends on your point of view, I guess).\napiVersion: v1 kind: Namespace metadata: name: production labels: istio-injection: enabled --- apiVersion: v1 kind: Service metadata: name: httpbin namespace: production labels: app: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin --- apiVersion: apps/v1 kind: Deployment metadata: name: httpbin namespace: production spec: replicas: 1 selector: matchLabels: app: httpbin template: metadata: labels: app: httpbin spec: containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin ports: - containerPort: 80 --- apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gw namespace: production spec: selector: istio: ingressgateway #! use istio default ingress gateway servers: - port: number: 80 name: http protocol: HTTP tls: httpsRedirect: true hosts: - \"httpbin.gsicdc.mydomain.io\" - port: number: 443 name: https protocol: HTTPS tls: mode: SIMPLE credentialName: wildcard-tls hosts: - \"httpbin.gsicdc.mydomain.io\" --- apiVersion: networking.istio.io/v1 kind: VirtualService metadata: name: httpbin-vs namespace: production spec: hosts: - \"httpbin.gsicdc.mydomain.io\" gateways: - httpbin-gw http: - match: - uri: regex: /.* route: - destination: port: number: 8000 host: httpbin --- Let us now deploy the whole package of Kubernetes primitives using kapp:\n$ kapp deploy -a httpbin -f httpbin-kapp.yaml Target cluster 'https://api.gsicdc.myproject.shoot.devgarden.cloud' (nodes: shoot--myproject--gsicdc-my-workerpool-z1-6586c8f6cb-x24kh) Changes Namespace Name Kind Conds. Age Op Wait to Rs Ri (cluster) production Namespace - - create reconcile - - production httpbin Deployment - - create reconcile - - ^ httpbin Service - - create reconcile - - ^ httpbin-gw Gateway - - create reconcile - - ^ httpbin-vs VirtualService - - create reconcile - - Op: 5 create, 0 delete, 0 update, 0 noop Wait to: 5 reconcile, 0 delete, 0 noop Continue? [yN]: y 5:36:31PM: ---- applying 1 changes [0/5 done] ---- \u003csnip\u003e 5:37:00PM: ok: reconcile deployment/httpbin (apps/v1) namespace: production 5:37:00PM: ---- applying complete [5/5 done] ---- 5:37:00PM: ---- waiting complete [5/5 done] ---- Succeeded Let’s finally test the service (Of course you can use the browser as well):\n$ http httpbin.gsicdc.mydomain.io HTTP/1.1 301 Moved Permanently content-length: 0 date: Wed, 13 May 2020 21:29:13 GMT location: https://httpbin.gsicdc.mydomain.io/ server: istio-envoy $ curl -k https://httpbin.gsicdc.mydomain.io/ip { \"origin\": \"10.250.0.2\" } Quod erat demonstrandum. The proof of exchanging the issuer is now left to the reader.\nTip Remember that the certificate is actually not valid because it is issued from the Let’s encrypt staging environment. Thus, we needed “curl -k” or “http –verify no”. Hint: use the interactive k9s tool. Cleanup Remove the cloud native application:\n$ kapp ls Apps in namespace 'default' Name Namespaces Lcs Lca httpbin (cluster),production true 17m $ kapp delete -a httpbin ... Continue? [yN]: y ... 11:47:47PM: ---- waiting complete [8/8 done] ---- Succeeded Remove Istio:\n$ istioctl x uninstall --purge clusterrole.rbac.authorization.k8s.io \"prometheus-istio-system\" deleted clusterrolebinding.rbac.authorization.k8s.io \"prometheus-istio-system\" deleted ... Delete your Shoot:\nkgarden annotate shoot gsicdc confirmation.gardener.cloud/deletion=true --overwrite kgarden delete shoot gsicdc --wait=false ","categories":"","description":"","excerpt":"As we ramp up more and more friends of Gardener, I thought it …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/tutorial-custom-domain-with-istio/","tags":"","title":"Gardener yourself a Shoot with Istio, custom Domains, and Certificates"},{"body":"Overview Gardener is implemented using the operator pattern: It uses custom controllers that act on our own custom resources, and apply Kubernetes principles to manage clusters instead of containers. Following this analogy, you can recognize components of the Gardener architecture as well-known Kubernetes components, for example, shoot clusters can be compared with pods, and seed clusters can be seen as worker nodes.\nThe following Gardener components play a similar role as the corresponding components in the Kubernetes architecture:\n Gardener Component Kubernetes Component gardener-apiserver kube-apiserver gardener-controller-manager kube-controller-manager gardener-scheduler kube-scheduler gardenlet kubelet Similar to how the kube-scheduler of Kubernetes finds an appropriate node for newly created pods, the gardener-scheduler of Gardener finds an appropriate seed cluster to host the control plane for newly ordered clusters. By providing multiple seed clusters for a region or provider, and distributing the workload, Gardener also reduces the blast radius of potential issues.\nKubernetes runs a primary “agent” on every node, the kubelet, which is responsible for managing pods and containers on its particular node. Decentralizing the responsibility to the kubelet has the advantage that the overall system is scalable. Gardener achieves the same for cluster management by using a gardenlet as а primary “agent” on every seed cluster, and is only responsible for shoot clusters located in its particular seed cluster:\nThe gardener-controller-manager has controllers to manage resources of the Gardener API. However, instead of letting the gardener-controller-manager talk directly to seed clusters or shoot clusters, the responsibility isn’t only delegated to the gardenlet, but also managed using a reversed control flow: It’s up to the gardenlet to contact the Gardener API server, for example, to share a status for its managed seed clusters.\nReversing the control flow allows placing seed clusters or shoot clusters behind firewalls without the necessity of direct access via VPN tunnels anymore.\nTLS Bootstrapping Kubernetes doesn’t manage worker nodes itself, and it’s also not responsible for the lifecycle of the kubelet running on the workers. Similarly, Gardener doesn’t manage seed clusters itself, so it is also not responsible for the lifecycle of the gardenlet running on the seeds. As a consequence, both the gardenlet and the kubelet need to prepare a trusted connection to the Gardener API server and the Kubernetes API server correspondingly.\nTo prepare a trusted connection between the gardenlet and the Gardener API server, the gardenlet initializes a bootstrapping process after you deployed it into your seed clusters:\n The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources.\n After the CSR is signed, the gardenlet downloads the created client certificate, creates a new kubeconfig with it, and stores it inside a Secret in the seed cluster.\n The gardenlet deletes the bootstrap kubeconfig secret, and starts up with its new kubeconfig.\n The gardenlet starts normal operation.\n The gardener-controller-manager runs a control loop that automatically signs CSRs created by gardenlets.\n The gardenlet bootstrapping process is based on the kubelet bootstrapping process. More information: Kubelet’s TLS bootstrapping.\n If you don’t want to run this bootstrap process, you can create a kubeconfig pointing to the garden cluster for the gardenlet yourself, and use the field gardenClientConnection.kubeconfig in the gardenlet configuration to share it with the gardenlet.\ngardenlet Certificate Rotation The certificate used to authenticate the gardenlet against the API server has a certain validity based on the configuration of the garden cluster (--cluster-signing-duration flag of the kube-controller-manager (default 1y)).\n You can also configure the validity for the client certificate by specifying .gardenClientConnection.kubeconfigValidity.validity in the gardenlet’s component configuration. Note that changing this value will only take effect when the kubeconfig is rotated again (it is not picked up immediately). The minimum validity is 10m (that’s what is enforced by the CertificateSigningRequest API in Kubernetes which is used by the gardenlet).\n By default, after about 70-90% of the validity has expired, the gardenlet tries to automatically replace the current certificate with a new one (certificate rotation).\n You can change these boundaries by specifying .gardenClientConnection.kubeconfigValidity.autoRotationJitterPercentage{Min,Max} in the gardenlet’s component configuration.\n To use a certificate rotation, you need to specify the secret to store the kubeconfig with the rotated certificate in the field .gardenClientConnection.kubeconfigSecret of the gardenlet component configuration.\nRotate Certificates Using Bootstrap kubeconfig If the gardenlet created the certificate during the initial TLS Bootstrapping using the Bootstrap kubeconfig, certificates can be rotated automatically. The same control loop in the gardener-controller-manager that signs the CSRs during the initial TLS Bootstrapping also automatically signs the CSR during a certificate rotation.\nℹ️ You can trigger an immediate renewal by annotating the Secret in the seed cluster stated in the .gardenClientConnection.kubeconfigSecret field with gardener.cloud/operation=renew. Within 10s, gardenlet detects this and terminates itself to request new credentials. After it has booted up again, gardenlet will issue a new certificate independent of the remaining validity of the existing one.\nℹ️ Alternatively, annotate the respective Seed with gardener.cloud/operation=renew-kubeconfig. This will make gardenlet annotate its own kubeconfig secret with gardener.cloud/operation=renew and triggers the process described in the previous paragraph.\nRotate Certificates Using Custom kubeconfig When trying to rotate a custom certificate that wasn’t created by gardenlet as part of the TLS Bootstrap, the x509 certificate’s Subject field needs to conform to the following:\n the Common Name (CN) is prefixed with gardener.cloud:system:seed: the Organization (O) equals gardener.cloud:system:seeds Otherwise, the gardener-controller-manager doesn’t automatically sign the CSR. In this case, an external component or user needs to approve the CSR manually, for example, using the command kubectl certificate approve seed-csr-\u003c...\u003e). If that doesn’t happen within 15 minutes, the gardenlet repeats the process and creates another CSR.\nConfiguring the Seed to Work with gardenlet The gardenlet works with a single seed, which must be configured in the GardenletConfiguration under .seedConfig. This must be a copy of the Seed resource, for example:\napiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration seedConfig: metadata: name: my-seed spec: provider: type: aws # ... settings: scheduling: visible: true (see this yaml file for a more complete example)\nOn startup, gardenlet registers a Seed resource using the given template in the seedConfig if it’s not present already.\nComponent Configuration In the component configuration for the gardenlet, it’s possible to define:\n settings for the Kubernetes clients interacting with the various clusters settings for the controllers inside the gardenlet settings for leader election and log levels, feature gates, and seed selection or seed configuration. More information: Example gardenlet Component Configuration.\nHeartbeats Similar to how Kubernetes uses Lease objects for node heart beats (see KEP), the gardenlet is using Lease objects for heart beats of the seed cluster. Every two seconds, the gardenlet checks that the seed cluster’s /healthz endpoint returns HTTP status code 200. If that is the case, the gardenlet renews the lease in the Garden cluster in the gardener-system-seed-lease namespace and updates the GardenletReady condition in the status.conditions field of the Seed resource. For more information, see this section.\nSimilar to the node-lifecycle-controller inside the kube-controller-manager, the gardener-controller-manager features a seed-lifecycle-controller that sets the GardenletReady condition to Unknown in case the gardenlet fails to renew the lease. As a consequence, the gardener-scheduler doesn’t consider this seed cluster for newly created shoot clusters anymore.\n/healthz Endpoint The gardenlet includes an HTTP server that serves a /healthz endpoint. It’s used as a liveness probe in the Deployment of the gardenlet. If the gardenlet fails to renew its lease, then the endpoint returns 500 Internal Server Error, otherwise it returns 200 OK.\nPlease note that the /healthz only indicates whether the gardenlet could successfully probe the Seed’s API server and renew the lease with the Garden cluster. It does not show that the Gardener extension API server (with the Gardener resource groups) is available. However, the gardenlet is designed to withstand such connection outages and retries until the connection is reestablished.\nControllers The gardenlet consists out of several controllers which are now described in more detail.\nBackupBucket Controller The BackupBucket controller reconciles those core.gardener.cloud/v1beta1.BackupBucket resources whose .spec.seedName value is equal to the name of the Seed the respective gardenlet is responsible for. A core.gardener.cloud/v1beta1.BackupBucket resource is created by the Seed controller if .spec.backup is defined in the Seed.\nThe controller adds finalizers to the BackupBucket and the secret mentioned in the .spec.secretRef of the BackupBucket. The controller also copies this secret to the seed cluster. Additionally, it creates an extensions.gardener.cloud/v1alpha1.BackupBucket resource (non-namespaced) in the seed cluster and waits until the responsible extension controller reconciles it (see Contract: BackupBucket Resource for more details). The status from the reconciliation is reported in the .status.lastOperation field. Once the extension resource is ready and the .status.generatedSecretRef is set by the extension controller, the gardenlet copies the referenced secret to the garden namespace in the garden cluster. An owner reference to the core.gardener.cloud/v1beta1.BackupBucket is added to this secret.\nIf the core.gardener.cloud/v1beta1.BackupBucket is deleted, the controller deletes the generated secret in the garden cluster and the extensions.gardener.cloud/v1alpha1.BackupBucket resource in the seed cluster and it waits for the respective extension controller to remove its finalizers from the extensions.gardener.cloud/v1alpha1.BackupBucket. Then it deletes the secret in the seed cluster and finally removes the finalizers from the core.gardener.cloud/v1beta1.BackupBucket and the referred secret.\nBackupEntry Controller The BackupEntry controller reconciles those core.gardener.cloud/v1beta1.BackupEntry resources whose .spec.seedName value is equal to the name of a Seed the respective gardenlet is responsible for. Those resources are created by the Shoot controller (only if backup is enabled for the respective Seed) and there is exactly one BackupEntry per Shoot.\nThe controller creates an extensions.gardener.cloud/v1alpha1.BackupEntry resource (non-namespaced) in the seed cluster and waits until the responsible extension controller reconciled it (see Contract: BackupEntry Resource for more details). The status is populated in the .status.lastOperation field.\nThe core.gardener.cloud/v1beta1.BackupEntry resource has an owner reference pointing to the corresponding Shoot. Hence, if the Shoot is deleted, the BackupEntry resource also gets deleted. In this case, the controller deletes the extensions.gardener.cloud/v1alpha1.BackupEntry resource in the seed cluster and waits until the responsible extension controller has deleted it. Afterwards, the finalizer of the core.gardener.cloud/v1beta1.BackupEntry resource is released so that it finally disappears from the system.\nIf the spec.seedName and .status.seedName of the core.gardener.cloud/v1beta1.BackupEntry are different, the controller will migrate it by annotating the extensions.gardener.cloud/v1alpha1.BackupEntry in the Source Seed with gardener.cloud/operation: migrate, waiting for it to be migrated successfully and eventually deleting it from the Source Seed cluster. Afterwards, the controller will recreate the extensions.gardener.cloud/v1alpha1.BackupEntry in the Destination Seed, annotate it with gardener.cloud/operation: restore and wait for the restore operation to finish. For more details about control plane migration, please read Shoot Control Plane Migration.\nKeep Backup for Deleted Shoots In some scenarios it might be beneficial to not immediately delete the BackupEntrys (and with them, the etcd backup) for deleted Shoots.\nIn this case you can configure the .controllers.backupEntry.deletionGracePeriodHours field in the component configuration of the gardenlet. For example, if you set it to 48, then the BackupEntrys for deleted Shoots will only be deleted 48 hours after the Shoot was deleted.\nAdditionally, you can limit the shoot purposes for which this applies by setting .controllers.backupEntry.deletionGracePeriodShootPurposes[]. For example, if you set it to [production] then only the BackupEntrys for Shoots with .spec.purpose=production will be deleted after the configured grace period. All others will be deleted immediately after the Shoot deletion.\nIn case a BackupEntry is scheduled for future deletion but you want to delete it immediately, add the annotation backupentry.core.gardener.cloud/force-deletion=true.\nBastion Controller The Bastion controller reconciles those operations.gardener.cloud/v1alpha1.Bastion resources whose .spec.seedName value is equal to the name of a Seed the respective gardenlet is responsible for.\nThe controller creates an extensions.gardener.cloud/v1alpha1.Bastion resource in the seed cluster in the shoot namespace with the same name as operations.gardener.cloud/v1alpha1.Bastion. Then it waits until the responsible extension controller has reconciled it (see Contract: Bastion Resource for more details). The status is populated in the .status.conditions and .status.ingress fields.\nDuring the deletion of operations.gardener.cloud/v1alpha1.Bastion resources, the controller first sets the Ready condition to False and then deletes the extensions.gardener.cloud/v1alpha1.Bastion resource in the seed cluster. Once this resource is gone, the finalizer of the operations.gardener.cloud/v1alpha1.Bastion resource is released, so it finally disappears from the system.\nControllerInstallation Controller The ControllerInstallation controller in the gardenlet reconciles ControllerInstallation objects with the help of the following reconcilers.\n“Main” Reconciler This reconciler is responsible for ControllerInstallations referencing a ControllerDeployment whose type=helm.\nFor each ControllerInstallation, it creates a namespace on the seed cluster named extension-\u003ccontroller-installation-name\u003e. Then, it creates a generic garden kubeconfig and garden access secret for the extension for accessing the garden cluster.\nAfter that, it unpacks the Helm chart tarball in the ControllerDeployments .providerConfig.chart field and deploys the rendered resources to the seed cluster. The Helm chart values in .providerConfig.values will be used and extended with some information about the Gardener environment and the seed cluster:\ngardener: version: \u003cgardenlet-version\u003e garden: clusterIdentity: \u003cidentity-of-garden-cluster\u003e genericKubeconfigSecretName: \u003csecret-name\u003e gardenlet: featureGates: Foo: true Bar: false # ... seed: name: \u003cseed-name\u003e clusterIdentity: \u003cidentity-of-seed-cluster\u003e annotations: \u003cseed-annotations\u003e labels: \u003cseed-labels\u003e spec: \u003cseed-specification\u003e As of today, there are a few more fields in .gardener.seed, but it is recommended to use the .gardener.seed.spec if the Helm chart needs more information about the seed configuration.\nThe rendered chart will be deployed via a ManagedResource created in the garden namespace of the seed cluster. It is labeled with controllerinstallation-name=\u003cname\u003e so that one can easily find the owning ControllerInstallation for an existing ManagedResource.\nThe reconciler maintains the Installed condition of the ControllerInstallation and sets it to False if the rendering or deployment fails.\n“Care” Reconciler This reconciler reconciles ControllerInstallation objects and checks whether they are in a healthy state. It checks the .status.conditions of the backing ManagedResource created in the garden namespace of the seed cluster.\n If the ResourcesApplied condition of the ManagedResource is True, then the Installed condition of the ControllerInstallation will be set to True. If the ResourcesHealthy condition of the ManagedResource is True, then the Healthy condition of the ControllerInstallation will be set to True. If the ResourcesProgressing condition of the ManagedResource is True, then the Progressing condition of the ControllerInstallation will be set to True. A ControllerInstallation is considered “healthy” if Applied=Healthy=True and Progressing=False.\n“Required” Reconciler This reconciler watches all resources in the extensions.gardener.cloud API group in the seed cluster. It is responsible for maintaining the Required condition on ControllerInstallations. Concretely, when there is at least one extension resource in the seed cluster a ControllerInstallation is responsible for, then the status of the Required condition will be True. If there are no extension resources anymore, its status will be False.\nThis condition is taken into account by the ControllerRegistration controller part of gardener-controller-manager when it computes which extensions have to be deployed to which seed cluster. See Gardener Controller Manager for more details.\nGardenlet Controller The Gardenlet controller reconciles a Gardenlet resource with the same name as the Seed the gardenlet is responsible for. This is used to implement self-upgrades of gardenlet based on information pulled from the garden cluster. For a general overview, see this document.\nOn Gardenlet reconciliation, the controller deploys the gardenlet within its own cluster which after downloading the Helm chart specified in .spec.deployment.helm.ociRepository and rendering it with the provided values/configuration.\nOn Gardenlet deletion, nothing happens: The gardenlet does not terminate itself - deleting a Gardenlet object effectively means that self-upgrades are stopped.\nManagedSeed Controller The ManagedSeed controller in the gardenlet reconciles ManagedSeeds that refers to Shoot scheduled on Seed the gardenlet is responsible for. Additionally, the controller monitors Seeds, which are owned by ManagedSeeds for which the gardenlet is responsible.\nOn ManagedSeed reconciliation, the controller first waits for the referenced Shoot to undergo a reconciliation process. Once the Shoot is successfully reconciled, the controller sets the ShootReconciled status of the ManagedSeed to true. Then, it creates garden namespace within the target shoot cluster. The controller also manages secrets related to Seeds, such as the backup and kubeconfig secrets. It ensures that these secrets are created and updated according to the ManagedSeed spec. Finally, it deploys the gardenlet within the specified shoot cluster which registers the Seed cluster.\nOn ManagedSeed deletion, the controller first deletes the corresponding Seed that was originally created by the controller. Subsequently, it deletes the gardenlet instance within the shoot cluster. The controller also ensures the deletion of related Seed secrets. Finally, the dedicated garden namespace within the shoot cluster is deleted.\nNetworkPolicy Controller The NetworkPolicy controller reconciles NetworkPolicys in all relevant namespaces in the seed cluster and provides so-called “general” policies for access to the runtime cluster’s API server, DNS, public networks, etc.\nThe controller resolves the IP address of the Kubernetes service in the default namespace and creates an egress NetworkPolicys for it.\nFor more details about NetworkPolicys in Gardener, please see NetworkPolicys In Garden, Seed, Shoot Clusters.\nSeed Controller The Seed controller in the gardenlet reconciles Seed objects with the help of the following reconcilers.\n“Main Reconciler” This reconciler is responsible for managing the seed’s system components. Those comprise CA certificates, the various CustomResourceDefinitions, the logging and monitoring stacks, and few central components like gardener-resource-manager, etcd-druid, istio, etc.\nThe reconciler also deploys a BackupBucket resource in the garden cluster in case the Seed's .spec.backup is set. It also checks whether the seed cluster’s Kubernetes version is at least the minimum supported version and errors in case this constraint is not met.\nThis reconciler maintains the .status.lastOperation field, i.e. it sets it:\n to state=Progressing before it executes its reconciliation flow. to state=Error in case an error occurs. to state=Succeeded in case the reconciliation succeeded. “Care” Reconciler This reconciler checks whether the seed system components (deployed by the “main” reconciler) are healthy. It checks the .status.conditions of the backing ManagedResource created in the garden namespace of the seed cluster. A ManagedResource is considered “healthy” if the conditions ResourcesApplied=ResourcesHealthy=True and ResourcesProgressing=False.\nIf all ManagedResources are healthy, then the SeedSystemComponentsHealthy condition of the Seed will be set to True. Otherwise, it will be set to False.\nIf at least one ManagedResource is unhealthy and there is threshold configuration for the conditions (in .controllers.seedCare.conditionThresholds), then the status of the SeedSystemComponentsHealthy condition will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nIn order to compute the condition statuses, this reconciler considers ManagedResources (in the garden and istio-system namespace) and their status, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when SeedSystemComponentsHealthy .spec.class is set “Lease” Reconciler This reconciler checks whether the connection to the seed cluster’s /healthz endpoint works. If this succeeds, then it renews a Lease resource in the garden cluster’s gardener-system-seed-lease namespace. This indicates a heartbeat to the external world, and internally the gardenlet sets its health status to true. In addition, the GardenletReady condition in the status of the Seed is set to True. The whole process is similar to what the kubelet does to report heartbeats for its Node resource and its KubeletReady condition. For more information, see this section.\nIf the connection to the /healthz endpoint or the update of the Lease fails, then the internal health status of gardenlet is set to false. Also, this internal health status is set to false automatically after some time, in case the controller gets stuck for whatever reason. This internal health status is available via the gardenlet’s /healthz endpoint and is used for the livenessProbe in the gardenlet pod.\nShoot Controller The Shoot controller in the gardenlet reconciles Shoot objects with the help of the following reconcilers.\n“Main” Reconciler This reconciler is responsible for managing all shoot cluster components and implements the core logic for creating, updating, hibernating, deleting, and migrating shoot clusters. It is also responsible for syncing the Cluster cluster to the seed cluster before and after each successful shoot reconciliation.\nThe main reconciliation logic is performed in 3 different task flows dedicated to specific operation types:\n reconcile (operations: create, reconcile, restore): this is the main flow responsible for creation and regular reconciliation of shoots. Hibernating a shoot also triggers this flow. It is also used for restoration of the shoot control plane on the new seed (second half of a Control Plane Migration) migrate: this flow is triggered when spec.seedName specifies a different seed than status.seedName. It performs the first half of the Control Plane Migration, i.e., a backup (migrate operation) of all control plane components followed by a “shallow delete”. delete: this flow is triggered when the shoot’s deletionTimestamp is set, i.e., when it is deleted. The gardenlet takes special care to prevent unnecessary shoot reconciliations. This is important for several reasons, e.g., to not overload the seed API servers and to not exhaust infrastructure rate limits too fast. The gardenlet performs shoot reconciliations according to the following rules:\n If status.observedGeneration is less than metadata.generation: this is the case, e.g., when the spec was changed, a manual reconciliation operation was triggered, or the shoot was deleted. If the last operation was not successful. If the shoot is in a failed state, the gardenlet does not perform any reconciliation on the shoot (unless the retry operation was triggered). However, it syncs the Cluster resource to the seed in order to inform the extension controllers about the failed state. Regular reconciliations are performed with every GardenletConfiguration.controllers.shoot.syncPeriod (defaults to 1h). Shoot reconciliations are not performed if the assigned seed cluster is not healthy or has not been reconciled by the current gardenlet version yet (determined by the Seed.status.gardener section). This is done to make sure that shoots are reconciled with fully rolled out seed system components after a Gardener upgrade. Otherwise, the gardenlet might perform operations of the new version that doesn’t match the old version of the deployed seed system components, which might lead to unspecified behavior. There are a few special cases that overwrite or confine how often and under which circumstances periodic shoot reconciliations are performed:\n In case the gardenlet config allows it (controllers.shoot.respectSyncPeriodOverwrite, disabled by default), the sync period for a shoot can be increased individually by setting the shoot.gardener.cloud/sync-period annotation. This is always allowed for shoots in the garden namespace. Shoots are not reconciled with a higher frequency than specified in GardenletConfiguration.controllers.shoot.syncPeriod. In case the gardenlet config allows it (controllers.shoot.respectSyncPeriodOverwrite, disabled by default), shoots can be marked as “ignored” by setting the shoot.gardener.cloud/ignore annotation. In this case, the gardenlet does not perform any reconciliation for the shoot. In case GardenletConfiguration.controllers.shoot.reconcileInMaintenanceOnly is enabled (disabled by default), the gardenlet performs regular shoot reconciliations only once in the respective maintenance time window (GardenletConfiguration.controllers.shoot.syncPeriod is ignored). The gardenlet randomly distributes shoot reconciliations over the maintenance time window to avoid high bursts of reconciliations (see Shoot Maintenance). In case Shoot.spec.maintenance.confineSpecUpdateRollout is enabled (disabled by default), changes to the shoot specification are not rolled out immediately but only during the respective maintenance time window (see Shoot Maintenance). “Care” Reconciler This reconciler performs three “care” actions related to Shoots.\nConditions It maintains the following conditions:\n APIServerAvailable: The /healthz endpoint of the shoot’s kube-apiserver is called and considered healthy when it responds with 200 OK. ControlPlaneHealthy: The control plane is considered healthy when the respective Deployments (for example kube-apiserver,kube-controller-manager), and Etcds (for example etcd-main) exist and are healthy. ObservabilityComponentsHealthy: This condition is considered healthy when the respective Deployments (for example plutono) and StatefulSets (for example prometheus,vali) exist and are healthy. EveryNodeReady: The conditions of the worker nodes are checked (e.g., Ready, MemoryPressure). Also, it’s checked whether the Kubernetes version of the installed kubelet matches the desired version specified in the Shoot resource. SystemComponentsHealthy: The conditions of the ManagedResources are checked (e.g., ResourcesApplied). Also, it is verified whether the VPN tunnel connection is established (which is required for the kube-apiserver to communicate with the worker nodes). Sometimes, ManagedResources can have both Healthy and Progressing conditions set to True (e.g., when a DaemonSet rolls out one-by-one on a large cluster with many nodes) while this is not reflected in the Shoot status. In order to catch issues where the rollout gets stuck, one can set .controllers.shootCare.managedResourceProgressingThreshold in the gardenlet’s component configuration. If the Progressing condition is still True for more than the configured duration, the SystemComponentsHealthy condition in the Shoot is set to False, eventually.\nEach condition can optionally also have error codes in order to indicate which type of issue was detected (see Shoot Status for more details).\nApart from the above, extension controllers can also contribute to the status or error codes of these conditions (see Contributing to Shoot Health Status Conditions for more details).\nIf all checks for a certain conditions are succeeded, then its status will be set to True. Otherwise, it will be set to False.\nIf at least one check fails and there is threshold configuration for the conditions (in .controllers.seedCare.conditionThresholds), then the status will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nBesides directly checking the status of Deployments, Etcds, StatefulSets in the shoot namespace, this reconciler also considers ManagedResources (in the shoot namespace) and their status in order to compute the condition statuses, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when ControlPlaneHealthy .spec.class=seed and care.gardener.cloud/condition-type label either unset, or set to ControlPlaneHealthy ObservabilityComponentsHealthy care.gardener.cloud/condition-type label set to ObservabilityComponentsHealthy SystemComponentsHealthy .spec.class unset or care.gardener.cloud/condition-type label set to SystemComponentsHealthy Constraints And Automatic Webhook Remediation Please see Shoot Status for more details.\nGarbage Collection Stale pods in the shoot namespace in the seed cluster and in the kube-system namespace in the shoot cluster are deleted. A pod is considered stale when:\n it was terminated with reason Evicted. it was terminated with reason starting with OutOf (e.g., OutOfCpu). it was terminated with reason NodeAffinity. it is stuck in termination (i.e., if its deletionTimestamp is more than 5m ago). “State” Reconciler This reconciler periodically (default: every 6h) performs backups of the state of Shoot clusters and persists them into ShootState resources into the same namespace as the Shoots in the garden cluster. It is only started in case the gardenlet is responsible for an unmanaged Seed, i.e. a Seed which is not backed by a seedmanagement.gardener.cloud/v1alpha1.ManagedSeed object. Alternatively, it can be disabled by setting the concurrentSyncs=0 for the controller in the gardenlet’s component configuration.\nPlease refer to GEP-22: Improved Usage of the ShootState API for all information.\nTokenRequestor Controller For ServiceAccounts The gardenlet uses an instance of the TokenRequestor controller which initially was developed in the context of the gardener-resource-manager, please read this document for further information.\ngardenlet uses it for requesting tokens for components running in the seed cluster that need to communicate with the garden cluster. The mechanism works the same way as for shoot control plane components running in the seed which need to communicate with the shoot cluster. However, gardenlet’s instance of the TokenRequestor controller is restricted to Secrets labeled with resources.gardener.cloud/class=garden. Furthermore, it doesn’t respect the serviceaccount.resources.gardener.cloud/namespace annotation. Instead, it always uses the seed’s namespace in the garden cluster for managing ServiceAccounts and their tokens.\nTokenRequestor Controller For WorkloadIdentitys The TokenRequestorWorkloadIdentity controller in the gardenlet reconciles Secrets labeled with security.gardener.cloud/purpose=workload-identity-token-requestor. When it encounters such Secret, it associates the Secret with a specific WorkloadIdentity using the annotations workloadidentity.security.gardener.cloud/name and workloadidentity.security.gardener.cloud/namespace. Any workload creating such Secrets is responsible to label and annotate the Secrets accordingly. After the association is made, the gardenlet requests a token for the specific WorkloadIdentity from the Gardener API Server and writes it back in the Secret’s data against the token key. The gardenlet is responsible to keep this token valid by refreshing it periodically. The token is then used by components running in the seed cluster in order to present the said WorkloadIdentity before external systems, e.g. by calling cloud provider APIs.\nPlease refer to GEP-26: Workload Identity - Trust Based Authentication for more details.\nVPAEvictionRequirements Controller The VPAEvictionRequirements controller in the gardenlet reconciles VerticalPodAutoscaler objects labeled with autoscaling.gardener.cloud/eviction-requirements: managed-by-controller. It manages the EvictionRequirements on a VPA object, which are used to restrict when and how a Pod can be evicted to apply a new resource recommendation. Specifically, the following actions will be taken for the respective label and annotation configuration:\n If the VPA has the annotation eviction-requirements.autoscaling.gardener.cloud/downscale-restriction: never, an EvictionRequirement is added to the VPA object that allows evictions for upscaling only If the VPA has the annotation eviction-requirements.autoscaling.gardener.cloud/downscale-restriction: in-maintenance-window-only, the same EvictionRequirement is added to the VPA object when the Shoot is currently outside of its maintenance window. When the Shoot is inside its maintenance window, the EvictionRequirement is removed. Information about the Shoot maintenance window times are stored in the annotation shoot.gardener.cloud/maintenance-window on the VPA Managed Seeds Gardener users can use shoot clusters as seed clusters, so-called “managed seeds” (aka “shooted seeds”), by creating ManagedSeed resources. By default, the gardenlet that manages this shoot cluster then automatically creates a clone of itself with the same version and the same configuration that it currently has. Then it deploys the gardenlet clone into the managed seed cluster.\nFor more information, see ManagedSeeds: Register Shoot as Seed.\nMigrating from Previous Gardener Versions If your Gardener version doesn’t support gardenlets yet, no special migration is required, but the following prerequisites must be met:\n Your Gardener version is at least 0.31 before upgrading to v1. You have to make sure that your garden cluster is exposed in a way that it’s reachable from all your seed clusters. With previous Gardener versions, you had deployed the Gardener Helm chart (incorporating the API server, controller-manager, and scheduler). With v1, this stays the same, but you now have to deploy the gardenlet Helm chart as well into all of your seeds (if they aren’t managed, as mentioned earlier).\nSee Deploy a gardenlet for all instructions.\nRelated Links Gardener Architecture #356: Implement Gardener Scheduler #2309: Add /healthz endpoint for gardenlet ","categories":"","description":"Understand how the gardenlet, the primary \"agent\" on every seed cluster, works and learn more about the different Gardener components","excerpt":"Understand how the gardenlet, the primary \"agent\" on every seed …","ref":"/docs/gardener/concepts/gardenlet/","tags":"","title":"gardenlet"},{"body":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This tutorial describes how to use annotated Gateway API resources as source for Certificate.\nInstall Istio on your cluster Follow the Istio Kubernetes Gateway API to install the Gateway API and to install Istio.\nThese are the typical commands for the Istio installation with the Kubernetes Gateway API:\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - kubectl get crd gateways.gateway.networking.k8s.io \u0026\u003e /dev/null || \\ { kubectl kustomize \"github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.0.0\" | kubectl apply -f -; } istioctl install --set profile=minimal -y kubectl label namespace default istio-injection=enabled Verify that Gateway Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Note: The sample service is not used in the following steps. It is deployed for illustration purposes only. To use it with certificates, you have to add an HTTPS port for it.\nUsing a Gateway as a source Deploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1beta1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: #cert.gardener.cloud/dnsnames: \"*.example.com\" # alternative if you want to control the dns names explicitly. cert.gardener.cloud/purpose: managed spec: gatewayClassName: istio listeners: - name: default hostname: \"*.example.com\" # this is used by cert-controller-manager to extract DNS names port: 443 protocol: HTTPS allowedRoutes: namespaces: from: All tls: # important: tls section must be defined with exactly one certificateRefs item certificateRefs: - name: foo-example-com --- apiVersion: gateway.networking.k8s.io/v1beta1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by cert-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF You should now see a created Certificate resource similar to:\n$ kubectl -n istio-ingress get cert -oyaml apiVersion: v1 items: - apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: generateName: gateway-gateway- name: gateway-gateway-kdw6h namespace: istio-ingress ownerReferences: - apiVersion: gateway.networking.k8s.io/v1 blockOwnerDeletion: true controller: true kind: Gateway name: gateway spec: commonName: '*.example.com' secretName: foo-example-com status: ... kind: List metadata: resourceVersion: \"\" Using a HTTPRoute as a source If the Gateway resource is annotated with cert.gardener.cloud/purpose: managed, hostnames from all referencing HTTPRoute resources are automatically extracted. These resources don’t need an additional annotation.\nDeploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1beta1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: cert.gardener.cloud/purpose: managed spec: gatewayClassName: istio listeners: - name: default hostname: null # not set port: 443 protocol: HTTPS allowedRoutes: namespaces: from: All tls: # important: tls section must be defined with exactly one certificateRefs item certificateRefs: - name: foo-example-com --- apiVersion: gateway.networking.k8s.io/v1beta1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF This should show a similar Certificate resource as above.\n","categories":"","description":"","excerpt":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/gateway-api-gateways/","tags":"","title":"Gateway Api Gateways"},{"body":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This tutorial describes how to use annotated Gateway API resources as source for DNSEntries with the Gardener shoot-dns-service extension.\nThe dns-controller-manager supports the resources Gateway and HTTPRoute.\nInstall Istio on your cluster Using a new or existing shoot cluster, follow the Istio Kubernetes Gateway API to install the Gateway API and to install Istio.\nThese are the typical commands for the Istio installation with the Kubernetes Gateway API:\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - kubectl get crd gateways.gateway.networking.k8s.io \u0026\u003e /dev/null || \\ { kubectl kustomize \"github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.0.0\" | kubectl apply -f -; } istioctl install --set profile=minimal -y kubectl label namespace default istio-injection=enabled Verify that Gateway Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Deploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: dns.gardener.cloud/dnsnames: \"*.example.com\" dns.gardener.cloud/class: garden spec: gatewayClassName: istio listeners: - name: default hostname: \"*.example.com\" # this is used by dns-controller-manager to extract DNS names port: 80 protocol: HTTP allowedRoutes: namespaces: from: All --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF You should now see events in the namespace of the gateway:\n$ kubectl -n istio-system get events --sort-by={.metadata.creationTimestamp} LAST SEEN TYPE REASON OBJECT MESSAGE ... 38s Normal dns-annotation service/gateway-istio httpbin.example.com: created dns entry object shoot--foo--bar/gateway-istio-service-zpf8n 38s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry pending: waiting for dns reconciliation 38s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry is pending 36s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry active Using a HTTPRoute as a source If the Gateway resource is annotated with dns.gardener.cloud/dnsnames: \"*\", hostnames from all referencing HTTPRoute resources are automatically extracted. These resources don’t need an additional annotation.\nDeploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: gatewayClassName: istio listeners: - name: default hostname: null # not set port: 80 protocol: HTTP allowedRoutes: namespaces: from: All --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF This should show a similar events as above.\nAccess the sample service using curl $ curl -I http://httpbin.example.com/get HTTP/1.1 200 OK server: istio-envoy date: Tue, 13 Feb 2024 08:09:41 GMT content-type: application/json content-length: 701 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 19 Accessing any other URL that has not been explicitly exposed should return an HTTP 404 error:\n$ curl -I http://httpbin.example.com/headers HTTP/1.1 404 Not Found date: Tue, 13 Feb 2024 08:09:41 GMT server: istio-envoy transfer-encoding: chunked ","categories":"","description":"","excerpt":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/tutorials/gateway-api-gateways/","tags":"","title":"Gateway Api Gateways"},{"body":"Overview To troubleshoot certain problems in a Kubernetes cluster, operators need access to the host of the Kubernetes node. This can be required if a node misbehaves or fails to join the cluster in the first place.\nWith access to the host, it is for instance possible to check the kubelet logs and interact with common tools such as systemctl and journalctl.\nThe first section of this guide explores options to get a shell to the node of a Gardener Kubernetes cluster. The options described in the second section do not rely on Kubernetes capabilities to get shell access to a node and thus can also be used if an instance failed to join the cluster.\nThis guide only covers how to get access to the host, but does not cover troubleshooting methods.\n Overview Get a Shell to an Operational Cluster Node Gardener Dashboard Result Gardener Ops Toolbelt Custom Root Pod SSH Access to a Node That Failed to Join the Cluster Identifying the Problematic Instance gardenctl ssh SSH with a Manually Created Bastion on AWS Create the Bastion Security Group Create the Bastion Instance Connecting to the Target Instance Cleanup Get a Shell to an Operational Cluster Node The following describes four different approaches to get a shell to an operational Shoot worker node. As a prerequisite to troubleshooting a Kubernetes node, the node must have joined the cluster successfully and be able to run a pod. All of the described approaches involve scheduling a pod with root permissions and mounting the root filesystem.\nGardener Dashboard Prerequisite: the terminal feature is configured for the Gardener dashboard.\n Navigate to the cluster overview page and find the Terminal in the Access tile. Select the target Cluster (Garden, Seed / Control Plane, Shoot cluster) depending on the requirements and access rights (only certain users have access to the Seed Control Plane).\n To open the terminal configuration, interact with the top right-hand corner of the screen. Set the Terminal Runtime to “Privileged”. Also, specify the target node from the drop-down menu. Result The Dashboard then schedules a pod and opens a shell session to the node.\nTo get access to the common binaries installed on the host, prefix the command with chroot /hostroot. Note that the path depends on where the root path is mounted in the container. In the default image used by the Dashboard, it is under /hostroot.\nGardener Ops Toolbelt Prerequisite: kubectl is available.\nThe Gardener ops-toolbelt can be used as a convenient way to deploy a root pod to a node. The pod uses an image that is bundled with a bunch of useful troubleshooting tools. This is also the same image that is used by default when using the Gardener Dashboard terminal feature as described in the previous section.\nThe easiest way to use the Gardener ops-toolbelt is to execute the ops-pod script in the hacks folder. To get root shell access to a node, execute the aforementioned script by supplying the target node name as an argument:\n\u003cpath-to-ops-toolbelt-repo\u003e/hacks/ops-pod \u003ctarget-node\u003e Custom Root Pod Alternatively, a pod can be assigned to a target node and a shell can be opened via standard Kubernetes means. To enable root access to the node, the pod specification requires proper securityContext and volume properties.\nFor instance, you can use the following pod manifest, after changing with the name of the node you want this pod attached to:\napiVersion: v1 kind: Pod metadata: name: privileged-pod namespace: default spec: nodeSelector: kubernetes.io/hostname: \u003ctarget-node-name\u003e containers: - name: busybox image: busybox stdin: true securityContext: privileged: true volumeMounts: - name: host-root-volume mountPath: /host readOnly: true volumes: - name: host-root-volume hostPath: path: / hostNetwork: true hostPID: true restartPolicy: Never SSH Access to a Node That Failed to Join the Cluster This section explores two options that can be used to get SSH access to a node that failed to join the cluster. As it is not possible to schedule a pod on the node, the Kubernetes-based methods explored so far cannot be used in this scenario.\nAdditionally, Gardener typically provisions worker instances in a private subnet of the VPC, hence - there is no public IP address that could be used for direct SSH access.\nFor this scenario, cloud providers typically have extensive documentation (e.g., AWS \u0026 GCP and in some cases tooling support). However, these approaches are mostly cloud provider specific, require interaction via their CLI and API or sometimes the installation of a cloud provider specific agent on the node.\nAlternatively, gardenctl can be used providing a cloud provider agnostic and out-of-the-box support to get ssh access to an instance in a private subnet. Currently gardenctl supports AWS, GCP, Openstack, Azure and Alibaba Cloud.\nIdentifying the Problematic Instance First, the problematic instance has to be identified. In Gardener, worker pools can be created in different cloud provider regions, zones, and accounts.\nThe instance would typically show up as successfully started / running in the cloud provider dashboard or API and it is not immediately obvious which one has a problem. Instead, we can use the Gardener API / CRDs to obtain the faulty instance identifier in a cloud-agnostic way.\nGardener uses the Machine Controller Manager to create the Shoot worker nodes. For each worker node, the Machine Controller Manager creates a Machine CRD in the Shoot namespace in the respective Seed cluster. Usually the problematic instance can be identified, as the respective Machine CRD has status pending.\nThe instance / node name can be obtained from the Machine .status field:\nkubectl get machine \u003cmachine-name\u003e -o json | jq -r .status.node This is all the information needed to go ahead and use gardenctl ssh to get a shell to the node. In addition, the used cloud provider, the specific identifier of the instance, and the instance region can be identified from the Machine CRD.\nGet the identifier of the instance via:\nkubectl get machine \u003cmachine-name\u003e -o json | jq -r .spec.providerID // e.g aws:///eu-north-1/i-069733c435bdb4640 The identifier shows that the instance belongs to the cloud provider aws with the ec2 instance-id i-069733c435bdb4640 in region eu-north-1.\nTo get more information about the instance, check out the MachineClass (e.g., AWSMachineClass) that is associated with each Machine CRD in the Shoot namespace of the Seed cluster.\nThe AWSMachineClass contains the machine image (ami), machine-type, iam information, network-interfaces, subnets, security groups and attached volumes.\nOf course, the information can also be used to get the instance with the cloud provider CLI / API.\ngardenctl ssh Using the node name of the problematic instance, we can use the gardenctl ssh command to get SSH access to the cloud provider instance via an automatically set up bastion host. gardenctl takes care of spinning up the bastion instance, setting up the SSH keys, ports and security groups and opens a root shell on the target instance. After the SSH session has ended, gardenctl deletes the created cloud provider resources.\nUse the following commands:\n First, target a Garden cluster containing all the Shoot definitions. gardenctl target garden \u003ctarget-garden\u003e Target an available Shoot by name. This sets up the context, configures the kubeconfig file of the Shoot cluster and downloads the cloud provider credentials. Subsequent commands will execute in this context. gardenctl target shoot \u003ctarget-shoot\u003e This uses the cloud provider credentials to spin up the bastion and to open a shell on the target instance. gardenctl ssh \u003ctarget-node\u003e SSH with a Manually Created Bastion on AWS In case you are not using gardenctl or want to control the bastion instance yourself, you can also manually set it up. The steps described here are generally the same as those used by gardenctl internally. Despite some cloud provider specifics, they can be generalized to the following list:\n Open port 22 on the target instance. Create an instance / VM in a public subnet (the bastion instance needs to have a public IP address). Set-up security groups and roles, and open port 22 for the bastion instance. The following diagram shows an overview of how the SSH access to the target instance works:\nThis guide demonstrates the setup of a bastion on AWS.\nPrerequisites:\n The AWS CLI is set up.\n Obtain target instance-id (see Identifying the Problematic Instance).\n Obtain the VPC ID the Shoot resources are created in. This can be found in the Infrastructure CRD in the Shoot namespace in the Seed.\n Make sure that port 22 on the target instance is open (default for Gardener deployed instances).\n Extract security group via: aws ec2 describe-instances --instance-ids \u003cinstance-id\u003e Check for rule that allows inbound connections on port 22: aws ec2 describe-security-groups --group-ids=\u003csecurity-group-id\u003e If not available, create the rule with the following comamnd: aws ec2 authorize-security-group-ingress --group-id \u003csecurity-group-id\u003e --protocol tcp --port 22 --cidr 0.0.0.0/0 Create the Bastion Security Group The common name of the security group is \u003cshoot-name\u003e-bsg. Create the security group: aws ec2 create-security-group --group-name \u003cbastion-security-group-name\u003e --description ssh-access --vpc-id \u003cVPC-ID\u003e Optionally, create identifying tags for the security group: aws ec2 create-tags --resources \u003cbastion-security-group-id\u003e --tags Key=component,Value=\u003ctag\u003e Create a permission in the bastion security group that allows ssh access on port 22: aws ec2 authorize-security-group-ingress --group-id \u003cbastion-security-group-id\u003e --protocol tcp --port 22 --cidr 0.0.0.0/0 Create an IAM role for the bastion instance with the name \u003cshoot-name\u003e-bastions: aws iam create-role --role-name \u003cshoot-name\u003e-bastions The content should be:\n{ \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": [ \"ec2:DescribeRegions\" ], \"Resource\": [ \"*\" ] } ] } Create the instance profile and name it \u003cshoot-name\u003e-bastions: aws iam create-instance-profile --instance-profile-name \u003cname\u003e Add the created role to the instance profile: aws iam add-role-to-instance-profile --instance-profile-name \u003cinstance-profile-name\u003e --role-name \u003crole-name\u003e Create the Bastion Instance Next, in order to be able to ssh into the bastion instance, the instance has to be set up with a user with a public ssh key. Create a user gardener that has the same Gardener-generated public ssh key as the target instance.\n First, we need to get the public part of the Shoot ssh-key. The ssh-key is stored in a secret in the the project namespace in the Garden cluster. The name is: \u003cshoot-name\u003e-ssh-publickey. Get the key via: kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\\\"id_rsa.pub\\\" A script handed over as user-data to the bastion ec2 instance, can be used to create the gardener user and add the ssh-key. For your convenience, you can use the following script to generate the user-data. #!/bin/bash -eu saveUserDataFile () { ssh_key=$1 cat \u003e gardener-bastion-userdata.sh \u003c\u003cEOF #!/bin/bash -eu id gardener || useradd gardener -mU mkdir -p /home/gardener/.ssh echo \"$ssh_key\" \u003e /home/gardener/.ssh/authorized_keys chown gardener:gardener /home/gardener/.ssh/authorized_keys echo \"gardener ALL=(ALL) NOPASSWD:ALL\" \u003e/etc/sudoers.d/99-gardener-user EOF } if [ -p /dev/stdin ]; then read -r input cat | saveUserDataFile \"$input\" else pbpaste | saveUserDataFile \"$input\" fi Use the script by handing-over the public ssh-key of the Shoot cluster: kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\\\"id_rsa.pub\\\" | ./generate-userdata.sh This generates a file called gardener-bastion-userdata.sh in the same directory containing the user-data.\n The following information is needed to create the bastion instance: bastion-IAM-instance-profile-name - Use the created instance profile with the name \u003cshoot-name\u003e-bastions\nimage-id - It is possible to use the same image-id as the one used for the target instance (or any other image). Has cloud provider specific format (AWS: ami).\nssh-public-key-name\n- This is the ssh key pair already created in the Shoot's cloud provider account by Gardener during the `Infrastructure` CRD reconciliation. - The name is usually: `\u003cshoot-name\u003e-ssh-publickey` subnet-id - Choose a subnet that is attached to an Internet Gateway and NAT Gateway (bastion instance must have a public IP). - The Gardener created public subnet with the name \u003cshoot-name\u003e-public-utility-\u003cxy\u003e can be used. Please check the created subnets with the cloud provider.\nbastion-security-group-id - Use the id of the created bastion security group.\nfile-path-to-userdata - Use the filepath to the user-data file generated in the previous step.\n bastion-instance-name Optionaly, you can tag the instance. Usually \u003cshoot-name\u003e-bastions Create the bastion instance via: ec2 run-instances --iam-instance-profile Name=\u003cbastion-IAM-instance-profile-name\u003e --image-id \u003cimage-id\u003e --count 1 --instance-type t3.nano --key-name \u003cssh-public-key-name\u003e --security-group-ids \u003cbastion-security-group-id\u003e --subnet-id \u003csubnet-id\u003e --associate-public-ip-address --user-data \u003cfile-path-to-userdata\u003e --tag-specifications ResourceType=instance,Tags=[{Key=Name,Value=\u003cbastion-instance-name\u003e},{Key=component,Value=\u003cmytag\u003e}] ResourceType=volume,Tags=[{Key=component,Value=\u003cmytag\u003e}]\" Capture the instance-id from the response and wait until the ec2 instance is running and has a public IP address.\nConnecting to the Target Instance Save the private key of the ssh-key-pair in a temporary local file for later use: umask 077 kubectl get secret \u003cshoot-name\u003e.ssh-keypair -o json | jq -r .data.\\\"id_rsa\\\" | base64 -d \u003e id_rsa.key Use the private ssh key to ssh into the bastion instance: ssh -i \u003cpath-to-private-key\u003e gardener@\u003cpublic-bastion-instance-ip\u003e  If that works, connect from your local terminal to the target instance via the bastion: ssh -i \u003cpath-to-private-key\u003e -o ProxyCommand=\"ssh -W %h:%p -i \u003cprivate-key\u003e -o IdentitiesOnly=yes -o StrictHostKeyChecking=no gardener@\u003cpublic-ip-bastion\u003e\" gardener@\u003cprivate-ip-target-instance\u003e -o IdentitiesOnly=yes -o StrictHostKeyChecking=no Cleanup Do not forget to cleanup the created resources. Otherwise Gardener will eventually fail to delete the Shoot.\n","categories":"","description":"Describes the methods for getting shell access to worker nodes","excerpt":"Describes the methods for getting shell access to worker nodes","ref":"/docs/guides/monitoring-and-troubleshooting/shell-to-node/","tags":"","title":"Get a Shell to a Gardener Shoot Worker Node"},{"body":"Deploying Rsyslog Relp Extension Locally This document will walk you through running the Rsyslog Relp extension and a fake rsyslog relp service on your local machine for development purposes. This guide uses Gardener’s local development setup and builds on top of it.\nIf you encounter difficulties, please open an issue so that we can make this process easier.\nPrerequisites Make sure that you have a running local Gardener setup. The steps to complete this can be found here. Make sure you are running Gardener version \u003e= 1.74.0 or the latest version of the master branch. Setting up the Rsyslog Relp Extension Important: Make sure that your KUBECONFIG env variable is targeting the local Gardener cluster!\nmake extension-up This will build the shoot-rsyslog-relp, shoot-rsyslog-relp-admission, and shoot-rsyslog-relp-echo-server images and deploy the needed resources and configurations in the garden cluster. The shoot-rsyslog-relp-echo-server will act as development replacement of a real rsyslog relp server.\nCreating a Shoot Cluster Once the above step is completed, we can deploy and configure a Shoot cluster with default rsyslog relp settings.\nkubectl apply -f ./example/shoot.yaml Once the Shoot’s namespace is created, we can create a networkpolicy that will allow egress traffic from the rsyslog on the Shoot’s nodes to the rsyslog-relp-echo-server that serves as a fake rsyslog target server.\nkubectl apply -f ./example/local/allow-machine-to-rsyslog-relp-echo-server-netpol.yaml Currently, the Shoot’s nodes run Ubuntu, which does not have the rsyslog-relp and auditd packages installed, so the configuration done by the extension has no effect. Once the Shoot is created, we have to manually install the rsyslog-relp and auditd packages:\nkubectl -n shoot--local--local exec -it $(kubectl -n shoot--local--local get po -l app=machine,machine-provider=local -o name) -- bash -c \" apt-get update \u0026\u0026 \\ apt-get install -y rsyslog-relp auditd \u0026\u0026 \\ systemctl enable rsyslog.service \u0026\u0026 \\ systemctl start rsyslog.service\" Once that is done we can verify that log messages are forwarded to the rsyslog-relp-echo-server by checking its logs.\nkubectl -n rsyslog-relp-echo-server logs deployment/rsyslog-relp-echo-server Making Changes to the Rsyslog Relp Extension Changes to the rsyslog relp extension can be applied to the local environment by repeatedly running the make recipe.\nmake extension-up Tearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the shoot-rsyslog-relp extension in the Shoot’s spec. When the extension is not used by the Shoot anymore, you can run:\nmake extension-down This will delete the ControllerRegistration and ControllerDeployment of the extension, the shoot-rsyslog-relp-admission deployment, and the rsyslog-relp-echo-server deployment.\nMaintaining the Publicly Available Image for the rsyslog-relp Echo Server The testmachinery tests use an rsyslog-relp-echo-server image from a publicly available repository. The one which is currently used is eu.gcr.io/gardener-project/gardener/extensions/shoot-rsyslog-relp-echo-server:v0.1.0.\nSometimes it might be necessary to update the image and publish it, e.g. when updating the alpine base image version specified in the repository’s Dokerfile.\nTo do that:\n Bump the version with which the image is built in the Makefile.\n Build the shoot-rsyslog-relp-echo-server image:\nmake echo-server-docker-image Once the image is built, push it to gcr with:\nmake push-echo-server-image Finally, bump the version of the image used by the testmachinery tests here.\n Create a PR with the changes.\n ","categories":"","description":"","excerpt":"Deploying Rsyslog Relp Extension Locally This document will walk you …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/getting-started/","tags":"","title":"Getting Started"},{"body":"Deploying Gardener Locally This document will walk you through deploying Gardener on your local machine. If you encounter difficulties, please open an issue so that we can make this process easier.\nOverview Gardener runs in any Kubernetes cluster. In this guide, we will start a KinD cluster which is used as both garden and seed cluster (please refer to the architecture overview) for simplicity.\nBased on Skaffold, the container images for all required components will be built and deployed into the cluster (via their Helm charts).\nAlternatives When deploying Gardener on your local machine you might face several limitations:\n Your machine doesn’t have enough compute resources (see prerequisites) for hosting a second seed cluster or multiple shoot clusters. Testing Gardener’s IPv6 features requires a Linux machine and native IPv6 connectivity to the internet, but you’re on macOS or don’t have IPv6 connectivity in your office environment or via your home ISP. In these cases, you might want to check out one of the following options that run the setup described in this guide elsewhere for circumventing these limitations:\n remote local setup: deploy on a remote pod for more compute resources dev box on Google Cloud: deploy on a Google Cloud machine for more compute resource and/or simple IPv4/IPv6 dual-stack networking Prerequisites Make sure that you have followed the Local Setup guide up until the Get the sources step. Make sure your Docker daemon is up-to-date, up and running and has enough resources (at least 8 CPUs and 8Gi memory; see here how to configure the resources for Docker for Mac). Please note that 8 CPU / 8Gi memory might not be enough for more than two Shoot clusters, i.e., you might need to increase these values if you want to run additional Shoots. If you plan on following the optional steps to create a second seed cluster, the required resources will be more - at least 10 CPUs and 18Gi memory. Additionally, please configure at least 120Gi of disk size for the Docker daemon. Tip: You can clean up unused data with docker system df and docker system prune -a.\n Setting Up the KinD Cluster (Garden and Seed) make kind-up If you want to setup an IPv6 KinD cluster, use make kind-up IPFAMILY=ipv6 instead.\n This command sets up a new KinD cluster named gardener-local and stores the kubeconfig in the ./example/gardener-local/kind/local/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/local/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig.\n All of the following steps assume that you are using this kubeconfig.\nAdditionally, this command also deploys a local container registry to the cluster, as well as a few registry mirrors, that are set up as a pull-through cache for all upstream registries Gardener uses by default. This is done to speed up image pulls across local clusters. The local registry can be accessed as localhost:5001 for pushing and pulling. The storage directories of the registries are mounted to the host machine under dev/local-registry. With this, mirrored images don’t have to be pulled again after recreating the cluster.\nThe command also deploys a default calico installation as the cluster’s CNI implementation with NetworkPolicy support (the default kindnet CNI doesn’t provide NetworkPolicy support). Furthermore, it deploys the metrics-server in order to support HPA and VPA on the seed cluster.\nSetting Up IPv6 Single-Stack Networking (optional) First, ensure that your /etc/hosts file contains an entry resolving localhost to the IPv6 loopback address:\n::1 localhost Typically, only ip6-localhost is mapped to ::1 on linux machines. However, we need localhost to resolve to both 127.0.0.1 and ::1 so that we can talk to our registry via a single address (localhost:5001).\nNext, we need to configure NAT for outgoing traffic from the kind network to the internet. After executing make kind-up IPFAMILY=ipv6, execute the following command to set up the corresponding iptables rules:\nip6tables -t nat -A POSTROUTING -o $(ip route show default | awk '{print $5}') -s fd00:10::/64 -j MASQUERADE Setting Up Gardener make gardener-up If you want to setup an IPv6 ready Gardener, use make gardener-up IPFAMILY=ipv6 instead.\n This will first build the base images (which might take a bit if you do it for the first time). Afterwards, the Gardener resources will be deployed into the cluster.\nDeveloping Gardener make gardener-dev This is similar to make gardener-up but additionally starts a skaffold dev loop. After the initial deployment, skaffold starts watching source files. Once it has detected changes, press any key to trigger a new build and deployment of the changed components.\nTip: you can set the SKAFFOLD_MODULE environment variable to select specific modules of the skaffold configuration (see skaffold.yaml) that skaffold should watch, build, and deploy. This significantly reduces turnaround times during development.\nFor example, if you want to develop changes to gardenlet:\n# initial deployment of all components make gardener-up # start iterating on gardenlet without deploying other components make gardener-dev SKAFFOLD_MODULE=gardenlet Debugging Gardener make gardener-debug This is using skaffold debugging features. In the Gardener case, Go debugging using Delve is the most relevant use case. Please see the skaffold debugging documentation how to setup your IDE accordingly.\nSKAFFOLD_MODULE environment variable is working the same way as described for Developing Gardener. However, skaffold is not watching for changes when debugging, because it would like to avoid interrupting your debugging session.\nFor example, if you want to debug gardenlet:\n# initial deployment of all components make gardener-up # start debugging gardenlet without deploying other components make gardener-debug SKAFFOLD_MODULE=gardenlet In debugging flow, skaffold builds your container images, reconfigures your pods and creates port forwardings for the Delve debugging ports to your localhost. The default port is 56268. If you debug multiple pods at the same time, the port of the second pod will be forwarded to 56269 and so on. Please check your console output for the concrete port-forwarding on your machine.\n Note: Resuming or stopping only a single goroutine (Go Issue 25578, 31132) is currently not supported, so the action will cause all the goroutines to get activated or paused. (vscode-go wiki)\n This means that when a goroutine of gardenlet (or any other gardener-core component you try to debug) is paused on a breakpoint, all the other goroutines are paused. Hence, when the whole gardenlet process is paused, it can not renew its lease and can not respond to the liveness and readiness probes. Skaffold automatically increases timeoutSeconds of liveness and readiness probes to 600. Anyway, we were facing problems when debugging that pods have been killed after a while.\nThus, leader election, health and readiness checks for gardener-admission-controller, gardener-apiserver, gardener-controller-manager, gardener-scheduler,gardenlet and operator are disabled when debugging.\nIf you have similar problems with other components which are not deployed by skaffold, you could temporarily turn off the leader election and disable liveness and readiness probes there too.\nCreating a Shoot Cluster You can wait for the Seed to be ready by running:\n./hack/usage/wait-for.sh seed local GardenletReady SeedSystemComponentsHealthy ExtensionsReady Alternatively, you can run kubectl get seed local and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION local Ready local local 4m42s vX.Y.Z-dev v1.25.1 In order to create a first shoot cluster, just run:\nkubectl apply -f example/provider-local/shoot.yaml You can wait for the Shoot to be ready by running:\nNAMESPACE=garden-local ./hack/usage/wait-for.sh shoot local APIServerAvailable ControlPlaneHealthy ObservabilityComponentsHealthy EveryNodeReady SystemComponentsHealthy Alternatively, you can run kubectl -n garden-local get shoot local and wait for the LAST OPERATION to reach 100%:\nNAME CLOUDPROFILE PROVIDER REGION K8S VERSION HIBERNATION LAST OPERATION STATUS AGE local local local local 1.25.1 Awake Create Processing (43%) healthy 94s If you don’t need any worker pools, you can create a workerless Shoot by running:\nkubectl apply -f example/provider-local/shoot-workerless.yaml (Optional): You could also execute a simple e2e test (creating and deleting a shoot) by running:\nmake test-e2e-local-simple KUBECONFIG=\"$PWD/example/gardener-local/kind/local/kubeconfig\" Accessing the Shoot Cluster ⚠️ Please note that in this setup, shoot clusters are not accessible by default when you download the kubeconfig and try to communicate with them. The reason is that your host most probably cannot resolve the DNS names of the clusters since provider-local extension runs inside the KinD cluster (for more details, see DNSRecord). Hence, if you want to access the shoot cluster, you have to run the following command which will extend your /etc/hosts file with the required information to make the DNS names resolvable:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Begin of Gardener local setup section # Shoot API server domains 172.18.255.1 api.local.local.external.local.gardener.cloud 172.18.255.1 api.local.local.internal.local.gardener.cloud # Ingress 172.18.255.1 p-seed.ingress.local.seed.local.gardener.cloud 172.18.255.1 g-seed.ingress.local.seed.local.gardener.cloud 172.18.255.1 gu-local--local.ingress.local.seed.local.gardener.cloud 172.18.255.1 p-local--local.ingress.local.seed.local.gardener.cloud 172.18.255.1 v-local--local.ingress.local.seed.local.gardener.cloud # E2e tests 172.18.255.1 api.e2e-managedseed.garden.external.local.gardener.cloud 172.18.255.1 api.e2e-managedseed.garden.internal.local.gardener.cloud 172.18.255.1 api.e2e-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-hib-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-hib-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-unpriv.local.external.local.gardener.cloud 172.18.255.1 api.e2e-unpriv.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-ncp.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-ncp.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-migrate.local.external.local.gardener.cloud 172.18.255.1 api.e2e-migrate.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-migrate-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-migrate-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-mgr-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-mgr-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-rotate.local.external.local.gardener.cloud 172.18.255.1 api.e2e-rotate.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-rotate-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-rotate-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-default.local.external.local.gardener.cloud 172.18.255.1 api.e2e-default.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-default-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-default-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-force-delete.local.external.local.gardener.cloud 172.18.255.1 api.e2e-force-delete.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-fd-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-fd-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upd-node.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upd-node.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upd-node-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upd-node-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upgrade.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upgrade.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upgrade-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upgrade-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib-wl.local.internal.local.gardener.cloud 172.18.255.1 gu-local--e2e-rotate.ingress.local.seed.local.gardener.cloud 172.18.255.1 gu-local--e2e-rotate-wl.ingress.local.seed.local.gardener.cloud # End of Gardener local setup section EOF To access the Shoot, you can acquire a kubeconfig by using the shoots/adminkubeconfig subresource.\nFor convenience a helper script is provided in the hack directory. By default the script will generate a kubeconfig for a Shoot named “local” in the garden-local namespace valid for one hour.\n./hack/usage/generate-admin-kubeconf.sh \u003e admin-kubeconf.yaml If you want to change the default namespace or shoot name, you can do so by passing different values as arguments.\n./hack/usage/generate-admin-kubeconf.sh --namespace \u003cnamespace\u003e --shoot-name \u003cshootname\u003e \u003e admin-kubeconf.yaml To access an Ingress resource from the Seed, use the Ingress host with port 8448 (https://\u003cingress-host\u003e:8448, for example https://gu-local--local.ingress.local.seed.local.gardener.cloud:8448).\n(Optional): Setting Up a Second Seed Cluster There are cases where you would want to create a second seed cluster in your local setup. For example, if you want to test the control plane migration feature. The following steps describe how to do that.\nIf you are on macOS, add a new IP address on your loopback device which will be necessary for the new KinD cluster that you will create. On macOS, the default loopback device is lo0.\nsudo ip addr add 172.18.255.2 dev lo0 # adding 172.18.255.2 ip to the loopback interface Next, setup the second KinD cluster:\nmake kind2-up This command sets up a new KinD cluster named gardener-local2 and stores its kubeconfig in the ./example/gardener-local/kind/local2/kubeconfig file.\nIn order to deploy required resources in the KinD cluster that you just created, run:\nmake gardenlet-kind2-up The following steps assume that you are using the kubeconfig that points to the gardener-local cluster (first KinD cluster): export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig.\nYou can wait for the local2 Seed to be ready by running:\n./hack/usage/wait-for.sh seed local2 GardenletReady SeedSystemComponentsHealthy ExtensionsReady Alternatively, you can run kubectl get seed local2 and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION local2 Ready local local 4m42s vX.Y.Z-dev v1.25.1 If you want to perform control plane migration, you can follow the steps outlined in Control Plane Migration to migrate the shoot cluster to the second seed you just created.\nDeleting the Shoot Cluster ./hack/usage/delete shoot local garden-local (Optional): Tear Down the Second Seed Cluster make kind2-down Tear Down the Gardener Environment make kind-down Alternative Way to Set Up Garden and Seed Leveraging gardener-operator Instead of starting Garden and Seed via make kind-up gardener-up, you can also use gardener-operator to create your local dev landscape. In this setup, the virtual garden cluster has its own load balancer, so you have to create an own DNS entry in your /etc/hosts:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Manually created to access local Gardener virtual garden cluster. # TODO: Remove this again when the virtual garden cluster access is no longer required. 172.18.255.3 api.virtual-garden.local.gardener.cloud EOF You can bring up gardener-operator with this command:\nmake kind-operator-up operator-up Afterwards, you can create your local Garden and install gardenlet into the KinD cluster with this command:\nmake operator-seed-up You find the kubeconfig for the KinD cluster at ./example/gardener-local/kind/operator/kubeconfig. The one for the virtual garden is accessible at ./example/operator/virtual-garden/kubeconfig.\n [!IMPORTANT] When you create non-HA shoot clusters (i.e., Shoots with .spec.controlPlane.highAvailability.failureTolerance != zone), then they are not exposed via 172.18.255.1 (ref). Instead, you need to find out under which Istio instance they got exposed, and put the corresponding IP address into your /etc/hosts file:\n# replace \u003cshoot-namespace\u003e with your shoot namespace (e.g., `shoot--foo--bar`): kubectl -n \"$(kubectl -n \u003cshoot-namespace\u003e get gateway kube-apiserver -o jsonpath={.spec.selector.istio} | sed 's/.*--/istio-ingress--/')\" get svc istio-ingressgateway -o jsonpath={.status.loadBalancer.ingress..ip} When the shoot cluster is HA (i.e., .spec.controlPlane.highAvailability.failureTolerance == zone), then you can access it via 172.18.255.1.\n Please use this command to tear down your environment:\nmake kind-operator-down This setup supports creating shoots and managed seeds the same way as explained in the previous chapters. However, the development and debugging setups are not working yet.\nRemote Local Setup Just like Prow is executing the KinD based integration tests in a K8s pod, it is possible to interactively run this KinD based Gardener development environment, aka “local setup”, in a “remote” K8s pod.\nk apply -f docs/deployment/content/remote-local-setup.yaml k exec -it remote-local-setup-0 -- sh tmux a Caveats Please refer to the TMUX documentation for working effectively inside the remote-local-setup pod.\nTo access Plutono, Prometheus or other components in a browser, two port forwards are needed:\nThe port forward from the laptop to the pod:\nk port-forward remote-local-setup-0 3000 The port forward in the remote-local-setup pod to the respective component:\nk port-forward -n shoot--local--local deployment/plutono 3000 Related Links Local Provider Extension ","categories":"","description":"","excerpt":"Deploying Gardener Locally This document will walk you through …","ref":"/docs/gardener/deployment/getting_started_locally/","tags":"","title":"Getting Started Locally"},{"body":"Developing Gardener Locally This document explains how to setup a kind based environment for developing Gardener locally.\nFor the best development experience you should especially check the Developing Gardener section.\nIn case you plan a debugging session please check the Debugging Gardener section.\n","categories":"","description":"","excerpt":"Developing Gardener Locally This document explains how to setup a kind …","ref":"/docs/gardener/getting_started_locally/","tags":"","title":"Getting Started Locally"},{"body":"Etcd-Druid Local Setup This page aims to provide steps on how to setup Etcd-Druid locally with and without storage providers.\nClone the etcd-druid github repo # clone the repo git clone https://github.com/gardener/etcd-druid.git # cd into etcd-druid folder cd etcd-druid Note:\n Etcd-druid uses kind as it’s local Kubernetes engine. The local setup is configured for kind due to its convenience but any other kubernetes setup would also work. To set up etcd-druid with backups enabled on a LocalStack provider, refer this document In the section Annotate Etcd CR with the reconcile annotation, the flag --enable-etcd-spec-auto-reconcile is set to false, which means a special annotation is required on the Etcd CR, for etcd-druid to reconcile it. To disable this behavior and allow auto-reconciliation of the Etcd CR for any change in the Etcd spec, set the controllers.etcd.enableEtcdSpecAutoReconcile value to true in the values.yaml located at charts/druid/values.yaml. Or if etcd-druid is being run as a process, then while starting the process, set the CLI flag --enable-etcd-spec-auto-reconcile=true for it. Setting up the kind cluster # Create a kind cluster make kind-up This creates a new kind cluster and stores the kubeconfig in the ./hack/e2e-test/infrastructure/kind/kubeconfig file.\nTo target this newly created cluster, set the KUBECONFIG environment variable to the kubeconfig file located at ./hack/e2e-test/infrastructure/kind/kubeconfig by using the following\nexport KUBECONFIG=$PWD/hack/e2e-test/infrastructure/kind/kubeconfig Setting up etcd-druid Either one of these commands may be used to deploy etcd-druid to the configured k8s cluster.\n The following command deploys etcd-druid to the configured k8s cluster:\nmake deploy The following command deploys etcd-druid to the configured k8s cluster using Skaffold dev mode, such that changes in the etcd-druid code are automatically picked up and applied to the deployment. This helps with local development and quick iterative changes:\nmake deploy-dev The following command deploys etcd-druid to the configured k8s cluster using Skaffold debug mode, so that a debugger can be attached to the running etcd-druid deployment. Please refer to this guide for more information on Skaffold-based debugging:\nmake deploy-debug This generates the Etcd and EtcdCopyBackupsTask CRDs and deploys an etcd-druid pod into the cluster.\n Note: Before calling any of the make deploy* commands, certain environment variables may be set in order to enable/disable certain functionalities of etcd-druid. These are:\n DRUID_ENABLE_ETCD_COMPONENTS_WEBHOOK=true : enables the etcdcomponents webhook DRUID_E2E_TEST=true : sets specific configuration for etcd-druid for optimal e2e test runs, like a lower sync period for the etcd controller. USE_ETCD_DRUID_FEATURE_GATES=false : enables etcd-druid feature gates. Prepare the Etcd CR Etcd CR can be configured in 2 ways. Either to take backups to the store or disable them. Follow the appropriate section below based on the requirement.\nThe Etcd CR can be found at this location $PWD/config/samples/druid_v1alpha1_etcd.yaml\n Without Backups enabled\nTo set up etcd-druid without backups enabled, make sure the spec.backup.store section of the Etcd CR is commented out.\n With Backups enabled (On Cloud Provider Object Stores)\n Prepare the secret\nCreate a secret for cloud provider access. Find the secret yaml templates for different cloud providers here.\nReplace the dummy values with the actual configurations and make sure to add a name and a namespace to the secret as intended.\n Note 1: The secret should be applied in the same namespace as druid.\nNote 2: All the values in the data field of secret yaml should be in base64 encoded format.\n Apply the secret\nkubectl apply -f path/to/secret Adapt Etcd resource\nUncomment the spec.backup.store section of the druid yaml and set the keys to allow backuprestore to take backups by connecting to an object store.\n# Configuration for storage provider store: secretRef: name: etcd-backup-secret-name container: object-storage-container-name provider: aws # options: aws,azure,gcp,openstack,alicloud,dell,openshift,local prefix: etcd-test Brief explanation of keys:\n secretRef.name is the name of the secret that was applied as mentioned above store.container is the object storage bucket name store.provider is the bucket provider. Pick from the options mentioned in comment store.prefix is the folder name that you want to use for your snapshots inside the bucket. Applying the Etcd CR Note: With backups enabled, make sure the bucket is created in corresponding cloud provider before applying the Etcd yaml\n Create the Etcd CR (Custom Resource) by applying the Etcd yaml to the cluster\n# Apply the prepared etcd CR yaml kubectl apply -f config/samples/druid_v1alpha1_etcd.yaml Verify the Etcd cluster To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the cluster size, readiness status of its members, and various other attributes.\nkubectl get etcd -o=wide Verify Etcd Member Pods To check the etcd member pods, do the following and look out for pods starting with the name etcd-\nkubectl get pods Verify Etcd Pods’ Functionality Verify the working conditions of the etcd pods by putting data through a etcd container and access the db from same/another container depending on single/multi node etcd cluster.\nIdeally, you can exec into the etcd container using kubectl exec -it \u003cetcd_pod\u003e -c etcd -- bash if it utilizes a base image containing a shell. However, note that the etcd-wrapper Docker image employs a distroless image, which lacks a shell. To interact with etcd, use an Ephemeral container as a debug container. Refer to this documentation for building and using an ephemeral container by attaching it to the etcd container.\n# Put a key-value pair into the etcd etcdctl put key1 value1 # Retrieve all key-value pairs from the etcd db etcdctl get --prefix \"\" For a multi-node etcd cluster, insert the key-value pair from the etcd container of one etcd member and retrieve it from the etcd container of another member to verify consensus among the multiple etcd members.\nView Etcd Database File The Etcd database file is located at var/etcd/data/new.etcd/snap/db inside the backup-restore container. In versions with an alpine base image, you can exec directly into the container. However, in recent versions where the backup-restore docker image started using a distroless image, a debug container is required to communicate with it, as mentioned in the previous section.\nUpdating the Etcd CR The Etcd spec can be updated with new changes, such as etcd cluster configuration or backup-restore configuration, and etcd-druid will reconcile these changes as expected, under certain conditions:\n If the --enable-etcd-spec-auto-reconcile flag is set to true, the spec change is automatically picked up and reconciled by etcd-druid. If the --enable-etcd-spec-auto-reconcile flag is unset, or set to false, then etcd-druid will expect an additional annotation gardener.cloud/operation: reconcile on the Etcd resource in order to pick it up for reconciliation. Upon successful reconciliation, this annotation is removed by etcd-druid. The annotation can be added as follows: # Annotate etcd-test CR to reconcile kubectl annotate etcd etcd-test gardener.cloud/operation=\"reconcile\" Cleaning the setup # Delete the cluster make kind-down This cleans up the entire setup as the kind cluster gets deleted. It deletes the created Etcd, all pods that got created along the way and also other resources such as statefulsets, services, PV’s, PVC’s, etc.\n","categories":"","description":"","excerpt":"Etcd-Druid Local Setup This page aims to provide steps on how to setup …","ref":"/docs/other-components/etcd-druid/getting-started-locally/","tags":"","title":"Getting Started Locally"},{"body":"Getting started with etcd-druid using Azurite, and kind This document is a step-by-step guide to run etcd-druid with Azurite, the Azure Blob Storage emulator, within a kind cluster. This setup is ideal for local development and testing.\nPrerequisites Docker with the daemon running, or Docker Desktop running. Azure CLI (\u003e=2.55.0) Environment setup Step 1: Provisioning the kind cluster Execute the command below to provision a kind cluster. This command also forwards port 10000 from the kind cluster to your local machine, enabling Azurite access:\nmake kind-up Export the KUBECONFIG file after running the above command.\nStep 2: Deploy Azurite To start up the Azurite emulator in a pod in the kind cluster, run:\nmake deploy-azurite Step 3: Set up a ABS Container To use the Azure CLI with the Azurite emulator running as a pod in the kind cluster, export the connection string for the Azure CLI. export AZURE_STORAGE_CONNECTION_STRING=\"DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;\" Create a Azure Blob Storage Container in Azurite az storage container create -n etcd-bucket Step 4: Deploy etcd-druid make deploy Step 5: Configure the Secret and the Etcd manifests Apply the Kubernetes Secret manifest through: kubectl apply -f config/samples/etcd-secret-azurite.yaml Apply the Etcd manifest through: kubectl apply -f config/samples/druid_v1alpha1_etcd_azurite.yaml Step 6 : Make use of the Azurite emulator however you wish etcd-backup-restore will now use Azurite running in kind as the remote store to store snapshots if all the previous steps were followed correctly.\nCleanup make kind-down unset AZURE_STORAGE_CONNECTION_STRING KUBECONFIG ","categories":"","description":"","excerpt":"Getting started with etcd-druid using Azurite, and kind This document …","ref":"/docs/other-components/etcd-druid/getting-started-locally-azurite/","tags":"","title":"Getting Started Locally Azurite"},{"body":"Getting Started with etcd-druid, LocalStack, and Kind This guide provides step-by-step instructions on how to set up etcd-druid with LocalStack and Kind on your local machine. LocalStack emulates AWS services locally, which allows the etcd cluster to interact with AWS S3 without the need for an actual AWS connection. This setup is ideal for local development and testing.\nPrerequisites Docker (installed and running) AWS CLI (version \u003e=1.29.0 or \u003e=2.13.0) Environment Setup Step 1: Provision the Kind Cluster Execute the command below to provision a kind cluster. This command also forwards port 4566 from the kind cluster to your local machine, enabling LocalStack access:\nmake kind-up Step 2: Deploy LocalStack Deploy LocalStack onto the Kubernetes cluster using the command below:\nmake deploy-localstack Step 3: Set up an S3 Bucket Set up the AWS CLI to interact with LocalStack by setting the necessary environment variables. This configuration redirects S3 commands to the LocalStack endpoint and provides the required credentials for authentication: export AWS_ENDPOINT_URL_S3=\"http://localhost:4566\" export AWS_ACCESS_KEY_ID=ACCESSKEYAWSUSER export AWS_SECRET_ACCESS_KEY=sEcreTKey export AWS_DEFAULT_REGION=us-east-2 Create an S3 bucket for etcd-druid backup purposes: aws s3api create-bucket --bucket etcd-bucket --region us-east-2 --create-bucket-configuration LocationConstraint=us-east-2 --acl private Step 4: Deploy etcd-druid Deploy etcd-druid onto the Kind cluster using the command below:\nmake deploy Step 5: Configure etcd with LocalStack Store Apply the required Kubernetes manifests to create an etcd custom resource (CR) and a secret for AWS credentials, facilitating LocalStack access:\nexport KUBECONFIG=hack/e2e-test/infrastructure/kind/kubeconfig kubectl apply -f config/samples/druid_v1alpha1_etcd_localstack.yaml -f config/samples/etcd-secret-localstack.yaml Step 6: Reconcile the etcd Initiate etcd reconciliation by annotating the etcd resource with the gardener.cloud/operation=reconcile annotation:\nkubectl annotate etcd etcd-test gardener.cloud/operation=reconcile Congratulations! You have successfully configured etcd-druid, LocalStack, and kind on your local machine. Inspect the etcd-druid logs and LocalStack to ensure the setup operates as anticipated.\nTo validate the buckets, execute the following command:\naws s3 ls etcd-bucket/etcd-test/v2/ Cleanup To dismantle the setup, execute the following command:\nmake kind-down unset AWS_ENDPOINT_URL_S3 AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION KUBECONFIG ","categories":"","description":"","excerpt":"Getting Started with etcd-druid, LocalStack, and Kind This guide …","ref":"/docs/other-components/etcd-druid/getting-started-locally-localstack/","tags":"","title":"Getting Started Locally Localstack"},{"body":"Deploying Gardener Locally and Enabling Provider-Extensions This document will walk you through deploying Gardener on your local machine and bootstrapping your own seed clusters on an existing Kubernetes cluster. It is supposed to run your local Gardener developments on a real infrastructure. For running Gardener only entirely local, please check the getting started locally documentation. If you encounter difficulties, please open an issue so that we can make this process easier.\nOverview Gardener runs in any Kubernetes cluster. In this guide, we will start a KinD cluster which is used as garden cluster. Any Kubernetes cluster could be used as seed clusters in order to support provider extensions (please refer to the architecture overview). This guide is tested for using Kubernetes clusters provided by Gardener, AWS, Azure, and GCP as seed so far.\nBased on Skaffold, the container images for all required components will be built and deployed into the clusters (via their Helm charts).\nPrerequisites Make sure that you have followed the Local Setup guide up until the Get the sources step. Make sure your Docker daemon is up-to-date, up and running and has enough resources (at least 8 CPUs and 8Gi memory; see the Docker documentation for how to configure the resources for Docker for Mac). Additionally, please configure at least 120Gi of disk size for the Docker daemon. Tip: You can clean up unused data with docker system df and docker system prune -a.\n Make sure that you have access to a Kubernetes cluster you can use as a seed cluster in this setup. The seed cluster requires at least 16 CPUs in total to run one shoot cluster You could use any Kubernetes cluster for your seed cluster. However, using a Gardener shoot cluster for your seed simplifies some configuration steps. When bootstrapping gardenlet to the cluster, your new seed will have the same provider type as the shoot cluster you use - an AWS shoot will become an AWS seed, a GCP shoot will become a GCP seed, etc. (only relevant when using a Gardener shoot as seed). Provide Infrastructure Credentials and Configuration As this setup is running on a real infrastructure, you have to provide credentials for DNS, the infrastructure, and the kubeconfig for the Kubernetes cluster you want to use as seed.\n There are .gitignore entries for all files and directories which include credentials. Nevertheless, please double check and make sure that credentials are not committed to the version control system.\n DNS Gardener control plane requires DNS for default and internal domains. Thus, you have to configure a valid DNS provider for your setup.\nPlease maintain your DNS provider configuration and credentials at ./example/provider-extensions/garden/controlplane/domain-secrets.yaml.\nYou can find a template for the file at ./example/provider-extensions/garden/controlplane/domain-secrets.yaml.tmpl.\nInfrastructure Infrastructure secrets and the corresponding secret bindings should be maintained at:\n ./example/provider-extensions/garden/project/credentials/infrastructure-secrets.yaml ./example/provider-extensions/garden/project/credentials/secretbindings.yaml There are templates with .tmpl suffixes for the files in the same folder.\nProjects The projects and the namespaces associated with them should be maintained at ./example/provider-extensions/garden/project/project.yaml.\nYou can find a template for the file at ./example/provider-extensions/garden/project/project.yaml.tmpl.\nSeed Cluster Preparation The kubeconfig of your Kubernetes cluster you would like to use as seed should be placed at ./example/provider-extensions/seed/kubeconfig. Additionally, please maintain the configuration of your seed in ./example/provider-extensions/gardenlet/values.yaml. It is automatically copied from values.yaml.tmpl in the same directory when you run make gardener-extensions-up for the first time. It also includes explanations of the properties you should set.\nUsing a Gardener Shoot cluster as seed simplifies the process, because some configuration options can be taken from shoot-info and creating DNS entries and TLS certificates is automated.\nHowever, you can use different Kubernetes clusters for your seed too and configure these things manually. Please configure the options of ./example/provider-extensions/gardenlet/values.yaml upfront. For configuring DNS and TLS certificates, make gardener-extensions-up, which is explained later, will pause and tell you what to do.\nExternal Controllers You might plan to deploy and register external controllers for networking, operating system, providers, etc. Please put ControllerDeployments and ControllerRegistrations into the ./example/provider-extensions/garden/controllerregistrations directory. The whole content of this folder will be applied to your KinD cluster.\nCloudProfiles There are no demo CloudProfiles yet. Thus, please copy CloudProfiles from another landscape to the ./example/provider-extensions/garden/cloudprofiles directory or create your own CloudProfiles based on the gardener examples. Please check the GitHub repository of your desired provider-extension. Most of them include example CloudProfiles. All files you place in this folder will be applied to your KinD cluster.\nSetting Up the KinD Cluster make kind-extensions-up This command sets up a new KinD cluster named gardener-extensions and stores the kubeconfig in the ./example/gardener-local/kind/extensions/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/extensions/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/extensions/kubeconfig.\n All of the following steps assume that you are using this kubeconfig.\nAdditionally, this command deploys a local container registry to the cluster as well as a few registry mirrors that are set up as a pull-through cache for all upstream registries Gardener uses by default. This is done to speed up image pulls across local clusters. The local registry can be accessed as localhost:5001 for pushing and pulling. The storage directories of the registries are mounted to your machine under dev/local-registry. With this, mirrored images don’t have to be pulled again after recreating the cluster.\nThe command also deploys a default calico installation as the cluster’s CNI implementation with NetworkPolicy support (the default kindnet CNI doesn’t provide NetworkPolicy support). Furthermore, it deploys the metrics-server in order to support HPA and VPA on the seed cluster.\nSetting Up Gardener (Garden on KinD, Seed on Gardener Cluster) make gardener-extensions-up This will first prepare the basic configuration of your KinD and Gardener clusters. Afterwards, the images for the Garden cluster are built and deployed into the KinD cluster. Finally, the images for the Seed cluster are built, pushed to a container registry on the Seed, and the gardenlet is started.\nAdding Additional Seeds Additional seed(s) can be added by running\nmake gardener-extensions-up SEED_NAME=\u003cseed-name\u003e The seed cluster preparations are similar to the first seed:\nThe kubeconfig of your Kubernetes cluster you would like to use as seed should be placed at ./example/provider-extensions/seed/kubeconfig-\u003cseed-name\u003e. Additionally, please maintain the configuration of your seed in ./example/provider-extensions/gardenlet/values-\u003cseed-name\u003e.yaml. It is automatically copied from values.yaml.tmpl in the same directory when you run make gardener-extensions-up SEED_NAME=\u003cseed-name\u003e for the first time. It also includes explanations of the properties you should set.\nRemoving a Seed If you have multiple seeds and want to remove one, just use\nmake gardener-extensions-down SEED_NAME=\u003cseed-name\u003e If it is not the last seed, this command will only remove the seed, but leave the local Gardener cluster and the other seeds untouched. To remove all seeds and to cleanup the local Gardener cluster, you have to run the command for each seed.\nRotate credentials of container image registry in a Seed There is a container image registry in each Seed cluster where Gardener images required for the Seed and the Shoot nodes are pushed to. This registry is password protected. The password is generated when the Seed is deployed via make gardener-extensions-up. Afterward, it is not rotated automatically. Otherwise, this could break the update of gardener-node-agent, because it might not be able to pull its own new image anymore This is no general issue of gardener-node-agent, but a limitation provider-extensions setup. Gardener does not support protected container images out of the box. The function was added for this scenario only.\nHowever, if you want to rotate the credentials for any reason, there are two options for it.\n run make gardener-extensions-up (to ensure that your images are up-to-date) reconcile all shoots on the seed where you want to rotate the registry password run kubectl delete secrets -n registry registry-password on your seed cluster run make gardener-extensions-up reconcile the shoots again or\n reconcile all shoots on the seed where you want to rotate the registry password run kubectl delete secrets -n registry registry-password on your seed cluster run ./example/provider-extensions/registry-seed/deploy-registry.sh \u003cpath to seed kubeconfig\u003e \u003cseed registry hostname\u003e reconcile the shoots again Pause and Unpause the KinD Cluster The KinD cluster can be paused by stopping and keeping its docker container. This can be done by running:\nmake kind-extensions-down When you run make kind-extensions-up again, you will start the docker container with your previous Gardener configuration again.\nThis provides the option to switch off your local KinD cluster fast without leaving orphaned infrastructure elements behind.\nCreating a Shoot Cluster You can wait for the Seed to be ready by running:\nkubectl wait --for=condition=gardenletready seed provider-extensions --timeout=5m make kind-extensions-up already includes such a check. However, it might be useful when you wake up your Seed from hibernation or unpause you KinD cluster.\nAlternatively, you can run kubectl get seed provider-extensions and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION provider-extensions Ready gcp europe-west1 111m v1.61.0-dev v1.24.7 In order to create a first shoot cluster, please create your own Shoot definition and apply it to your KinD cluster. gardener-scheduler includes candidateDeterminationStrategy: MinimalDistance configuration so you are able to run schedule Shoots of different providers on your Seed.\nYou can wait for your Shoots to be ready by running kubectl -n garden-local get shoots and wait for the LAST OPERATION to reach 100%. The output depends on your Shoot definition. This is an example output:\nNAME CLOUDPROFILE PROVIDER REGION K8S VERSION HIBERNATION LAST OPERATION STATUS AGE aws aws aws eu-west-1 1.24.3 Awake Create Processing (43%) healthy 84s aws-arm64 aws aws eu-west-1 1.24.3 Awake Create Processing (43%) healthy 65s azure az azure westeurope 1.24.2 Awake Create Processing (43%) healthy 57s gcp gcp gcp europe-west1 1.24.3 Awake Create Processing (43%) healthy 94s Accessing the Shoot Cluster Your shoot clusters will have a public DNS entries for their API servers, so that they could be reached via the Internet via kubectl after you have created their kubeconfig.\nWe encourage you to use the adminkubeconfig subresource for accessing your shoot cluster. You can find an example how to use it in Accessing Shoot Clusters.\nDeleting the Shoot Clusters Before tearing down your environment, you have to delete your shoot clusters. This is highly recommended because otherwise you would leave orphaned items on your infrastructure accounts.\n./hack/usage/delete shoot \u003cyour-shoot\u003e garden-local Tear Down the Gardener Environment Before you delete your local KinD cluster, you should shut down your Shoots and Seed in a clean way to avoid orphaned infrastructure elements in your projects.\nPlease ensure that your KinD and Seed clusters are online (not paused or hibernated) and run:\nmake gardener-extensions-down This will delete all Shoots first (this could take a couple of minutes), then uninstall gardenlet from the Seed and the gardener components from the KinD. Finally, the additional components like container registry, etc., are deleted from both clusters.\nWhen this is done, you can securely delete your local KinD cluster by running:\nmake kind-extensions-clean ","categories":"","description":"","excerpt":"Deploying Gardener Locally and Enabling Provider-Extensions This …","ref":"/docs/gardener/deployment/getting_started_locally_with_extensions/","tags":"","title":"Getting Started Locally With Extensions"},{"body":"Disclaimer Be aware, that the following sections might be opinionated. Kubernetes, and the GPU support in particular, are rapidly evolving, which means that this guide is likely to be outdated sometime soon. For this reason, contributions are highly appreciated to update this guide.\nCreate a Cluster First thing first, let’s create a Kubernetes (K8s) cluster with GPU accelerated nodes. In this example we will use an AWS p2.xlarge EC2 instance because it’s the cheapest available option at the moment. Use such cheap instances for learning to limit your resource costs. This costs around 1€/hour per GPU\nInstall NVidia Driver as Daemonset apiVersion: apps/v1 kind: DaemonSet metadata: name: nvidia-driver-installer namespace: kube-system labels: k8s-app: nvidia-driver-installer spec: selector: matchLabels: name: nvidia-driver-installer k8s-app: nvidia-driver-installer template: metadata: labels: name: nvidia-driver-installer k8s-app: nvidia-driver-installer spec: hostPID: true initContainers: - image: squat/modulus:4a1799e7aa0143bcbb70d354bab3e419b1f54972 name: modulus args: - compile - nvidia - \"410.104\" securityContext: privileged: true env: - name: MODULUS_CHROOT value: \"true\" - name: MODULUS_INSTALL value: \"true\" - name: MODULUS_INSTALL_DIR value: /opt/drivers - name: MODULUS_CACHE_DIR value: /opt/modulus/cache - name: MODULUS_LD_ROOT value: /root - name: IGNORE_MISSING_MODULE_SYMVERS value: \"1\" volumeMounts: - name: etc-coreos mountPath: /etc/coreos readOnly: true - name: usr-share-coreos mountPath: /usr/share/coreos readOnly: true - name: ld-root mountPath: /root - name: module-cache mountPath: /opt/modulus/cache - name: module-install-dir-base mountPath: /opt/drivers - name: dev mountPath: /dev containers: - image: \"gcr.io/google-containers/pause:3.1\" name: pause tolerations: - key: \"nvidia.com/gpu\" effect: \"NoSchedule\" operator: \"Exists\" volumes: - name: etc-coreos hostPath: path: /etc/coreos - name: usr-share-coreos hostPath: path: /usr/share/coreos - name: ld-root hostPath: path: / - name: module-cache hostPath: path: /opt/modulus/cache - name: dev hostPath: path: /dev - name: module-install-dir-base hostPath: path: /opt/drivers Install Device Plugin apiVersion: apps/v1 kind: DaemonSet metadata: name: nvidia-gpu-device-plugin namespace: kube-system labels: k8s-app: nvidia-gpu-device-plugin #addonmanager.kubernetes.io/mode: Reconcile spec: selector: matchLabels: k8s-app: nvidia-gpu-device-plugin template: metadata: labels: k8s-app: nvidia-gpu-device-plugin annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: priorityClassName: system-node-critical volumes: - name: device-plugin hostPath: path: /var/lib/kubelet/device-plugins - name: dev hostPath: path: /dev containers: - image: \"k8s.gcr.io/nvidia-gpu-device-plugin@sha256:08509a36233c5096bb273a492251a9a5ca28558ab36d74007ca2a9d3f0b61e1d\" command: [\"/usr/bin/nvidia-gpu-device-plugin\", \"-logtostderr\", \"-host-path=/opt/drivers/nvidia\"] name: nvidia-gpu-device-plugin resources: requests: cpu: 50m memory: 10Mi limits: cpu: 50m memory: 10Mi securityContext: privileged: true volumeMounts: - name: device-plugin mountPath: /device-plugin - name: dev mountPath: /dev updateStrategy: type: RollingUpdate Test To run an example training on a GPU node, first start a base image with Tensorflow with GPU support \u0026 Keras:\napiVersion: apps/v1 kind: Deployment metadata: name: deeplearning-workbench namespace: default spec: replicas: 1 selector: matchLabels: app: deeplearning-workbench template: metadata: labels: app: deeplearning-workbench spec: containers: - name: deeplearning-workbench image: afritzler/deeplearning-workbench resources: limits: nvidia.com/gpu: 1 tolerations: - key: \"nvidia.com/gpu\" effect: \"NoSchedule\" operator: \"Exists\" Note the tolerations section above is not required if you deploy the ExtendedResourceToleration admission controller to your cluster. You can do this in the kubernetes section of your Gardener cluster shoot.yaml as follows:\n kubernetes: kubeAPIServer: admissionPlugins: - name: ExtendedResourceToleration Now exec into the container and start an example Keras training:\nkubectl exec -it deeplearning-workbench-8676458f5d-p4d2v -- /bin/bash cd /keras/example python imdb_cnn.py Related Links Andreas Fritzler from the Gardener Core team for the R\u0026D, who has provided this setup. Build and install NVIDIA driver on CoreOS Nvidia Device Plugin ","categories":"","description":"Setting up a GPU Enabled Cluster for Deep Learning","excerpt":"Setting up a GPU Enabled Cluster for Deep Learning","ref":"/docs/guides/administer-shoots/gpu/","tags":"","title":"GPU Enabled Cluster"},{"body":"GRPC based implementation of Cloud Providers - WIP Goal: Currently the Cloud Providers’ (CP) functionalities ( Create(), Delete(), List() ) are part of the Machine Controller Manager’s (MCM)repository. Because of this, adding support for new CPs into MCM requires merging code into MCM which may not be required for core functionalities of MCM itself. Also, for various reasons it may not be feasible for all CPs to merge their code with MCM which is an Open Source project.\nBecause of these reasons, it was decided that the CP’s code will be moved out in separate repositories so that they can be maintained separately by the respective teams. Idea is to make MCM act as a GRPC server, and CPs as GRPC clients. The CP can register themselves with the MCM using a GRPC service exposed by the MCM. Details of this approach is discussed below.\nHow it works: MCM acts as GRPC server and listens on a pre-defined port 5000. It implements below GRPC services. Details of each of these services are mentioned in next section.\n Register() GetMachineClass() GetSecret() GRPC services exposed by MCM: Register() rpc Register(stream DriverSide) returns (stream MCMside) {}\nThe CP GRPC client calls this service to register itself with the MCM. The CP passes the kind and the APIVersion which it implements, and MCM maintains an internal map for all the registered clients. A GRPC stream is returned in response which is kept open througout the life of both the processes. MCM uses this stream to communicate with the client for machine operations: Create(), Delete() or List(). The CP client is responsible for reading the incoming messages continuously, and based on the operationType parameter embedded in the message, it is supposed to take the required action. This part is already handled in the package grpc/infraclient. To add a new CP client, import the package, and implement the ExternalDriverProvider interface:\ntype ExternalDriverProvider interface { Create(machineclass *MachineClassMeta, credentials, machineID, machineName string) (string, string, error) Delete(machineclass *MachineClassMeta, credentials, machineID string) error List(machineclass *MachineClassMeta, credentials, machineID string) (map[string]string, error) } GetMachineClass() rpc GetMachineClass(MachineClassMeta) returns (MachineClass) {}\nAs part of the message from MCM for various machine operations, the name of the machine class is sent instead of the full machine class spec. The CP client is expected to use this GRPC service to get the full spec of the machine class. This optionally enables the client to cache the machine class spec, and make the call only if the machine calass spec is not already cached.\nGetSecret() rpc GetSecret(SecretMeta) returns (Secret) {}\nAs part of the message from MCM for various machine operations, the Cloud Config (CC) and CP credentials are not sent. The CP client is expected to use this GRPC service to get the secret which has CC and CP’s credentials from MCM. This enables the client to cache the CC and credentials, and to make the call only if the data is not already cached.\nHow to add a new Cloud Provider’s support Import the package grpc/infraclient and grpc/infrapb from MCM (currently in MCM’s “grpc-driver” branch)\n Implement the interface ExternalDriverProvider Create(): Creates a new machine Delete(): Deletes a machine List(): Lists machines Use the interface MachineClassDataProvider GetMachineClass(): Makes the call to MCM to get machine class spec GetSecret(): Makes the call to MCM to get secret containing Cloud Config and CP’s credentials Example implementation: Refer GRPC based implementation for AWS client: https://github.com/ggaurav10/aws-driver-grpc\n","categories":"","description":"","excerpt":"GRPC based implementation of Cloud Providers - WIP Goal: Currently the …","ref":"/docs/other-components/machine-controller-manager/proposals/external_providers_grpc/","tags":"","title":"GRPC Based Implementation of Cloud Providers"},{"body":"Gardener Extension for the gVisor Container Runtime Sandbox \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-10 (Additional Container Runtimes) Extensibility API documentation ","categories":"","description":"Gardener extension controller for the gVisor container runtime sandbox","excerpt":"Gardener extension controller for the gVisor container runtime sandbox","ref":"/docs/extensions/container-runtime-extensions/gardener-extension-runtime-gvisor/","tags":"","title":"GVisor container runtime"},{"body":"Health Check Library Goal Typically, an extension reconciles a specific resource (Custom Resource Definitions (CRDs)) and creates / modifies resources in the cluster (via helm, managed resources, kubectl, …). We call these API Objects ‘dependent objects’ - as they are bound to the lifecycle of the extension.\nThe goal of this library is to enable extensions to setup health checks for their ‘dependent objects’ with minimal effort.\nUsage The library provides a generic controller with the ability to register any resource that satisfies the extension object interface. An example is the Worker CRD.\nHealth check functions for commonly used dependent objects can be reused and registered with the controller, such as:\n Deployment DaemonSet StatefulSet ManagedResource (Gardener specific) See the below example taken from the provider-aws.\nhealth.DefaultRegisterExtensionForHealthCheck( aws.Type, extensionsv1alpha1.SchemeGroupVersion.WithKind(extensionsv1alpha1.WorkerResource), func() runtime.Object { return \u0026extensionsv1alpha1.Worker{} }, mgr, // controller runtime manager opts, // options for the health check controller nil, // custom predicates map[extensionshealthcheckcontroller.HealthCheck]string{ general.CheckManagedResource(genericactuator.McmShootResourceName): string(gardencorev1beta1.ShootSystemComponentsHealthy), general.CheckSeedDeployment(aws.MachineControllerManagerName): string(gardencorev1beta1.ShootEveryNodeReady), worker.SufficientNodesAvailable(): string(gardencorev1beta1.ShootEveryNodeReady), }) This creates a health check controller that reconciles the extensions.gardener.cloud/v1alpha1.Worker resource with the spec.type ‘aws’. Three health check functions are registered that are executed during reconciliation. Each health check is mapped to a single HealthConditionType that results in conditions with the same condition.type (see below). To contribute to the Shoot’s health, the following conditions can be used: SystemComponentsHealthy, EveryNodeReady, ControlPlaneHealthy, ObservabilityComponentsHealthy. In case of workerless Shoot the EveryNodeReady condition is not present, so it can’t be used.\nThe Gardener/Gardenlet checks each extension for conditions matching these types. However, extensions are free to choose any HealthConditionType. For more information, see Contributing to Shoot Health Status Conditions.\nA health check has to satisfy the below interface. You can find implementation examples in the healtcheck folder.\ntype HealthCheck interface { // Check is the function that executes the actual health check Check(context.Context, types.NamespacedName) (*SingleCheckResult, error) // InjectSeedClient injects the seed client InjectSeedClient(client.Client) // InjectShootClient injects the shoot client InjectShootClient(client.Client) // SetLoggerSuffix injects the logger SetLoggerSuffix(string, string) // DeepCopy clones the healthCheck DeepCopy() HealthCheck } The health check controller regularly (default: 30s) reconciles the extension resource and executes the registered health checks for the dependent objects. As a result, the controller writes condition(s) to the status of the extension containing the health check result. In our example, two checks are mapped to ShootEveryNodeReady and one to ShootSystemComponentsHealthy, leading to conditions with two distinct HealthConditionTypes (condition.type):\nstatus: conditions: - lastTransitionTime: \"20XX-10-28T08:17:21Z\" lastUpdateTime: \"20XX-11-28T08:17:21Z\" message: (1/1) Health checks successful reason: HealthCheckSuccessful status: \"True\" type: SystemComponentsHealthy - lastTransitionTime: \"20XX-10-28T08:17:21Z\" lastUpdateTime: \"20XX-11-28T08:17:21Z\" message: (2/2) Health checks successful reason: HealthCheckSuccessful status: \"True\" type: EveryNodeReady Please note that there are four statuses: True, False, Unknown, and Progressing.\n True should be used for successful health checks. False should be used for unsuccessful/failing health checks. Unknown should be used when there was an error trying to determine the health status. Progressing should be used to indicate that the health status did not succeed but for expected reasons (e.g., a cluster scale up/down could make the standard health check fail because something is wrong with the Machines, however, it’s actually an expected situation and known to be completed within a few minutes.) Health checks that report Progressing should also provide a timeout, after which this “progressing situation” is expected to be completed. The health check library will automatically transition the status to False if the timeout was exceeded.\nAdditional Considerations It is up to the extension to decide how to conduct health checks, though it is recommended to make use of the build-in health check functionality of managed-resources for trivial checks. By deploying the depending resources via managed resources, the gardener resource manager conducts basic checks for different API objects out-of-the-box (e.g Deployments, DaemonSets, …) - and writes health conditions.\nBy default, Gardener performs health checks for all the ManagedResources created in the shoot namespaces. Their status will be aggregated to the Shoot conditions according to the following rules:\n Health checks of ManagedResource with .spec.class=nil are aggregated to the SystemComponentsHealthy condition Health checks of ManagedResource with .spec.class!=nil are aggregated to the ControlPlaneHealthy condition unless the ManagedResource is labeled with care.gardener.cloud/condition-type=\u003cother-condition-type\u003e. In such case, it is aggregated to the \u003cother-condition-type\u003e. More sophisticated health checks should be implemented by the extension controller itself (implementing the HealthCheck interface).\n","categories":"","description":"","excerpt":"Health Check Library Goal Typically, an extension reconciles a …","ref":"/docs/gardener/extensions/healthcheck-library/","tags":"","title":"Healthcheck Library"},{"body":"Heartbeat Controller The heartbeat controller renews a dedicated Lease object named gardener-extension-heartbeat at regular 30 second intervals by default. This Lease is used for heartbeats similar to how gardenlet uses Lease objects for seed heartbeats (see gardenlet heartbeats).\nThe gardener-extension-heartbeat Lease can be checked by other controllers to verify that the corresponding extension controller is still running. Currently, gardenlet checks this Lease when performing shoot health checks and expects to find the Lease inside the namespace where the extension controller is deployed by the corresponding ControllerInstallation. For each extension resource deployed in the Shoot control plane, gardenlet finds the corresponding gardener-extension-heartbeat Lease resource and checks whether the Lease’s .spec.renewTime is older than the allowed threshold for stale extension health checks - in this case, gardenlet considers the health check report for an extension resource as “outdated” and reflects this in the Shoot status.\n","categories":"","description":"","excerpt":"Heartbeat Controller The heartbeat controller renews a dedicated Lease …","ref":"/docs/gardener/extensions/heartbeat/","tags":"","title":"Heartbeat"},{"body":"High Availability of Deployed Components gardenlets and extension controllers are deploying components via Deployments, StatefulSets, etc., as part of the shoot control plane, or the seed or shoot system components.\nSome of the above component deployments must be further tuned to improve fault tolerance / resilience of the service. This document outlines what needs to be done to achieve this goal.\nPlease be forwarded to the Convenient Application Of These Rules section, if you want to take a shortcut to the list of actions that require developers’ attention.\nSeed Clusters The worker nodes of seed clusters can be deployed to one or multiple availability zones. The Seed specification allows you to provide the information which zones are available:\nspec: provider: region: europe-1 zones: - europe-1a - europe-1b - europe-1c Independent of the number of zones, seed system components like the gardenlet or the extension controllers themselves, or others like etcd-druid, dependency-watchdog, etc., should always be running with multiple replicas.\nConcretely, all seed system components should respect the following conventions:\n Replica Counts\n Component Type \u003c 3 Zones \u003e= 3 Zones Comment Observability (Monitoring, Logging) 1 1 Downtimes accepted due to cost reasons Controllers 2 2 / (Webhook) Servers 2 2 / Apart from the above, there might be special cases where these rules do not apply, for example:\n istio-ingressgateway is scaled horizontally, hence the above numbers are the minimum values. nginx-ingress-controller in the seed cluster is used to advertise all shoot observability endpoints, so due to performance reasons it runs with 2 replicas at all times. In the future, this component might disappear in favor of the istio-ingressgateway anyways. Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint, ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname minDomains: 3 # lower value of max replicas or 3 maxSkew: 1 whenUnsatisfiable: ScheduleAnyway matchLabels: ... minDomains is set when failure tolerance is configured or annotation high-availability-config.resources.gardener.cloud/host-spread=\"true\" is given.\n … and the seed cluster has \u003e= 2 zones, then the component should also have a second topologySpreadConstraint, ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - topologyKey: topology.kubernetes.io/zone minDomains: 2 # lower value of max replicas or number of zones maxSkew: 1 whenUnsatisfiable: DoNotSchedule matchLabels: ... According to these conventions, even seed clusters with only one availability zone try to be highly available “as good as possible” by spreading the replicas across multiple nodes. Hence, while such seed clusters obviously cannot handle zone outages, they can at least handle node failures.\n Shoot Clusters The Shoot specification allows configuring “high availability” as well as the failure tolerance type for the control plane components, see Highly Available Shoot Control Plane for details.\nRegarding the seed cluster selection, the only constraint is that shoot clusters with failure tolerance type zone are only allowed to run on seed clusters with at least three zones. All other shoot clusters (non-HA or those with failure tolerance type node) can run on seed clusters with any number of zones.\nControl Plane Components All control plane components should respect the following conventions:\n Replica Counts\n Component Type w/o HA w/ HA (node) w/ HA (zone) Comment Observability (Monitoring, Logging) 1 1 1 Downtimes accepted due to cost reasons Controllers 1 2 2 / (Webhook) Servers 2 2 2 / Apart from the above, there might be special cases where these rules do not apply, for example:\n etcd is a server, though the most critical component of a cluster requiring a quorum to survive failures. Hence, it should have 3 replicas even when the failure tolerance is node only. kube-apiserver is scaled horizontally, hence the above numbers are the minimum values (even when the shoot cluster is not HA, there might be multiple replicas). Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway matchLabels: ... Hence, the node spread is done on best-effort basis only.\nHowever, if the shoot cluster has defined a failure tolerance type, the whenUnsatisfiable field should be set to DoNotSchedule.\n … and the failure tolerance type of the shoot cluster is zone, then the component should also have a second topologySpreadConstraint ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - maxSkew: 1 minDomains: 2 # lower value of max replicas or number of zones topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule matchLabels: ... Node Affinity\nThe gardenlet annotates the shoot namespace in the seed cluster with the high-availability-config.resources.gardener.cloud/zones annotation.\n If the shoot cluster is non-HA or has failure tolerance type node, then the value will be always exactly one zone (e.g., high-availability-config.resources.gardener.cloud/zones=europe-1b). If the shoot cluster has failure tolerance type zone, then the value will always contain exactly three zones (e.g., high-availability-config.resources.gardener.cloud/zones=europe-1a,europe-1b,europe-1c). For backwards-compatibility, this annotation might contain multiple zones for shoot clusters created before gardener/gardener@v1.60 and not having failure tolerance type zone. This is because their volumes might already exist in multiple zones, hence pinning them to only one zone would not work.\nHence, in case this annotation is present, the components should have the following node affinity:\nspec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - europe-1a # - ... This is to ensure all pods are running in the same (set of) availability zone(s) such that cross-zone network traffic is avoided as much as possible (such traffic is typically charged by the underlying infrastructure provider).\n System Components The availability of system components is independent of the control plane since they run on the shoot worker nodes while the control plane components run on the seed worker nodes (for more information, see the Kubernetes architecture overview). Hence, it only depends on the number of availability zones configured in the shoot worker pools via .spec.provider.workers[].zones. Concretely, the highest number of zones of a worker pool with systemComponents.allow=true is considered.\nAll system components should respect the following conventions:\n Replica Counts\n Component Type 1 or 2 Zones \u003e= 3 Zones Controllers 2 2 (Webhook) Servers 2 2 Apart from the above, there might be special cases where these rules do not apply, for example:\n coredns is scaled horizontally (today), hence the above numbers are the minimum values (possibly, scaling these components vertically may be more appropriate, but that’s unrelated to the HA subject matter). Optional addons like nginx-ingress or kubernetes-dashboard are only provided on best-effort basis for evaluation purposes, hence they run with 1 replica at all times. Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway matchLabels: ... Hence, the node spread is done on best-effort basis only.\n … and the cluster has \u003e= 2 zones, then the component should also have a second topologySpreadConstraint ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - maxSkew: 1 minDomains: 2 # lower value of max replicas or number of zones topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule matchLabels: ... Convenient Application of These Rules According to above scenarios and conventions, the replicas, topologySpreadConstraints or affinity settings of the deployed components might need to be adapted.\nIn order to apply those conveniently and easily for developers, Gardener installs a mutating webhook into both seed and shoot clusters which reacts on Deployments and StatefulSets deployed to namespaces with the high-availability-config.resources.gardener.cloud/consider=true label set.\nThe following actions have to be taken by developers:\n Check if components are prepared to run concurrently with multiple replicas, e.g. controllers usually use leader election to achieve this.\n All components should be generally equipped with PodDisruptionBudgets with .spec.maxUnavailable=1 and unhealthyPodEvictionPolicy=AlwaysAllow:\n spec: maxUnavailable: 1 unhealthyPodEvictionPolicy: AlwaysAllow selector: matchLabels: ... Add the label high-availability-config.resources.gardener.cloud/type to deployments or statefulsets, as well as optionally involved horizontalpodautoscalers or HVPAs where the following two values are possible: controller server Type server is also preferred if a component is a controller and (webhook) server at the same time.\nYou can read more about the webhook’s internals in High Availability Config.\ngardenlet Internals Make sure you have read the above document about the webhook internals before continuing reading this section.\nSeed Controller The gardenlet performs the following changes on all namespaces running seed system components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e, where \u003czones\u003e is the list provided in .spec.provider.zones[] in the Seed specification. Note that neither the high-availability-config.resources.gardener.cloud/failure-tolerance-type, nor the high-availability-config.resources.gardener.cloud/zone-pinning annotations are set, hence the node affinity would never be touched by the webhook.\nThe only exception to this rule are the istio ingress gateway namespaces. This includes the default istio ingress gateway when SNI is enabled, as well as analogous namespaces for exposure classes and zone-specific istio ingress gateways. Those namespaces will additionally be annotated with high-availability-config.resources.gardener.cloud/zone-pinning set to true, resulting in the node affinities and the topology spread constraints being set. The replicas are not touched, as the istio ingress gateways are scaled by a horizontal autoscaler instance.\nShoot Controller Control Plane The gardenlet performs the following changes on the namespace running the shoot control plane components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. This makes the webhook mutate the replica count and the topology spread constraints. adds the annotation high-availability-config.resources.gardener.cloud/failure-tolerance-type with value equal to .spec.controlPlane.highAvailability.failureTolerance.type (or \"\", if .spec.controlPlane.highAvailability=nil). This makes the webhook mutate the node affinity according to the specified zone(s). adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e, where \u003czones\u003e is a … … random zone chosen from the .spec.provider.zones[] list in the Seed specification (always only one zone (even if there are multiple available in the seed cluster)) in case the Shoot has no HA setting (i.e., spec.controlPlane.highAvailability=nil) or when the Shoot has HA setting with failure tolerance type node. … list of three randomly chosen zones from the .spec.provider.zones[] list in the Seed specification in case the Shoot has HA setting with failure tolerance type zone. System Components The gardenlet performs the following changes on all namespaces running shoot system components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. This makes the webhook mutate the replica count and the topology spread constraints. adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e where \u003czones\u003e is the merged list of zones provided in .zones[] with systemComponents.allow=true for all worker pools in .spec.provider.workers[] in the Shoot specification. Note that neither the high-availability-config.resources.gardener.cloud/failure-tolerance-type, nor the high-availability-config.resources.gardener.cloud/zone-pinning annotations are set, hence the node affinity would never be touched by the webhook.\n","categories":"","description":"","excerpt":"High Availability of Deployed Components gardenlets and extension …","ref":"/docs/gardener/high-availability/","tags":"","title":"High Availability"},{"body":"Hot-Update VirtualMachine tags without triggering a rolling-update Hot-Update VirtualMachine tags without triggering a rolling-update Motivation Boundary Condition What is available today? What are the problems with the current approach? MachineClass Update and its impact Proposal Shoot YAML changes Provider specific WorkerConfig API changes Gardener provider extension changes Driver interface changes Machine Class reconciliation Reconciliation Changes Motivation MCM Issue#750 There is a requirement to provide a way for consumers to add tags which can be hot-updated onto VMs. This requirement can be generalized to also offer a convenient way to specify tags which can be applied to VMs, NICs, Devices etc.\n MCM Issue#635 which in turn points to MCM-Provider-AWS Issue#36 - The issue hints at other fields like enable/disable source/destination checks for NAT instances which needs to be hot-updated on network interfaces.\n In GCP provider - instance.ServiceAccounts can be updated without the need to roll-over the instance. See\n Boundary Condition All tags that are added via means other than MachineClass.ProviderSpec should be preserved as-is. Only updates done to tags in MachineClass.ProviderSpec should be applied to the infra resources (VM/NIC/Disk).\nWhat is available today? WorkerPool configuration inside shootYaml provides a way to set labels. As per the definition these labels will be applied on Node resources. Currently these labels are also passed to the VMs as tags. There is no distinction made between Node labels and VM tags.\nMachineClass has a field which holds provider specific configuration and one such configuration is tags. Gardener provider extensions updates the tags in MachineClass.\n AWS provider extension directly passes the labels to the tag section of machineClass. Azure provider extension sanitizes the woker pool labels and adds them as tags in MachineClass. GCP provider extension sanitizes them, and then sets them as labels in the MachineClass. In GCP tags only have keys and are currently hard coded. Let us look at an example of MachineClass.ProviderSpec in AWS:\nproviderSpec: ami: ami-02fe00c0afb75bbd3 tags: #[section-1] pool lables added by gardener extension ######################################################### kubernetes.io/arch: amd64 networking.gardener.cloud/node-local-dns-enabled: \"true\" node.kubernetes.io/role: node worker.garden.sapcloud.io/group: worker-ser234 worker.gardener.cloud/cri-name: containerd worker.gardener.cloud/pool: worker-ser234 worker.gardener.cloud/system-components: \"true\" #[section-2] Tags defined in the gardener-extension-provider-aws ########################################################### kubernetes.io/cluster/cluster-full-name: \"1\" kubernetes.io/role/node: \"1\" #[section-3] ########################################################### user-defined-key1: user-defined-val1 user-defined-key2: user-defined-val2 Refer src for tags defined in section-1. Refer src for tags defined in section-2. Tags in section-3 are defined by the user.\n Out of the above three tag categories, MCM depends section-2 tags (mandatory-tags) for its orphan collection and Driver’s DeleteMachineand GetMachineStatus to work.\nProviderSpec.Tags are transported to the provider specific resources as follows:\n Provider Resources Tags are set on Code Reference Comment AWS Instance(VM), Volume, Network-Interface aws-VM-Vol-NIC No distinction is made between tags set on VM, NIC or Volume Azure Instance(VM), Network-Interface azure-VM-parameters \u0026 azureNIC-Parameters GCP Instance(VM), 1 tag: name (denoting the name of the worker) is added to Disk gcp-VM \u0026 gcp-Disk In GCP key-value pairs are called labels while network tags have only keys AliCloud Instance(VM) aliCloud-VM What are the problems with the current approach? There are a few shortcomings in the way tags/labels are handled:\n Tags can only be set at the time a machine is created. There is no distinction made amongst tags/labels that are added to VM’s, disks or network interfaces. As stated above for AWS same set of tags are added to all. There is a limit defined on the number of tags/labels that can be associated to the devices (disks, VMs, NICs etc). Example: In AWS a max of 50 user created tags are allowed. Similar restrictions are applied on different resources across providers. Therefore adding all tags to all devices even if the subset of tags are not meant for that resource exhausts the total allowed tags/labels for that resource. The only placeholder in shoot yaml as mentioned above is meant to only hold labels that should be applied on primarily on the Node objects. So while you could use the node labels for extended resources, using it also for tags is not clean. There is no provision in the shoot YAML today to add tags only to a subset of resources. MachineClass Update and its impact When Worker.ProviderConfig is changed then a worker-hash is computed which includes the raw ProviderConfig. This hash value is then used as a suffix when constructing the name for a MachineClass. See aws-extension-provider as an example. A change in the name of the MachineClass will then in-turn trigger a rolling update of machines. Since tags are provider specific and therefore will be part of ProviderConfig, any update to them will result in a rolling-update of machines.\nProposal Shoot YAML changes Provider specific configuration is set via providerConfig section for each worker pool.\nExample worker provider config (current):\nproviderConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 dataVolumes: - name: kubelet-dir snapshotID: snap-13234 iamInstanceProfile: # (specify either ARN or name) name: my-profile arn: my-instance-profile-arn It is proposed that an additional field be added for tags under providerConfig. Proposed changed YAML:\nproviderConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 dataVolumes: - name: kubelet-dir snapshotID: snap-13234 iamInstanceProfile: # (specify either ARN or name) name: my-profile arn: my-instance-profile-arn tags: vm: key1: val1 key2: val2 .. # for GCP network tags are just keys (there is no value associated to them). # What is shown below will work for AWS provider. network: key3: val3 key4: val4 Under tags clear distinction is made between tags for VMs, Disks, network interface etc. Each provider has a different allowed-set of characters that it accepts as key names, has different limits on the tags that can be set on a resource (disk, NIC, VM etc.) and also has a different format (GCP network tags are only keys).\n TODO:\n Check if worker.labels are getting added as tags on infra resources. We should continue to support it and double check that these should only be added to VMs and not to other resources.\n Should we support users adding VM tags as node labels?\n Provider specific WorkerConfig API changes Taking AWS provider extension as an example to show the changes.\n WorkerConfig will now have the following changes:\n A new field for tags will be introduced. Additional metadata for struct fields will now be added via struct tags. type WorkerConfig struct { metav1.TypeMeta Volume *Volume // .. all fields are not mentioned here. // Tags are a collection of tags to be set on provider resources (e.g. VMs, Disks, Network Interfaces etc.) Tags *Tags `hotupdatable:true` } // Tags is a placeholder for all tags that can be set/updated on VMs, Disks and Network Interfaces. type Tags struct { // VM tags set on the VM instances. VM map[string]string // Network tags set on the network interfaces. Network map[string]string // Disk tags set on the volumes/disks. Disk map[string]string } There is a need to distinguish fields within ProviderSpec (which is then mapped to the above WorkerConfig) which can be updated without the need to change the hash suffix for MachineClass and thus trigger a rolling update on machines.\nTo achieve that we propose to use struct tag hotupdatable whose value indicates if the field can be updated without the need to do a rolling update. To ensure backward compatibility, all fields which do not have this tag or have hotupdatable set to false will be considered as immutable and will require a rolling update to take affect.\nGardener provider extension changes Taking AWS provider extension as an example. Following changes should be made to all gardener provider extensions\n AWS Gardener Extension generates machine config using worker pool configuration. As part of that it also computes the workerPoolHash which is then used to create the name of the MachineClass.\nCurrently WorkerPoolHash function uses the entire providerConfig to compute the hash. Proposal is to do the following:\n Remove the code from function WorkerPoolHash. Add another function to compute hash using all immutable fields in the provider config struct and then pass that to worker.WorkerPoolHash as additionalData. The above will ensure that tags and any other field in WorkerConfig which is marked with updatable:true is not considered for hash computation and will therefore not contribute to changing the name of MachineClass object thus preventing a rolling update.\nWorkerConfig and therefore the contained tags will be set as ProviderSpec in MachineClass.\nIf only fields which have updatable:true are changed then it should result in update/patch of MachineClass and not creation.\nDriver interface changes Driver interface which is a facade to provider specific API implementations will have one additional method.\ntype Driver interface { // .. existing methods are not mentioned here for brevity. UpdateMachine(context.Context, *UpdateMachineRequest) error } // UpdateMachineRequest is the request to update machine tags. type UpdateMachineRequest struct { ProviderID string LastAppliedProviderSpec raw.Extension MachineClass *v1alpha1.MachineClass Secret *corev1.Secret } If any machine-controller-manager-provider-\u003cprovidername\u003e has not implemented UpdateMachine then updates of tags on Instances/NICs/Disks will not be done. An error message will be logged instead.\n Machine Class reconciliation Current MachineClass reconciliation does not reconcile MachineClass resource updates but it only enqueues associated machines. The reason is that it is assumed that anything that is changed in a MachineClass will result in a creation of a new MachineClass with a different name. This will result in a rolling update of all machines using the MachineClass as a template.\nHowever, it is possible that there is data that all machines in a MachineSet share which do not require a rolling update (e.g. tags), therefore there is a need to reconcile the MachineClass as well.\nReconciliation Changes In order to ensure that machines get updated eventually with changes to the hot-updatable fields defined in the MachineClass.ProviderConfig as raw.Extension.\nWe should only fix MCM Issue#751 in the MachineClass reconciliation and let it enqueue the machines as it does today. We additionally propose the following two things:\n Introduce a new annotation last-applied-providerspec on every machine resource. This will capture the last successfully applied MachineClass.ProviderSpec on this instance.\n Enhance the machine reconciliation to include code to hot-update machine.\n In machine-reconciliation there are currently two flows triggerDeletionFlow and triggerCreationFlow. When a machine gets enqueued due to changes in MachineClass then in this method following changes needs to be introduced:\nCheck if the machine has last-applied-providerspec annotation.\nCase 1.1\nIf the annotation is not present then there can be just 2 possibilities:\n It is a fresh/new machine and no backing resources (VM/NIC/Disk) exist yet. The current flow checks if the providerID is empty and Status.CurrenStatus.Phase is empty then it enters into the triggerCreationFlow.\n It is an existing machine which does not yet have this annotation. In this case call Driver.UpdateMachine. If the driver returns no error then add last-applied-providerspec annotation with the value of MachineClass.ProviderSpec to this machine.\n Case 1.2\nIf the annotation is present then compare the last applied provider-spec with the current provider-spec. If there are changes (check their hash values) then call Driver.UpdateMachine. If the driver returns no error then add last-applied-providerspec annotation with the value of MachineClass.ProviderSpec to this machine.\n NOTE: It is assumed that if there are changes to the fields which are not marked as hotupdatable then it will result in the change of name for MachineClass resulting in a rolling update of machines. If the name has not changed + machine is enqueued + there is a change in machine-class then it will be change to a hotupdatable fields in the spec.\n Trigger update flow can be done after reconcileMachineHealth and syncMachineNodeTemplates in machine-reconciliation.\nThere are 2 edge cases that needs attention and special handling:\n Premise: It is identified that there is an update done to one or more hotupdatable fields in the MachineClass.ProviderSpec.\n Edge-Case-1\nIn the machine reconciliation, an update-machine-flow is triggered which in-turn calls Driver.UpdateMachine. Consider the case where the hot update needs to be done to all VM, NIC and Disk resources. The driver returns an error which indicates a partial-failure. As we have mentioned above only when Driver.UpdateMachine returns no error will last-applied-providerspec be updated. In case of partial failure the annotation will not be updated. This event will be re-queued for a re-attempt. However consider a case where before the item is re-queued, another update is done to MachineClass reverting back the changes to the original spec.\n At T1 At T2 (T2 \u003e T1) At T3 (T3\u003e T2) last-applied-providerspec=S1MachineClass.ProviderSpec = S1 last-applied-providerspec=S1MachineClass.ProviderSpec = S2 Another update to MachineClass.ProviderConfig = S3 is enqueue (S3 == S1) last-applied-providerspec=S1Driver.UpdateMachine for S1-S2 update - returns partial failureMachine-Key is requeued At T4 (T4\u003e T3) when a machine is reconciled then it checks that last-applied-providerspec is S1 and current MachineClass.ProviderSpec = S3 and since S3 is same as S1, no update is done. At T2 Driver.UpdateMachine was called to update the machine with S2 but it partially failed. So now you will have resources which are partially updated with S2 and no further updates will be attempted.\nEdge-Case-2\nThe above situation can also happen when Driver.UpdateMachine is in the process of updating resources. It has hot-updated lets say 1 resource. But now MCM crashes. By the time it comes up another update to MachineClass.ProviderSpec is done essentially reverting back the previous change (same case as above). In this case reconciliation loop never got a chance to get any response from the driver.\nTo handle the above edge cases there are 2 options:\nOption #1\nIntroduce a new annotation inflight-providerspec-hash . The value of this annotation will be the hash value of the MachineClass.ProviderSpec that is in the process of getting applied on this machine. The machine will be updated with this annotation just before calling Driver.UpdateMachine (in the trigger-update-machine-flow). If the driver returns no error then (in a single update):\n last-applied-providerspec will be updated\n inflight-providerspec-hash annotation will be removed.\n Option #2 - Preferred\nLeverage Machine.Status.LastOperation with Type set to MachineOperationUpdate and State set to MachineStateProcessing This status will be updated just before calling Driver.UpdateMachine.\nSemantically LastOperation captures the details of the operation post-operation and not pre-operation. So this solution would be a divergence from the norm.\n","categories":"","description":"","excerpt":"Hot-Update VirtualMachine tags without triggering a rolling-update …","ref":"/docs/other-components/machine-controller-manager/proposals/hotupdate-instances/","tags":"","title":"Hotupdate Instances"},{"body":"There are two ways to get the health information of a shoot API server.\n Try to reach the public endpoint of the shoot API server via \"https://api.\u003cshoot-name\u003e.\u003cproject-name\u003e.shoot.\u003ccanary|office|live\u003e.k8s-hana.ondemand.com/healthz\" The endpoint is secured, therefore you need to authenticate via basic auth or client cert. Both are available in the admin kubeconfig of the shoot cluster. Note that with those credentials you have full (admin) access to the cluster, therefore it is highly recommended to create custom credentials with some RBAC rules and bindings which only allow access to the /healthz endpoint.\n Fetch the shoot resource of your cluster via the programmatic API of the Gardener and get the availability information from the status. You need a kubeconfig for the Garden cluster, which you can get via the Gardener dashboard. Then you could fetch your shoot resource and query for the availability information via: kubectl get shoot \u003cshoot-name\u003e -o json | jq -r '.status.conditions[] | select(.type==\"APIServerAvailable\")' The availability information in the second scenario is collected by the Gardener. If you want to collect the information independently from Gardener, you should choose the first scenario.\nIf you want to archive a simple pull monitor in the AvS for a shoot cluster, you also need to use the first scenario, because with it you have a stable endpoint for the API server which you can query.\n","categories":"","description":"","excerpt":"There are two ways to get the health information of a shoot API …","ref":"/docs/faq/clusterhealthz/","tags":"","title":"How can you get the status of a shoot API server?"},{"body":"Configuration of Multi-AZ worker pools depends on the infrastructure.\nThe zone distribution for the worker pools can be configured generically across all infrastructures. You can find provider-specific details in the InfrastructureConfig section of each extension provider repository:\n AWS (a VPC with a subnet is required in each zone you want to support) GCP Azure AliCloud OpenStack ","categories":"","description":"","excerpt":"Configuration of Multi-AZ worker pools depends on the infrastructure. …","ref":"/docs/faq/configure-worker-pools/","tags":"","title":"How do you configure Multi-AZ worker pools for different extensions?"},{"body":"End-users must provide credentials such that Gardener and Kubernetes controllers can communicate with the respective cloud provider APIs in order to perform infrastructure operations. These credentials should be regularly rotated.\nHow to do so is explained in Shoot Credentials Rotation.\n","categories":"","description":"","excerpt":"End-users must provide credentials such that Gardener and Kubernetes …","ref":"/docs/faq/rotate-iaas-keys/","tags":"","title":"How do you rotate IaaS keys for a running cluster?"},{"body":"Adding a Feature Gate In order to add a feature gate, add it as enabled to the appropriate section of the shoot.yaml file:\nSectionName: featureGates: SomeKubernetesFeature: true The available sections are kubelet, kubernetes, kubeAPIServer, kubeControllerManager, kubeScheduler, and kubeProxy.\nFor more detals, see the example shoot.yaml file.\nWhat is the expected downtime when updating the shoot.yaml? No downtime is expected after executing a shoot.yaml update.\n","categories":"","description":"","excerpt":"Adding a Feature Gate In order to add a feature gate, add it as …","ref":"/docs/faq/add-feature-gates/","tags":"","title":"How to add K8S feature gates to my shoot cluster?"},{"body":"Introduction Kubernetes offers powerful options to get more details about startup or runtime failures of pods as e.g. described in Application Introspection and Debugging or Debug Pods and Replication Controllers.\nIn order to identify pods with potential issues, you could, e.g., run kubectl get pods --all-namespaces | grep -iv Running to filter out the pods which are not in the state Running. One of frequent error state is CrashLoopBackOff, which tells that a pod crashes right after the start. Kubernetes then tries to restart the pod again, but often the pod startup fails again.\nHere is a short list of possible reasons which might lead to a pod crash:\n Error during image pull caused by e.g. wrong/missing secrets or wrong/missing image The app runs in an error state caused e.g. by missing environmental variables (ConfigMaps) or secrets Liveness probe failed Too high resource consumption (memory and/or CPU) or too strict quota settings Persistent volumes can’t be created/mounted The container image is not updated Basically, the commands kubectl logs ... and kubectl describe ... with different parameters are used to get more detailed information. By calling e.g. kubectl logs --help you can get more detailed information about the command and its parameters.\nIn the next sections you’ll find some basic approaches to get some ideas what went wrong.\nRemarks:\n Even if the pods seem to be running, as the status Running indicates, a high counter of the Restarts shows potential problems You can get a good overview of the troubleshooting process with the interactive tutorial Troubleshooting with Kubectl available which explains basic debugging activities The examples below are deployed into the namespace default. In case you want to change it, use the optional parameter --namespace \u003cyour-namespace\u003e to select the target namespace. The examples require a Kubernetes release ≥ 1.8. Prerequisites Your deployment was successful (no logical/syntactical errors in the manifest files), but the pod(s) aren’t running.\nError Caused by Wrong Image Name Start by running kubectl describe pod \u003cyour-pod\u003e \u003cyour-namespace\u003e to get detailed information about the pod startup.\nIn the Events section, you should get an error message like Failed to pull image ... and Reason: Failed. The pod is in state ImagePullBackOff.\nThe example below is based on a demo in the Kubernetes documentation. In all examples, the default namespace is used.\nFirst, perform a cleanup with:\nkubectl delete pod termination-demo\nNext, create a resource based on the yaml content below:\napiVersion: v1 kind: Pod metadata: name: termination-demo spec: containers: - name: termination-demo-container image: debiann command: [\"/bin/sh\"] args: [\"-c\", \"sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log\"] kubectl describe pod termination-demo lists in the Event section the content\nEvents: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 2m\t2m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned termination-demo to ip-10-250-17-112.eu-west-1.compute.internal 2m\t2m\t1\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-sgccm\" 2m\t1m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulling\tpulling image \"debiann\" 2m\t1m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tWarning\tFailed\tFailed to pull image \"debiann\": rpc error: code = Unknown desc = Error: image library/debiann:latest not found 2m\t54s\t10\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tWarning\tFailedSync\tError syncing pod 2m\t54s\t6\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tBackOff\tBack-off pulling image \"debiann\" The error message with Reason: Failed tells you that there is an error during pulling the image. A closer look at the image name indicates a misspelling.\nThe App Runs in an Error State Caused, e.g., by Missing Environmental Variables (ConfigMaps) or Secrets This example illustrates the behavior in the case when the app expects environment variables but the corresponding Kubernetes artifacts are missing.\nFirst, perform a cleanup with:\nkubectl delete deployment termination-demo kubectl delete configmaps app-env Next, deploy the following manifest:\napiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] Now, the command kubectl get pods lists the pod termination-demo-xxx in the state Error or CrashLoopBackOff. The command kubectl describe pod termination-demo-xxx tells you that there is no error during startup but gives no clue about what caused the crash.\nEvents: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 19m\t19m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned termination-demo-5fb484867d-xz2x9 to ip-10-250-17-112.eu-west-1.compute.internal 19m\t19m\t1\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-sgccm\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulling\tpulling image \"debian\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulled\tSuccessfully pulled image \"debian\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tCreated\tCreated container 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tStarted\tStarted container 19m\t14m\t24\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tWarning\tBackOff\tBack-off restarting failed container 19m\t4m\t69\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tWarning\tFailedSync\tError syncing pod The command kubectl get logs termination-demo-xxx gives access to the output, the application writes on stderr and stdout. In this case, you should get an output similar to:\n/bin/sh: 1: cannot open : No such file So you need to have a closer look at the application. In this case, the environmental variable MYFILE is missing. To fix this issue, you could e.g. add a ConfigMap to your deployment as is shown in the manifest listed below:\napiVersion: v1 kind: ConfigMap metadata: name: app-env data: MYFILE: \"/etc/profile\" --- apiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] envFrom: - configMapRef: name: app-env Note that once you fix the error and re-run the scenario, you might still see the pod in a CrashLoopBackOff status. It is because the container finishes the command sed ... and runs to completion. In order to keep the container in a Running status, a long running task is required, e.g.:\napiVersion: v1 kind: ConfigMap metadata: name: app-env data: MYFILE: \"/etc/profile\" SLEEP: \"5\" --- apiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] # args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] args: [\"-c\", \"while true; do sleep $SLEEP; echo sleeping; done;\"] envFrom: - configMapRef: name: app-env Too High Resource Consumption (Memory and/or CPU) or Too Strict Quota Settings You can optionally specify the amount of memory and/or CPU your container gets during runtime. In case these settings are missing, the default requests settings are taken: CPU: 0m (in Milli CPU) and RAM: 0Gi, which indicate no other limits other than the ones of the node(s) itself. For more details, e.g. about how to configure limits, see Configure Default Memory Requests and Limits for a Namespace.\nIn case your application needs more resources, Kubernetes distinguishes between requests and limit settings: requests specify the guaranteed amount of resource, whereas limit tells Kubernetes the maximum amount of resource the container might need. Mathematically, both settings could be described by the relation 0 \u003c= requests \u003c= limit. For both settings you need to consider the total amount of resources your nodes provide. For a detailed description of the concept, see Resource Quality of Service in Kubernetes.\nUse kubectl describe nodes to get a first overview of the resource consumption in your cluster. Of special interest are the figures indicating the amount of CPU and Memory Requests at the bottom of the output.\nThe next example demonstrates what happens in case the CPU request is too high in order to be managed by your cluster.\nFirst, perform a cleanup with:\nkubectl delete deployment termination-demo kubectl delete configmaps app-env Next, adapt the cpu below in the yaml below to be slightly higher than the remaining CPU resources in your cluster and deploy this manifest. In this example, 600m (milli CPUs) are requested in a Kubernetes system with a single 2 core worker node which results in an error message.\napiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log\"] resources: requests: cpu: \"600m\" The command kubectl get pods lists the pod termination-demo-xxx in the state Pending. More details on why this happens could be found by using the command kubectl describe pod termination-demo-xxx:\n$ kubectl describe po termination-demo-fdb7bb7d9-mzvfw Name: termination-demo-fdb7bb7d9-mzvfw Namespace: default ... Containers: termination-demo-container: Image: debian Port: \u003cnone\u003e Host Port: \u003cnone\u003e Command: /bin/sh Args: -c sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log Requests: cpu: 6 Environment: \u003cnone\u003e Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-t549m (ro) Conditions: Type Status PodScheduled False Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 9s (x7 over 40s) default-scheduler 0/2 nodes are available: 2 Insufficient cpu. You can find more details in:\n Managing Compute Resources for Containters Resource Quality of Service in Kubernetes Remarks:\n This example works similarly when specifying a too high request for memory In case you configured an autoscaler range when creating your Kubernetes cluster, another worker node will be spinned up automatically if you didn’t reach the maximum number of worker nodes In case your app is running out of memory (the memory settings are too small), you will typically find an OOMKilled (Out Of Memory) message in the Events section of the kubectl describe pod ... output The Container Image Is Not Updated You applied a fix in your app, created a new container image and pushed it into your container repository. After redeploying your Kubernetes manifests, you expected to get the updated app, but the same bug is still in the new deployment present.\nThis behavior is related to how Kubernetes decides whether to pull a new docker image or to use the cached one.\nIn case you didn’t change the image tag, the default image policy IfNotPresent tells Kubernetes to use the cached image (see Images).\nAs a best practice, you should not use the tag latest and change the image tag in case you changed anything in your image (see Configuration Best Practices).\nFor more information, see Container Image Not Updating.\nRelated Links Application Introspection and Debugging Debug Pods and Replication Controllers Logging Architecture Configure Default Memory Requests and Limits for a Namespace Managing Compute Resources for Containters Resource Quality of Service in Kubernetes Interactive Tutorial Troubleshooting with Kubectl Images Kubernetes Best Practices ","categories":"","description":"Your pod doesn't run as expected. Are there any log files? Where? How could I debug a pod?","excerpt":"Your pod doesn't run as expected. Are there any log files? Where? How …","ref":"/docs/guides/monitoring-and-troubleshooting/debug-a-pod/","tags":"","title":"How to Debug a Pod"},{"body":"How to provide credentials for upstream registry? In Kubernetes, to pull images from private container image registries you either have to specify an image pull Secret (see Pull an Image from a Private Registry) or you have to configure the kubelet to dynamically retrieve credentials using a credential provider plugin (see Configure a kubelet image credential provider). When pulling an image, the kubelet is providing the credentials to the CRI implementation. The CRI implementation uses the provided credentials against the upstream registry to pull the image.\nThe registry-cache extension is using the Distribution project as pull through cache implementation. The Distribution project does not use the provided credentials from the CRI implementation while fetching an image from the upstream. Hence, the above-described scenarios such as configuring image pull Secret for a Pod or configuring kubelet credential provider plugins don’t work out of the box with the pull through cache provided by the registry-cache extension. Instead, the Distribution project supports configuring only one set of credentials for a given pull through cache instance (for a given upstream).\nThis document describe how to supply credentials for the private upstream registry in order to pull private image with the registry cache.\nHow to configure the registry cache to use upstream registry credentials? Create an immutable Secret with the upstream registry credentials in the Garden cluster:\nkubectl create -f - \u003c\u003cEOF apiVersion: v1 kind: Secret metadata: name: ro-docker-secret-v1 namespace: garden-dev type: Opaque immutable: true data: username: $(echo -n $USERNAME | base64 -w0) password: $(echo -n $PASSWORD | base64 -w0) EOF For Artifact Registry, the username is _json_key and the password is the service account key in JSON format. To base64 encode the service account key, copy it and run:\necho -n $SERVICE_ACCOUNT_KEY_JSON | base64 -w0 Add the newly created Secret as a reference to the Shoot spec, and then to the registry-cache extension configuration.\nIn the registry-cache configuration, set the secretReferenceName field. It should point to a resource reference under spec.resources. The resource reference itself points to the Secret in project namespace.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot # ... spec: extensions: - type: registry-cache providerConfig: apiVersion: registry.extensions.gardener.cloud/v1alpha3 kind: RegistryConfig caches: - upstream: docker.io secretReferenceName: docker-secret # ... resources: - name: docker-secret resourceRef: apiVersion: v1 kind: Secret name: ro-docker-secret-v1 # ... [!WARNING] Do not delete the referenced Secret when there is a Shoot still using it.\n How to rotate the registry credentials? To rotate registry credentials perform the following steps:\n Generate a new pair of credentials in the cloud provider account. Do not invalidate the old ones. Create a new Secret (e.g., ro-docker-secret-v2) with the newly generated credentials as described in step 1. in How to configure the registry cache to use upstream registry credentials?. Update the Shoot spec with newly created Secret as described in step 2. in How to configure the registry cache to use upstream registry credentials?. The above step will trigger a Shoot reconciliation. Wait for it to complete. Make sure that the old Secret is no longer referenced by any Shoot cluster. Finally, delete the Secret containing the old credentials (e.g., ro-docker-secret-v1). Delete the corresponding old credentials from the cloud provider account. Possible Pitfalls The registry cache is not protected by any authentication/authorization mechanism. The cached images (incl. private images) can be fetched from the registry cache without authentication/authorization. Note that the registry cache itself is not exposed publicly. The registry cache provides the credentials for every request against the corresponding upstream. In some cases, misconfigured credentials can prevent the registry cache to pull even public images from the upstream (for example: invalid service account key for Artifact Registry). However, this behaviour is controlled by the server-side logic of the upstream registry. Do not remove the image pull Secrets when configuring credentials for the registry cache. When the registry-cache is not available, containerd falls back to the upstream registry. containerd still needs the image pull Secret to pull the image and in this way to have the fallback mechanism working. ","categories":"","description":"","excerpt":"How to provide credentials for upstream registry? In Kubernetes, to …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-cache/upstream-credentials/","tags":"","title":"How to provide credentials for upstream registry?"},{"body":"Image Vector The Gardener components are deploying several different container images into the garden, seed, and the shoot clusters. The image repositories and tags are defined in a central image vector file. Obviously, the image versions defined there must fit together with the deployment manifests (e.g., some command-line flags do only exist in certain versions).\nExample images: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: registry.k8s.io/pause tag: \"3.4\" targetVersion: \"1.20.x\" architectures: - amd64 - arm64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 targetVersion: \"\u003e= 1.21\" architectures: - amd64 - arm64 That means that Gardener will use the pause-container with tag 3.4 for all clusters with Kubernetes version 1.20.x, and the image with ref registry.k8s.io/pause:3.5 for all clusters with Kubernetes \u003e= 1.21.\n [!NOTE] As you can see, it is possible to provide the full image reference via the ref field. Another option is to use the repository and tag fields. tag may also be a digest only (starting with sha256:...), or it can contain both tag and digest (v1.2.3@sha256:...).\n Architectures images: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: registry.k8s.io/pause tag: \"3.5\" architectures: - amd64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 architectures: - arm64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 architectures: - amd64 - arm64 architectures is an optional field of image. It is a list of strings specifying CPU architecture of machines on which this image can be used. The valid options for the architectures field are as follows:\n amd64 : This specifies that the image can run only on machines having CPU architecture amd64. arm64 : This specifies that the image can run only on machines having CPU architecture arm64. If an image doesn’t specify any architectures, then by default it is considered to support both amd64 and arm64 architectures.\nOverwriting Image Vector In some environments it is not possible to use these “pre-defined” images that come with a Gardener release. A prominent example for that is Alicloud in China, which does not allow access to Google’s GCR. In these cases, you might want to overwrite certain images, e.g., point the pause-container to a different registry.\n⚠️ If you specify an image that does not fit to the resource manifest, then the reconciliations might fail.\nIn order to overwrite the images, you must provide a similar file to the Gardener component:\nimages: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: my-custom-image-registry/pause tag: \"3.4\" version: \"1.20.x\" - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: my-custom-image-registry/pause:3.5 version: \"\u003e= 1.21\" [!IMPORTANT] When the overwriting file contains ref for an image but the source file doesn’t, then this invalidates both repository and tag of the source. When it contains repository for an image but the source file uses ref, then this invalidates ref of the source.\n For gardenlet, you can create a ConfigMap containing the above content and mount it as a volume into the gardenlet pod. Next, specify the environment variable IMAGEVECTOR_OVERWRITE, whose value must be the path to the file you just mounted. The approach works similarly for gardener-operator.\napiVersion: v1 kind: ConfigMap metadata: name: gardenlet-images-overwrite namespace: garden data: images_overwrite.yaml: |images: - ... --- apiVersion: apps/v1 kind: Deployment metadata: name: gardenlet namespace: garden spec: template: spec: containers: - name: gardenlet env: - name: IMAGEVECTOR_OVERWRITE value: /imagevector-overwrite/images_overwrite.yaml volumeMounts: - name: gardenlet-images-overwrite mountPath: /imagevector-overwrite volumes: - name: gardenlet-images-overwrite configMap: name: gardenlet-images-overwrite Image Vectors for Dependent Components Gardener is deploying a lot of different components that might deploy other images themselves. These components might use an image vector as well. Operators might want to customize the image locations for these transitive images as well, hence, they might need to specify an image vector overwrite for the components directly deployed by Gardener.\nIt is possible to specify the IMAGEVECTOR_OVERWRITE_COMPONENTS environment variable to Gardener that points to a file with the following content:\ncomponents: - name: etcd-druid imageVectorOverwrite: |images: - name: etcd tag: v1.2.3 repository: etcd/etcd Gardener will, if supported by the directly deployed component (etcd-druid in this example), inject the given imageVectorOverwrite into the Deployment manifest. The respective component is responsible for using the overwritten images instead of its defaults.\nHelm Chart Image Vector Some Gardener components might also deploy packaged Helm charts which are pulled from an OCI repository. The concepts are the very same as for the container images. The only difference is that the environment variable for overwriting this chart image vector is called IMAGEVECTOR_OVERWRITE_CHARTS.\n","categories":"","description":"","excerpt":"Image Vector The Gardener components are deploying several different …","ref":"/docs/gardener/deployment/image_vector/","tags":"","title":"Image Vector"},{"body":"Contract: Infrastructure Resource Every Kubernetes cluster requires some low-level infrastructure to be setup in order to work properly. Examples for that are networks, routing entries, security groups, IAM roles, etc. Before introducing the Infrastructure extension resource Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see here). Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich infrastructure resources are required? Unfortunately, there is no general answer to this question as it is highly provider specific. Consider the above mentioned resources, i.e. VPC, subnets, route tables, security groups, IAM roles, SSH key pairs. Most of the resources are required in order to create VMs (the shoot cluster worker nodes), load balancers, and volumes.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig resourceGroup: name: mygroup networks: vnet: # specify either 'name' or 'cidr' # name: my-vnet cidr: 10.250.0.0/16 workers: 10.250.0.0/19 The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. However, the most important section is the .spec.providerConfig. It contains an embedded declaration of the provider specific configuration for the infrastructure (that cannot be known by Gardener itself). You are responsible for designing how this configuration looks like. Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the Shoot resource.\nAfter your controller has created the required resources in your provider’s infrastructure it needs to generate an output that can be used by other controllers in subsequent steps. An example for that is the Worker extension resource controller. It is responsible for creating virtual machines (shoot worker nodes) in this prepared infrastructure. Everything that it needs to know in order to do that (e.g. the network IDs, security group names, etc. (again: provider-specific)) needs to be provided as output in the Infrastructure resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus resourceGroup: name: mygroup networks: vnet: name: my-vnet subnets: - purpose: nodes name: my-subnet availabilitySets: - purpose: nodes id: av-set-id name: av-set-name routeTables: - purpose: nodes name: route-table-name securityGroups: - purpose: nodes name: sec-group-name In order to support a new infrastructure provider you need to write a controller that watches all Infrastructures with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nDynamic nodes network for shoot clusters Some environments do not allow end-users to statically define a CIDR for the network that shall be used for the shoot worker nodes. In these cases it is possible for the extension controllers to dynamically provision a network for the nodes (as part of their reconciliation loops), and to provide the CIDR in the status of the Infrastructure resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: ... nodesCIDR: 10.250.0.0/16 Gardener will pick this nodesCIDR and use it to configure the VPN components to establish network connectivity between the control plane and the worker nodes. If the Shoot resource already specifies a nodes CIDR in .spec.networking.nodes and the extension controller provides also a value in .status.nodesCIDR in the Infrastructure resource then the latter one will always be considered with higher priority by Gardener.\nNon-provider specific information required for infrastructure creation Some providers might require further information that is not provider specific but already part of the shoot resource. One example for this is the GCP infrastructure controller which needs the pod and the service network of the cluster in order to prepare and configure the infrastructure correctly. As Gardener cannot know which information is required by providers it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Infrastructure resource itself.\nImplementation details Actuator interface Most existing infrastructure controller implementations follow a common pattern where a generic Reconciler delegates to an Actuator interface that contains the methods Reconcile, Delete, Migrate, and Restore. These methods are called by the generic Reconciler for the respective operations, and should be implemented by the extension according to the contract described here and the migration guidelines.\nConfigValidator interface For infrastructure controllers, the generic Reconciler also delegates to a ConfigValidator interface that contains a single Validate method. This method is called by the generic Reconciler at the beginning of every reconciliation, and can be implemented by the extension to validate the .spec.providerConfig part of the Infrastructure resource with the respective cloud provider, typically the existence and validity of cloud provider resources such as AWS VPCs or GCP Cloud NAT IPs.\nThe Validate method returns a list of errors. If this list is non-empty, the generic Reconciler will fail with an error. This error will have the error code ERR_CONFIGURATION_PROBLEM, unless there is at least one error in the list that has its ErrorType field set to field.ErrorTypeInternal.\nReferences and additional resources Infrastructure API (Golang specification) Sample implementation for the Azure provider Sample ConfigValidator implementation ","categories":"","description":"","excerpt":"Contract: Infrastructure Resource Every Kubernetes cluster requires …","ref":"/docs/gardener/extensions/infrastructure/","tags":"","title":"Infrastructure"},{"body":"Post-Create Initialization of Machine Instance Background Today the driver.Driver facade represents the boundary between the the machine-controller and its various provider specific implementations.\nWe have abstract operations for creation/deletion and listing of machines (actually compute instances) but we do not correctly handle post-creation initialization logic. Nor do we provide an abstract operation to represent the hot update of an instance after creation.\nWe have found this to be necessary for several use cases. Today in the MCM AWS Provider, we already misuse driver.GetMachineStatus which is supposed to be a read-only operation obtaining the status of an instance.\n Each AWS EC2 instance performs source/destination checks by default. For EC2 NAT instances these should be disabled. This is done by issuing a ModifyInstanceAttribute request with the SourceDestCheck set to false. The MCM AWS Provider, decodes the AWSProviderSpec, reads providerSpec.SrcAndDstChecksEnabled and correspondingly issues the call to modify the already launched instance. However, this should be done as an action after creating the instance and should not be part of the VM status retrieval.\n Similarly, there is a pending PR to add the Ipv6AddessCount and Ipv6PrefixCount to enable the assignment of an ipv6 address and an ipv6 prefix to instances. This requires constructing and issuing an AssignIpv6Addresses request after the EC2 instance is available.\n We have other uses-cases such as MCM Issue#750 where there is a requirement to provide a way for consumers to add tags which can be hot-updated onto instances. This requirement can be generalized to also offer a convenient way to specify tags which can be applied to VMs, NICs, Devices etc.\n We have a need for “machine-instance-not-ready” taint as described in MCM#740 which should only get removed once the post creation updates are finished.\n Objectives We will split the fulfilment of this overall need into 2 stages of implementation.\n Stage-A: Support post-VM creation initialization logic of the instance suing a proposed Driver.InitializeMachine by permitting provider implementors to add initialization logic after VM creation, return with special new error code codes.Initialization for initialization errors and correspondingly support a new machine operation stage InstanceInitialization which will be updated in the machine LastOperation. The triggerCreationFlow - a reconciliation sub-flow of the MCM responsible for orchestrating instance creation and updating machine status will be changed to support this behaviour.\n Stage-B: Introduction of Driver.UpdateMachine and enhancing the MCM, MCM providers and gardener extension providers to support hot update of instances through Driver.UpdateMachine. The MCM triggerUpdationFlow - a reconciliation sub-flow of the MCM which is supposed to be responsible for orchestrating instance update - but currently not used, will be updated to invoke the provider Driver.UpdateMachine on hot-updates to to the Machine object\n Stage-A Proposal Current MCM triggerCreationFlow Today, reconcileClusterMachine which is the main routine for the Machine object reconciliation invokes triggerCreationFlow at the end when the machine.Spec.ProviderID is empty or if the machine.Status.CurrentStatus.Phase is empty or in CrashLoopBackOff\n%%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart LR other[\"...\"] --\u003echk{\"machine ProviderID empty OR Phase empty or CrashLoopBackOff ? \"}--yes--\u003etriggerCreationFlow chk--noo--\u003eLongRetry[\"return machineutils.LongRetry\"] Today, the triggerCreationFlow is illustrated below with some minor details omitted/compressed for brevity\nNOTES\n The lastop below is an abbreviation for machine.Status.LastOperation. This, along with the machine phase is generally updated on the Machine object just before returning from the method. regarding phase=CrashLoopBackOff|Failed. the machine phase may either be CrashLoopBackOff or move to Failed if the difference between current time and the machine.CreationTimestamp has exceeded the configured MachineCreationTimeout. %%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart TD end1((\"end\")) begin((\" \")) medretry[\"return MediumRetry, err\"] shortretry[\"return ShortRetry, err\"] medretry--\u003eend1 shortretry--\u003eend1 begin--\u003eAddBootstrapTokenToUserData --\u003egms[\"statusResp,statusErr=driver.GetMachineStatus(...)\"] --\u003echkstatuserr{\"Check statusErr\"} chkstatuserr--notFound--\u003echknodelbl{\"Chk Node Label\"} chkstatuserr--else--\u003ecreateFailed[\"lastop.Type=Create,lastop.state=Failed,phase=CrashLoopBackOff|Failed\"]--\u003emedretry chkstatuserr--nil--\u003einitnodename[\"nodeName = statusResp.NodeName\"]--\u003esetnodename chknodelbl--notset--\u003ecreatemachine[\"createResp, createErr=driver.CreateMachine(...)\"]--\u003echkCreateErr{\"Check createErr\"} chkCreateErr--notnil--\u003ecreateFailed chkCreateErr--nil--\u003egetnodename[\"nodeName = createResp.NodeName\"] --\u003echkstalenode{\"nodeName != machine.Name\\n//chk stale node\"} chkstalenode--false--\u003esetnodename[\"if unset machine.Labels['node']= nodeName\"] --\u003emachinepending[\"if empty/crashloopbackoff lastop.type=Create,lastop.State=Processing,phase=Pending\"] --\u003eshortretry chkstalenode--true--\u003edelmachine[\"driver.DeleteMachine(...)\"] --\u003epermafail[\"lastop.type=Create,lastop.state=Failed,Phase=Failed\"] --\u003eshortretry subgraph noteA [\" \"] permafail -.- note1([\"VM was referring to stale node obj\"]) end style noteA opacity:0 subgraph noteB [\" \"] setnodename-.- note2([\"Proposal: Introduce Driver.InitializeMachine after this\"]) end Enhancement of MCM triggerCreationFlow Relevant Observations on Current Flow Observe that we always perform a call to Driver.GetMachineStatus and only then conditionally perform a call to Driver.CreateMachine if there was was no machine found. Observe that after the call to a successful Driver.CreateMachine, the machine phase is set to Pending, the LastOperation.Type is currently set to Create and the LastOperation.State set to Processing before returning with a ShortRetry. The LastOperation.Description is (unfortunately) set to the fixed message: Creating machine on cloud provider. Observe that after an erroneous call to Driver.CreateMachine, the machine phase is set to CrashLoopBackOff or Failed (in case of creation timeout). The following changes are proposed with a view towards minimal impact on current code and no introduction of a new Machine Phase.\nMCM Changes We propose introducing a new machine operation Driver.InitializeMachine with the following signature type Driver interface { // .. existing methods are omitted for brevity. // InitializeMachine call is responsible for post-create initialization of the provider instance. InitializeMachine(context.Context, *InitializeMachineRequest) error } // InitializeMachineRequest is the initialization request for machine instance initialization type InitializeMachineRequest struct { // Machine object whose VM instance should be initialized Machine *v1alpha1.Machine // MachineClass backing the machine object MachineClass *v1alpha1.MachineClass // Secret backing the machineClass object Secret *corev1.Secret } We propose introducing a new MC error code codes.Initialization indicating that the VM Instance was created but there was an error in initialization after VM creation. The implementor of Driver.InitializeMachine can return this error code, indicating that InitializeMachine needs to be called again. The Machine Controller will change the phase to CrashLoopBackOff as usual when encountering a codes.Initialization error. We will introduce a new machine operation stage InstanceInitialization. In case of an codes.Initialization error the machine.Status.LastOperation.Description will be set to InstanceInitialization, machine.Status.LastOperation.ErrorCode will be set to codes.Initialization the LastOperation.Type will be set to Create the LastOperation.State set to Failed before returning with a ShortRetry The semantics of Driver.GetMachineStatus will be changed. If the instance associated with machine exists, but the instance was not initialized as expected, the provider implementations of GetMachineStatus should return an error: status.Error(codes.Initialization). If Driver.GetMachineStatus returned an error encapsulating codes.Initialization then Driver.InitializeMachine will be invoked again in the triggerCreationFlow. As according to the usual logic, the main machine controller reconciliation loop will now re-invoke the triggerCreationFlow again if the machine phase is CrashLoopBackOff. Illustration AWS Provider Changes Driver.InitializeMachine The implementation for the AWS Provider will look something like:\n After the VM instance is available, check providerSpec.SrcAndDstChecksEnabled, construct ModifyInstanceAttributeInput and call ModifyInstanceAttribute. In case of an error return codes.Initialization instead of the current codes.Internal Check providerSpec.NetworkInterfaces and if Ipv6PrefixCount is not nil, then construct AssignIpv6AddressesInput and call AssignIpv6Addresses. In case of an error return codes.Initialization. Don’t use the generic codes.Internal The existing Ipv6 PR will need modifications.\nDriver.GetMachineStatus If providerSpec.SrcAndDstChecksEnabled is false, check ec2.Instance.SourceDestCheck. If it does not match then return status.Error(codes.Initialization) Check providerSpec.NetworkInterfaces and if Ipv6PrefixCount is not nil, check ec2.Instance.NetworkInterfaces and check if InstanceNetworkInterface.Ipv6Addresses has a non-nil slice. If this is not the case then return status.Error(codes.Initialization) Instance Not Ready Taint Due to the fact that creation flow for machines will now be enhanced to correctly support post-creation startup logic, we should not scheduled workload until this startup logic is complete. Even without this feature we have a need for such a taint as described in MCM#740 We propose a new taint node.machine.sapcloud.io/instance-not-ready which will be added as a node startup taint in gardener core KubeletConfiguration.RegisterWithTaints The will will then removed by MCM in health check reconciliation, once the machine becomes fully ready. (when moving to Running phase) We will add this taint as part of --ignore-taint in CA We will introduce a disclaimer / prerequisite in the MCM FAQ, to add this taint as part of kubelet config under --register-with-taints, otherwise workload could get scheduled , before machine beomes Running Stage-B Proposal Enhancement of Driver Interface for Hot Updation Kindly refer to the Hot-Update Instances design which provides elaborate detail.\n","categories":"","description":"","excerpt":"Post-Create Initialization of Machine Instance Background Today the …","ref":"/docs/other-components/machine-controller-manager/proposals/initialize-machine/","tags":"","title":"Initialize Machine"},{"body":"Overview This guide walks you through the installation of the latest version of Knative using pre-built images on a Gardener created cluster environment. To set up your own Gardener, see the documentation or have a look at the landscape-setup-template project. To learn more about this open source project, read the blog on kubernetes.io.\nPrerequisites Knative requires a Kubernetes cluster v1.15 or newer.\nSteps Install and Configure kubectl If you already have kubectl CLI, run kubectl version --short to check the version. You need v1.10 or newer. If your kubectl is older, follow the next step to install a newer version.\n Install the kubectl CLI.\n Access Gardener Create a project in the Gardener dashboard. This will essentially create a Kubernetes namespace with the name garden-\u003cmy-project\u003e.\n Configure access to your Gardener project using a kubeconfig.\nIf you are not the Gardener Administrator already, you can create a technical user in the Gardener dashboard. Go to the “Members” section and add a service account. You can then download the kubeconfig for your project. You can skip this step if you create your cluster using the user interface; it is only needed for programmatic access, make sure you set export KUBECONFIG=garden-my-project.yaml in your shell.\n Creating a Kubernetes Cluster You can create your cluster using kubectl CLI by providing a cluster specification yaml file. You can find an example for GCP in the gardener/gardener repository. Make sure the namespace matches that of your project. Then just apply the prepared so-called “shoot” cluster CRD with kubectl:\nkubectl apply --filename my-cluster.yaml The easier alternative is to create the cluster following the cluster creation wizard in the Gardener dashboard: Configure kubectl for Your Cluster You can now download the kubeconfig for your freshly created cluster in the Gardener dashboard or via the CLI as follows:\nkubectl --namespace shoot--my-project--my-cluster get secret kubecfg --output jsonpath={.data.kubeconfig} | base64 --decode \u003e my-cluster.yaml This kubeconfig file has full administrators access to you cluster. For the rest of this guide, be sure you have export KUBECONFIG=my-cluster.yaml set.\nInstalling Istio Knative depends on Istio. If your cloud platform offers a managed Istio installation, we recommend installing Istio that way, unless you need the ability to customize your installation.\nOtherwise, see the Installing Istio for Knative guide to install Istio.\nYou must install Istio on your Kubernetes cluster before continuing with these instructions to install Knative.\nInstalling cluster-local-gateway for Serving Cluster-Internal Traffic If you installed Istio, you can install a cluster-local-gateway within your Knative cluster so that you can serve cluster-internal traffic. If you want to configure your revisions to use routes that are visible only within your cluster, install and use the cluster-local-gateway.\nInstalling Knative The following commands install all available Knative components as well as the standard set of observability plugins. Knative’s installation guide - Installing Knative.\n If you are upgrading from Knative 0.3.x: Update your domain and static IP address to be associated with the LoadBalancer istio-ingressgateway instead of knative-ingressgateway. Then run the following to clean up leftover resources:\nkubectl delete svc knative-ingressgateway -n istio-system kubectl delete deploy knative-ingressgateway -n istio-system If you have the Knative Eventing Sources component installed, you will also need to delete the following resource before upgrading:\nkubectl delete statefulset/controller-manager -n knative-sources While the deletion of this resource during the upgrade process will not prevent modifications to Eventing Source resources, those changes will not be completed until the upgrade process finishes.\n To install Knative, first install the CRDs by running the kubectl apply command once with the -l knative.dev/crd-install=true flag. This prevents race conditions during the install, which cause intermittent errors:\nkubectl apply --selector knative.dev/crd-install=true \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/serving.yaml \\ --filename https://github.com/knative/eventing/releases/download/v0.12.1/eventing.yaml \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/monitoring.yaml To complete the installation of Knative and its dependencies, run the kubectl apply command again, this time without the --selector flag:\nkubectl apply --filename https://github.com/knative/serving/releases/download/v0.12.1/serving.yaml \\ --filename https://github.com/knative/eventing/releases/download/v0.12.1/eventing.yaml \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/monitoring.yaml Monitor the Knative components until all of the components show a STATUS of Running:\nkubectl get pods --namespace knative-serving kubectl get pods --namespace knative-eventing kubectl get pods --namespace knative-monitoring Set Your Custom Domain Fetch the external IP or CNAME of the knative-ingressgateway: kubectl --namespace istio-system get service knative-ingressgateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE knative-ingressgateway LoadBalancer 100.70.219.81 35.233.41.212 80:32380/TCP,443:32390/TCP,32400:32400/TCP 4d Create a wildcard DNS entry in your custom domain to point to the above IP or CNAME: *.knative.\u003cmy domain\u003e == A 35.233.41.212 # or CNAME if you are on AWS *.knative.\u003cmy domain\u003e == CNAME a317a278525d111e89f272a164fd35fb-1510370581.eu-central-1.elb.amazonaws.com Adapt your Knative config-domain (set your domain in the data field): kubectl --namespace knative-serving get configmaps config-domain --output yaml apiVersion: v1 data: knative.\u003cmy domain\u003e: \"\" kind: ConfigMap name: config-domain namespace: knative-serving What’s Next Now that your cluster has Knative installed, you can see what Knative has to offer.\nDeploy your first app with the Getting Started with Knative App Deployment guide.\nGet started with Knative Eventing by walking through one of the Eventing Samples.\nInstall Cert-Manager if you want to use the automatic TLS cert provisioning feature.\nCleaning Up Use the Gardener dashboard to delete your cluster, or execute the following with kubectl pointing to your garden-my-project.yaml kubeconfig:\nkubectl --kubeconfig garden-my-project.yaml --namespace garden--my-project annotate shoot my-cluster confirmation.gardener.cloud/deletion=true kubectl --kubeconfig garden-my-project.yaml --namespace garden--my-project delete shoot my-cluster ","categories":"","description":"A walkthrough the steps for installing Knative in Gardener shoot clusters.","excerpt":"A walkthrough the steps for installing Knative in Gardener shoot …","ref":"/docs/guides/applications/knative-install/","tags":"","title":"Install Knative in Gardener Clusters"},{"body":"Integration tests Usage General setup \u0026 configurations Integration tests for machine-controller-manager-provider-{provider-name} can be executed manually by following below steps.\n Clone the repository machine-controller-manager-provider-{provider-name} on the local system. Navigate to machine-controller-manager-provider-{provider-name} directory and create a dev sub-directory in it. If the tags on instances \u0026 associated resources on the provider are of String type (for example, GCP tags on its instances are of type String and not key-value pair) then add TAGS_ARE_STRINGS := true in the Makefile and export it. For GCP this has already been hard coded in the Makefile. Running the tests There is a rule test-integration in the Makefile of the provider repository, which can be used to start the integration test: $ make test-integration This will ask for additional inputs. Most of them are self explanatory except: The script assumes that both the control and target clusters are already being created. In case of non-gardener setup (control cluster is not a gardener seed), the name of the machineclass must be test-mc-v1 and the value of providerSpec.secretRef.name should be test-mc-secret. In case of azure, TARGET_CLUSTER_NAME must be same as the name of the Azure ResourceGroup for the cluster. If you are deploying the secret manually, a Secret named test-mc-secret (that contains the provider secret and cloud-config) in the default namespace of the Control Cluster should be created. The controllers log files (mcm_process.log and mc_process.log) are stored in .ci/controllers-test/logs repo and can be used later. Adding Integration Tests for new providers For a new provider, Running Integration tests works with no changes. But for the orphan resource test cases to work correctly, the provider-specific API calls and the Resource Tracker Interface (RTI) should be implemented. Please check machine-controller-manager-provider-aws for reference.\nExtending integration tests Update ControllerTests to be extend the testcases for all providers. Common testcases for machine|machineDeployment creation|deletion|scaling are packaged into ControllerTests. To extend the provider specfic test cases, the changes should be done in the machine-controller-manager-provider-{provider-name} repository. For example, to extended the testcases for machine-controller-manager-provider-aws, make changes to test/integration/controller/controller_test.go inside the machine-controller-manager-provider-aws repository. commons contains the Cluster and Clientset objects that makes it easy to extend the tests. ","categories":"","description":"","excerpt":"Integration tests Usage General setup \u0026 configurations Integration …","ref":"/docs/other-components/machine-controller-manager/integration_tests/","tags":"","title":"Integration Tests"},{"body":"Introduction When transferring data among networked systems, trust is a central concern. In particular, when communicating over an untrusted medium such as the internet, it is critical to ensure the integrity and immutability of all the data a system operates on. Especially if you use Docker Engine to push and pull images (data) to a public registry.\nThis immutability offers you a guarantee that any and all containers that you instantiate will be absolutely identical at inception. Surprise surprise, deterministic operations.\nA Lesson in Deterministic Ops Docker Tags are about as reliable and disposable as this guy down here.\nSeems simple enough. You have probably already deployed hundreds of YAML’s or started endless counts of Docker containers.\ndocker run --name mynginx1 -P -d nginx:1.13.9 or\napiVersion: apps/v1 kind: Deployment metadata: name: rss-site spec: replicas: 1 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: front-end image: nginx:1.13.9 ports: - containerPort: 80 But Tags are mutable and humans are prone to error. Not a good combination. Here, we’ll dig into why the use of tags can be dangerous and how to deploy your containers across a pipeline and across environments with determinism in mind.\nLet’s say that you want to ensure that whether it’s today or 5 years from now, that specific deployment uses the very same image that you have defined. Any updates or newer versions of an image should be executed as a new deployment. The solution: digest\nA digest takes the place of the tag when pulling an image. For example, to pull the above image by digest, run the following command:\ndocker run --name mynginx1 -P -d nginx@sha256:4771d09578c7c6a65299e110b3ee1c0a2592f5ea2618d23e4ffe7a4cab1ce5de You can now make sure that the same image is always loaded at every deployment. It doesn’t matter if the TAG of the image has been changed or not. This solves the problem of repeatability.\nContent Trust However, there’s an additionally hidden danger. It is possible for an attacker to replace a server image with another one infected with malware.\nDocker Content trust gives you the ability to verify both the integrity and the publisher of all the data received from a registry over any channel.\nPrior to version 1.8, Docker didn’t have a way to verify the authenticity of a server image. But in v1.8, a new feature called Docker Content Trust was introduced to automatically sign and verify the signature of a publisher.\nSo, as soon as a server image is downloaded, it is cross-checked with the signature of the publisher to see if someone tampered with it in any way. This solves the problem of trust.\nIn addition, you should scan all images for known vulnerabilities.\n","categories":"","description":"Ensure that you always get the right image","excerpt":"Ensure that you always get the right image","ref":"/docs/guides/applications/content_trust/","tags":"","title":"Integrity and Immutability"},{"body":"IPv6 in Gardener Clusters 🚧 IPv6 networking is currently under development.\n IPv6 Single-Stack Networking GEP-21 proposes IPv6 Single-Stack Support in the local Gardener environment. This documentation will be enhanced while implementing GEP-21, see gardener/gardener#7051.\nTo use IPv6 single-stack networking, the feature gate IPv6SingleStack must be enabled on gardener-apiserver and gardenlet.\nDevelopment/Testing Setup Developing or testing IPv6-related features requires a Linux machine (docker only supports IPv6 on Linux) and native IPv6 connectivity to the internet. If you’re on a different OS or don’t have IPv6 connectivity in your office environment or via your home ISP, make sure to check out gardener-community/dev-box-gcp, which allows you to circumvent these limitations.\nTo get started with the IPv6 setup and create a local IPv6 single-stack shoot cluster, run the following commands:\nmake kind-up gardener-up IPFAMILY=ipv6 k apply -f example/provider-local/shoot-ipv6.yaml Please also take a look at the guide on Deploying Gardener Locally for more details on setting up an IPv6 gardener for testing or development purposes.\nContainer Images If you plan on using custom images, make sure your registry supports IPv6 access.\nCheck the component checklist for tips concerning container registries and how to handle their IPv6 support.\n","categories":"","description":"","excerpt":"IPv6 in Gardener Clusters 🚧 IPv6 networking is currently under …","ref":"/docs/gardener/ipv6/","tags":"","title":"Ipv6"},{"body":"Istio Istio offers a service mesh implementation with focus on several important features - traffic, observability, security, and policy.\nPrerequisites Third-party JWT is used, therefore each Seed cluster where this feature is enabled must have Service Account Token Volume Projection enabled. Kubernetes 1.16+ Differences with Istio’s Default Profile The default profile which is recommended for production deployment, is not suitable for the Gardener use case, as it offers more functionality than desired. The current installation goes through heavy refactorings due to the IstioOperator and the mixture of Helm values + Kubernetes API specification makes configuring and fine-tuning it very hard. A more simplistic deployment is used by Gardener. The differences are the following:\n Telemetry is not deployed. istiod is deployed. istio-ingress-gateway is deployed in a separate istio-ingress namespace. istio-egress-gateway is not deployed. None of the Istio addons are deployed. Mixer (deprecated) is not deployed. Mixer CDRs are not deployed. Kubernetes Service, Istio’s VirtualService and ServiceEntry are NOT advertised in the service mesh. This means that if a Service needs to be accessed directly from the Istio Ingress Gateway, it should have networking.istio.io/exportTo: \"*\" annotation. VirtualService and ServiceEntry must have .spec.exportTo: [\"*\"] set on them respectively. Istio injector is not enabled. mTLS is enabled by default. Handling Multiple Availability Zones with Istio For various reasons, e.g., improved resiliency to certain failures, it may be beneficial to use multiple availability zones in a seed cluster. While availability zones have advantages in being able to cover some failure domains, they also come with some additional challenges. Most notably, the latency across availability zone boundaries is higher than within an availability zone. Furthermore, there might be additional cost implied by network traffic crossing an availability zone boundary. Therefore, it may be useful to try to keep traffic within an availability zone if possible. The istio deployment as part of Gardener has been adapted to allow this.\nA seed cluster spanning multiple availability zones may be used for highly-available shoot control planes. Those control planes may use a single or multiple availability zones. In addition to that, ordinary non-highly-available shoot control planes may be scheduled to such a seed cluster as well. The result is that the seed cluster may have control planes spanning multiple availability zones and control planes that are pinned to exactly one availability zone. These two types need to be handled differently when trying to prevent unnecessary cross-zonal traffic.\nThe goal is achieved by using multiple istio ingress gateways. The default istio ingress gateway spans all availability zones. It is used for multi-zonal shoot control planes. For each availability zone, there is an additional istio ingress gateway, which is utilized only for single-zone shoot control planes pinned to this availability zone. This is illustrated in the following diagram.\nPlease note that operators may need to perform additional tuning to prevent cross-zonal traffic completely. The loadbalancer settings in the seed specification offer various options, e.g., by setting the external traffic policy to local or using infrastructure specific loadbalancer annotations.\nFurthermore, note that this approach is also taken in case ExposureClasses are used. For each exposure class, additional zonal istio ingress gateways may be deployed to cover for single-zone shoot control planes using the exposure class.\n","categories":"","description":"","excerpt":"Istio Istio offers a service mesh implementation with focus on several …","ref":"/docs/gardener/istio/","tags":"","title":"Istio"},{"body":"Using annotated Istio Gateway and/or Istio Virtual Service as Source This tutorial describes how to use annotated Istio Gateway resources as source for Certificate resources.\nInstall Istio on your cluster Follow the Istio Getting Started to download and install Istio.\nThese are the typical commands for the istio demo installation\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - istioctl install --set profile=demo -y kubectl label namespace default istio-injection=enabled Note: If you are using a KinD cluster, the istio-ingressgateway service may be pending forever.\n$ kubectl -n istio-system get svc istio-ingressgateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE istio-ingressgateway LoadBalancer 10.96.88.189 \u003cpending\u003e 15021:30590/TCP,80:30185/TCP,443:30075/TCP,31400:30129/TCP,15443:30956/TCP 13m In this case, you may patch the status for demo purposes (of course it still would not accept connections)\nkubectl -n istio-system patch svc istio-ingressgateway --type=merge --subresource status --patch '{\"status\":{\"loadBalancer\":{\"ingress\":[{\"ip\":\"1.2.3.4\"}]}}}' Verify that Istio Gateway/VirtualService Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Create an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: #cert.gardener.cloud/dnsnames: \"*.example.com\" # alternative if you want to control the dns names explicitly. cert.gardener.cloud/purpose: managed spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 443 name: http protocol: HTTPS hosts: - \"httpbin.example.com\" # this is used by the dns-controller-manager to extract DNS names tls: credentialName: my-tls-secret EOF You should now see a created Certificate resource similar to:\n$ kubectl -n istio-system get cert -oyaml apiVersion: v1 items: - apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: generateName: httpbin-gateway-gateway- name: httpbin-gateway-gateway-hdbjb namespace: istio-system ownerReferences: - apiVersion: networking.istio.io/v1 blockOwnerDeletion: true controller: true kind: Gateway name: httpbin-gateway spec: commonName: httpbin.example.com secretName: my-tls-secret status: ... kind: List metadata: resourceVersion: \"\" Using a VirtualService as a source If the Gateway resource is annotated with cert.gardener.cloud/purpose: managed, hosts from all referencing VirtualServices resources are automatically extracted. These resources don’t need an additional annotation.\nCreate an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: cert.gardener.cloud/purpose: managed spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 443 name: https protocol: HTTPS hosts: - \"*\" tls: credentialName: my-tls-secret EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is used by dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF This should show a similar Certificate resource as above.\n","categories":"","description":"","excerpt":"Using annotated Istio Gateway and/or Istio Virtual Service as Source …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/istio-gateways/","tags":"","title":"Istio Gateways"},{"body":"Using annotated Istio Gateway and/or Istio Virtual Service as Source This tutorial describes how to use annotated Istio Gateway resources as source for DNSEntries with the Gardener shoot-dns-service extension.\nInstall Istio on your cluster Using a new or existing shoot cluster, follow the Istio Getting Started to download and install Istio.\nThese are the typical commands for the istio demo installation\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - istioctl install --set profile=demo -y kubectl label namespace default istio-injection=enabled Verify that Istio Gateway/VirtualService Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Create an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 80 name: http protocol: HTTP hosts: - \"httpbin.example.com\" # this is used by the dns-controller-manager to extract DNS names EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is also used by the dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF You should now see events in the namespace of the gateway:\n$ kubectl -n istio-system get events --sort-by={.metadata.creationTimestamp} LAST SEEN TYPE REASON OBJECT MESSAGE ... 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: created dns entry object shoot--foo--bar/httpbin-gateway-gateway-zpf8n 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry pending: waiting for dns reconciliation 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry is pending 36s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry active Using a VirtualService as a source If the Gateway resource is annotated with dns.gardener.cloud/dnsnames: \"*\", hosts from all referencing VirtualServices resources are automatically extracted. These resources don’t need an additional annotation.\nCreate an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 80 name: http protocol: HTTP hosts: - \"*\" EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is used by dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF This should show a similar events as above.\nTo get the targets to the extracted DNS names, the shoot-dns-service controller is able to gather information from the kubernetes service of the Istio Ingress Gateway.\nNote: It is also possible to set the targets my specifying an Ingress resource using the dns.gardener.cloud/ingress annotation on the Istio Ingress Gateway resource.\nNote: It is also possible to set the targets manually by using the dns.gardener.cloud/targets annotation on the Istio Ingress Gateway resource.\nAccess the sample service using curl $ curl -I http://httpbin.example.com/status/200 HTTP/1.1 200 OK server: istio-envoy date: Tue, 13 Feb 2024 07:49:37 GMT content-type: text/html; charset=utf-8 access-control-allow-origin: * access-control-allow-credentials: true content-length: 0 x-envoy-upstream-service-time: 15 Accessing any other URL that has not been explicitly exposed should return an HTTP 404 error:\n$ curl -I http://httpbin.example.com/headers HTTP/1.1 404 Not Found date: Tue, 13 Feb 2024 08:09:41 GMT server: istio-envoy transfer-encoding: chunked ","categories":"","description":"","excerpt":"Using annotated Istio Gateway and/or Istio Virtual Service as Source …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/tutorials/istio-gateways/","tags":"","title":"Istio Gateways"},{"body":"Overview Use the Kubernetes command-line tool, kubectl, to deploy and manage applications on Kubernetes. Using kubectl, you can inspect cluster resources, as well as create, delete, and update components.\nBy default, the kubectl configuration is located at ~/.kube/config.\nLet us suppose that you have two clusters, one for development work and one for scratch work.\nHow to handle this easily without copying the used configuration always to the right place?\nExport the KUBECONFIG Environment Variable bash$ export KUBECONFIG=\u003cPATH-TO-M\u003e-CONFIG\u003e/kubeconfig-dev.yaml How to determine which cluster is used by the kubectl command?\nDetermine Active Cluster bash$ kubectl cluster-info Kubernetes master is running at https://api.dev.garden.shoot.canary.k8s-hana.ondemand.com KubeDNS is running at https://api.dev.garden.shoot.canary.k8s-hana.ondemand.com/api/v1/proxy/namespaces/kube-system/services/kube-dns To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. bash$ Display Cluster in the bash - Linux and Alike I found this tip on Stackoverflow and find it worth to be added here.\nEdit your ~/.bash_profile and add the following code snippet to show the current K8s context in the shell’s prompt:\nprompt_k8s(){ k8s_current_context=$(kubectl config current-context 2\u003e /dev/null) if [[ $? -eq 0 ]] ; then echo -e \"(${k8s_current_context}) \"; fi } PS1+='$(prompt_k8s)' After this, your bash command prompt contains the active KUBECONFIG context and you always know which cluster is active - develop or production.\nFor example:\nbash$ export KUBECONFIG=/Users/d023280/Documents/workspace/gardener-ui/kubeconfig_gardendev.yaml bash (garden_dev)$ Note the (garden_dev) prefix in the bash command prompt.\nThis helps immensely to avoid thoughtless mistakes.\nDisplay Cluster in the PowerShell - Windows Display the current K8s cluster in the title of PowerShell window.\nCreate a profile file for your shell under %UserProfile%\\Documents\\Windows­PowerShell\\Microsoft.PowerShell_profile.ps1\nCopy following code to Microsoft.PowerShell_profile.ps1\n function prompt_k8s { $k8s_current_context = (kubectl config current-context) | Out-String if($?) { return $k8s_current_context }else { return \"No K8S contenxt found\" } } $host.ui.rawui.WindowTitle = prompt_k8s If you want to switch to different cluster, you can set KUBECONFIG to new value, and re-run the file Microsoft.PowerShell_profile.ps1\n","categories":"","description":"Expose the active kubeconfig into bash","excerpt":"Expose the active kubeconfig into bash","ref":"/docs/guides/client-tools/bash-kubeconfig/","tags":"","title":"Kubeconfig Context as bash Prompt"},{"body":"This HowTo covers common Kubernetes antipatterns that we have seen over the past months.\nRunning as Root User Whenever possible, do not run containers as root user. One could be tempted to say that Kubernetes pods and nodes are well separated. Host and containers running on it share the same kernel. If a container is compromised, the root user in the container has full control over the underlying node.\nWatch the very good presentation by Liz Rice at the KubeCon 2018\n Use RUN groupadd -r anygroup \u0026\u0026 useradd -r -g anygroup myuser to create a group and add a user to it. Use the USER command to switch to this user. Note that you may also consider to provide an explicit UID/GID if required.\nFor example:\nARG GF_UID=\"500\" ARG GF_GID=\"500\" # add group \u0026 user RUN groupadd -r -g $GF_GID appgroup \u0026\u0026 \\ useradd appuser -r -u $GF_UID -g appgroup USER appuser Store Data or Logs in Containers Containers are ideal for stateless applications and should be transient. This means that no data or logs should be stored in the container, as they are lost when the container is closed. Use persistence volumes instead to persist data outside of containers. Using an ELK stack is another good option for storing and processing logs.\nUsing Pod IP Addresses Each pod is assigned an IP address. It is necessary for pods to communicate with each other to build an application, e.g. an application must communicate with a database. Existing pods are terminated and new pods are constantly started. If you would rely on the IP address of a pod or container, you would need to update the application configuration constantly. This makes the application fragile.\nCreate services instead. They provide a logical name that can be assigned independently of the varying number and IP addresses of containers. Services are the basic concept for load balancing within Kubernetes.\nMore Than One Process in a Container A docker file provides a CMD and ENTRYPOINT to start the image. CMD is often used around a script that makes a configuration and then starts the container. Do not try to start multiple processes with this script. It is important to consider the separation of concerns when creating docker images. Running multiple processes in a single pod makes managing your containers, collecting logs and updating each process more difficult.\nYou can split the image into multiple containers and manage them independently - even in one pod. Bear in mind that Kubernetes only monitors the process with PID=1. If more than one process is started within a container, then these no longer fall under the control of Kubernetes.\nCreating Images in a Running Container A new image can be created with the docker commit command. This is useful if changes have been made to the container and you want to persist them for later error analysis. However, images created like this are not reproducible and completely worthless for a CI/CD environment. Furthermore, another developer cannot recognize which components the image contains. Instead, always make changes to the docker file, close existing containers and start a new container with the updated image.\nSaving Passwords in a docker Image 💀 Do not save passwords in a Docker file! They are in plain text and are checked into a repository. That makes them completely vulnerable even if you are using a private repository like the Artifactory.\nAlways use Secrets or ConfigMaps to provision passwords or inject them by mounting a persistent volume.\nUsing the ’latest’ Tag Starting an image with tomcat is tempting. If no tags are specified, a container is started with the tomcat:latest image. This image may no longer be up to date and refer to an older version instead. Running a production application requires complete control of the environment with exact versions of the image.\nMake sure you always use a tag or even better the sha256 hash of the image, e.g., tomcat@sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f.\nWhy Use the sha256 Hash? Tags are not immutable and can be overwritten by a developer at any time. In this case you don’t have complete control over your image - which is bad.\nDifferent Images per Environment Don’t create different images for development, testing, staging and production environments. The image should be the source of truth and should only be created once and pushed to the repository. This image:tag should be used for different environments in the future.\nDepend on Start Order of Pods Applications often depend on containers being started in a certain order. For example, a database container must be up and running before an application can connect to it. The application should be resilient to such changes, as the db pod can be unreachable or restarted at any time. The application container should be able to handle such situations without terminating or crashing.\nAdditional Anti-Patterns and Patterns In the community, vast experience has been collected to improve the stability and usability of Docker and Kubernetes.\nRefer to Kubernetes Production Patterns for more information.\n","categories":"","description":"Common antipatterns for Kubernetes and Docker","excerpt":"Common antipatterns for Kubernetes and Docker","ref":"/docs/guides/applications/antipattern/","tags":"","title":"Kubernetes Antipatterns"},{"body":"Kubernetes Clients in Gardener This document aims at providing a general developer guideline on different aspects of using Kubernetes clients in a large-scale distributed system and project like Gardener. The points included here are not meant to be consulted as absolute rules, but rather as general rules of thumb that allow developers to get a better feeling about certain gotchas and caveats. It should be updated with lessons learned from maintaining the project and running Gardener in production.\nPrerequisites: Please familiarize yourself with the following basic Kubernetes API concepts first, if you’re new to Kubernetes. A good understanding of these basics will help you better comprehend the following document.\n Kubernetes API Concepts (including terminology, watch basics, etc.) Extending the Kubernetes API (including Custom Resources and aggregation layer / extension API servers) Extend the Kubernetes API with CustomResourceDefinitions Working with Kubernetes Objects Sample Controller (the diagram helps to build an understanding of an controller’s basic structure) Client Types: Client-Go, Generated, Controller-Runtime For historical reasons, you will find different kinds of Kubernetes clients in Gardener:\nClient-Go Clients client-go is the default/official client for talking to the Kubernetes API in Golang. It features the so called “client sets” for all built-in Kubernetes API groups and versions (e.g. v1 (aka core/v1), apps/v1). client-go clients are generated from the built-in API types using client-gen and are composed of interfaces for every known API GroupVersionKind. A typical client-go usage looks like this:\nvar ( ctx context.Context c kubernetes.Interface // \"k8s.io/client-go/kubernetes\" deployment *appsv1.Deployment // \"k8s.io/api/apps/v1\" ) updatedDeployment, err := c.AppsV1().Deployments(\"default\").Update(ctx, deployment, metav1.UpdateOptions{}) Important characteristics of client-go clients:\n clients are specific to a given API GroupVersionKind, i.e., clients are hard-coded to corresponding API-paths (don’t need to use the discovery API to map GVK to a REST endpoint path). client’s don’t modify the passed in-memory object (e.g. deployment in the above example). Instead, they return a new in-memory object. This means that controllers have to continue working with the new in-memory object or overwrite the shared object to not lose any state updates. Generated Client Sets for Gardener APIs Gardener’s APIs extend the Kubernetes API by registering an extension API server (in the garden cluster) and CustomResourceDefinitions (on Seed clusters), meaning that the Kubernetes API will expose additional REST endpoints to manage Gardener resources in addition to the built-in API resources. In order to talk to these extended APIs in our controllers and components, client-gen is used to generate client-go-style clients to pkg/client/{core,extensions,seedmanagement,...}.\nUsage of these clients is equivalent to client-go clients, and the same characteristics apply. For example:\nvar ( ctx context.Context c gardencoreclientset.Interface // \"github.com/gardener/gardener/pkg/client/core/clientset/versioned\" shoot *gardencorev1beta1.Shoot // \"github.com/gardener/gardener/pkg/apis/core/v1beta1\" ) updatedShoot, err := c.CoreV1beta1().Shoots(\"garden-my-project\").Update(ctx, shoot, metav1.UpdateOptions{}) Controller-Runtime Clients controller-runtime is a Kubernetes community project (kubebuilder subproject) for building controllers and operators for custom resources. Therefore, it features a generic client that follows a different approach and does not rely on generated client sets. Instead, the client can be used for managing any Kubernetes resources (built-in or custom) homogeneously. For example:\nvar ( ctx context.Context c client.Client // \"sigs.k8s.io/controller-runtime/pkg/client\" deployment *appsv1.Deployment // \"k8s.io/api/apps/v1\" shoot *gardencorev1beta1.Shoot // \"github.com/gardener/gardener/pkg/apis/core/v1beta1\" ) err := c.Update(ctx, deployment) // or err = c.Update(ctx, shoot) A brief introduction to controller-runtime and its basic constructs can be found at the official Go documentation.\nImportant characteristics of controller-runtime clients:\n The client functions take a generic client.Object or client.ObjectList value. These interfaces are implemented by all Golang types, that represent Kubernetes API objects or lists respectively which can be interacted with via usual API requests. [1] The client first consults a runtime.Scheme (configured during client creation) for recognizing the object’s GroupVersionKind (this happens on the client-side only). A runtime.Scheme is basically a registry for Golang API types, defaulting and conversion functions. Schemes are usually provided per GroupVersion (see this example for apps/v1) and can be combined to one single scheme for further usage (example). In controller-runtime clients, schemes are used only for mapping a typed API object to its GroupVersionKind. It then consults a meta.RESTMapper (also configured during client creation) for mapping the GroupVersionKind to a RESTMapping, which contains the GroupVersionResource and Scope (namespaced or cluster-scoped). From these values, the client can unambiguously determine the REST endpoint path of the corresponding API resource. For instance: appsv1.DeploymentList is available at /apis/apps/v1/deployments or /apis/apps/v1/namespaces/\u003cnamespace\u003e/deployments respectively. There are different RESTMapper implementations, but generally they are talking to the API server’s discovery API for retrieving RESTMappings for all API resources known to the API server (either built-in, registered via API extension or CustomResourceDefinitions). The default implementation of a controller-runtime (which Gardener uses as well) is the dynamic RESTMapper. It caches discovery results (i.e. RESTMappings) in-memory and only re-discovers resources from the API server when a client tries to use an unknown GroupVersionKind, i.e., when it encounters a No{Kind,Resource}MatchError. The client writes back results from the API server into the passed in-memory object. This means that controllers don’t have to worry about copying back the results and should just continue to work on the given in-memory object. This is a nice and flexible pattern, and helper functions should try to follow it wherever applicable. Meaning, if possible accept an object param, pass it down to clients and keep working on the same in-memory object instead of creating a new one in your helper function. The benefit is that you don’t lose updates to the API object and always have the last-known state in memory. Therefore, you don’t have to read it again, e.g., for getting the current resourceVersion when working with optimistic locking, and thus minimize the chances for running into conflicts. However, controllers must not use the same in-memory object concurrently in multiple goroutines. For example, decoding results from the API server in multiple goroutines into the same maps (e.g., labels, annotations) will cause panics because of “concurrent map writes”. Also, reading from an in-memory API object in one goroutine while decoding into it in another goroutine will yield non-atomic reads, meaning data might be corrupt and represent a non-valid/non-existing API object. Therefore, if you need to use the same in-memory object in multiple goroutines concurrently (e.g., shared state), remember to leverage proper synchronization techniques like channels, mutexes, atomic.Value and/or copy the object prior to use. The average controller however, will not need to share in-memory API objects between goroutines, and it’s typically an indicator that the controller’s design should be improved. The client decoder erases the object’s TypeMeta (apiVersion and kind fields) after retrieval from the API server, see kubernetes/kubernetes#80609, kubernetes-sigs/controller-runtime#1517. Unstructured and metadata-only requests objects are an exception to this because the contained TypeMeta is the only way to identify the object’s type. Because of this behavior, obj.GetObjectKind().GroupVersionKind() is likely to return an empty GroupVersionKind. I.e., you must not rely on TypeMeta being set or GetObjectKind() to return something usable. If you need to identify an object’s GroupVersionKind, use a scheme and its ObjectKinds function instead (or the helper function apiutil.GVKForObject). This is not specific to controller-runtime clients and applies to client-go clients as well. [1] Other lower level, config or internal API types (e.g., such as AdmissionReview) don’t implement client.Object. However, you also can’t interact with such objects via the Kubernetes API and thus also not via a client, so this can be disregarded at this point.\nMetadata-Only Clients Additionally, controller-runtime clients can be used to easily retrieve metadata-only objects or lists. This is useful for efficiently checking if at least one object of a given kind exists, or retrieving metadata of an object, if one is not interested in the rest (e.g., spec/status). The Accept header sent to the API server then contains application/json;as=PartialObjectMetadataList;g=meta.k8s.io;v=v1, which makes the API server only return metadata of the retrieved object(s). This saves network traffic and CPU/memory load on the API server and client side. If the client fully lists all objects of a given kind including their spec/status, the resulting list can be quite large and easily exceed the controllers available memory. That’s why it’s important to carefully check if a full list is actually needed, or if metadata-only list can be used instead.\nFor example:\nvar ( ctx context.Context c client.Client // \"sigs.k8s.io/controller-runtime/pkg/client\" shootList = \u0026metav1.PartialObjectMetadataList{} // \"k8s.io/apimachinery/pkg/apis/meta/v1\" ) shootList.SetGroupVersionKind(gardencorev1beta1.SchemeGroupVersion.WithKind(\"ShootList\")) if err := c.List(ctx, shootList, client.InNamespace(\"garden-my-project\"), client.Limit(1)); err != nil { return err } if len(shootList.Items) \u003e 0 { // project has at least one shoot } else { // project doesn't have any shoots } Gardener’s Client Collection, ClientMaps The Gardener codebase has a collection of clients (kubernetes.Interface), which can return all the above mentioned client types. Additionally, it contains helpers for rendering and applying helm charts (ChartRender, ChartApplier) and retrieving the API server’s version (Version). Client sets are managed by so called ClientMaps, which are a form of registry for all client set for a given type of cluster, i.e., Garden, Seed and Shoot. ClientMaps manage the whole lifecycle of clients: they take care of creating them if they don’t exist already, running their caches, refreshing their cached server version and invalidating them when they are no longer needed.\nvar ( ctx context.Context cm clientmap.ClientMap // \"github.com/gardener/gardener/pkg/client/kubernetes/clientmap\" shoot *gardencorev1beta1.Shoot ) cs, err := cm.GetClient(ctx, keys.ForShoot(shoot)) // kubernetes.Interface if err != nil { return err } c := cs.Client() // client.Client The client collection mainly exist for historical reasons (there used to be a lot of code using the client-go style clients). However, Gardener is in the process of moving more towards controller-runtime and only using their clients, as they provide many benefits and are much easier to use. Also, gardener/gardener#4251 aims at refactoring our controller and admission components to native controller-runtime components.\n ⚠️ Please always prefer controller-runtime clients over other clients when writing new code or refactoring existing code.\n Cache Types: Informers, Listers, Controller-Runtime Caches Similar to the different types of client(set)s, there are also different kinds of Kubernetes client caches. However, all of them are based on the same concept: Informers. An Informer is a watch-based cache implementation, meaning it opens watch connections to the API server and continuously updates cached objects based on the received watch events (ADDED, MODIFIED, DELETED). Informers offer to add indices to the cache for efficient object lookup (e.g., by name or labels) and to add EventHandlers for the watch events. The latter is used by controllers to fill queues with objects that should be reconciled on watch events.\nInformers are used in and created via several higher-level constructs:\nSharedInformerFactories, Listers The generated clients (built-in as well as extended) feature a SharedInformerFactory for every API group, which can be used to create and retrieve Informers for all GroupVersionKinds. Similarly, it can be used to retrieve Listers that allow getting and listing objects from the Informer’s cache. However, both of these constructs are only used for historical reasons, and we are in the process of migrating away from them in favor of cached controller-runtime clients (see gardener/gardener#2414, gardener/gardener#2822). Thus, they are described only briefly here.\nImportant characteristics of Listers:\n Objects read from Informers and Listers can always be slightly out-out-date (i.e., stale) because the client has to first observe changes to API objects via watch events (which can intermittently lag behind by a second or even more). Thus, don’t make any decisions based on data read from Listers if the consequences of deciding wrongfully based on stale state might be catastrophic (e.g. leaking infrastructure resources). In such cases, read directly from the API server via a client instead. Objects retrieved from Informers or Listers are pointers to the cached objects, so they must not be modified without copying them first, otherwise the objects in the cache are also modified. Controller-Runtime Caches controller-runtime features a cache implementation that can be used equivalently as their clients. In fact, it implements a subset of the client.Client interface containing the Get and List functions. Under the hood, a cache.Cache dynamically creates Informers (i.e., opens watches) for every object GroupVersionKind that is being retrieved from it.\nNote that the underlying Informers of a controller-runtime cache (cache.Cache) and the ones of a SharedInformerFactory (client-go) are not related in any way. Both create Informers and watch objects on the API server individually. This means that if you read the same object from different cache implementations, you may receive different versions of the object because the watch connections of the individual Informers are not synced.\n ⚠️ Because of this, controllers/reconcilers should get the object from the same cache in the reconcile loop, where the EventHandler was also added to set up the controller. For example, if a SharedInformerFactory is used for setting up the controller then read the object in the reconciler from the Lister instead of from a cached controller-runtime client.\n By default, the client.Client created by a controller-runtime Manager is a DelegatingClient. It delegates Get and List calls to a Cache, and all other calls to a client that talks directly to the API server. Exceptions are requests with *unstructured.Unstructured objects and object kinds that were configured to be excluded from the cache in the DelegatingClient.\n ℹ️ kubernetes.Interface.Client() returns a DelegatingClient that uses the cache returned from kubernetes.Interface.Cache() under the hood. This means that all Client() usages need to be ready for cached clients and should be able to cater with stale cache reads.\n Important characteristics of cached controller-runtime clients:\n Like for Listers, objects read from a controller-runtime cache can always be slightly out of date. Hence, don’t base any important decisions on data read from the cache (see above). In contrast to Listers, controller-runtime caches fill the passed in-memory object with the state of the object in the cache (i.e., they perform something like a “deep copy into”). This means that objects read from a controller-runtime cache can safely be modified without unintended side effects. Reading from a controller-runtime cache or a cached controller-runtime client implicitly starts a watch for the given object kind under the hood. This has important consequences: Reading a given object kind from the cache for the first time can take up to a few seconds depending on size and amount of objects as well as API server latency. This is because the cache has to do a full list operation and wait for an initial watch sync before returning results. ⚠️ Controllers need appropriate RBAC permissions for the object kinds they retrieve via cached clients (i.e., list and watch). ⚠️ By default, watches started by a controller-runtime cache are cluster-scoped, meaning it watches and caches objects across all namespaces. Thus, be careful which objects to read from the cache as it might significantly increase the controller’s memory footprint. There is no interaction with the cache on writing calls (Create, Update, Patch and Delete), see below. Uncached objects, filtered caches, APIReaders:\nIn order to allow more granular control over which object kinds should be cached and which calls should bypass the cache, controller-runtime offers a few mechanisms to further tweak the client/cache behavior:\n When creating a DelegatingClient, certain object kinds can be configured to always be read directly from the API instead of from the cache. Note that this does not prevent starting a new Informer when retrieving them directly from the cache. Watches can be restricted to a given (set of) namespace(s) by setting cache.Options.Namespaces. Watches can be filtered (e.g., by label) per object kind by configuring cache.Options.SelectorsByObject on creation of the cache. Retrieving metadata-only objects or lists from a cache results in a metadata-only watch/cache for that object kind. The APIReader can be used to always talk directly to the API server for a given Get or List call (use with care and only as a last resort!). To Cache or Not to Cache Although watch-based caches are an important factor for the immense scalability of Kubernetes, it definitely comes at a price (mainly in terms of memory consumption). Thus, developers need to be careful when introducing new API calls and caching new object kinds. Here are some general guidelines on choosing whether to read from a cache or not:\n Always try to use the cache wherever possible and make your controller able to tolerate stale reads. Leverage optimistic locking: use deterministic naming for objects you create (this is what the Deployment controller does [2]). Leverage optimistic locking / concurrency control of the API server: send updates/patches with the last-known resourceVersion from the cache (see below). This will make the request fail, if there were concurrent updates to the object (conflict error), which indicates that we have operated on stale data and might have made wrong decisions. In this case, let the controller handle the error with exponential backoff. This will make the controller eventually consistent. Track the actions you took, e.g., when creating objects with generateName (this is what the ReplicaSet controller does [3]). The actions can be tracked in memory and repeated if the expected watch events don’t occur after a given amount of time. Always try to write controllers with the assumption that data will only be eventually correct and can be slightly out of date (even if read directly from the API server!). If there is already some other code that needs a cache (e.g., a controller watch), reuse it instead of doing extra direct reads. Don’t read an object again if you just sent a write request. Write requests (Create, Update, Patch and Delete) don’t interact with the cache. Hence, use the current state that the API server returned (filled into the passed in-memory object), which is basically a “free direct read” instead of reading the object again from a cache, because this will probably set back the object to an older resourceVersion. If you are concerned about the impact of the resulting cache, try to minimize that by using filtered or metadata-only watches. If watching and caching an object type is not feasible, for example because there will be a lot of updates, and you are only interested in the object every ~5m, or because it will blow up the controllers memory footprint, fallback to a direct read. This can either be done by disabling caching the object type generally or doing a single request via an APIReader. In any case, please bear in mind that every direct API call results in a quorum read from etcd, which can be costly in a heavily-utilized cluster and impose significant scalability limits. Thus, always try to minimize the impact of direct calls by filtering results by namespace or labels, limiting the number of results and/or using metadata-only calls. [2] The Deployment controller uses the pattern \u003cdeployment-name\u003e-\u003cpodtemplate-hash\u003e for naming ReplicaSets. This means, the name of a ReplicaSet it tries to create/update/delete at any given time is deterministically calculated based on the Deployment object. By this, it is insusceptible to stale reads from its ReplicaSets cache.\n[3] In simple terms, the ReplicaSet controller tracks its CREATE pod actions as follows: when creating new Pods, it increases a counter of expected ADDED watch events for the corresponding ReplicaSet. As soon as such events arrive, it decreases the counter accordingly. It only creates new Pods for a given ReplicaSet once all expected events occurred (counter is back to zero) or a timeout has occurred. This way, it prevents creating more Pods than desired because of stale cache reads and makes the controller eventually consistent.\nConflicts, Concurrency Control, and Optimistic Locking Every Kubernetes API object contains the metadata.resourceVersion field, which identifies an object’s version in the backing data store, i.e., etcd. Every write to an object in etcd results in a newer resourceVersion. This field is mainly used for concurrency control on the API server in an optimistic locking fashion, but also for efficient resumption of interrupted watch connections.\nOptimistic locking in the Kubernetes API sense means that when a client wants to update an API object, then it includes the object’s resourceVersion in the request to indicate the object’s version the modifications are based on. If the resourceVersion in etcd has not changed in the meantime, the update request is accepted by the API server and the updated object is written to etcd. If the resourceVersion sent by the client does not match the one of the object stored in etcd, there were concurrent modifications to the object. Consequently, the request is rejected with a conflict error (status code 409, API reason Conflict), for example:\n{ \"kind\": \"Status\", \"apiVersion\": \"v1\", \"metadata\": {}, \"status\": \"Failure\", \"message\": \"Operation cannot be fulfilled on configmaps \\\"foo\\\": the object has been modified; please apply your changes to the latest version and try again\", \"reason\": \"Conflict\", \"details\": { \"name\": \"foo\", \"kind\": \"configmaps\" }, \"code\": 409 } This concurrency control is an important mechanism in Kubernetes as there are typically multiple clients acting on API objects at the same time (humans, different controllers, etc.). If a client receives a conflict error, it should read the object’s latest version from the API server, make the modifications based on the newest changes, and retry the update. The reasoning behind this is that a client might choose to make different decisions based on the concurrent changes made by other actors compared to the outdated version that it operated on.\nImportant points about concurrency control and conflicts:\n The resourceVersion field carries a string value and clients must not assume numeric values (the type and structure of versions depend on the backing data store). This means clients may compare resourceVersion values to detect whether objects were changed. But they must not compare resourceVersions to figure out which one is newer/older, i.e., no greater/less-than comparisons are allowed. By default, update calls (e.g. via client-go and controller-runtime clients) use optimistic locking as the passed in-memory usually object contains the latest resourceVersion known to the controller, which is then also sent to the API server. API servers can also choose to accept update calls without optimistic locking (i.e., without a resourceVersion in the object’s metadata) for any given resource. However, sending update requests without optimistic locking is strongly discouraged, as doing so overwrites the entire object, discarding any concurrent changes made to it. On the other side, patch requests can always be executed either with or without optimistic locking, by (not) including the resourceVersion in the patched object’s metadata. Sending patch requests without optimistic locking might be safe and even desirable as a patch typically updates only a specific section of the object. However, there are also situations where patching without optimistic locking is not safe (see below). Don’t Retry on Conflict Similar to how a human would typically handle a conflict error, there are helper functions implementing RetryOnConflict-semantics, i.e., try an update call, then re-read the object if a conflict occurs, apply the modification again and retry the update. However, controllers should generally not use RetryOnConflict-semantics. Instead, controllers should abort their current reconciliation run and let the queue handle the conflict error with exponential backoff. The reasoning behind this is that a conflict error indicates that the controller has operated on stale data and might have made wrong decisions earlier on in the reconciliation. When using a helper function that implements RetryOnConflict-semantics, the controller doesn’t check which fields were changed and doesn’t revise its previous decisions accordingly. Instead, retrying on conflict basically just ignores any conflict error and blindly applies the modification.\nTo properly solve the conflict situation, controllers should immediately return with the error from the update call. This will cause retries with exponential backoff so that the cache has a chance to observe the latest changes to the object. In a later run, the controller will then make correct decisions based on the newest version of the object, not run into conflict errors, and will then be able to successfully reconcile the object. This way, the controller becomes eventually consistent.\nThe other way to solve the situation is to modify objects without optimistic locking in order to avoid running into a conflict in the first place (only if this is safe). This can be a preferable solution for controllers with long-running reconciliations (which is actually an anti-pattern but quite unavoidable in some of Gardener’s controllers). Aborting the entire reconciliation run is rather undesirable in such cases, as it will add a lot of unnecessary waiting time for end users and overhead in terms of compute and network usage.\nHowever, in any case, retrying on conflict is probably not the right option to solve the situation (there are some correct use cases for it, though, they are very rare). Hence, don’t retry on conflict.\nTo Lock or Not to Lock As explained before, conflicts are actually important and prevent clients from doing wrongful concurrent updates. This means that conflicts are not something we generally want to avoid or ignore. However, in many cases controllers are exclusive owners of the fields they want to update and thus it might be safe to run without optimistic locking.\nFor example, the gardenlet is the exclusive owner of the spec section of the Extension resources it creates on behalf of a Shoot (e.g., the Infrastructure resource for creating VPC). Meaning, it knows the exact desired state and no other actor is supposed to update the Infrastructure’s spec fields. When the gardenlet now updates the Infrastructures spec section as part of the Shoot reconciliation, it can simply issue a PATCH request that only updates the spec and runs without optimistic locking. If another controller concurrently updated the object in the meantime (e.g., the status section), the resourceVersion got changed, which would cause a conflict error if running with optimistic locking. However, concurrent status updates would not change the gardenlet’s mind on the desired spec of the Infrastructure resource as it is determined only by looking at the Shoot’s specification. If the spec section was changed concurrently, it’s still fine to overwrite it because the gardenlet should reconcile the spec back to its desired state.\nGenerally speaking, if a controller is the exclusive owner of a given set of fields and they are independent of concurrent changes to other fields in that object, it can patch these fields without optimistic locking. This might ignore concurrent changes to other fields or blindly overwrite changes to the same fields, but this is fine if the mentioned conditions apply. Obviously, this applies only to patch requests that modify only a specific set of fields but not to update requests that replace the entire object.\nIn such cases, it’s even desirable to run without optimistic locking as it will be more performant and save retries. If certain requests are made with high frequency and have a good chance of causing conflicts, retries because of optimistic locking can cause a lot of additional network traffic in a large-scale Gardener installation.\nUpdates, Patches, Server-Side Apply There are different ways of modifying Kubernetes API objects. The following snippet demonstrates how to do a given modification with the most frequently used options using a controller-runtime client:\nvar ( ctx context.Context c client.Client shoot *gardencorev1beta1.Shoot ) // update shoot.Spec.Kubernetes.Version = \"1.26\" err := c.Update(ctx, shoot) // json merge patch patch := client.MergeFrom(shoot.DeepCopy()) shoot.Spec.Kubernetes.Version = \"1.26\" err = c.Patch(ctx, shoot, patch) // strategic merge patch patch = client.StrategicMergeFrom(shoot.DeepCopy()) shoot.Spec.Kubernetes.Version = \"1.26\" err = c.Patch(ctx, shoot, patch) Important characteristics of the shown request types:\n Update requests always send the entire object to the API server and update all fields accordingly. By default, optimistic locking is used (resourceVersion is included). Both patch types run without optimistic locking by default. However, it can be enabled explicitly if needed: // json merge patch + optimistic locking patch := client.MergeFromWithOptions(shoot.DeepCopy(), client.MergeFromWithOptimisticLock{}) // ... // strategic merge patch + optimistic locking patch = client.StrategicMergeFrom(shoot.DeepCopy(), client.MergeFromWithOptimisticLock{}) // ... Patch requests only contain the changes made to the in-memory object between the copy passed to client.*MergeFrom and the object passed to Client.Patch(). The diff is calculated on the client-side based on the in-memory objects only. This means that if in the meantime some fields were changed on the API server to a different value than the one on the client-side, the fields will not be changed back as long as they are not changed on the client-side as well (there will be no diff in memory). Thus, if you want to ensure a given state using patch requests, always read the object first before patching it, as there will be no diff otherwise, meaning the patch will be empty. For more information, see gardener/gardener#4057 and the comments in gardener/gardener#4027. Also, always send updates and patch requests even if your controller hasn’t made any changes to the current state on the API server. I.e., don’t make any optimization for preventing empty patches or no-op updates. There might be mutating webhooks in the system that will modify the object and that rely on update/patch requests being sent (even if they are no-op). Gardener’s extension concept makes heavy use of mutating webhooks, so it’s important to keep this in mind. JSON merge patches always replace lists as a whole and don’t merge them. Keep this in mind when operating on lists with merge patch requests. If the controller is the exclusive owner of the entire list, it’s safe to run without optimistic locking. Though, if you want to prevent overwriting concurrent changes to the list or its items made by other actors (e.g., additions/removals to the metadata.finalizers list), enable optimistic locking. Strategic merge patches are able to make more granular modifications to lists and their elements without replacing the entire list. It uses Golang struct tags of the API types to determine which and how lists should be merged. See Update API Objects in Place Using kubectl patch or the strategic merge patch documentation for more in-depth explanations and comparison with JSON merge patches. With this, controllers might be able to issue patch requests for individual list items without optimistic locking, even if they are not exclusive owners of the entire list. Remember to check the patchStrategy and patchMergeKey struct tags of the fields you want to modify before blindly adding patch requests without optimistic locking. Strategic merge patches are only supported by built-in Kubernetes resources and custom resources served by Extension API servers. Strategic merge patches are not supported by custom resources defined by CustomResourceDefinitions (see this comparison). In that case, fallback to JSON merge patches. Server-side Apply is yet another mechanism to modify API objects, which is supported by all API resources (in newer Kubernetes versions). However, it has a few problems and more caveats preventing us from using it in Gardener at the time of writing. See gardener/gardener#4122 for more details. Generally speaking, patches are often the better option compared to update requests because they can save network traffic, encoding/decoding effort, and avoid conflicts under the presented conditions. If choosing a patch type, consider which type is supported by the resource you’re modifying and what will happen in case of a conflict. Consider whether your modification is safe to run without optimistic locking. However, there is no simple rule of thumb on which patch type to choose.\n On Helper Functions Here is a note on some helper functions, that should be avoided and why:\ncontrollerutil.CreateOrUpdate does a basic get, mutate and create or update call chain, which is often used in controllers. We should avoid using this helper function in Gardener, because it is likely to cause conflicts for cached clients and doesn’t send no-op requests if nothing was changed, which can cause problems because of the heavy use of webhooks in Gardener extensions (see above). That’s why usage of this function was completely replaced in gardener/gardener#4227 and similar PRs.\ncontrollerutil.CreateOrPatch is similar to CreateOrUpdate but does a patch request instead of an update request. It has the same drawback as CreateOrUpdate regarding no-op updates. Also, controllers can’t use optimistic locking or strategic merge patches when using CreateOrPatch. Another reason for avoiding use of this function is that it also implicitly patches the status section if it was changed, which is confusing for others reading the code. To accomplish this, the func does some back and forth conversion, comparison and checks, which are unnecessary in most of our cases and simply wasted CPU cycles and complexity we want to avoid.\nThere were some Try{Update,UpdateStatus,Patch,PatchStatus} helper functions in Gardener that were already removed by gardener/gardener#4378 but are still used in some extension code at the time of writing. The reason for eliminating these functions is that they implement RetryOnConflict-semantics. Meaning, they first get the object, mutate it, then try to update and retry if a conflict error occurs. As explained above, retrying on conflict is a controller anti-pattern and should be avoided in almost every situation. The other problem with these functions is that they read the object first from the API server (always do a direct call), although in most cases we already have a recent version of the object at hand. So, using this function generally does unnecessary API calls and therefore causes unwanted compute and network load.\nFor the reasons explained above, there are similar helper functions that accomplish similar things but address the mentioned drawbacks: controllerutils.{GetAndCreateOrMergePatch,GetAndCreateOrStrategicMergePatch}. These can be safely used as replacements for the aforementioned helper funcs. If they are not fitting for your use case, for example because you need to use optimistic locking, just do the appropriate calls in the controller directly.\nRelated Links Kubernetes Client usage in Gardener (Community Meeting talk, 2020-06-26) These resources are only partially related to the topics covered in this doc, but might still be interesting for developer seeking a deeper understanding of Kubernetes API machinery, architecture and foundational concepts.\n API Conventions The Kubernetes Resource Model ","categories":"","description":"","excerpt":"Kubernetes Clients in Gardener This document aims at providing a …","ref":"/docs/gardener/kubernetes-clients/","tags":"","title":"Kubernetes Clients"},{"body":"KUBERNETES_SERVICE_HOST Environment Variable Injection In each Shoot cluster’s kube-system namespace a DaemonSet called apiserver-proxy is deployed. It routes traffic to the upstream Shoot Kube APIServer. See the APIServer SNI GEP for more details.\nTo skip this extra network hop, a mutating webhook called apiserver-proxy.networking.gardener.cloud is deployed next to the API server in the Seed. It adds a KUBERNETES_SERVICE_HOST environment variable to each container and init container that do not specify it. See the webhook repository for more information.\nOpt-Out of Pod Injection In some cases it’s desirable to opt-out of Pod injection:\n DNS is disabled on that individual Pod, but it still needs to talk to the kube-apiserver. Want to test the kube-proxy and kubelet in-cluster discovery. Opt-Out of Pod Injection for Specific Pods To opt out of the injection, the Pod should be labeled with apiserver-proxy.networking.gardener.cloud/inject: disable, e.g.:\napiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx apiserver-proxy.networking.gardener.cloud/inject: disable spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 Opt-Out of Pod Injection on Namespace Level To opt out of the injection of all Pods in a namespace, you should label your namespace with apiserver-proxy.networking.gardener.cloud/inject: disable, e.g.:\napiVersion: v1 kind: Namespace metadata: labels: apiserver-proxy.networking.gardener.cloud/inject: disable name: my-namespace or via kubectl for existing namespace:\nkubectl label namespace my-namespace apiserver-proxy.networking.gardener.cloud/inject=disable Note: Please be aware that it’s not possible to disable injection on a namespace level and enable it for individual pods in it.\n Opt-Out of Pod Injection for the Entire Cluster If the injection is causing problems for different workloads and ignoring individual pods or namespaces is not possible, then the feature could be disabled for the entire cluster with the alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector annotation with value disable on the Shoot resource itself:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: annotations: alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector: 'disable' name: my-cluster or via kubectl for existing shoot cluster:\nkubectl label shoot my-cluster alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector=disable Note: Please be aware that it’s not possible to disable injection on a cluster level and enable it for individual pods in it.\n ","categories":"","description":"","excerpt":"KUBERNETES_SERVICE_HOST Environment Variable Injection In each Shoot …","ref":"/docs/gardener/shoot_kubernetes_service_host_injection/","tags":"","title":"KUBERNETES_SERVICE_HOST Environment Variable Injection"},{"body":"Introduction Lakom is kubernetes admission controller which purpose is to implement cosign image signature verification with public cosign key. It also takes care to resolve image tags to sha256 digests. A built-in cache mechanism can be enabled to reduce the load toward the OCI registry.\nFlags Lakom admission controller is configurable via command line flags. The trusted cosign public keys and the associated algorithms associated with them are set viq configuration file provided with the flag --lakom-config-path.\n Flag Name Description Default Value --bind-address Address to bind to “0.0.0.0” --cache-refresh-interval Refresh interval for the cached objects 30s --cache-ttl TTL for the cached objects. Set to 0, if cache has to be disabled 10m0s --contention-profiling Enable lock contention profiling, if profiling is enabled false --health-bind-address Bind address for the health server “:8081” -h, --help help for lakom --insecure-allow-insecure-registries If set, communication via HTTP with registries will be allowed. false --insecure-allow-untrusted-images If set, the webhook will just return warning for the images without trusted signatures. false --kubeconfig Paths to a kubeconfig. Only required if out-of-cluster. --lakom-config-path Path to file with lakom configuration containing cosign public keys used to verify the image signatures --metrics-bind-address Bind address for the metrics server “:8080” --port Webhook server port 9443 --profiling Enable profiling via web interface host:port/debug/pprof/ false --tls-cert-dir Directory with server TLS certificate and key (must contain a tls.crt and tls.key file --use-only-image-pull-secrets If set, only the credentials from the image pull secrets of the pod are used to access the OCI registry. Otherwise, the node identity and docker config are also used. false --version prints version information and quits; –version=vX.Y.Z… sets the reported version Lakom Cosign Public Keys Configuration File Lakom cosign public keys configuration file should be YAML or JSON formatted. It can set multiple trusted keys, as each key must be given a name. The supported types of public keys are RSA, ECDSA and Ed25519. The RSA keys can be additionally configured with a signature verification algorithm specifying the scheme and hash function used during signature verification. As of now ECDSA and Ed25519 keys cannot be configured with specific algorithm.\npublicKeys: - name: example-public-key algorithm: RSASSA-PSS-SHA256 key: |------BEGIN PUBLIC KEY----- MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBAPeQXbIWMMXYV+9+j9b4jXTflnpfwn4E GMrmqYVhm0sclXb2FPP5aV/NFH6SZdHDZcT8LCNsNgxzxV4N+UE/JIsCAwEAAQ== -----END PUBLIC KEY----- Supported RSA Signature Verification Algorithms RSASSA-PKCS1-v1_5-SHA256: uses RSASSA-PKCS1-v1_5 scheme with SHA256 hash func RSASSA-PKCS1-v1_5-SHA384: uses RSASSA-PKCS1-v1_5 scheme with SHA384 hash func RSASSA-PKCS1-v1_5-SHA512: uses RSASSA-PKCS1-v1_5 scheme with SHA512 hash func RSASSA-PSS-SHA256: uses RSASSA-PSS scheme with SHA256 hash func RSASSA-PSS-SHA384: uses RSASSA-PSS scheme with SHA384 hash func RSASSA-PSS-SHA512: uses RSASSA-PSS scheme with SHA512 hash func ","categories":"","description":"","excerpt":"Introduction Lakom is kubernetes admission controller which purpose is …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/lakom/","tags":"","title":"Lakom"},{"body":"Gardener Extension for lakom services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-lakom-service extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nLakom Admission Controller Lakom is kubernetes admission controller which purpose is to implement cosign image signature verification against public cosign key. It also takes care to resolve image tags to sha256 digests. It also caches all OCI artifacts to reduce the load toward the OCI registry.\nExtension Resources Example extension resource:\n apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-lakom-service namespace: shoot--project--abc spec: type: shoot-lakom-service When an extension resource is reconciled, the extension controller will create an instance of lakom admission controller. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally The Lakom admission controller can be configured with make dev-setup and started with make start-lakom. You can run the lakom extension controller locally on your machine by executing make start.\nIf you’d like to develop Lakom using a local cluster such as KinD, make sure your KUBECONFIG environment variable is targeting the local Garden cluster. Add 127.0.0.1 garden.local.gardener.cloud to your /etc/hosts. You can then run:\nmake extension-up This will trigger a skaffold deployment that builds the images, pushes them to the registry and installs the helm charts from /charts.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"A k8s admission controller verifying pods are using signed images (cosign signatures) and a gardener extension to install it for shoots and seeds.","excerpt":"A k8s admission controller verifying pods are using signed images …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/","tags":"","title":"Lakom service"},{"body":"e2e Test Suite Developers can run extended e2e tests, in addition to unit tests, for Etcd-Druid in or from their local environments. This is recommended to verify the desired behavior of several features and to avoid regressions in future releases.\nThe very same tests typically run as part of the component’s release job as well as on demand, e.g., when triggered by Etcd-Druid maintainers for open pull requests.\nTesting Etcd-Druid automatically involves a certain test coverage for gardener/etcd-backup-restore which is deployed as a side-car to the actual etcd container.\nPrerequisites The e2e test lifecycle is managed with the help of skaffold. Every involved step like setup, deploy, undeploy or cleanup is executed against a Kubernetes cluster which makes it a mandatory prerequisite at the same time. Only skaffold itself with involved docker, helm and kubectl executions as well as the e2e-tests are executed locally. Required binaries are automatically downloaded if you use the corresponding make target, as described in this document.\nIt’s expected that especially the deploy step is run against a Kubernetes cluster which doesn’t contain an Druid deployment or any left-overs like druid.gardener.cloud CRDs. The deploy step will likely fail in such scenarios.\n Tip: Create a fresh KinD cluster or a similar one with a small footprint before executing the tests.\n Providers The following providers are supported for e2e tests:\n AWS Azure GCP Local Valid credentials need to be provided when tests are executed with mentioned cloud providers.\n Flow An e2e test execution involves the following steps:\n Step Description setup Create a storage bucket which is used for etcd backups (only with cloud providers). deploy Build Docker image, upload it to registry (if remote cluster - see Docker build), deploy Helm chart (charts/druid) to Kubernetes cluster. test Execute e2e tests as defined in test/e2e. undeploy Remove the deployed artifacts from Kubernetes cluster. cleanup Delete storage bucket and Druid deployment from test cluster. Make target Executing e2e-tests is as easy as executing the following command with defined Env-Vars as desribed in the following section and as needed for your test scenario.\nmake test-e2e Common Env Variables The following environment variables influence how the flow described above is executed:\n PROVIDERS: Providers used for testing (all, aws, azure, gcp, local). Multiple entries must be comma separated. Note: Some tests will use very first entry from env PROVIDERS for e2e testing (ex: multi-node tests). So for multi-node tests to use specific provider, specify that provider as first entry in env PROVIDERS.\n KUBECONFIG: Kubeconfig pointing to cluster where Etcd-Druid will be deployed (preferably KinD). TEST_ID: Some ID which is used to create assets for and during testing. STEPS: Steps executed by make target (setup, deploy, test, undeploy, cleanup - default: all steps). AWS Env Variables AWS_ACCESS_KEY_ID: Key ID of the user. AWS_SECRET_ACCESS_KEY: Access key of the user. AWS_REGION: Region in which the test bucket is created. Example:\nmake \\ AWS_ACCESS_KEY_ID=\"abc\" \\ AWS_SECRET_ACCESS_KEY=\"xyz\" \\ AWS_REGION=\"eu-central-1\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"aws\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e Azure Env Variables STORAGE_ACCOUNT: Storage account used for managing the storage container. STORAGE_KEY: Key of storage account. Example:\nmake \\ STORAGE_ACCOUNT=\"abc\" \\ STORAGE_KEY=\"eHl6Cg==\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"azure\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e GCP Env Variables GCP_SERVICEACCOUNT_JSON_PATH: Path to the service account json file used for this test. GCP_PROJECT_ID: ID of the GCP project. Example:\nmake \\ GCP_SERVICEACCOUNT_JSON_PATH=\"/var/lib/secrets/serviceaccount.json\" \\ GCP_PROJECT_ID=\"xyz-project\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"gcp\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e Local Env Variables No special environment variables are required for running e2e tests with Local provider.\nExample:\nmake \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"local\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e e2e test with localstack The above-mentioned e2e tests need storage from real cloud providers to be setup. But there is a tool named localstack that enables to run e2e test with mock AWS storage. We can also provision KIND cluster for e2e tests. So, together with localstack and KIND cluster, we don’t need to depend on any actual cloud provider infrastructure to be setup to run e2e tests.\nHow are the KIND cluster and localstack set up KIND or Kubernetes-In-Docker is a kubernetes cluster that is set up inside a docker container. This cluster is with limited capability as it does not have much compute power. But this cluster can easily be setup inside a container and can be tear down easily just by removing a container. That’s why KIND cluster is very easy to use for e2e tests. Makefile command helps to spin up a KIND cluster and use the cluster to run e2e tests.\nThere is a docker image for localstack. The image is deployed as pod inside the KIND cluster through hack/e2e-test/infrastructure/localstack/localstack.yaml. Makefile takes care of deploying the yaml file in a KIND cluster.\nThe developer needs to run make ci-e2e-kind command. This command in turn runs hack/ci-e2e-kind.sh which spin up the KIND cluster and deploy localstack in it and then run the e2e tests using localstack as mock AWS storage provider. e2e tests are actually run on host machine but deploy the druid controller inside KIND cluster. Druid controller spawns multinode etcd clusters inside KIND cluster. e2e tests verify whether the druid controller performs its jobs correctly or not. Mock localstack storage is cleaned up after every e2e tests. That’s why the e2e tests need to access the localstack pod running inside KIND cluster. The network traffic between host machine and localstack pod is resolved via mapping localstack pod port to host port while setting up the KIND cluster via hack/e2e-test/infrastructure/kind/cluster.yaml\nHow to execute e2e tests with localstack and KIND cluster Run the following make command to spin up a KinD cluster, deploy localstack and run the e2e tests with provider aws:\nmake ci-e2e-kind ","categories":"","description":"","excerpt":"e2e Test Suite Developers can run extended e2e tests, in addition to …","ref":"/docs/other-components/etcd-druid/local-e2e-tests/","tags":"","title":"Local e2e Tests"},{"body":"Local development Purpose Develop new feature and fix bug on the Gardener Dashboard.\nRequirements Yarn. For the required version, refer to .engines.yarn in package.json. Node.js. For the required version, refer to .engines.node in package.json. Steps 1. Clone repository Clone the gardener/dashboard repository\ngit clone git@github.com:gardener/dashboard.git 2. Install dependencies Run yarn at the repository root to install all dependencies.\ncd dashboard yarn 3. Configuration Place the Gardener Dashboard configuration under ${HOME}/.gardener/config.yaml or alternatively set the path to the configuration file using the GARDENER_CONFIG environment variable.\nA local configuration example could look like follows:\nport: 3030 logLevel: debug logFormat: text apiServerUrl: https://my-local-cluster # garden cluster kube-apiserver url - kubectl config view --minify -ojsonpath='{.clusters[].cluster.server}' sessionSecret: c2VjcmV0 # symmetric key used for encryption frontend: dashboardUrl: pathname: /api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/ defaultHibernationSchedule: evaluation: - start: 00 17 * * 1,2,3,4,5 development: - start: 00 17 * * 1,2,3,4,5 end: 00 08 * * 1,2,3,4,5 production: ~ 4. Run it locally The Gardener Dashboard backend server requires a kubeconfig for the Garden cluster. You can set it e.g. by using the KUBECONFIG environment variable.\nIf you want to run the Garden cluster locally, follow the getting started locally documentation. Gardener Dashboard supports the local infrastructure provider that comes with the local Gardener cluster setup. See 6. Login to the dashboard for more information on how to use the Dashboard with a local gardener or any other Gardener landscape.\nStart the backend server (http://localhost:3030).\ncd backend export KUBECONFIG=/path/to/garden/cluster/kubeconfig.yaml yarn serve To start the frontend server, you have two options for handling the server certificate:\n Recommended Method: Run yarn setup in the frontend directory to generate a new self-signed CA and TLS server certificate before starting the frontend server for the first time. The CA is automatically added to the keychain on macOS. If you prefer not to add it to the keychain, you can use the --skip-keychain flag. For other operating systems, you will need to manually add the generated certificates to the local trust store.\n Alternative Method: If you prefer not to run yarn setup, a temporary self-signed certificate will be generated automatically. This certificate will not be added to the keychain. Note that you will need to click through the insecure warning in your browser to access the dashboard.\n We need to start a TLS dev server because we use cookie names with __Host- prefix. This requires the secure attribute to be set. For more information, see OWASP Host Prefix.\nStart the frontend dev server (https://localhost:8443) with https and hot reload enabled.\ncd frontend # yarn setup yarn serve You can now access the UI on https://localhost:8443/\n5. Login to the dashboard To login to the dashboard you can either configure oidc, or alternatively login using a token:\nTo login using a token, first create a service account.\nkubectl -n garden create serviceaccount dashboard-user Assign it a role, e.g. cluster-admin.\nkubectl set subject clusterrolebinding cluster-admin --serviceaccount=garden:dashboard-user Get the token of the service account.\nkubectl -n garden create token dashboard-user --duration 24h Copy the token and login to the dashboard.\nBuild Build docker image locally.\nmake build Push Push docker image to Google Container Registry.\nmake push This command expects a valid gcloud configuration named gardener.\ngcloud config configurations describe gardener is_active: true name: gardener properties: core: account: john.doe@example.org project: johndoe-1008 ","categories":"","description":"","excerpt":"Local development Purpose Develop new feature and fix bug on the …","ref":"/docs/dashboard/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-alicloud admission-alicloud is an admission webhook server which is responsible for the validation of the cloud provider (Alicloud in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-alicloud.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-alicloud.sh You are now ready to experiment with the admission-alicloud webhook server locally.\n","categories":"","description":"","excerpt":"admission-alicloud admission-alicloud is an admission webhook server …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-aws admission-aws is an admission webhook server which is responsible for the validation of the cloud provider (AWS in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-aws.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-aws.sh You are now ready to experiment with the admission-aws webhook server locally.\n","categories":"","description":"","excerpt":"admission-aws admission-aws is an admission webhook server which is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-azure admission-azure is an admission webhook server which is responsible for the validation of the cloud provider (Azure in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-azure.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-azure.sh You are now ready to experiment with the admission-azure webhook server locally.\n","categories":"","description":"","excerpt":"admission-azure admission-azure is an admission webhook server which …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-gcp admission-gcp is an admission webhook server which is responsible for the validation of the cloud provider (GCP in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-gcp.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-gcp.sh You are now ready to experiment with the admission-gcp webhook server locally.\n","categories":"","description":"","excerpt":"admission-gcp admission-gcp is an admission webhook server which is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-openstack admission-openstack is an admission webhook server which is responsible for the validation of the cloud provider (OpenStack in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-openstack.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-openstack.sh You are now ready to experiment with the admission-openstack webhook server locally.\n","categories":"","description":"","excerpt":"admission-openstack admission-openstack is an admission webhook server …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/local-setup/","tags":"","title":"Local Setup"},{"body":"Overview Conceptually, all Gardener components are designed to run as a Pod inside a Kubernetes cluster. The Gardener API server extends the Kubernetes API via the user-aggregated API server concepts. However, if you want to develop it, you may want to work locally with the Gardener without building a Docker image and deploying it to a cluster each and every time. That means that the Gardener runs outside a Kubernetes cluster which requires providing a Kubeconfig in your local filesystem and point the Gardener to it when starting it (see below).\nFurther details can be found in\n Principles of Kubernetes, and its components Kubernetes Development Guide Architecture of Gardener This guide is split into two main parts:\n Preparing your setup by installing all dependencies and tools Getting the Gardener source code locally Preparing the Setup [macOS only] Installing homebrew The copy-paste instructions in this guide are designed for macOS and use the package manager Homebrew.\nOn macOS run\n/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\" [macOS only] Installing GNU bash Built-in apple-darwin bash is missing some features that could cause shell scripts to fail locally.\nbrew install bash Installing git We use git as VCS which you need to install. On macOS run\nbrew install git For other OS, please check the Git installation documentation.\nInstalling Go Install the latest version of Go. On macOS run\nbrew install go For other OS, please check Go installation documentation.\nInstalling kubectl Install kubectl. Please make sure that the version of kubectl is at least v1.25.x. On macOS run\nbrew install kubernetes-cli For other OS, please check the kubectl installation documentation.\nInstalling Docker You need to have docker installed and running. On macOS run\nbrew install --cask docker For other OS please check the docker installation documentation.\nInstalling iproute2 iproute2 provides a collection of utilities for network administration and configuration. On macOS run\nbrew install iproute2mac Installing jq jq is a lightweight and flexible command-line JSON processor. On macOS run\nbrew install jq Installing yq yq is a lightweight and portable command-line YAML processor. On macOS run\nbrew install yq Installing GNU Parallel GNU Parallel is a shell tool for executing jobs in parallel, used by the code generation scripts (make generate). On macOS run\nbrew install parallel [macOS only] Install GNU Core Utilities When running on macOS, install the GNU core utilities and friends:\nbrew install coreutils gnu-sed gnu-tar grep gzip This will create symbolic links for the GNU utilities with g prefix on your PATH, e.g., gsed or gbase64. To allow using them without the g prefix, add the gnubin directories to the beginning of your PATH environment variable (brew install and brew info will print out instructions for each formula):\nexport PATH=$(brew --prefix)/opt/coreutils/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gnu-sed/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gnu-tar/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/grep/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gzip/bin:$PATH [Windows Only] WSL2 Apart from Linux distributions and macOS, the local gardener setup can also run on the Windows Subsystem for Linux 2.\nWhile WSL1, plain docker for Windows and various Linux distributions and local Kubernetes environments may be supported, this setup was verified with:\n WSL2 Docker Desktop WSL2 Engine Ubuntu 18.04 LTS on WSL2 Nodeless local garden (see below) The Gardener repository and all the above-mentioned tools (git, golang, kubectl, …) should be installed in your WSL2 distro, according to the distribution-specific Linux installation instructions.\nGet the Sources Clone the repository from GitHub into your $GOPATH.\nmkdir -p $(go env GOPATH)/src/github.com/gardener cd $(go env GOPATH)/src/github.com/gardener git clone git@github.com:gardener/gardener.git cd gardener Note: Gardener is using Go modules and cloning the repository into $GOPATH is not a hard requirement. However it is still recommended to clone into $GOPATH because k8s.io/code-generator does not work yet outside of $GOPATH - kubernetes/kubernetes#86753.\n Start the Gardener Please see getting_started_locally.md how to build and deploy Gardener from your local sources.\n","categories":"","description":"","excerpt":"Overview Conceptually, all Gardener components are designed to run as …","ref":"/docs/gardener/local_setup/","tags":"","title":"Local Setup"},{"body":"Preparing the Local Development Setup (Mac OS X) Preparing the Local Development Setup (Mac OS X) Installing Golang environment Installing Docker (Optional) Setup Docker Hub account (Optional) Local development Installing the Machine Controller Manager locally Prepare the cluster Getting started Testing Machine Classes Usage Conceptionally, the Machine Controller Manager is designed to run in a container within a Pod inside a Kubernetes cluster. For development purposes, you can run the Machine Controller Manager as a Go process on your local machine. This process connects to your remote cluster to manage VMs for that cluster. That means that the Machine Controller Manager runs outside a Kubernetes cluster which requires providing a Kubeconfig in your local filesystem and point the Machine Controller Manager to it when running it (see below).\nAlthough the following installation instructions are for Mac OS X, similar alternate commands could be found for any Linux distribution.\nInstalling Golang environment Install the latest version of Golang (at least v1.8.3 is required) by using Homebrew:\n$ brew install golang In order to perform linting on the Go source code, install Golint:\n$ go get -u golang.org/x/lint/golint Installing Docker (Optional) In case you want to build Docker images for the Machine Controller Manager you have to install Docker itself. We recommend using Docker for Mac OS X which can be downloaded from here.\nSetup Docker Hub account (Optional) Create a Docker hub account at Docker Hub if you don’t already have one.\nLocal development ⚠️ Before you start developing, please ensure to comply with the following requirements:\n You have understood the principles of Kubernetes, and its components, what their purpose is and how they interact with each other. You have understood the architecture of the Machine Controller Manager The development of the Machine Controller Manager could happen by targeting any cluster. You basically need a Kubernetes cluster running on a set of machines. You just need the Kubeconfig file with the required access permissions attached to it.\nInstalling the Machine Controller Manager locally Clone the repository from GitHub.\n$ git clone git@github.com:gardener/machine-controller-manager.git $ cd machine-controller-manager Prepare the cluster Connect to the remote kubernetes cluster where you plan to deploy the Machine Controller Manager using kubectl. Set the environment variable KUBECONFIG to the path of the yaml file containing your cluster info Now, create the required CRDs on the remote cluster using the following command, $ kubectl apply -f kubernetes/crds.yaml Getting started Setup and Restore with Gardener\nSetup\nIn gardener access to static kubeconfig files is no longer supported due to security reasons. One needs to generate short-lived (max TTL = 1 day) admin kube configs for target and control clusters. A convenience script/Makefile target has been provided to do the required initial setup which includes:\n Creating a temporary directory where target and control kubeconfigs will be stored. Create a request to generate the short lived admin kubeconfigs. These are downloaded and stored in the temporary folder created above. In gardener clusters DWD (Dependency Watchdog) runs as an additional component which can interfere when MCM/CA is scaled down. To prevent that an annotation dependency-watchdog.gardener.cloud/ignore-scaling is added to machine-controller-manager deployment which prevents DWD from scaling up the deployment replicas. Scales down machine-controller-manager deployment in the control cluster to 0 replica. Creates the required .env file and populates required environment variables which are then used by the Makefile in both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Copies the generated and downloaded kubeconfig files for the target and control clusters to machine-controller-manager-provider-\u003cprovider-name\u003e project as well. To do the above you can either invoke make gardener-setup or you can directly invoke the script ./hack/gardener_local_setup.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nRestore\nOnce the testing is over you can invoke a convenience script/Makefile target which does the following:\n Removes all generated admin kubeconfig files from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Removes the .env file that was generated as part of the setup from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Scales up machine-controller-manager deployment in the control cluster back to 1 replica. Removes the annotation dependency-watchdog.gardener.cloud/ignore-scaling that was added to prevent DWD to scale up MCM. To do the above you can either invoke make gardener-restore or you can directly invoke the script ./hack/gardener_local_restore.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nSetup and Restore without Gardener\nSetup\nIf you are not running MCM components in a gardener cluster, then it is assumed that there is not going to be any DWD (Dependency Watchdog) component. A convenience script/Makefile target has been provided to the required initial setup which includes:\n Copies the provided control and target kubeconfig files to machine-controller-manager-provider-\u003cprovider-name\u003e project. Scales down machine-controller-manager deployment in the control cluster to 0 replica. Creates the required .env file and populates required environment variables which are then used by the Makefile in both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. To do the above you can either invoke make non-gardener-setup or you can directly invoke the script ./hack/non_gardener_local_setup.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nRestore\nOnce the testing is over you can invoke a convenience script/Makefile target which does the following:\n Removes all provided kubeconfig files from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Removes the .env file that was generated as part of the setup from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Scales up machine-controller-manager deployment in the control cluster back to 1 replica. To do the above you can either invoke make non-gardener-restore or you can directly invoke the script ./hack/non_gardener_local_restore.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nOnce the setup is done then you can start the machine-controller-manager as a local process using the following Makefile target:\n$ make start I1227 11:08:19.963638 55523 controllermanager.go:204] Starting shared informers I1227 11:08:20.766085 55523 controller.go:247] Starting machine-controller-manager ⚠️ The file dev/target-kubeconfig.yaml points to the cluster whose nodes you want to manage. dev/control-kubeconfig.yaml points to the cluster from where you want to manage the nodes from. However, dev/control-kubeconfig.yaml is optional.\nThe Machine Controller Manager should now be ready to manage the VMs in your kubernetes cluster.\n⚠️ This is assuming that your MCM is built to manage machines for any in-tree supported providers. There is a new way to deploy and manage out of tree (external) support for providers whose development can be found here\nTesting Machine Classes To test the creation/deletion of a single instance for one particular machine class you can use the managevm cli. The corresponding INFRASTRUCTURE-machine-class.yaml and the INFRASTRUCTURE-secret.yaml need to be defined upfront. To build and run it\nGO111MODULE=on go build -o managevm cmd/machine-controller-manager-cli/main.go # create machine ./managevm --secret PATH_TO/INFRASTRUCTURE-secret.yaml --machineclass PATH_TO/INFRASTRUCTURE-machine-class.yaml --classkind INFRASTRUCTURE --machinename test # delete machine ./managevm --secret PATH_TO/INFRASTRUCTURE-secret.yaml --machineclass PATH_TO/INFRASTRUCTURE-machine-class.yaml --classkind INFRASTRUCTURE --machinename test --machineid INFRASTRUCTURE:///REGION/INSTANCE_ID Usage To start using Machine Controller Manager, follow the links given at usage here.\n","categories":"","description":"","excerpt":"Preparing the Local Development Setup (Mac OS X) Preparing the Local …","ref":"/docs/other-components/machine-controller-manager/local_setup/","tags":"","title":"Local Setup"},{"body":"How to Create Log Parser for Container into fluent-bit If our log message is parsed correctly, it has to be showed in Plutono like this:\n{\"log\":\"OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io\",\"pid\":\"1\",\"severity\":\"INFO\",\"source\":\"controller.go:107\"} Otherwise it will looks like this:\n{ \"log\":\"{ \\\"level\\\":\\\"info\\\",\\\"ts\\\":\\\"2020-06-01T11:23:26.679Z\\\",\\\"logger\\\":\\\"gardener-resource-manager.health-reconciler\\\",\\\"msg\\\":\\\"Finished ManagedResource health checks\\\",\\\"object\\\":\\\"garden/provider-aws-dsm9r\\\" }\\n\" } } Create a Custom Parser First of all, we need to know how the log for the specific container looks like (for example, lets take a log from the alertmanager : level=info ts=2019-01-28T12:33:49.362015626Z caller=main.go:175 build_context=\"(go=go1.11.2, user=root@4ecc17c53d26, date=20181109-15:40:48))\n We can see that this log contains 4 subfields(severity=info, timestamp=2019-01-28T12:33:49.362015626Z, source=main.go:175 and the actual message). So we have to write a regex which matches this log in 4 groups(We can use https://regex101.com/ like helping tool). So, for this purpose our regex looks like this:\n ^level=(?\u003cseverity\u003e\\w+)\\s+ts=(?\u003ctime\u003e\\d{4}-\\d{2}-\\d{2}[Tt].*[zZ])\\s+caller=(?\u003csource\u003e[^\\s]*+)\\s+(?\u003clog\u003e.*) Now we have to create correct time format for the timestamp (We can use this site for this purpose: http://ruby-doc.org/stdlib-2.4.1/libdoc/time/rdoc/Time.html#method-c-strptime). So our timestamp matches correctly the following format: %Y-%m-%dT%H:%M:%S.%L It’s time to apply our new regex into fluent-bit configuration. To achieve that we can just deploy in the cluster where the fluent-operator is deployed the following custom resources: apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: seed name: \u003c\u003c pod-name \u003e\u003e--(\u003c\u003c container-name \u003e\u003e) spec: filters: - parser: keyName: log parser: \u003c\u003c container-name \u003e\u003e-parser reserveData: true match: kubernetes.\u003c\u003c pod-name \u003e\u003e*\u003c\u003c container-name \u003e\u003e* EXAMPLE apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: seed name: alertmanager spec: filters: - parser: keyName: log parser: alertmanager-parser reserveData: true match: \"kubernetes.alertmanager*alertmanager*\" Now lets check if there already exists ClusterParser with such a regex and time format that we need. If it doesn’t, create one: apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterParser metadata: name: \u003c\u003c container-name \u003e\u003e-parser labels: fluentbit.gardener/type: \"seed\" spec: regex: timeKey: time timeFormat: \u003c\u003c time-format \u003e\u003e regex: \"\u003c\u003c regex \u003e\u003e\" EXAMPLE apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterParser metadata: name: alermanager-parser labels: fluentbit.gardener/type: \"seed\" spec: regex: timeKey: time timeFormat: \"%Y-%m-%dT%H:%M:%S.%L\" regex: \"^level=(?\u003cseverity\u003e\\\\w+)\\\\s+ts=(?\u003ctime\u003e\\\\d{4}-\\\\d{2}-\\\\d{2}[Tt].*[zZ])\\\\s+caller=(?\u003csource\u003e[^\\\\s]*+)\\\\s+(?\u003clog\u003e.*)\" Follow your development setup to validate that the parsers are working correctly. ","categories":"","description":"","excerpt":"How to Create Log Parser for Container into fluent-bit If our log …","ref":"/docs/gardener/log_parsers/","tags":"","title":"Log Parsers"},{"body":"Logging in Gardener Components This document aims at providing a general developer guideline on different aspects of logging practices and conventions used in the Gardener codebase. It contains mostly Gardener-specific points, and references other existing and commonly accepted logging guidelines for general advice. Developers and reviewers should consult this guide when writing, refactoring, and reviewing Gardener code. If parts are unclear or new learnings arise, this guide should be adapted accordingly.\nLogging Libraries / Implementations Historically, Gardener components have been using logrus. There is a global logrus logger (logger.Logger) that is initialized by components on startup and used across the codebase. In most places, it is used as a printf-style logger and only in some instances we make use of logrus’ structured logging functionality.\nIn the process of migrating our components to native controller-runtime components (see gardener/gardener#4251), we also want to make use of controller-runtime’s built-in mechanisms for streamlined logging. controller-runtime uses logr, a simple structured logging interface, for library-internal logging and logging in controllers.\nlogr itself is only an interface and doesn’t provide an implementation out of the box. Instead, it needs to be backed by a logging implementation like zapr. Code that uses the logr interface is thereby not tied to a specific logging implementation and makes the implementation easily exchangeable. controller-runtime already provides a set of helpers for constructing zapr loggers, i.e., logr loggers backed by zap, which is a popular logging library in the go community. Hence, we are migrating our component logging from logrus to logr (backed by zap) as part of gardener/gardener#4251.\n ⚠️ logger.Logger (logrus logger) is deprecated in Gardener and shall not be used in new code – use logr loggers when writing new code! (also see Migration from logrus to logr)\nℹ️ Don’t use zap loggers directly, always use the logr interface in order to avoid tight coupling to a specific logging implementation.\n gardener-apiserver differs from the other components as it is based on the apiserver library and therefore uses klog – just like kube-apiserver. As gardener-apiserver writes (almost) no logs in our coding (outside the apiserver library), there is currently no plan for switching the logging implementation. Hence, the following sections focus on logging in the controller and admission components only.\nlogcheck Tool To ensure a smooth migration to logr and make logging in Gardener components more consistent, the logcheck tool was added. It enforces (parts of) this guideline and detects programmer-level errors early on in order to prevent bugs. Please check out the tool’s documentation for a detailed description.\nStructured Logging Similar to efforts in the Kubernetes project, we want to migrate our component logs to structured logging. As motivated above, we will use the logr interface instead of klog though.\nYou can read more about the motivation behind structured logging in logr’s background and FAQ (also see this blog post by Dave Cheney). Also, make sure to check out controller-runtime’s logging guideline with specifics for projects using the library. The following sections will focus on the most important takeaways from those guidelines and give general instructions on how to apply them to Gardener and its controller-runtime components.\n Note: Some parts in this guideline differ slightly from controller-runtime’s document.\n TL;DR of Structured Logging ❌ Stop using printf-style logging:\nvar logger *logrus.Logger logger.Infof(\"Scaling deployment %s/%s to %d replicas\", deployment.Namespace, deployment.Name, replicaCount) ✅ Instead, write static log messages and enrich them with additional structured information in form of key-value pairs:\nvar logger logr.Logger logger.Info(\"Scaling deployment\", \"deployment\", client.ObjectKeyFromObject(deployment), \"replicas\", replicaCount) Log Configuration Gardener components can be configured to either log in json (default) or text format: json format is supposed to be used in production, while text format might be nicer for development.\n# json {\"level\":\"info\",\"ts\":\"2021-12-16T08:32:21.059+0100\",\"msg\":\"Hello botanist\",\"garden\":\"eden\"} # text 2021-12-16T08:32:21.059+0100 INFO Hello botanist {\"garden\": \"eden\"} Components can be set to one of the following log levels (with increasing verbosity): error, info (default), debug.\nLog Levels logr uses V-levels (numbered log levels), higher V-level means higher verbosity. V-levels are relative (in contrast to klog’s absolute V-levels), i.e., V(1) creates a logger, that is one level more verbose than its parent logger.\nIn Gardener components, the mentioned log levels in the component config (error, info, debug) map to the zap levels with the same names (see here). Hence, our loggers follow the same mapping from numerical logr levels to named zap levels like described in zapr, i.e.:\n component config specifies debug ➡️ both V(0) and V(1) are enabled component config specifies info ➡️ V(0) is enabled, V(1) will not be shown component config specifies error ➡️ neither V(0) nor V(1) will be shown Error() logs will always be shown This mapping applies to the components’ root loggers (the ones that are not “derived” from any other logger; constructed on component startup). If you derive a new logger with e.g. V(1), the mapping will shift by one. For example, V(0) will then log at zap’s debug level.\nThere is no warning level (see Dave Cheney’s post). If there is an error condition (e.g., unexpected error received from a called function), the error should either be handled or logged at error if it is neither handled nor returned. If you have an error value at hand that doesn’t represent an actual error condition, but you still want to log it as an informational message, log it at info level with key err.\nWe might consider to make use of a broader range of log levels in the future when introducing more logs and common command line flags for our components (comparable to --v of Kubernetes components). For now, we stick to the mentioned two log levels like controller-runtime: info (V(0)) and debug (V(1)).\nLogging in Controllers Named Loggers Controllers should use named loggers that include their name, e.g.:\ncontrollerLogger := rootLogger.WithName(\"controller\").WithName(\"shoot\") controllerLogger.Info(\"Deploying kube-apiserver\") results in\n2021-12-16T09:27:56.550+0100 INFO controller.shoot Deploying kube-apiserver Logger names are hierarchical. You can make use of it, where controllers are composed of multiple “subcontrollers”, e.g., controller.shoot.hibernation or controller.shoot.maintenance.\nUsing the global logger logf.Log directly is discouraged and should be rather exceptional because it makes correlating logs with code harder. Preferably, all parts of the code should use some named logger.\nReconciler Loggers In your Reconcile function, retrieve a logger from the given context.Context. It inherits from the controller’s logger (i.e., is already named) and is preconfigured with name and namespace values for the reconciliation request:\nfunc (r *reconciler) Reconcile(ctx context.Context, request reconcile.Request) (reconcile.Result, error) { log := logf.FromContext(ctx) log.Info(\"Reconciling Shoot\") // ... return reconcile.Result{}, nil } results in\n2021-12-16T09:35:59.099+0100 INFO controller.shoot Reconciling Shoot {\"name\": \"sunflower\", \"namespace\": \"garden-greenhouse\"} The logger is injected by controller-runtime’s Controller implementation. The logger returned by logf.FromContext is never nil. If the context doesn’t carry a logger, it falls back to the global logger (logf.Log), which might discard logs if not configured, but is also never nil.\n ⚠️ Make sure that you don’t overwrite the name or namespace value keys for such loggers, otherwise you will lose information about the reconciled object.\n The controller implementation (controller-runtime) itself takes care of logging the error returned by reconcilers. Hence, don’t log an error that you are returning. Generally, functions should not return an error, if they already logged it, because that means the error is already handled and not an error anymore. See Dave Cheney’s post for more on this.\nMessages Log messages should be static. Don’t put variable content in there, i.e., no fmt.Sprintf or string concatenation (+). Use key-value pairs instead. Log messages should be capitalized. Note: This contrasts with error messages, that should not be capitalized. However, both should not end with a punctuation mark. Keys and Values Use WithValues instead of repeatedly adding key-value pairs for multiple log statements. WithValues creates a new logger from the parent, that carries the given key-value pairs. E.g., use it when acting on one object in multiple steps and logging something for each step:\nlog := parentLog.WithValues(\"infrastructure\", client.ObjectKeyFromObject(infrastrucutre)) // ... log.Info(\"Creating Infrastructure\") // ... log.Info(\"Waiting for Infrastructure to be reconciled\") // ... Note: WithValues bypasses controller-runtime’s special zap encoder that nicely encodes ObjectKey/NamespacedName and runtime.Object values, see kubernetes-sigs/controller-runtime#1290. Thus, the end result might look different depending on the value and its Stringer implementation.\n Use lowerCamelCase for keys. Don’t put spaces in keys, as it will make log processing with simple tools like jq harder.\n Keys should be constant, human-readable, consistent across the codebase and naturally match parts of the log message, see logr guideline.\n When logging object keys (name and namespace), use the object’s type as the log key and a client.ObjectKey/types.NamespacedName value as value, e.g.:\nvar deployment *appsv1.Deployment log.Info(\"Creating Deployment\", \"deployment\", client.ObjectKeyFromObject(deployment)) which results in\n{\"level\":\"info\",\"ts\":\"2021-12-16T08:32:21.059+0100\",\"msg\":\"Creating Deployment\",\"deployment\":{\"name\": \"bar\", \"namespace\": \"foo\"}} There are cases where you don’t have the full object key or the object itself at hand, e.g., if an object references another object (in the same namespace) by name (think secretRef or similar). In such a cases, either construct the full object key including the implied namespace or log the object name under a key ending in Name, e.g.:\nvar ( // object to reconcile shoot *gardencorev1beta1.Shoot // retrieved via logf.FromContext, preconfigured by controller with namespace and name of reconciliation request log logr.Logger ) // option a: full object key, manually constructed log.Info(\"Shoot uses SecretBinding\", \"secretBinding\", client.ObjectKey{Namespace: shoot.Namespace, Name: *shoot.Spec.SecretBindingName}) // option b: only name under respective *Name log key log.Info(\"Shoot uses SecretBinding\", \"secretBindingName\", *shoot.Spec.SecretBindingName) Both options result in well-structured logs, that are easy to interpret and process:\n{\"level\":\"info\",\"ts\":\"2022-01-18T18:00:56.672+0100\",\"msg\":\"Shoot uses SecretBinding\",\"name\":\"my-shoot\",\"namespace\":\"garden-project\",\"secretBinding\":{\"namespace\":\"garden-project\",\"name\":\"aws\"}} {\"level\":\"info\",\"ts\":\"2022-01-18T18:00:56.673+0100\",\"msg\":\"Shoot uses SecretBinding\",\"name\":\"my-shoot\",\"namespace\":\"garden-project\",\"secretBindingName\":\"aws\"} When handling generic client.Object values (e.g. in helper funcs), use object as key.\n When adding timestamps to key-value pairs, use time.Time values. By this, they will be encoded in the same format as the log entry’s timestamp.\nDon’t use metav1.Time values, as they will be encoded in a different format by their Stringer implementation. Pass \u003csomeTimestamp\u003e.Time to loggers in case you have a metav1.Time value at hand.\n Same applies to durations. Use time.Duration values instead of *metav1.Duration. Durations can be handled specially by zap just like timestamps.\n Event recorders not only create Event objects but also log them. However, both Gardener’s manually instantiated event recorders and the ones that controller-runtime provides log to debug level and use generic formats, that are not very easy to interpret or process (no structured logs). Hence, don’t use event recorders as replacements for well-structured logs. If a controller records an event for a completed action or important information, it should probably log it as well, e.g.:\nlog.Info(\"Creating ManagedSeed\", \"replica\", r.GetObjectKey()) a.recorder.Eventf(managedSeedSet, corev1.EventTypeNormal, EventCreatingManagedSeed, \"Creating ManagedSeed %s\", r.GetFullName()) Logging in Test Code If the tested production code requires a logger, you can pass logr.Discard() or logf.NullLogger{} in your test, which simply discards all logs.\n logf.Log is safe to use in tests and will not cause a nil pointer deref, even if it’s not initialized via logf.SetLogger. It is initially set to a NullLogger by default, which means all logs are discarded, unless logf.SetLogger is called in the first 30 seconds of execution.\n Pass zap.WriteTo(GinkgoWriter) in tests where you want to see the logs on test failure but not on success, for example:\nlogf.SetLogger(logger.MustNewZapLogger(logger.DebugLevel, logger.FormatJSON, zap.WriteTo(GinkgoWriter))) log := logf.Log.WithName(\"test\") ","categories":"","description":"","excerpt":"Logging in Gardener Components This document aims at providing a …","ref":"/docs/gardener/logging/","tags":"","title":"Logging"},{"body":"Logging and Monitoring for Extensions Gardener provides an integrated logging and monitoring stack for alerting, monitoring, and troubleshooting of its managed components by operators or end users. For further information how to make use of it in these roles, refer to the corresponding guides for exploring logs and for monitoring with Plutono.\nThe components that constitute the logging and monitoring stack are managed by Gardener. By default, it deploys Prometheus and Alertmanager (managed via prometheus-operator, and Plutono into the garden namespace of all seed clusters. If the logging is enabled in the gardenlet configuration (logging.enabled), it will deploy fluent-operator and Vali in the garden namespace too.\nEach shoot namespace hosts managed logging and monitoring components. As part of the shoot reconciliation flow, Gardener deploys a shoot-specific Prometheus, blackbox-exporter, Plutono, and, if configured, an Alertmanager into the shoot namespace, next to the other control plane components. If the logging is enabled in the gardenlet configuration (logging.enabled) and the shoot purpose is not testing, it deploys a shoot-specific Vali in the shoot namespace too.\nThe logging and monitoring stack is extensible by configuration. Gardener extensions can take advantage of that and contribute monitoring configurations encoded in ConfigMaps for their own, specific dashboards, alerts and other supported assets and integrate with it. As with other Gardener resources, they will be continuously reconciled. The extensions can also deploy directly fluent-operator custom resources which will be created in the seed cluster and plugged into the fluent-bit instance.\nThis guide is about the roles and extensibility options of the logging and monitoring stack components, and how to integrate extensions with:\n Monitoring Logging Monitoring Seed Cluster Cache Prometheus The central Prometheus instance in the garden namespace (called “cache Prometheus”) fetches metrics and data from all seed cluster nodes and all seed cluster pods. It uses the federation concept to allow the shoot-specific instances to scrape only the metrics for the pods of the control plane they are responsible for. This mechanism allows to scrape the metrics for the nodes/pods once for the whole cluster, and to have them distributed afterwards. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this cache Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=cache, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: cache name: cache-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Seed Prometheus Another Prometheus instance in the garden namespace (called “seed Prometheus”) fetches metrics and data from seed system components, kubelets, cAdvisors, and extensions. If you want your extension pods to be scraped then they must be annotated with prometheus.io/scrape=true and prometheus.io/port=\u003cmetrics-port\u003e. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this seed Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=seed, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: seed name: seed-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Aggregate Prometheus Another Prometheus instance in the garden namespace (called “aggregate Prometheus”) stores pre-aggregated data from the cache Prometheus and shoot Prometheus. An ingress exposes this Prometheus instance allowing it to be scraped from another cluster. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this aggregate Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=aggregate, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: aggregate name: aggregate-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Plutono A Plutono instance is deployed by gardenlet into the seed cluster’s garden namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the garden namespace labelled with dashboard.monitoring.gardener.cloud/seed=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/seed: \"true\" name: extension-foo-my-custom-dashboard namespace: garden data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Shoot Cluster Shoot Prometheus The shoot-specific metrics are then made available to operators and users in the shoot Plutono, using the shoot Prometheus as data source.\nExtension controllers might deploy components as part of their reconciliation next to the shoot’s control plane. Examples for this would be a cloud-controller-manager or CSI controller deployments. Extensions that want to have their managed control plane components integrated with monitoring can contribute their per-shoot configuration for scraping Prometheus metrics, Alertmanager alerts or Plutono dashboards.\nExtensions Monitoring Integration In case an extension wants to extend the configuration for the shoot Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=shoot.\nServiceMonitor When the component runs in the seed cluster (e.g., as part of the shoot control plane), ServiceMonitor resources should be used:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: shoot name: shoot-my-controlplane-component namespace: shoot--foo--bar spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics In case HTTPS scheme is used, the CA certificate should be provided like this:\nspec: scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt In case the component requires credentials when contacting its metrics endpoint, provide them like this:\nspec: authorization: credentials: name: \u003cname-of-secret-containing-credentials\u003e key: \u003cdata-keyin-secret\u003e If the component delegates authorization to the kube-apiserver of the shoot cluster, you can use the shoot-access-prometheus-shoot secret:\nspec: authorization: credentials: name: shoot-access-prometheus-shoot key: token # in case the component's server certificate is signed by the cluster CA: scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt ScrapeConfigs If the component runs in the shoot cluster itself, metrics are scraped via the kube-apiserver proxy. In this case, Prometheus needs to authenticate itself with the API server. This can be done like this:\napiVersion: monitoring.coreos.com/v1alpha1 kind: ScrapeConfig metadata: labels: prometheus: shoot name: shoot-my-cluster-component namespace: shoot--foo--bar spec: authorization: credentials: name: shoot-access-prometheus-shoot key: token scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt kubernetesSDConfigs: - apiServer: https://kube-apiserver authorization: credentials: name: shoot-access-prometheus-shoot key: token followRedirects: true namespaces: names: - kube-system role: endpoints tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt cert: {} metricRelabelings: - sourceLabels: - __name__ action: keep regex: ^(metric1|metric2)$ - sourceLabels: - namespace action: keep regex: kube-system relabelings: - action: replace replacement: my-cluster-component targetLabel: job - sourceLabels: [__meta_kubernetes_service_name, __meta_kubernetes_pod_container_port_name] separator: ; regex: my-component-service;metrics replacement: $1 action: keep - sourceLabels: [__meta_kubernetes_endpoint_node_name] separator: ; regex: (.*) targetLabel: node replacement: $1 action: replace - sourceLabels: [__meta_kubernetes_pod_name] separator: ; regex: (.*) targetLabel: pod replacement: $1 action: replace - targetLabel: __address__ replacement: kube-apiserver:443 - sourceLabels: [__meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_number] separator: ; regex: (.+);(.+) targetLabel: __metrics_path__ replacement: /api/v1/namespaces/kube-system/pods/${1}:${2}/proxy/metrics action: replace [!TIP] Developers can make use of the pkg/component/observability/monitoring/prometheus/shoot.ClusterComponentScrapeConfigSpec function in order to generate a ScrapeConfig like above.\n PrometheusRule Similar to ServiceMonitors, PrometheusRules can be created with the prometheus=shoot label:\napiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: shoot name: shoot-my-component namespace: shoot--foo--bar spec: groups: - name: my.rules rules: # ... Plutono Dashboards A Plutono instance is deployed by gardenlet into the shoot cluster’s namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the shoot cluster’s namespace labelled with dashboard.monitoring.gardener.cloud/shoot=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/shoot: \"true\" name: extension-foo-my-custom-dashboard namespace: shoot--project--name data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Logging In Kubernetes clusters, container logs are non-persistent and do not survive stopped and destroyed containers. Gardener addresses this problem for the components hosted in a seed cluster by introducing its own managed logging solution. It is integrated with the Gardener monitoring stack to have all troubleshooting context in one place.\nGardener logging consists of components in three roles - log collectors and forwarders, log persistency and exploration/consumption interfaces. All of them live in the seed clusters in multiple instances:\n Logs are persisted by Vali instances deployed as StatefulSets - one per shoot namespace, if the logging is enabled in the gardenlet configuration (logging.enabled) and the shoot purpose is not testing, and one in the garden namespace. The shoot instances store logs from the control plane components hosted there. The garden Vali instance is responsible for logs from the rest of the seed namespaces - kube-system, garden, extension-*, and others. Fluent-bit DaemonSets deployed by the fluent-operator on each seed node collect logs from it. A custom plugin takes care to distribute the collected log messages to the Vali instances that they are intended for. This allows to fetch the logs once for the whole cluster, and to distribute them afterwards. Plutono is the UI component used to explore monitoring and log data together for easier troubleshooting and in context. Plutono instances are configured to use the corresponding Vali instances, sharing the same namespace as data providers. There is one Plutono Deployment in the garden namespace and one Deployment per shoot namespace (exposed to the end users and to the operators). Logs can be produced from various sources, such as containers or systemd, and in different formats. The fluent-bit design supports configurable data pipeline to address that problem. Gardener provides such configuration for logs produced by all its core managed components as ClusterFilters and ClusterParsers . Extensions can contribute their own, specific configurations as fluent-operator custom resources too. See for example the logging configuration for the Gardener AWS provider extension.\nFluent-bit Log Parsers and Filters To integrate with Gardener logging, extensions can and should specify how fluent-bit will handle the logs produced by the managed components that they contribute to Gardener. Normally, that would require to configure a parser for the specific logging format, if none of the available is applicable, and a filter defining how to apply it. For a complete reference for the configuration options, refer to fluent-bit’s documentation.\nTo contribute its own configuration to the fluent-bit agents data pipelines, an extension must deploy a fluent-operator custom resource labeled with fluentbit.gardener/type: seed in the seed cluster.\n Note: Take care to provide the correct data pipeline elements in the corresponding fields and not to mix them.\n Example: Logging configuration for provider-specific cloud-controller-manager deployed into shoot namespaces that reuses the kube-apiserver-parser defined in logging.go to parse the component logs:\napiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: \"seed\" name: cloud-controller-manager-aws-cloud-controller-manager spec: filters: - parser: keyName: log parser: kube-apiserver-parser reserveData: true match: kubernetes.*cloud-controller-manager*aws-cloud-controller-manager* Further details how to define parsers and use them with examples can be found in the following guide.\nPlutono The two types of Plutono instances found in a seed cluster are configured to expose logs of different origin in their dashboards:\n Garden Plutono dashboards expose logs from non-shoot namespaces of the seed clusters Pod Logs Extensions Systemd Logs Shoot Plutono dashboards expose logs from the shoot cluster namespace where they belong Kube Apiserver Kube Controller Manager Kube Scheduler Cluster Autoscaler VPA components Kubernetes Pods If the type of logs exposed in the Plutono instances needs to be changed, it is necessary to update the corresponding instance dashboard configurations.\nTips Be careful to create ClusterFilters and ClusterParsers with unique names because they are not namespaced. We use pod_name for filters with one container and pod_name--container_name for pods with multiple containers. Be careful to match exactly the log names that you need for a particular parser in your filters configuration. The regular expression you will supply will match names in the form kubernetes.pod_name.\u003cmetadata\u003e.container_name. If there are extensions with the same container and pod names, they will all match the same parser in a filter. That may be a desired effect, if they all share the same log format. But it will be a problem if they don’t. To solve it, either the pod or container names must be unique, and the regular expression in the filter has to match that unique pattern. A recommended approach is to prefix containers with the extension name and tune the regular expression to match it. For example, using myextension-container as container name and a regular expression kubernetes.mypod.*myextension-container will guarantee match of the right log name. Make sure that the regular expression does not match more than you expect. For example, kubernetes.systemd.*systemd.* will match both systemd-service and systemd-monitor-service. You will want to be as specific as possible. It’s a good idea to put the logging configuration into the Helm chart that also deploys the extension controller, while the monitoring configuration can be part of the Helm chart/deployment routine that deploys the component managed by the controller. References and Additional Resources GitHub Issue Describing the Concept Exemplary Implementation (Monitoring) for the GCP Provider Exemplary Implementation (ClusterFilter) for the AWS Provider Exemplary Implementation (ClusterParser) for the Shoot DNS Service ","categories":"","description":"","excerpt":"Logging and Monitoring for Extensions Gardener provides an integrated …","ref":"/docs/gardener/extensions/logging-and-monitoring/","tags":"","title":"Logging And Monitoring"},{"body":"Logging Stack Motivation Kubernetes uses the underlying container runtime logging, which does not persist logs for stopped and destroyed containers. This makes it difficult to investigate issues in the very common case of not running containers. Gardener provides a solution to this problem for the managed cluster components by introducing its own logging stack.\nComponents A Fluent-bit daemonset which works like a log collector and custom Golang plugin which spreads log messages to their Vali instances. One Vali Statefulset in the garden namespace which contains logs for the seed cluster and one per shoot namespace which contains logs for shoot’s controlplane. One Plutono Deployment in garden namespace and two Deployments per shoot namespace (one exposed to the end users and one for the operators). Plutono is the UI component used in the logging stack. Container Logs Rotation and Retention Container log rotation in Kubernetes describes a subtile but important implementation detail depending on the type of the used high-level container runtime. When the used container runtime is not CRI compliant (such as dockershim), then the kubelet does not provide any rotation or retention implementations, hence leaving those aspects to the downstream components. When the used container runtime is CRI compliant (such as containerd), then the kubelet provides the necessary implementation with two configuration options:\n ContainerLogMaxSize for rotation ContainerLogMaxFiles for retention ContainerD Runtime In this case, it is possible to configure the containerLogMaxSize and containerLogMaxFiles fields in the Shoot specification. Both fields are optional and if nothing is specified, then the kubelet rotates on the size 100M. Those fields are part of provider’s workers definition. Here is an example:\nspec: provider: workers: - cri: name: containerd kubernetes: kubelet: # accepted values are of resource.Quantity containerLogMaxSize: 150Mi containerLogMaxFiles: 10 The values of the containerLogMaxSize and containerLogMaxFiles fields need to be considered with care since container log files claim disk space from the host. On the opposite side, log rotations on too small sizes may result in frequent rotations which can be missed by other components (log shippers) observing these rotations.\nIn the majority of the cases, the defaults should do just fine. Custom configuration might be of use under rare conditions.\nExtension of the Logging Stack The logging stack is extended to scrape logs from the systemd services of each shoots’ nodes and from all Gardener components in the shoot kube-system namespace. These logs are exposed only to the Gardener operators.\nAlso, in the shoot control plane an event-logger pod is deployed, which scrapes events from the shoot kube-system namespace and shoot control-plane namespace in the seed. The event-logger logs the events to the standard output. Then the fluent-bit gets these events as container logs and sends them to the Vali in the shoot control plane (similar to how it works for any other control plane component). How to Access the Logs The logs are accessible via Plutono. To access them:\n Authenticate via basic auth to gain access to Plutono. The Plutono URL can be found in the Logging and Monitoring section of a cluster in the Gardener Dashboard alongside the credentials. The secret containing the credentials is stored in the project namespace following the naming pattern \u003cshoot-name\u003e.monitoring. For Gardener operators, the credentials are also stored in the control-plane (shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e) namespace in the observability-ingress-users-\u003chash\u003e secret in the seed.\n Plutono contains several dashboards that aim to facilitate the work of operators and users. From the Explore tab, users and operators have unlimited abilities to extract and manipulate logs.\n Note: Gardener Operators are people part of the Gardener team with operator permissions, not operators of the end-user cluster!\n How to Use the Explore Tab If you click on the Log browser \u003e button, you will see all of the available labels. Clicking on the label, you can see all of its available values for the given period of time you have specified. If you are searching for logs for the past one hour, do not expect to see labels or values for which there were no logs for that period of time. By clicking on a value, Plutono automatically eliminates all other labels and/or values with which no valid log stream can be made. After choosing the right labels and their values, click on the Show logs button. This will build Log query and execute it. This approach is convenient when you don’t know the labels names or they values. Once you feel comfortable, you can start to use the LogQL language to search for logs. Next to the Log browser \u003e button is the place where you can type log queries.\nExamples:\n If you want to get logs for calico-node-\u003chash\u003e pod in the cluster kube-system: The name of the node on which calico-node was running is known, but not the hash suffix of the calico-node pod. Also we want to search for errors in the logs.\n{pod_name=~\"calico-node-.+\", nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} |~ \"error\"\nHere, you will get as much help as possible from the Plutono by giving you suggestions and auto-completion.\n If you want to get the logs from kubelet systemd service of a given node and search for a pod name in the logs:\n{unit=\"kubelet.service\", nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} |~ \"pod name\"\n Note: Under unit label there is only the docker, containerd, kubelet and kernel logs.\n If you want to get the logs from gardener-node-agent systemd service of a given node and search for a string in the logs:\n{job=\"systemd-combine-journal\",nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} | unpack | unit=\"gardener-node-agent.service\"\n Note: {job=\"systemd-combine-journal\",nodename=\"\u003cnode name\u003e\"} stream pack all logs from systemd services except docker, containerd, kubelet, and kernel. To filter those log by unit, you have to unpack them first.\n Retrieving events: If you want to get the events from the shoot kube-system namespace generated by kubelet and related to the node-problem-detector:\n{job=\"event-logging\"} | unpack | origin_extracted=\"shoot\",source=\"kubelet\",object=~\".*node-problem-detector.*\"\n If you want to get the events generated by MCM in the shoot control plane in the seed:\n{job=\"event-logging\"} | unpack | origin_extracted=\"seed\",source=~\".*machine-controller-manager.*\"\n Note: In order to group events by origin, one has to specify origin_extracted because the origin label is reserved for all of the logs from the seed and the event-logger resides in the seed, so all of its logs are coming as they are only from the seed. The actual origin is embedded in the unpacked event. When unpacked, the embedded origin becomes origin_extracted.\n Expose Logs for Component to User Plutono Exposing logs for a new component to the User’s Plutono is described in the How to Expose Logs to the Users section.\nConfiguration Fluent-bit The Fluent-bit configurations can be found on pkg/component/observability/logging/fluentoperator/customresources There are six different specifications:\n FluentBit: Defines the fluent-bit DaemonSet specifications ClusterFluentBitConfig: Defines the labelselectors of the resources which fluent-bit will match ClusterInput: Defines the location of the input stream of the logs ClusterOutput: Defines the location of the output source (Vali for example) ClusterFilter: Defines filters which match specific keys ClusterParser: Defines parsers which are used by the filters Vali The Vali configurations can be found on charts/seed-bootstrap/charts/vali/templates/vali-configmap.yaml\nThe main specifications there are:\n Index configuration: Currently the following one is used: schema_config: configs: - from: 2018-04-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 24h from: Is the date from which logs collection is started. Using a date in the past is okay. store: The DB used for storing the index. object_store: Where the data is stored. schema: Schema version which should be used (v11 is currently recommended). index.prefix: The prefix for the index. index.period: The period for updating the indices. Adding a new index happens with new config block definition. The from field should start from the current day + previous index.period and should not overlap with the current index. The prefix also should be different.\n schema_config: configs: - from: 2018-04-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 24h - from: 2020-06-18 store: boltdb object_store: filesystem schema: v11 index: prefix: index_new_ period: 24h chunk_store_config Configuration chunk_store_config: max_look_back_period: 336h chunk_store_config.max_look_back_period should be the same as the retention_period\n table_manager Configuration table_manager: retention_deletes_enabled: true retention_period: 336h table_manager.retention_period is the living time for each log message. Vali will keep messages for (table_manager.retention_period - index.period) time due to specification in the Vali implementation.\nPlutono This is the Vali configuration that Plutono uses:\n - name: vali type: vali access: proxy url: http://logging.{{ .Release.Namespace }}.svc:3100 jsonData: maxLines: 5000 name: Is the name of the datasource. type: Is the type of the datasource. access: Should be set to proxy. url: Vali’s url svc: Vali’s port jsonData.maxLines: The limit of the log messages which Plutono will show to the users. Decrease this value if the browser works slowly!\n","categories":"","description":"","excerpt":"Logging Stack Motivation Kubernetes uses the underlying container …","ref":"/docs/gardener/logging-usage/","tags":"","title":"Logging Usage"},{"body":"Creating/Deleting machines (VM) Creating/Deleting machines (VM) Setting up your usage environment Important : Creating machine Inspect status of machine Delete machine Setting up your usage environment Follow the steps described here Important : Make sure that the kubernetes/machine_objects/machine.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_objects/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine Modify kubernetes/machine_objects/machine.yaml as per your requirement and create the VM as shown below: $ kubectl apply -f kubernetes/machine_objects/machine.yaml You should notice that the Machine Controller Manager has immediately picked up your manifest and started to create a new machine by talking to the cloud provider.\n Check Machine Controller Manager machines in the cluster $ kubectl get machine NAME STATUS AGE test-machine Running 5m A new machine is created with the name provided in the kubernetes/machine_objects/machine.yaml file.\n After a few minutes (~3 minutes for AWS), you should notice a new node joining the cluster. You can verify this by running: $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-14-52.eu-east-1.compute.internal. Ready 1m v1.8.0 This shows that a new node has successfully joined the cluster.\nInspect status of machine To inspect the status of any created machine, run the command given below.\n$ kubectl get machine test-machine -o yaml apiVersion: machine.sapcloud.io/v1alpha1 kind: Machine metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"Machine\",\"metadata\":{\"annotations\":{},\"labels\":{\"test-label\":\"test-label\"},\"name\":\"test-machine\",\"namespace\":\"\"},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}} clusterName: \"\" creationTimestamp: 2017-12-27T06:58:21Z finalizers: - machine.sapcloud.io/operator generation: 0 initializers: null labels: node: ip-10-250-14-52.eu-east-1.compute.internal test-label: test-label name: test-machine namespace: \"\" resourceVersion: \"12616948\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine uid: 535e596c-ead3-11e7-a6c0-828f843e4186 spec: class: kind: AWSMachineClass name: test-aws providerID: aws:///eu-east-1/i-00bef3f2618ffef23 status: conditions: - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has sufficient disk space available reason: KubeletHasSufficientDisk status: \"False\" type: OutOfDisk - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has sufficient memory available reason: KubeletHasSufficientMemory status: \"False\" type: MemoryPressure - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has no disk pressure reason: KubeletHasNoDiskPressure status: \"False\" type: DiskPressure - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T07:00:06Z message: kubelet is posting ready status reason: KubeletReady status: \"True\" type: Ready currentStatus: lastUpdateTime: 2017-12-27T07:00:06Z phase: Running lastOperation: description: Machine is now ready lastUpdateTime: 2017-12-27T07:00:06Z state: Successful type: Create node: ip-10-250-14-52.eu-west-1.compute.internal Delete machine To delete the VM using the kubernetes/machine_objects/machine.yaml as shown below\n$ kubectl delete -f kubernetes/machine_objects/machine.yaml Now the Machine Controller Manager picks up the manifest immediately and starts to delete the existing VM by talking to the cloud provider. The node should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Creating/Deleting machines (VM) Creating/Deleting machines (VM) …","ref":"/docs/other-components/machine-controller-manager/machine/","tags":"","title":"Machine"},{"body":"machine-controller-manager-provider-local Out of tree (controller-based) implementation for local as a new provider. The local out-of-tree provider implements the interface defined at MCM OOT driver.\nFundamental Design Principles Following are the basic principles kept in mind while developing the external plugin.\n Communication between this Machine Controller (MC) and Machine Controller Manager (MCM) is achieved using the Kubernetes native declarative approach. Machine Controller (MC) behaves as the controller used to interact with the local provider and manage the VMs corresponding to the machine objects. Machine Controller Manager (MCM) deals with higher level objects such as machine-set and machine-deployment objects. ","categories":"","description":"","excerpt":"machine-controller-manager-provider-local Out of tree …","ref":"/docs/gardener/extensions/machine-controller-provider-local/","tags":"","title":"Machine Controller Provider Local"},{"body":"Maintaining machine replicas using machines-deployments Maintaining machine replicas using machines-deployments Setting up your usage environment Important ⚠️ Creating machine-deployment Inspect status of machine-deployment Health monitoring Update your machines Inspect existing cluster configuration Perform a rolling update Re-check cluster configuration More variants of updates Undo an update Pause an update Delete machine-deployment Setting up your usage environment Follow the steps described here\nImportant ⚠️ Make sure that the kubernetes/machine_objects/machine-deployment.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_classes/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine-deployment Modify kubernetes/machine_objects/machine-deployment.yaml as per your requirement. Modify the number of replicas to the desired number of machines. Then, create an machine-deployment. $ kubectl apply -f kubernetes/machine_objects/machine-deployment.yaml Now the Machine Controller Manager picks up the manifest immediately and starts to create a new machines based on the number of replicas you have provided in the manifest.\n Check Machine Controller Manager machine-deployments in the cluster $ kubectl get machinedeployment NAME READY DESIRED UP-TO-DATE AVAILABLE AGE test-machine-deployment 3 3 3 0 10m You will notice a new machine-deployment with your given name\n Check Machine Controller Manager machine-sets in the cluster $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 3 3 0 10m You will notice a new machine-set backing your machine-deployment\n Check Machine Controller Manager machines in the cluster $ kubectl get machine NAME STATUS AGE test-machine-deployment-5bc6dd7c8f-5d24b Pending 5m test-machine-deployment-5bc6dd7c8f-6mpn4 Pending 5m test-machine-deployment-5bc6dd7c8f-dpt2q Pending 5m Now you will notice N (number of replicas specified in the manifest) new machines whose name are prefixed with the machine-deployment object name that you created.\n After a few minutes (~3 minutes for AWS), you would see that new nodes have joined the cluster. You can see this using $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-20-19.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-27-123.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-31-80.eu-west-1.compute.internal Ready 1m v1.8.0 This shows how new nodes have joined your cluster\nInspect status of machine-deployment To inspect the status of any created machine-deployment run the command below,\n$ kubectl get machinedeployment test-machine-deployment -o yaml You should get the following output.\napiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: annotations: deployment.kubernetes.io/revision: \"1\" kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"MachineDeployment\",\"metadata\":{\"annotations\":{},\"name\":\"test-machine-deployment\",\"namespace\":\"\"},\"spec\":{\"minReadySeconds\":200,\"replicas\":3,\"selector\":{\"matchLabels\":{\"test-label\":\"test-label\"}},\"strategy\":{\"rollingUpdate\":{\"maxSurge\":1,\"maxUnavailable\":1},\"type\":\"RollingUpdate\"},\"template\":{\"metadata\":{\"labels\":{\"test-label\":\"test-label\"}},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}}}} clusterName: \"\" creationTimestamp: 2017-12-27T08:55:56Z generation: 0 initializers: null name: test-machine-deployment namespace: \"\" resourceVersion: \"12634168\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine-deployment uid: c0b488f7-eae3-11e7-a6c0-828f843e4186 spec: minReadySeconds: 200 replicas: 3 selector: matchLabels: test-label: test-label strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: creationTimestamp: null labels: test-label: test-label spec: class: kind: AWSMachineClass name: test-aws status: availableReplicas: 3 conditions: - lastTransitionTime: 2017-12-27T08:57:22Z lastUpdateTime: 2017-12-27T08:57:22Z message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: \"True\" type: Available readyReplicas: 3 replicas: 3 updatedReplicas: 3 Health monitoring Health monitor is also applied similar to how it’s described for machine-sets\nUpdate your machines Let us consider the scenario where you wish to update all nodes of your cluster from t2.xlarge machines to m5.xlarge machines. Assume that your current test-aws has its spec.machineType: t2.xlarge and your deployment test-machine-deployment points to this AWSMachineClass.\nInspect existing cluster configuration Check Nodes present in the cluster $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-20-19.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-27-123.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-31-80.eu-west-1.compute.internal Ready 1m v1.8.0 Check Machine Controller Manager machine-sets in the cluster. You will notice one machine-set backing your machine-deployment $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 3 3 3 10m Login to your cloud provider (AWS). In the VM management console, you will find N VMs created of type t2.xlarge. Perform a rolling update To update this machine-deployment VMs to m5.xlarge, we would do the following:\n Copy your existing aws-machine-class.yaml cp kubernetes/machine_classes/aws-machine-class.yaml kubernetes/machine_classes/aws-machine-class-new.yaml Modify aws-machine-class-new.yaml, and update its metadata.name: test-aws2 and spec.machineType: m5.xlarge Now create this modified MachineClass kubectl apply -f kubernetes/machine_classes/aws-machine-class-new.yaml Edit your existing machine-deployment kubectl edit machinedeployment test-machine-deployment Update from spec.template.spec.class.name: test-aws to spec.template.spec.class.name: test-aws2 Re-check cluster configuration After a few minutes (~3mins)\n Check nodes present in cluster now. They are different nodes. $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-11-171.eu-west-1.compute.internal Ready 4m v1.8.0 ip-10-250-17-213.eu-west-1.compute.internal Ready 5m v1.8.0 ip-10-250-31-81.eu-west-1.compute.internal Ready 5m v1.8.0 Check Machine Controller Manager machine-sets in the cluster. You will notice two machine-sets backing your machine-deployment $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 0 0 0 1h test-machine-deployment-86ff45cc5 3 3 3 20m Login to your cloud provider (AWS). In the VM management console, you will find N VMs created of type t2.xlarge in terminated state, and N new VMs of type m5.xlarge in running state. This shows how a rolling update of a cluster from nodes with t2.xlarge to m5.xlarge went through.\nMore variants of updates The above demonstration was a simple use case. This could be more complex like - updating the system disk image versions/ kubelet versions/ security patches etc. You can also play around with the maxSurge and maxUnavailable fields in machine-deployment.yaml You can also change the update strategy from rollingupdate to recreate Undo an update Edit the existing machine-deployment $ kubectl edit machinedeployment test-machine-deployment Edit the deployment to have this new field of spec.rollbackTo.revision: 0 as shown as comments in kubernetes/machine_objects/machine-deployment.yaml This will undo your update to the previous version. Pause an update You can also pause the update while update is going on by editing the existing machine-deployment $ kubectl edit machinedeployment test-machine-deployment Edit the deployment to have this new field of spec.paused: true as shown as comments in kubernetes/machine_objects/machine-deployment.yaml\n This will pause the rollingUpdate if it’s in process\n To resume the update, edit the deployment as mentioned above and remove the field spec.paused: true updated earlier\n Delete machine-deployment To delete the VM using the kubernetes/machine_objects/machine-deployment.yaml $ kubectl delete -f kubernetes/machine_objects/machine-deployment.yaml The Machine Controller Manager picks up the manifest and starts to delete the existing VMs by talking to the cloud provider. The nodes should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Maintaining machine replicas using machines-deployments Maintaining …","ref":"/docs/other-components/machine-controller-manager/machine_deployment/","tags":"","title":"Machine Deployment"},{"body":"Machine Error code handling Notational Conventions The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997).\nThe key words “unspecified”, “undefined”, and “implementation-defined” are to be interpreted as described in the rationale for the C99 standard.\nAn implementation is not compliant if it fails to satisfy one or more of the MUST, REQUIRED, or SHALL requirements for the protocols it implements. An implementation is compliant if it satisfies all the MUST, REQUIRED, and SHALL requirements for the protocols it implements.\nTerminology Term Definition CR Custom Resource (CR) is defined by a cluster admin using the Kubernetes Custom Resource Definition primitive. VM A Virtual Machine (VM) provisioned and managed by a provider. It could also refer to a physical machine in case of a bare metal provider. Machine Machine refers to a VM that is provisioned/managed by MCM. It typically describes the metadata used to store/represent a Virtual Machine Node Native kubernetes Node object. The objects you get to see when you do a “kubectl get nodes”. Although nodes can be either physical/virtual machines, for the purposes of our discussions it refers to a VM. MCM Machine Controller Manager (MCM) is the controller used to manage higher level Machine Custom Resource (CR) such as machine-set and machine-deployment CRs. Provider/Driver/MC Provider (or) Driver (or) Machine Controller (MC) is the driver responsible for managing machine objects present in the cluster from whom it manages these machines. A simple example could be creation/deletion of VM on the provider. Pre-requisite MachineClass Resources MCM introduces the CRD MachineClass. This is a blueprint for creating machines that join a certain cluster as nodes in a certain role. The provider only works with MachineClass resources that have the structure described here.\nProviderSpec The MachineClass resource contains a providerSpec field that is passed in the ProviderSpec request field to CMI methods such as CreateMachine. The ProviderSpec can be thought of as a machine template from which the VM specification must be adopted. It can contain key-value pairs of these specs. An example for these key-value pairs are given below.\n Parameter Mandatory Type Description vmPool Yes string VM pool name, e.g. TEST-WOKER-POOL size Yes string VM size, e.g. xsmall, small, etc. Each size maps to a number of CPUs and memory size. rootFsSize No int Root (/) filesystem size in GB tags Yes map Tags to be put on the created VM Most of the ProviderSpec fields are not mandatory. If not specified, the provider passes an empty value in the respective Create VM parameter.\nThe tags can be used to map a VM to its corresponding machine object’s Name\nThe ProviderSpec is validated by methods that receive it as a request field for presence of all mandatory parameters and tags, and for validity of all parameters.\nSecrets The MachineClass resource also contains a secretRef field that contains a reference to a secret. The keys of this secret are passed in the Secrets request field to CMI methods.\nThe secret can contain sensitive data such as\n cloud-credentials secret data used to authenticate at the provider cloud-init scripts used to initialize a new VM. The cloud-init script is expected to contain scripts to initialize the Kubelet and make it join the cluster. Identifying Cluster Machines To implement certain methods, the provider should be able to identify all machines associated with a particular Kubernetes cluster. This can be achieved using one/more of the below mentioned ways:\n Names of VMs created by the provider are prefixed by the cluster ID specified in the ProviderSpec. VMs created by the provider are tagged with the special tags like kubernetes.io/cluster (for the cluster ID) and kubernetes.io/role (for the role), specified in the ProviderSpec. Mapping Resource Groups to individual cluster. Error Scheme All provider API calls defined in this spec MUST return a machine error status, which is very similar to standard machine status.\nMachine Provider Interface The provider MUST have a unique way to map a machine object to a VM which triggers the deletion for the corresponding VM backing the machine object. The provider SHOULD have a unique way to map the ProviderSpec of a machine-class to a unique Cluster. This avoids deletion of other machines, not backed by the MCM. CreateMachine A Provider is REQUIRED to implement this interface method. This interface method will be called by the MCM to provision a new VM on behalf of the requesting machine object.\n This call requests the provider to create a VM backing the machine-object.\n If VM backing the Machine.Name already exists, and is compatible with the specified Machine object in the CreateMachineRequest, the Provider MUST reply 0 OK with the corresponding CreateMachineResponse.\n The provider can OPTIONALLY make use of the MachineClass supplied in the MachineClass in the CreateMachineRequest to communicate with the provider.\n The provider can OPTIONALLY make use of the secrets supplied in the Secret in the CreateMachineRequest to communicate with the provider.\n The provider can OPTIONALLY make use of the Status.LastKnownState in the Machine object to decode the state of the VM operation based on the last known state of the VM. This can be useful to restart/continue an operations which are mean’t to be atomic.\n The provider MUST have a unique way to map a machine object to a VM. This could be implicitly provided by the provider by letting you set VM-names (or) could be explicitly specified by the provider using appropriate tags to map the same.\n This operation SHOULD be idempotent.\n The CreateMachineResponse returned by this method is expected to return\n ProviderID that uniquely identifys the VM at the provider. This is expected to match with the node.Spec.ProviderID on the node object. NodeName that is the expected name of the machine when it joins the cluster. It must match with the node name. LastKnownState is an OPTIONAL field that can store details of the last known state of the VM. It can be used by future operation calls to determine current infrastucture state. This state is saved on the machine object. // CreateMachine call is responsible for VM creation on the provider CreateMachine(context.Context, *CreateMachineRequest) (*CreateMachineResponse, error)// CreateMachineRequest is the create request for VM creation type CreateMachineRequest struct {\t// Machine object from whom VM is to be created \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// CreateMachineResponse is the create response for VM creation type CreateMachineResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string\t// LastKnownState represents the last state of the VM during an creation/deletion error \tLastKnownState string}CreateMachine Errors If the provider is unable to complete the CreateMachine call successfully, it MUST return a non-ok ginterface method code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in creating/adopting a VM that matches supplied creation request. The CreateMachineResponse is returned with desired values N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and ProviderSpec. Make sure all parameters are in permitted range of values. Exact issue to be given in .message Update providerSpec to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 6 ALREADY_EXISTS Already exists but desired parameters doesn’t match Parameters of the existing VM don’t match the ProviderSpec Create machine with a different name N 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to create an VM and it’s required dependencies Update requestor permissions to grant the same N 8 RESOURCE_EXHAUSTED Resource limits have been reached The requestor doesn’t have enough resource limits to process this creation request Enhance resource limits associated with the user/account to process this N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 10 ABORTED Operation is pending Indicates that there is already an operation pending for the specified machine Wait until previous pending operation is processed Y 11 OUT_OF_RANGE Resources were out of range The requested number of CPUs, memory size, of FS size in ProviderSpec falls outside of the corresponding valid range Update request paramaters to request valid resource requests N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nInitializeMachine Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error.\nThis interface method will be called by the MCM to initialize a new VM just after creation. This can be used to configure network configuration etc.\n This call requests the provider to initialize a newly created VM backing the machine-object. The InitializeMachineResponse returned by this method is expected to return ProviderID that uniquely identifys the VM at the provider. This is expected to match with the node.Spec.ProviderID on the node object. NodeName that is the expected name of the machine when it joins the cluster. It must match with the node name. // InitializeMachine call is responsible for VM initialization on the provider. InitializeMachine(context.Context, *InitializeMachineRequest) (*InitializeMachineResponse, error)// InitializeMachineRequest encapsulates params for the VM Initialization operation (Driver.InitializeMachine). type InitializeMachineRequest struct {\t// Machine object representing VM that must be initialized \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// InitializeMachineResponse is the response for VM instance initialization (Driver.InitializeMachine). type InitializeMachineResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string}InitializeMachine Errors If the provider is unable to complete the InitializeMachine call successfully, it MUST return a non-ok machine code in the machine status.\nIf the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in initializing a VM that matches supplied initialization request. The InitializeMachineResponse is returned with desired values N 5 NOT_FOUND Timeout VM Instance for Machine isn’t found at provider Skip Initialization and Continue N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Skip Initialization and continue N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. Needs investigation and possible intervention to fix this Y 17 UNINITIALIZED Failed Initialization VM Instance could not be initializaed Initialization is reattempted in next reconcile cycle Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nDeleteMachine A Provider is REQUIRED to implement this driver call. This driver call will be called by the MCM to deprovision/delete/terminate a VM backed by the requesting machine object.\n If a VM corresponding to the specified machine-object’s name does not exist or the artifacts associated with the VM do not exist anymore (after deletion), the Provider MUST reply 0 OK.\n The provider SHALL only act on machines belonging to the cluster-id/cluster-name obtained from the ProviderSpec.\n The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the DeleteMachineRequest to communicate with the provider.\n The provider can OPTIONALY make use of the Spec.ProviderID map in the Machine object.\n The provider can OPTIONALLY make use of the Status.LastKnownState in the Machine object to decode the state of the VM operation based on the last known state of the VM. This can be useful to restart/continue an operations which are mean’t to be atomic.\n This operation SHOULD be idempotent.\n The provider must have a unique way to map a machine object to a VM which triggers the deletion for the corresponding VM backing the machine object.\n The DeleteMachineResponse returned by this method is expected to return\n LastKnownState is an OPTIONAL field that can store details of the last known state of the VM. It can be used by future operation calls to determine current infrastucture state. This state is saved on the machine object. // DeleteMachine call is responsible for VM deletion/termination on the provider DeleteMachine(context.Context, *DeleteMachineRequest) (*DeleteMachineResponse, error)// DeleteMachineRequest is the delete request for VM deletion type DeleteMachineRequest struct {\t// Machine object from whom VM is to be deleted \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// DeleteMachineResponse is the delete response for VM deletion type DeleteMachineResponse struct {\t// LastKnownState represents the last state of the VM during an creation/deletion error \tLastKnownState string}DeleteMachine Errors If the provider is unable to complete the DeleteMachine call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in deleting a VM that matches supplied deletion request. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and make sure that it is in the desired format and not a blank value. Exact issue to be given in .message Update Machine.Name to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to delete an VM and it’s required dependencies Update requestor permissions to grant the same N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 10 ABORTED Operation is pending Indicates that there is already an operation pending for the specified machine Wait until previous pending operation is processed Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGetMachineStatus A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. This call will be invoked by the MC to get the status of a machine. This optional driver call helps in optimizing the working of the provider by avoiding unwanted calls to CreateMachine() and DeleteMachine().\n If a VM corresponding to the specified machine object’s Machine.Name exists on provider the GetMachineStatusResponse fields are to be filled similar to the CreateMachineResponse. The provider SHALL only act on machines belonging to the cluster-id/cluster-name obtained from the ProviderSpec. The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the GetMachineStatusRequest to communicate with the provider. The provider can OPTIONALY make use of the VM unique ID (returned by the provider on machine creation) passed in the ProviderID map in the GetMachineStatusRequest. This operation MUST be idempotent. // GetMachineStatus call get's the status of the VM backing the machine object on the provider GetMachineStatus(context.Context, *GetMachineStatusRequest) (*GetMachineStatusResponse, error)// GetMachineStatusRequest is the get request for VM info type GetMachineStatusRequest struct {\t// Machine object from whom VM status is to be fetched \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// GetMachineStatusResponse is the get response for VM info type GetMachineStatusResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string}GetMachineStatus Errors If the provider is unable to complete the GetMachineStatus call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in getting machine details for given machine Machine.Name N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and make sure that it is in the desired format and not a blank value. Exact issue to be given in .message Update Machine.Name to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 5 NOT_FOUND Machine isn’t found at provider The machine could not be found at provider Not required N 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to get details for the VM and it’s required dependencies Update requestor permissions to grant the same N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 11 OUT_OF_RANGE Multiple VMs found Multiple VMs found with matching machine object names Orphan VM handler to cleanup orphan VMs / Manual intervention maybe required if orphan VM handler isn’t enabled. Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N 17 UNINITIALIZED Failed Initialization VM Instance could not be initializaed Initialization is reattempted in next reconcile cycle N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nListMachines A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. The Provider SHALL return the information about all the machines associated with the MachineClass. Make sure to use appropriate filters to achieve the same to avoid data transfer overheads. This optional driver call helps in cleaning up orphan VMs present in the cluster. If not implemented, any orphan VM that might have been created incorrectly by the MCM/Provider (due to bugs in code/infra) might require manual clean up.\n If the Provider succeeded in returning a list of Machine.Name with their corresponding ProviderID, then return 0 OK. The ListMachineResponse contains a map of MachineList whose Key is expected to contain the ProviderID \u0026 Value is expected to contain the Machine.Name corresponding to it’s kubernetes machine CR object The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the ListMachinesRequest to communicate with the provider. // ListMachines lists all the machines that might have been created by the supplied machineClass ListMachines(context.Context, *ListMachinesRequest) (*ListMachinesResponse, error)// ListMachinesRequest is the request object to get a list of VMs belonging to a machineClass type ListMachinesRequest struct {\t// MachineClass object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// ListMachinesResponse is the response object of the list of VMs belonging to a machineClass type ListMachinesResponse struct {\t// MachineList is the map of list of machines. Format for the map should be \u003cProviderID, MachineName\u003e. \tMachineList map[string]string}ListMachines Errors If the provider is unable to complete the ListMachines call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call for listing all VMs associated with ProviderSpec was successful. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied ProviderSpec and make sure that all required fields are present in their desired value format. Exact issue to be given in .message Update ProviderSpec to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to list VMs and it’s required dependencies Update requestor permissions to grant the same N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGetVolumeIDs A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. This driver call will be called by the MCM to get the VolumeIDs for the list of PersistentVolumes (PVs) supplied. This OPTIONAL (but recommended) driver call helps in serailzied eviction of pods with PVs while draining of machines. This implies applications backed by PVs would be evicted one by one, leading to shorter application downtimes.\n On succesful returnal of a list of Volume-IDs for all supplied PVSpecs, the Provider MUST reply 0 OK. The GetVolumeIDsResponse is expected to return a repeated list of strings consisting of the VolumeIDs for PVSpec that could be extracted. If for any PV the Provider wasn’t able to identify the Volume-ID, the provider MAY chose to ignore it and return the Volume-IDs for the rest of the PVs for whom the Volume-ID was found. Getting the VolumeID from the PVSpec depends on the Cloud-provider. You can extract this information by parsing the PVSpec based on the ProviderType https://github.com/kubernetes/api/blob/release-1.15/core/v1/types.go#L297-L339 https://github.com/kubernetes/api/blob/release-1.15//core/v1/types.go#L175-L257 This operation MUST be idempotent. // GetVolumeIDsRequest is the request object to get a list of VolumeIDs for a PVSpec type GetVolumeIDsRequest struct {\t// PVSpecsList is a list of PV specs for whom volume-IDs are required \t// Plugin should parse this raw data into pre-defined list of PVSpecs \tPVSpecs []*corev1.PersistentVolumeSpec}// GetVolumeIDsResponse is the response object of the list of VolumeIDs for a PVSpec type GetVolumeIDsResponse struct {\t// VolumeIDs is a list of VolumeIDs. \tVolumeIDs []string}GetVolumeIDs Errors machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call getting list of VolumeIDs for the list of PersistentVolumes was successful. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied PVSpecList and make sure that it is in the desired format. Exact issue to be given in .message Update PVSpecList to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGenerateMachineClassForMigration A Provider SHOULD implement this driver call, else it MUST return a UNIMPLEMENTED status in error. This driver call will be called by the Machine Controller to try to perform a machineClass migration for an unknown machineClass Kind. This helps in migration of one kind of machineClass to another kind. For instance an machineClass custom resource of AWSMachineClass to MachineClass.\n On successful generation of machine class the Provider MUST reply 0 OK (or) nil error. GenerateMachineClassForMigrationRequest expects the provider-specific machine class (eg. AWSMachineClass) to be supplied as the ProviderSpecificMachineClass. The provider is responsible for unmarshalling the golang struct. It also passes a reference to an existing MachineClass object. The provider is expected to fill in thisMachineClass object based on the conversions. An optional ClassSpec containing the type ClassSpec struct is also provided to decode the provider info. GenerateMachineClassForMigration is only responsible for filling up the passed MachineClass object. The task of creating the new CR of the new kind (MachineClass) with the same name as the previous one and also annotating the old machineClass CR with a migrated annotation and migrating existing references is done by the calling library implicitly. This operation MUST be idempotent. // GenerateMachineClassForMigrationRequest is the request for generating the generic machineClass // for the provider specific machine class type GenerateMachineClassForMigrationRequest struct {\t// ProviderSpecificMachineClass is provider specfic machine class object. \t// E.g. AWSMachineClass \tProviderSpecificMachineClass interface{}\t// MachineClass is the machine class object generated that is to be filled up \tMachineClass *v1alpha1.MachineClass\t// ClassSpec contains the class spec object to determine the machineClass kind \tClassSpec *v1alpha1.ClassSpec}// GenerateMachineClassForMigrationResponse is the response for generating the generic machineClass // for the provider specific machine class type GenerateMachineClassForMigrationResponse struct{}MigrateMachineClass Errors machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful Migration of provider specific machine class was successful Machine reconcilation is retried once the new class has been created Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this provider. None N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Might need manual intervension to fix this Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nConfiguration and Operation Supervised Lifecycle Management For Providers packaged in software form: Provider Packages SHOULD use a well-documented container image format (e.g., Docker, OCI). The chosen package image format MAY expose configurable Provider properties as environment variables, unless otherwise indicated in the section below. Variables so exposed SHOULD be assigned default values in the image manifest. A Provider Supervisor MAY programmatically evaluate or otherwise scan a Provider Package’s image manifest in order to discover configurable environment variables. A Provider SHALL NOT assume that an operator or Provider Supervisor will scan an image manifest for environment variables. Environment Variables Variables defined by this specification SHALL be identifiable by their MC_ name prefix. Configuration properties not defined by the MC specification SHALL NOT use the same MC_ name prefix; this prefix is reserved for common configuration properties defined by the MC specification. The Provider Supervisor SHOULD supply all RECOMMENDED MC environment variables to a Provider. The Provider Supervisor SHALL supply all REQUIRED MC environment variables to a Provider. Logging Providers SHOULD generate log messages to ONLY standard output and/or standard error. In this case the Provider Supervisor SHALL assume responsibility for all log lifecycle management. Provider implementations that deviate from the above recommendation SHALL clearly and unambiguously document the following: Logging configuration flags and/or variables, including working sample configurations. Default log destination(s) (where do the logs go if no configuration is specified?) Log lifecycle management ownership and related guidance (size limits, rate limits, rolling, archiving, expunging, etc.) applicable to the logging mechanism embedded within the Provider. Providers SHOULD NOT write potentially sensitive data to logs (e.g. secrets). Available Services Provider Packages MAY support all or a subset of CMI services; service combinations MAY be configurable at runtime by the Provider Supervisor. This specification does not dictate the mechanism by which mode of operation MUST be discovered, and instead places that burden upon the VM Provider. Misconfigured provider software SHOULD fail-fast with an OS-appropriate error code. Linux Capabilities Providers SHOULD clearly document any additionally required capabilities and/or security context. Cgroup Isolation A Provider MAY be constrained by cgroups. Resource Requirements VM Providers SHOULD unambiguously document all of a Provider’s resource requirements. Deploying Recommended: The MCM and Provider are typically expected to run as two containers inside a common Pod. However, for the security reasons they could execute on seperate Pods provided they have a secure way to exchange data between them. ","categories":"","description":"","excerpt":"Machine Error code handling Notational Conventions The keywords …","ref":"/docs/other-components/machine-controller-manager/machine_error_codes/","tags":"","title":"Machine Error Codes"},{"body":"Maintaining machine replicas using machines-sets Maintaining machine replicas using machines-sets Setting up your usage environment Important ⚠️ Creating machine-set Inspect status of machine-set Health monitoring Delete machine-set Setting up your usage environment Follow the steps described here Important ⚠️ Make sure that the kubernetes/machines_objects/machine-set.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_classes/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine-set Modify kubernetes/machine_objects/machine-set.yaml as per your requirement. You can modify the number of replicas to the desired number of machines. Then, create an machine-set: $ kubectl apply -f kubernetes/machine_objects/machine-set.yaml You should notice that the Machine Controller Manager has immediately picked up your manifest and started to create a new machines based on the number of replicas you have provided in the manifest.\n Check Machine Controller Manager machine-sets in the cluster $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-set 3 3 0 1m You will see a new machine-set with your given name\n Check Machine Controller Manager machines in the cluster: $ kubectl get machine NAME STATUS AGE test-machine-set-b57zs Pending 5m test-machine-set-c4bg8 Pending 5m test-machine-set-kvskg Pending 5m Now you will see N (number of replicas specified in the manifest) new machines whose names are prefixed with the machine-set object name that you created.\n After a few minutes (~3 minutes for AWS), you should notice new nodes joining the cluster. You can verify this by running: $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-0-234.eu-west-1.compute.internal Ready 3m v1.8.0 ip-10-250-15-98.eu-west-1.compute.internal Ready 3m v1.8.0 ip-10-250-6-21.eu-west-1.compute.internal Ready 2m v1.8.0 This shows how new nodes have joined your cluster\nInspect status of machine-set To inspect the status of any created machine-set run the following command: $ kubectl get machineset test-machine-set -o yaml apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineSet metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"MachineSet\",\"metadata\":{\"annotations\":{},\"name\":\"test-machine-set\",\"namespace\":\"\",\"test-label\":\"test-label\"},\"spec\":{\"minReadySeconds\":200,\"replicas\":3,\"selector\":{\"matchLabels\":{\"test-label\":\"test-label\"}},\"template\":{\"metadata\":{\"labels\":{\"test-label\":\"test-label\"}},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}}}} clusterName: \"\" creationTimestamp: 2017-12-27T08:37:42Z finalizers: - machine.sapcloud.io/operator generation: 0 initializers: null name: test-machine-set namespace: \"\" resourceVersion: \"12630893\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine-set uid: 3469faaa-eae1-11e7-a6c0-828f843e4186 spec: machineClass: {} minReadySeconds: 200 replicas: 3 selector: matchLabels: test-label: test-label template: metadata: creationTimestamp: null labels: test-label: test-label spec: class: kind: AWSMachineClass name: test-aws status: availableReplicas: 3 fullyLabeledReplicas: 3 machineSetCondition: null lastOperation: lastUpdateTime: null observedGeneration: 0 readyReplicas: 3 replicas: 3 Health monitoring If you try to delete/terminate any of the machines backing the machine-set by either talking to the Machine Controller Manager or from the cloud provider, the Machine Controller Manager recreates a matching healthy machine to replace the deleted machine. Similarly, if any of your machines are unreachable or in an unhealthy state (kubelet not ready / disk pressure) for longer than the configured timeout (~ 5mins), the Machine Controller Manager recreates the nodes to replace the unhealthy nodes. Delete machine-set To delete the VM using the kubernetes/machine_objects/machine-set.yaml: $ kubectl delete -f kubernetes/machine-set.yaml Now the Machine Controller Manager has immediately picked up your manifest and started to delete the existing VMs by talking to the cloud provider. Your nodes should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Maintaining machine replicas using machines-sets Maintaining machine …","ref":"/docs/other-components/machine-controller-manager/machine_set/","tags":"","title":"Machine Set"},{"body":"Manage certificates with Gardener for public domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please see Manage certificates with Gardener for default domain You want to use a certificate for a custom domain. If this is your case, please keep reading this article. Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Your custom domain is under a public top level domain (e.g. .com) Your custom zone is resolvable with a public resolver via the internet (e.g. 8.8.8.8) You have a custom DNS provider configured and working (see “DNS Providers”) As part of the Let’s Encrypt ACME challenge validation process, Gardener sets a DNS TXT entry and Let’s Encrypt checks if it can both resolve and authenticate it. Therefore, it’s important that your DNS-entries are publicly resolvable. You can check this by querying e.g. Googles public DNS server and if it returns an entry your DNS is publicly visible:\n# returns the A record for cert-example.example.com using Googles DNS server (8.8.8.8) dig cert-example.example.com @8.8.8.8 A DNS provider In order to issue certificates for a custom domain you need to specify a DNS provider which is permitted to create DNS records for subdomains of your requested domain in the certificate. For example, if you request a certificate for host.example.com your DNS provider must be capable of managing subdomains of host.example.com.\nDNS providers are normally specified in the shoot manifest. To learn more on how to configure one, please see the DNS provider documentation.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) Gateways (both Istio gateways and from the Gateway API) Certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will be created automatically.\nUsing an Ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed # Optional but recommended, this is going to create the DNS entry at the same time dns.gardener.cloud/class: garden dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/commonname: \"*.example.com\" # optional, if not specified the first name from spec.tls[].hosts is used as common name #cert.gardener.cloud/dnsnames: \"\" # optional, if not specified the names from spec.tls[].hosts are used #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" spec: tls: - hosts: # Must not exceed 64 characters. - amazing.example.com # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Replace the hosts and rules[].host value again with your own domain and adjust the remaining Ingress attributes in accordance with your deployment (e.g. the above is for an istio Ingress controller and forwards traffic to a service1 on port 80).\nUsing a Service of type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/secretname: tls-secret dns.gardener.cloud/dnsnames: example.example.com dns.gardener.cloud/class: garden # Optional dns.gardener.cloud/ttl: \"600\" cert.gardener.cloud/commonname: \"*.example.example.com\" cert.gardener.cloud/dnsnames: \"\" #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using a Gateway resource Please see Istio Gateways or Gateway API for details.\nUsing the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: amazing.example.com secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden # If delegated domain for DNS01 challenge should be used. This has only an effect if a CNAME record is set for # '_acme-challenge.amazing.example.com'. # For example: If a CNAME record exists '_acme-challenge.amazing.example.com' =\u003e '_acme-challenge.writable.domain.com', # the DNS challenge will be written to '_acme-challenge.writable.domain.com'. #followCNAME: true # optionally set labels for the secret #secretLabels: # key1: value1 # key2: value2 # Optionally specify the preferred certificate chain: if the CA offers multiple certificate chains, prefer the chain with an issuer matching this Subject Common Name. If no match, the default offered chain will be used. #preferredChain: \"ISRG Root X1\" # Optionally specify algorithm and key size for private key. Allowed algorithms: \"RSA\" (allowed sizes: 2048, 3072, 4096) and \"ECDSA\" (allowed sizes: 256, 384) # If not specified, RSA with 2048 is used. #privateKey: # algorithm: ECDSA # size: 384 Supported attributes Here is a list of all supported annotations regarding the certificate extension:\n Path Annotation Value Required Description N/A cert.gardener.cloud/purpose: managed Yes when using annotations Flag for Gardener that this specific Ingress or Service requires a certificate spec.commonName cert.gardener.cloud/commonname: E.g. “*.demo.example.com” or “special.example.com” Certificate and Ingress : No Service: Yes, if DNS names unset Specifies for which domain the certificate request will be created. If not specified, the names from spec.tls[].hosts are used. This entry must comply with the 64 character limit. spec.dnsNames cert.gardener.cloud/dnsnames: E.g. “special.example.com” Certificate and Ingress : No Service: Yes, if common name unset Additional domains the certificate should be valid for (Subject Alternative Name). If not specified, the names from spec.tls[].hosts are used. Entries in this list can be longer than 64 characters. spec.secretRef.name cert.gardener.cloud/secretname: any-name Yes for certificate and Service Specifies the secret which contains the certificate/key pair. If the secret is not available yet, it’ll be created automatically as soon as the certificate has been issued. spec.issuerRef.name cert.gardener.cloud/issuer: E.g. gardener No Specifies the issuer you want to use. Only necessary if you request certificates for custom domains. N/A cert.gardener.cloud/revoked: true otherwise always false No Use only to revoke a certificate, see reference for more details spec.followCNAME cert.gardener.cloud/follow-cname E.g. true No Specifies that the usage of a delegated domain for DNS challenges is allowed. Details see Follow CNAME. spec.preferredChain cert.gardener.cloud/preferred-chain E.g. ISRG Root X1 No Specifies the Common Name of the issuer for selecting the certificate chain. Details see Preferred Chain. spec.secretLabels cert.gardener.cloud/secret-labels for annotation use e.g. key1=value1,key2=value2 No Specifies labels for the certificate secret. spec.privateKey.algorithm cert.gardener.cloud/private-key-algorithm RSA, ECDSA No Specifies algorithm for private key generation. The default value is depending on configuration of the extension (default of the default is RSA). You may request a new certificate without privateKey settings to find out the concrete defaults in your Gardener. spec.privateKey.size cert.gardener.cloud/private-key-size \"256\", \"384\", \"2048\", \"3072\", \"4096\" No Specifies size for private key generation. Allowed values for RSA are 2048, 3072, and 4096. For ECDSA allowed values are 256 and 384. The default values are depending on the configuration of the extension (defaults of the default values are 3072 for RSA and 384 for ECDSA respectively). Request a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.example.com\" spec: tls: - hosts: - amazing.example.com secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nUsing a custom Issuer Most Gardener deployment with the certification extension enabled have a preconfigured garden issuer. It is also usually configured to use Let’s Encrypt as the certificate provider.\nIf you need a custom issuer for a specific cluster, please see Using a custom Issuer\nQuotas For security reasons there may be a default quota on the certificate requests per day set globally in the controller registration of the shoot-cert-service.\nThe default quota only applies if there is no explicit quota defined for the issuer itself with the field requestsPerDayQuota, e.g.:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' requestsPerDayQuota: 10 DNS Propagation As stated before, cert-manager uses the ACME challenge protocol to authenticate that you are the DNS owner for the domain’s certificate you are requesting. This works by creating a DNS TXT record in your DNS provider under _acme-challenge.example.example.com containing a token to compare with. The TXT record is only applied during the domain validation. Typically, the record is propagated within a few minutes. But if the record is not visible to the ACME server for any reasons, the certificate request is retried again after several minutes. This means you may have to wait up to one hour after the propagation problem has been resolved before the certificate request is retried. Take a look in the events with kubectl describe ingress example for troubleshooting.\nCharacter Restrictions Due to restriction of the common name to 64 characters, you may to leave the common name unset in such cases.\nFor example, the following request is invalid:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-invalid namespace: default spec: commonName: morethan64characters.ingress.shoot.project.default-domain.gardener.cloud But it is valid to request a certificate for this domain if you have left the common name unset:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: dnsNames: - morethan64characters.ingress.shoot.project.default-domain.gardener.cloud References Gardener cert-management Managing DNS with Gardener ","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/guides/networking/certificate-extension/","tags":["task"],"title":"Manage Certificates with Gardener"},{"body":"Manage certificates with Gardener for default domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please keep reading this article. You want to use a certificate for a custom domain. If this is your case, please see Manage certificates with Gardener for public domain Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Since you are using the default DNS name, all DNS configuration should already be done and ready.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will automatically be created.\nUsing an ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\"spec: tls: - hosts: # Must not exceed 64 characters. - short.ingress.shoot.project.default-domain.gardener.cloud # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: short.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Using a service type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/purpose: managed # Certificate and private key reside in this secret. cert.gardener.cloud/secretname: tls-secret # You may add more domains separated by commas (e.g. \"service.shoot.project.default-domain.gardener.cloud, amazing.shoot.project.default-domain.gardener.cloud\") dns.gardener.cloud/dnsnames: \"service.shoot.project.default-domain.gardener.cloud\" dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: short.ingress.shoot.project.default-domain.gardener.cloud secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden If you’re interested in the current progress of your request, you’re advised to consult the description, more specifically the status attribute in case the issuance failed.\nRequest a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.ingress.shoot.project.default-domain.gardener.cloud\" spec: tls: - hosts: - amazing.ingress.shoot.project.default-domain.gardener.cloud secretName: tls-secret rules: - host: amazing.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nMore information For more information and more examples about using the certificate extension, please see Manage certificates with Gardener for public domain\n","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/guides/networking/certificate-extension-default-domain/","tags":["task"],"title":"Manage Certificates with Gardener for Default Domain"},{"body":"ManagedSeeds: Register Shoot as Seed An existing shoot can be registered as a seed by creating a ManagedSeed resource. This resource contains:\n The name of the shoot that should be registered as seed. A gardenlet section that contains: gardenlet deployment parameters, such as the number of replicas, the image, etc. The GardenletConfiguration resource that contains controllers configuration, feature gates, and a seedConfig section that contains the Seed spec and parts of its metadata. Additional configuration parameters, such as the garden connection bootstrap mechanism (see TLS Bootstrapping), and whether to merge the provided configuration with the configuration of the parent gardenlet. gardenlet is deployed to the shoot, and it registers a new seed upon startup based on the seedConfig section.\n Note: Earlier Gardener allowed specifying a seedTemplate directly in the ManagedSeed resource. This feature is discontinued, any seed configuration must be via the GardenletConfiguration.\n Note the following important aspects:\n Unlike the Seed resource, the ManagedSeed resource is namespaced. Currently, managed seeds are restricted to the garden namespace. The newly created Seed resource always has the same name as the ManagedSeed resource. Attempting to specify a different name in the seedConfig will fail. The ManagedSeed resource must always refer to an existing shoot. Attempting to create a ManagedSeed referring to a non-existing shoot will fail. A shoot that is being referred to by a ManagedSeed cannot be deleted. Attempting to delete such a shoot will fail. You can omit practically everything from the gardenlet section, including all or most of the Seed spec fields. Proper defaults will be supplied in all cases, based either on the most common use cases or the information already available in the Shoot resource. Also, if your seed is configured to host HA shoot control planes, then gardenlet will be deployed with multiple replicas across nodes or availability zones by default. Some Seed spec fields, for example the provider type and region, networking CIDRs for pods, services, and nodes, etc., must be the same as the corresponding Shoot spec fields of the shoot that is being registered as seed. Attempting to use different values (except empty ones, so that they are supplied by the defaulting mechanims) will fail. Deploying gardenlet to the Shoot To register a shoot as a seed and deploy gardenlet to the shoot using a default configuration, create a ManagedSeed resource similar to the following:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: ManagedSeed metadata: name: my-managed-seed namespace: garden spec: shoot: name: crazy-botany gardenlet: {} For an example that uses non-default configuration, see 55-managed-seed-gardenlet.yaml\nRenewing the Gardenlet Kubeconfig Secret In order to make the ManagedSeed controller renew the gardenlet’s kubeconfig secret, annotate the ManagedSeed with gardener.cloud/operation=renew-kubeconfig. This will trigger a reconciliation during which the kubeconfig secret is deleted and the bootstrapping is performed again (during which gardenlet obtains a new client certificate).\nIt is also possible to trigger the renewal on the secret directly, see Rotate Certificates Using Bootstrap kubeconfig.\nSpecifying apiServer replicas and autoscaler Options There are few configuration options that are not supported in a Shoot resource but due to backward compatibility reasons it is possible to specify them for a Shoot that is referred by a ManagedSeed. These options are:\n Option Description apiServer.autoscaler.minReplicas Controls the minimum number of kube-apiserver replicas for the shoot registered as seed cluster. apiServer.autoscaler.maxReplicas Controls the maximum number of kube-apiserver replicas for the shoot registered as seed cluster. apiServer.replicas Controls how many kube-apiserver replicas the shoot registered as seed cluster gets by default. It is possible to specify these options via the shoot.gardener.cloud/managed-seed-api-server annotation on the Shoot resource. Example configuration:\n annotations: shoot.gardener.cloud/managed-seed-api-server: \"apiServer.replicas=3,apiServer.autoscaler.minReplicas=3,apiServer.autoscaler.maxReplicas=6\" Enforced Configuration Options The following configuration options are enforced by Gardener API server for the ManagedSeed resources:\n The vertical pod autoscaler should be enabled from the Shoot specification.\nThe vertical pod autoscaler is a prerequisite for a Seed cluster. It is possible to enable the VPA feature for a Seed (using the Seed spec) and for a Shoot (using the Shoot spec). In context of ManagedSeeds, enabling the VPA in the Seed spec (instead of the Shoot spec) offers less flexibility and increases the network transfer and cost. Due to these reasons, the Gardener API server enforces the vertical pod autoscaler to be enabled from the Shoot specification.\n The nginx-ingress addon should not be enabled for a Shoot referred by a ManagedSeed.\nAn Ingress controller is also a prerequisite for a Seed cluster. For a Seed cluster, it is possible to enable Gardener managed Ingress controller or to deploy self-managed Ingress controller. There is also the nginx-ingress addon that can be enabled for a Shoot (using the Shoot spec). However, the Shoot nginx-ingress addon is in deprecated mode and it is not recommended for production clusters. Due to these reasons, the Gardener API server does not allow the Shoot nginx-ingress addon to be enabled for ManagedSeeds.\n ","categories":"","description":"","excerpt":"ManagedSeeds: Register Shoot as Seed An existing shoot can be …","ref":"/docs/gardener/managed_seed/","tags":"","title":"Managed Seed"},{"body":"Deploy Resources to the Shoot Cluster We have introduced a component called gardener-resource-manager that is deployed as part of every shoot control plane in the seed. One of its tasks is to manage CRDs, so called ManagedResources. Managed resources contain Kubernetes resources that shall be created, reconciled, updated, and deleted by the gardener-resource-manager.\nExtension controllers may create these ManagedResources in the shoot namespace if they need to create any resource in the shoot cluster itself, for example RBAC roles (or anything else).\nWhere can I find more examples and more information how to use ManagedResources? Please take a look at the respective documentation.\n","categories":"","description":"","excerpt":"Deploy Resources to the Shoot Cluster We have introduced a component …","ref":"/docs/gardener/extensions/managedresources/","tags":"","title":"Managedresources"},{"body":"Request DNS Names in Shoot Clusters Introduction Within a shoot cluster, it is possible to request DNS records via the following resource types:\n Ingress Service DNSEntry It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In some Gardener setups the shoot-dns-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-dns-service ... Before you start You should :\n Have created a shoot cluster Have created and correctly configured a DNS Provider (Please consult this page for more information) Have a basic understanding of DNS (see link under References) There are 2 types of DNS that you can use within Kubernetes :\n internal (usually managed by coreDNS) external (managed by a public DNS provider). This page, and the extension, exclusively works for external DNS handling.\nGardener allows 2 way of managing your external DNS:\n Manually, which means you are in charge of creating / maintaining your Kubernetes related DNS entries Via the Gardener DNS extension Gardener DNS extension The managed external DNS records feature of the Gardener clusters makes all this easier. You do not need DNS service provider specific knowledge, and in fact you do not need to leave your cluster at all to achieve that. You simply annotate the Ingress / Service that needs its DNS records managed and it will be automatically created / managed by Gardener.\nManaged external DNS records are supported with the following DNS provider types:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns Request DNS records for Ingress resources To request a DNS name for Ingress, Service or Gateway (Istio or Gateway API) objects in the shoot cluster it must be annotated with the DNS class garden and an annotation denoting the desired DNS names.\nExample for an annotated Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: # Let Gardener manage external DNS records for this Ingress. dns.gardener.cloud/dnsnames: special.example.com # Use \"*\" to collects domains names from .spec.rules[].host dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden # If you are delegating the certificate management to Gardener, uncomment the following line #cert.gardener.cloud/purpose: managed spec: rules: - host: special.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 # Uncomment the following part if you are delegating the certificate management to Gardener #tls: # - hosts: # - special.example.com # secretName: my-cert-secret-name For an Ingress, the DNS names are already declared in the specification. Nevertheless the dnsnames annotation must be present. Here a subset of the DNS names of the ingress can be specified. If DNS names for all names are desired, the value all can be used.\nKeep in mind that ingress resources are ignored unless an ingress controller is set up. Gardener does not provide an ingress controller by default. For more details, see Ingress Controllers and Service in the Kubernetes documentation.\nRequest DNS records for service type LoadBalancer Example for an annotated Service (it must have the type LoadBalancer) resource:\napiVersion: v1 kind: Service metadata: name: amazing-svc annotations: # Let Gardener manage external DNS records for this Service. dns.gardener.cloud/dnsnames: special.example.com dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden spec: selector: app: amazing-app ports: - protocol: TCP port: 80 targetPort: 8080 type: LoadBalancer Request DNS records for Gateway resources Please see Istio Gateways or Gateway API for details.\nCreating a DNSEntry resource explicitly It is also possible to create a DNS entry via the Kubernetes resource called DNSEntry:\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSEntry metadata: annotations: # Let Gardener manage this DNS entry. dns.gardener.cloud/class: garden name: special-dnsentry namespace: default spec: dnsName: special.example.com ttl: 600 targets: - 1.2.3.4 If one of the accepted DNS names is a direct subname of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nYou can check the status of the DNSEntry with\n$ kubectl get dnsentry NAME DNS TYPE PROVIDER STATUS AGE mydnsentry special.example.com aws-route53 default/aws Ready 24s As soon as the status of the entry is Ready, the provider has accepted the new DNS record. Depending on the provider and your DNS settings and cache, it may take up to 24 hours for the new entry to be propagated over all internet.\nMore examples can be found here\nRequest DNS records for Service/Ingress resources using a DNSAnnotation resource In rare cases it may not be possible to add annotations to a Service or Ingress resource object.\nE.g.: the helm chart used to deploy the resource may not be adaptable for some reasons or some automation is used, which always restores the original content of the resource object by dropping any additional annotations.\nIn these cases, it is recommended to use an additional DNSAnnotation resource in order to have more flexibility that DNSentry resources. The DNSAnnotation resource makes the DNS shoot service behave as if annotations have been added to the referenced resource.\nFor the Ingress example shown above, you can create a DNSAnnotation resource alternatively to provide the annotations.\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSAnnotation metadata: annotations: dns.gardener.cloud/class: garden name: test-ingress-annotation namespace: default spec: resourceRef: kind: Ingress apiVersion: networking.k8s.io/v1 name: test-ingress namespace: default annotations: dns.gardener.cloud/dnsnames: '*' dns.gardener.cloud/class: garden Note that the DNSAnnotation resource itself needs the dns.gardener.cloud/class=garden annotation. This also only works for annotations known to the DNS shoot service (see Accepted External DNS Records Annotations).\nFor more details, see also DNSAnnotation objects\nAccepted External DNS Records Annotations Here are all of the accepted annotation related to the DNS extension:\n Annotation Description dns.gardener.cloud/dnsnames Mandatory for service and ingress resources, accepts a comma-separated list of DNS names if multiple names are required. For ingress you can use the special value '*'. In this case, the DNS names are collected from .spec.rules[].host. dns.gardener.cloud/class Mandatory, in the context of the shoot-dns-service it must always be set to garden. dns.gardener.cloud/ttl Recommended, overrides the default Time-To-Live of the DNS record. dns.gardener.cloud/cname-lookup-interval Only relevant if multiple domain name targets are specified. It specifies the lookup interval for CNAMEs to map them to IP addresses (in seconds) dns.gardener.cloud/realms Internal, for restricting provider access for shoot DNS entries. Typcially not set by users of the shoot-dns-service. dns.gardener.cloud/ip-stack Only relevant for provider type aws-route53 if target is an AWS load balancer domain name. Can be set for service, ingress and DNSEntry resources. It specify which DNS records with alias targets are created instead of the usual CNAME records. If the annotation is not set (or has the value ipv4), only an A record is created. With value dual-stack, both A and AAAA records are created. With value ipv6 only an AAAA record is created. service.beta.kubernetes.io/aws-load-balancer-ip-address-type=dualstack For services, behaves similar to dns.gardener.cloud/ip-stack=dual-stack. loadbalancer.openstack.org/load-balancer-address Internal, for services only: support for PROXY protocol on Openstack (which needs a hostname as ingress). Typcially not set by users of the shoot-dns-service. If one of the accepted DNS names is a direct subdomain of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore, this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nTroubleshooting General DNS tools To check the DNS resolution, use the nslookup or dig command.\n$ nslookup special.your-domain.com or with dig\n$ dig +short special.example.com Depending on your network settings, you may get a successful response faster using a public DNS server (e.g. 8.8.8.8, 8.8.4.4, or 1.1.1.1) dig @8.8.8.8 +short special.example.com DNS record events The DNS controller publishes Kubernetes events for the resource which requested the DNS record (Ingress, Service, DNSEntry). These events reveal more information about the DNS requests being processed and are especially useful to check any kind of misconfiguration, e.g. requests for a domain you don’t own.\nEvents for a successfully created DNS record:\n$ kubectl describe service my-service Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal dns-annotation 19s dns-controller-manager special.example.com: dns entry is pending Normal dns-annotation 19s (x3 over 19s) dns-controller-manager special.example.com: dns entry pending: waiting for dns reconciliation Normal dns-annotation 9s (x3 over 10s) dns-controller-manager special.example.com: dns entry active Please note, events vanish after their retention period (usually 1h).\nDNSEntry status DNSEntry resources offer a .status sub-resource which can be used to check the current state of the object.\nStatus of a erroneous DNSEntry.\n status: message: No responsible provider found observedGeneration: 3 provider: remote state: Error References Understanding DNS Kubernetes Internal DNS DNSEntry API (Golang) Managing Certificates with Gardener ","categories":"","description":"Setup Gardener-managed DNS records in cluster.","excerpt":"Setup Gardener-managed DNS records in cluster.","ref":"/docs/guides/networking/dns-extension/","tags":"","title":"Managing DNS with Gardener"},{"body":"Hugo uses Markdown for its simple content format. However, there are a lot of things that Markdown doesn’t support well. You could use pure HTML to expand possibilities. A typical example is reducing the original dimensions of an image.\nHowever, use HTML judicially and to the minimum extent possible. Using HTML in markdowns makes it harder to maintain and publish coherent documentation bundles. This is a job typically performed by a publishing platform mechanisms, such as Hugo’s layouts. Considering that the source documentation might be published by multiple platforms you should be considerate in using markup that may bind it to a particular one.\nFor the same reason, avoid inline scripts and styles in your content. If you absolutely need to use them and they are not working as expected, please create a documentation issue and describe your case.\nTip Markdown is great for its simplicity but may be also constraining for the same reason. Before looking at HTML to make up for that, first check the shortcodes for alternatives. ","categories":"","description":"","excerpt":"Hugo uses Markdown for its simple content format. However, there are a …","ref":"/docs/contribute/documentation/markup/","tags":"","title":"Markdown"},{"body":"Monitoring etcd-druid uses Prometheus for metrics reporting. The metrics can be used for real-time monitoring and debugging of compaction jobs.\nThe simplest way to see the available metrics is to cURL the metrics endpoint /metrics. The format is described here.\nFollow the Prometheus getting started doc to spin up a Prometheus server to collect etcd metrics.\nThe naming of metrics follows the suggested Prometheus best practices. All compaction related metrics are put under namespace etcddruid and the respective subsystems.\nSnapshot Compaction These metrics provide information about the compaction jobs that run after some interval in shoot control planes. Studying the metrics, we can deduce how many compaction job ran successfully, how many failed, how many delta events compacted etc.\n Name Description Type etcddruid_compaction_jobs_total Total number of compaction jobs initiated by compaction controller. Counter etcddruid_compaction_jobs_current Number of currently running compaction job. Gauge etcddruid_compaction_job_duration_seconds Total time taken in seconds to finish a running compaction job. Histogram etcddruid_compaction_num_delta_events Total number of etcd events to be compacted by a compaction job. Gauge There are two labels for etcddruid_compaction_jobs_total metrics. The label succeeded shows how many of the compaction jobs are succeeded and label failed shows how many of compaction jobs are failed.\nThere are two labels for etcddruid_compaction_job_duration_seconds metrics. The label succeeded shows how much time taken by a successful job to complete and label failed shows how much time taken by a failed compaction job.\netcddruid_compaction_jobs_current metric comes with label etcd_namespace that indicates the namespace of the Etcd running in the control plane of a shoot cluster..\nEtcd These metrics are exposed by the etcd process that runs in each etcd pod.\nThe following list metrics is applicable to clustering of a multi-node etcd cluster. The full list of metrics exposed by etcd is available here.\n No. Metrics Name Description Comments 1 etcd_disk_wal_fsync_duration_seconds latency distributions of fsync called by WAL. High disk operation latencies indicate disk issues. 2 etcd_disk_backend_commit_duration_seconds latency distributions of commit called by backend. High disk operation latencies indicate disk issues. 3 etcd_server_has_leader whether or not a leader exists. 1: leader exists, 0: leader not exists. To capture quorum loss or to check the availability of etcd cluster. 4 etcd_server_is_leader whether or not this member is a leader. 1 if it is, 0 otherwise. 5 etcd_server_leader_changes_seen_total number of leader changes seen. Helpful in fine tuning the zonal cluster like etcd-heartbeat time etc, it can also indicates the etcd load and network issues. 6 etcd_server_is_learner whether or not this member is a learner. 1 if it is, 0 otherwise. 7 etcd_server_learner_promote_successes total number of successful learner promotions while this member is leader. Might be helpful in checking the success of API calls called by backup-restore. 8 etcd_network_client_grpc_received_bytes_total total number of bytes received from grpc clients. Client Traffic In. 9 etcd_network_client_grpc_sent_bytes_total total number of bytes sent to grpc clients. Client Traffic Out. 10 etcd_network_peer_sent_bytes_total total number of bytes sent to peers. Useful for network usage. 11 etcd_network_peer_received_bytes_total total number of bytes received from peers. Useful for network usage. 12 etcd_network_active_peers current number of active peer connections. Might be useful in detecting issues like network partition. 13 etcd_server_proposals_committed_total total number of consensus proposals committed. A consistently large lag between a single member and its leader indicates that member is slow or unhealthy. 14 etcd_server_proposals_pending current number of pending proposals to commit. Pending proposals suggests there is a high client load or the member cannot commit proposals. 15 etcd_server_proposals_failed_total total number of failed proposals seen. Might indicates downtime caused by a loss of quorum. 16 etcd_server_proposals_applied_total total number of consensus proposals applied. Difference between etcd_server_proposals_committed_total and etcd_server_proposals_applied_total should usually be small. 17 etcd_mvcc_db_total_size_in_bytes total size of the underlying database physically allocated in bytes. 18 etcd_server_heartbeat_send_failures_total total number of leader heartbeat send failures. Might be helpful in fine-tuning the cluster or detecting slow disk or any network issues. 19 etcd_network_peer_round_trip_time_seconds round-trip-time histogram between peers. Might be helpful in fine-tuning network usage specially for zonal etcd cluster. 20 etcd_server_slow_apply_total total number of slow apply requests. Might indicate overloaded from slow disk. 21 etcd_server_slow_read_indexes_total total number of pending read indexes not in sync with leader’s or timed out read index requests. The full list of metrics is available here.\nEtcd-Backup-Restore These metrics are exposed by the etcd-backup-restore container in each etcd pod.\nThe following list metrics is applicable to clustering of a multi-node etcd cluster. The full list of metrics exposed by etcd-backup-restore is available here.\n No. Metrics Name Description 1. etcdbr_cluster_size to capture the scale-up/scale-down scenarios. 2. etcdbr_is_learner whether or not this member is a learner. 1 if it is, 0 otherwise. 3. etcdbr_is_learner_count_total total number times member added as the learner. 4. etcdbr_restoration_duration_seconds total latency distribution required to restore the etcd member. 5. etcdbr_add_learner_duration_seconds total latency distribution of adding the etcd member as a learner to the cluster. 6. etcdbr_member_remove_duration_seconds total latency distribution removing the etcd member from the cluster. 7. etcdbr_member_promote_duration_seconds total latency distribution of promoting the learner to the voting member. 8. etcdbr_defragmentation_duration_seconds total latency distribution of defragmentation of each etcd cluster member. Prometheus supplied metrics The Prometheus client library provides a number of metrics under the go and process namespaces.\n","categories":"","description":"","excerpt":"Monitoring etcd-druid uses Prometheus for metrics reporting. The …","ref":"/docs/other-components/etcd-druid/metrics/","tags":"","title":"Metrics"},{"body":"Migrate Azure Shoot Load Balancer from basic to standard SKU This guide descibes how to migrate the Load Balancer of an Azure Shoot cluster from the basic SKU to the standard SKU. Be aware: You need to delete and recreate all services of type Load Balancer, which means that the public ip addresses of your service endpoints will change. Please do this only if the Stakeholder really needs to migrate this Shoot to use standard Load Balancers. All new Shoot clusters will automatically use Azure Standard Load Balancers.\n Disable temporarily Gardeners reconciliation.\nThe Gardener Controller Manager need to be configured to allow ignoring Shoot clusters. This can be configured in its the ControllerManagerConfiguration via the field .controllers.shoot.respectSyncPeriodOverwrite=\"true\". # In the Garden cluster. kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/ignore=\"true\" # In the Seed cluster. kubectl -n \u003cshoot-namespace\u003e scale deployment gardener-resource-manager --replicas=0 Backup all Kubernetes services of type Load Balancer. # In the Shoot cluster. # Determine all Load Balancer services. kubectl get service --all-namespaces | grep LoadBalancer # Backup each Load Balancer service. echo \"---\" \u003e\u003e service-backup.yaml \u0026\u0026 kubectl -n \u003cnamespace\u003e get service \u003cservice-name\u003e -o yaml \u003e\u003e service-backup.yaml Delete all Load Balancer services. # In the Shoot cluster. kubectl -n \u003cnamespace\u003e delete service \u003cservice-name\u003e Wait until until Load Balancer is deleted. Wait until all services of type Load Balancer are deleted and the Azure Load Balancer resource is also deleted. Check via the Azure Portal if the Load Balancer within the Shoot Resource Group has been deleted. This should happen automatically after all Kubernetes Load Balancer service are gone within a few minutes. Alternatively the Azure cli can be used to check the Load Balancer in the Shoot Resource Group. The credentials to configure the cli are available on the Seed cluster in the Shoot namespace.\n# In the Seed cluster. # Fetch the credentials from cloudprovider secret. kubectl -n \u003cshoot-namespace\u003e get secret cloudprovider -o yaml # Configure the Azure cli, with the base64 decoded values of the cloudprovider secret. az login --service-principal --username \u003cclientID\u003e --password \u003cclientSecret\u003e --tenant \u003ctenantID\u003e az account set -s \u003csubscriptionID\u003e # Fetch the constantly the Shoot Load Balancer in the Shoot Resource Group. Wait until the resource is gone. watch 'az network lb show -g shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e -n shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e' # Logout. az logout Modify the cloud-povider-config configmap in the Seed namespace of the Shoot. The key cloudprovider.conf contains the Kubernetes cloud-provider configuration. The value is a multiline string. Please change the value of the field loadBalancerSku from basic to standard. Iff the field does not exists then append loadBalancerSku: \\\"standard\\\"\\n to the value/string. # In the Seed cluster. kubectl -n \u003cshoot-namespace\u003e edit cm cloud-provider-config Enable Gardeners reconcilation and trigger a reconciliation. # In the Garden cluster # Enable reconcilation kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/ignore- # Trigger reconcilation kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/operation=\"reconcile\" Wait until the cluster has been reconciled.\nRecreate the services from the backup file. Probably you need to remove some fields from the service defintions e.g. .spec.clusterIP, .metadata.uid or .status etc. kubectl apply -f service-backup.yaml If successful remove backup file. # Delete the backup file. rm -f service-backup.yaml ","categories":"","description":"","excerpt":"Migrate Azure Shoot Load Balancer from basic to standard SKU This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/migrate-loadbalancer/","tags":"","title":"Migrate Loadbalancer"},{"body":"Control Plane Migration Control Plane Migration is a new Gardener feature that has been recently implemented as proposed in GEP-7 Shoot Control Plane Migration. It should be properly supported by all extensions controllers. This document outlines some important points that extension maintainers should keep in mind to properly support migration in their extensions.\nOverall Principles The following principles should always be upheld:\n All states maintained by the extension that is external from the seed cluster, for example infrastructure resources in a cloud provider, DNS entries, etc., should be kept during the migration. No such state should be deleted and then recreated, as this might cause disruption in the availability of the shoot cluster. All Kubernetes resources maintained by the extension in the shoot cluster itself should also be kept during the migration. No such resources should be deleted and then recreated. Migrate and Restore Operations Two new operations have been introduced in Gardener. They can be specified as values of the gardener.cloud/operation annotation on an extension resource to indicate that an operation different from a normal reconcile should be performed by the corresponding extension controller:\n The migrate operation is used to ask the extension controller in the source seed to stop reconciling extension resources (in case they are requeued due to errors) and perform cleanup activities, if such are required. These cleanup activities might involve removing finalizers on resources in the shoot namespace that have been previously created by the extension controller and deleting them without actually deleting any resources external to the seed cluster. This is also the last opportunity for extensions to persist their state into the .status.state field of the reconciled extension resource before its restored in the new destination seed cluster. The restore operation is used to ask the extension controller in the destination seed to restore any state saved in the extension resource status, before performing the actual reconciliation. Unlike the reconcile operation, extension controllers must remove the gardener.cloud/operation annotation at the end of a successful reconciliation when the current operation is migrate or restore, not at the beginning of a reconciliation.\nCleaning-Up Source Seed Resources All resources in the source seed that have been created by an extension controller, for example secrets, config maps, managed resources, etc., should be properly cleaned up by the extension controller when the current operation is migrate. As mentioned above, such resources should be deleted without actually deleting any resources external to the seed cluster.\nThere is one exception to this: Secrets labeled with persist=true created via the secrets manager. They should be kept (i.e., the Cleanup function of secrets manager should not be called) and will be garbage collected automatically at the end of the migrate operation. This ensures that they can be properly persisted in the ShootState resource and get restored on the new destination seed cluster.\nFor many custom resources, for example MCM resources, the above requirement means in practice that any finalizers should be removed before deleting the resource, in addition to ensuring that the resource deletion is not reconciled by its respective controller if there is no finalizer. For managed resources, the above requirement means in practice that the spec.keepObjects field should be set to true before deleting the extension resource.\nHere it is assumed that any resources that contain state needed by the extension controller can be safely deleted, since any such state has been saved as described in Saving and Restoring Extension States at the end of the last successful reconciliation.\nSaving and Restoring Extension States Some extension controllers create and maintain their own state when reconciling extension resources. For example, most infrastructure controllers use Terraform and maintain the terraform state in a special config map in the shoot namespace. This state must be properly migrated to the new seed cluster during control plane migration, so that subsequent reconciliations in the new seed could find and use it appropriately.\nAll extension controllers that require such state migration must save their state in the status.state field of their extension resource at the end of a successful reconciliation. They must also restore their state from that same field upon reconciling an extension resource when the current operation is restore, as specified by the gardener.cloud/operation annotation, before performing the actual reconciliation.\nAs an example, an infrastructure controller that uses Terraform must save the terraform state in the status.state field of the Infrastructure resource. An Infrastructure resource with a properly saved state might look as follows:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig resourceGroup: name: mygroup ... status: state: |{ \"version\": 3, \"terraform_version\": \"0.11.14\", \"serial\": 2, \"lineage\": \"3a1e2faa-e7b6-f5f0-5043-368dd8ea6c10\", ... } Extension controllers that do not use a saved state and therefore do not require state migration could leave the status.state field as nil at the end of a successful reconciliation, and just perform a normal reconciliation when the current operation is restore.\nIn addition, extension controllers that use referenced resources (usually secrets) must also make sure that these resources are added to the status.resources field of their extension resource at the end of a successful reconciliation, so they could be properly migrated by Gardener to the destination seed.\nImplementation Details Migrate and Restore Actuator Methods Most extension controller implementations follow a common pattern where a generic Reconciler implementation delegates to an Actuator interface that contains the methods Reconcile and Delete, provided by the extension. Two methods Migrate and Restore are available in all such Actuator interfaces, see the infrastructure Actuator interface as an example. These methods are called by the generic reconcilers for the migrate and restore operations respectively, and should be implemented by the extension according to the above guidelines.\nExtension Controllers Based on Generic Actuators In practice, the implementation of many extension controllers (for example, the ControlPlane and Worker controllers in most provider extensions) are based on a generic Actuator implementation that only delegates to extension methods for behavior that is truly provider specific. In all such cases, the Migrate and Restore methods have already been implemented properly in the generic actuators and there is nothing more to do in the extension itself.\nIn some rare cases, extension controllers based on a generic actuator might still introduce a custom Actuator implementation to override some of the generic actuator methods in order to enhance or change their behavior in a certain way. In such cases, the Migrate and Restore methods might need to be overridden as well, see the Azure controlplane controller as an example.\nWorker State Note that the machine state is handled specially by gardenlet (i.e., all relevant objects in the machine.sapcloud.io/v1alpha1 API are directly persisted by gardenlet and NOT by the generic actuators). In the past, they were persisted to the Worker’s .status.state field by the so-called “worker state reconciler”, however, this reconciler was dropped and changed as part of GEP-22. Nowadays, gardenlet directly writes the state to the ShootState resource during the Migrate phase of a Shoot (without the detour of the Worker’s .status.state field). On restoration, unlike for other extension kinds, gardenlet no longer populates the machine state into the Worker’s .status.state field. Instead, the extension controller should read the machine state directly from the ShootState in the garden cluster (see this document for information how to access the garden cluster) and use it to subsequently restore the relevant machine.sapcloud.io/v1alpha1 resources. This flow is implemented in the generic Worker actuator. As a result, Extension controllers using this generic actuator do not need to implement any custom logic.\nExtension Controllers Not Based on Generic Actuators The implementation of some extension controllers (for example, the infrastructure controllers in all provider extensions) are not based on a generic Actuator implementation. Such extension controllers must always provide a proper implementation of the Migrate and Restore methods according to the above guidelines, see the AWS infrastructure controller as an example. In practice, this might result in code duplication between the different extensions, since the Migrate and Restore code is usually not provider or OS-specific.\n If you do not use the generic Worker actuator, see this section for information how to handle the machine state related to the Worker resource.\n ","categories":"","description":"","excerpt":"Control Plane Migration Control Plane Migration is a new Gardener …","ref":"/docs/gardener/extensions/migration/","tags":"","title":"Migration"},{"body":"Migration from Gardener v0 to v1 Please refer to the document for older Gardener versions.\n","categories":"","description":"","excerpt":"Migration from Gardener v0 to v1 Please refer to the document for …","ref":"/docs/gardener/deployment/migration_v0_to_v1/","tags":"","title":"Migration V0 To V1"},{"body":"Monitoring Work In Progress We will be introducing metrics for Dependency-Watchdog-Prober and Dependency-Watchdog-Weeder. These metrics will be pushed to prometheus. Once that is completed we will provide details on all the metrics that will be supported here.\n","categories":"","description":"","excerpt":"Monitoring Work In Progress We will be introducing metrics for …","ref":"/docs/other-components/dependency-watchdog/deployment/monitor/","tags":"","title":"Monitor"},{"body":"Monitoring The shoot-rsyslog-relp extension exposes metrics for the rsyslog service running on a Shoot’s nodes so that they can be easily viewed by cluster owners and operators in the Shoot’s Prometheus and Plutono instances. The exposed monitoring data offers valuable insights into the operation of the rsyslog service and can be used to detect and debug ongoing issues. This guide describes the various metrics, alerts and logs available to cluster owners and operators.\nMetrics Metrics for the rsyslog service originate from its impstats module. These include the number of messages in the various queues, the number of ingested messages, the number of processed messages by configured actions, system resources used by the rsyslog service, and others. More information about them can be found in the impstats documentation and the statistics counter documentation. They are exposed via the node-exporter running on each Shoot node and are scraped by the Shoot’s Prometheus instance.\nThese metrics can also be viewed in a dedicated dashboard named Rsyslog Stats in the Shoot’s Plutono instance. You can select the node for which you wish the metrics to be displayed from the Node dropdown menu (by default metrics are summed over all nodes).\nFollowing is a list of all exposed rsyslog metrics. The name and origin labels can be used to determine wether the metric is for: a queue, an action, plugins or system stats; the node label can be used to determine the node the metric originates from:\nrsyslog_pstat_submitted Number of messages that were submitted to the rsyslog service from its input. Currently rsyslog uses the /run/systemd/journal/syslog socket as input.\n Type: Counter Labels: name node origin rsyslog_pstat_processed Number of messages that are successfully processed by an action and sent to the target server.\n Type: Counter Labels: name node origin rsyslog_pstat_failed Number of messages that could not be processed by an action nor sent to the target server.\n Type: Counter Labels: name node origin rsyslog_pstat_suspended Total number of times an action suspended itself. Note that this counts the number of times the action transitioned from active to suspended state. The counter is no indication of how long the action was suspended or how often it was retried.\n Type: Counter Labels: name node origin rsyslog_pstat_suspended_duration The total number of seconds this action was disabled.\n Type: Counter Labels: name node origin rsyslog_pstat_resumed The total number of times this action resumed itself. A resumption occurs after the action has detected that a failure condition does no longer exist.\n Type: Counter Labels: name node origin rsyslog_pstat_utime User time used in microseconds.\n Type: Counter Labels: name node origin rsyslog_pstat_stime System time used in microsends.\n Type: Counter Labels: name node origin rsyslog_pstat_maxrss Maximum resident set size\n Type: Gauge Labels: name node origin rsyslog_pstat_minflt Total number of minor faults the task has made per second, those which have not required loading a memory page from disk.\n Type: Counter Labels: name node origin rsyslog_pstat_majflt Total number of major faults the task has made per second, those which have required loading a memory page from disk.\n Type: Counter Labels: name node origin rsyslog_pstat_inblock Filesystem input operations.\n Type: Counter Labels: name node origin rsyslog_pstat_oublock Filesystem output operations.\n Type: Counter Labels: name node origin rsyslog_pstat_nvcsw Voluntary context switches.\n Type: Counter Labels: name node origin rsyslog_pstat_nivcsw Involuntary context switches.\n Type: Counter Labels: name node origin rsyslog_pstat_openfiles Number of open files.\n Type: Counter Labels: name node origin rsyslog_pstat_size Messages currently in queue.\n Type: Gauge Labels: name node origin rsyslog_pstat_enqueued Total messages enqueued.\n Type: Counter Labels: name node origin rsyslog_pstat_full Times queue was full.\n Type: Counter Labels: name node origin rsyslog_pstat_discarded_full Messages discarded due to queue being full.\n Type: Counter Labels: name node origin rsyslog_pstat_discarded_nf Messages discarded when queue not full.\n Type: Counter Labels: name node origin rsyslog_pstat_maxqsize Maximum size queue has reached.\n Type: Gauge Labels: name node origin rsyslog_augenrules_load_success Shows whether the augenrules --load command was executed successfully or not on the node.\n Type: Gauge Labels: node Alerts There are three alerts defined for the rsyslog service in the Shoot’s Prometheus instance:\nRsyslogTooManyRelpActionFailures This indicates that the cumulative failure rate in processing relp action messages is greater than 2%. In other words, it compares the rate of processed relp action messages to the rate of failed relp action messages and fires an alert when the following expression evaluates to true:\nsum(rate(rsyslog_pstat_failed{origin=\"core.action\",name=\"rsyslg-relp\"}[5m])) / sum(rate(rsyslog_pstat_processed{origin=\"core.action\",name=\"rsyslog-relp\"}[5m])) \u003e bool 0.02` RsyslogRelpActionProcessingRateIsZero This indicates that no messages are being sent to the upstream rsyslog target by the relp action. An alert is fired when the following expression evaluates to true:\nrate(rsyslog_pstat_processed{origin=\"core.action\",name=\"rsyslog-relp\"}[5m]) == 0 RsyslogRelpAuditRulesNotLoadedSuccessfully This indicates that augenrules --load was not executed successfully when called to load the configured audit rules. You should check if the auditd configuration you provided is valid. An alert is fired when the following expression evaluates to true:\nabsent(rsyslog_augenrules_load_success == 1) Users can subscribe to these alerts by following the Gardener alerting guide.\nLogging There are two ways to view the logs of the rsyslog service running on the Shoot’s nodes - either using the Explore tab of the Shoot’s Plutono instance, or ssh-ing directly to a node.\nTo view logs in Plutono, navigate to the Explore tab and select vali from the Explore dropdown menu. Afterwards enter the following vali query:\n{nodename=\"\u003cname-of-node\u003e\"} |~ \"\\\"unit\\\":\\\"rsyslog.service\\\"\"\nNotice that you cannot use the unit label to filter for the rsyslog.service unit logs. Instead, you have to grep for the service as displayed in the example above.\nTo view logs when directly ssh-ing to a node in the Shoot cluster, use either of the following commands on the node:\nsystemctl status rsyslog\njournalctl -u rsyslog\n","categories":"","description":"","excerpt":"Monitoring The shoot-rsyslog-relp extension exposes metrics for the …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/monitoring/","tags":"","title":"Monitoring"},{"body":"Monitoring Roles of the different Prometheus instances Cache Prometheus Deployed in the garden namespace. Important scrape targets:\n cadvisor node-exporter kube-state-metrics Purpose: Act as a reverse proxy that supports server-side filtering, which is not supported by Prometheus exporters but by federation. Metrics in this Prometheus are kept for a short amount of time (~1 day) since other Prometheus instances are expected to federate from it and move metrics over. For example, the shoot Prometheus queries this Prometheus to retrieve metrics corresponding to the shoot’s control plane. This way, we achieve isolation so that shoot owners are only able to query metrics for their shoots. Please note Prometheus does not support isolation features. Another example is if another Prometheus needs access to cadvisor metrics, which does not support server-side filtering, so it will query this Prometheus instead of the cadvisor. This strategy also reduces load on the kubelets and API Server.\nNote some of these Prometheus’ metrics have high cardinality (e.g., metrics related to all shoots managed by the seed). Some of these are aggregated with recording rules. These pre-aggregated metrics are scraped by the aggregate Prometheus.\nThis Prometheus is not used for alerting.\nAggregate Prometheus Deployed in the garden namespace. Important scrape targets:\n other Prometheus instances logging components Purpose: Store pre-aggregated data from the cache Prometheus and shoot Prometheus. An ingress exposes this Prometheus allowing it to be scraped from another cluster. Such pre-aggregated data is also used for alerting.\nSeed Prometheus Deployed in the garden namespace. Important scrape targets:\n pods in extension namespaces annotated with: prometheus.io/scrape=true prometheus.io/port=\u003cport\u003e prometheus.io/name=\u003cname\u003e cadvisor metrics from pods in the garden and extension namespaces The job name label will be applied to all metrics from that service.\nPurpose: Entrypoint for operators when debugging issues with extensions or other garden components.\nThis Prometheus is not used for alerting.\nShoot Prometheus Deployed in the shoot control plane namespace. Important scrape targets:\n control plane components shoot nodes (node-exporter) blackbox-exporter used to measure connectivity Purpose: Monitor all relevant components belonging to a shoot cluster managed by Gardener. Shoot owners can view the metrics in Plutono dashboards and receive alerts based on these metrics. For alerting internals refer to this document.\nCollect all shoot Prometheus with remote write An optional collection of all shoot Prometheus metrics to a central Prometheus (or cortex) instance is possible with the monitoring.shoot setting in GardenletConfiguration:\nmonitoring: shoot: remoteWrite: url: https://remoteWriteUrl # remote write URL keep:# metrics that should be forwarded to the external write endpoint. If empty all metrics get forwarded - kube_pod_container_info externalLabels: # add additional labels to metrics to identify it on the central instance additional: label If basic auth is needed it can be set via secret in garden namespace (Gardener API Server). Example secret\nDisable Gardener Monitoring If you wish to disable metric collection for every shoot and roll your own then you can simply set.\nmonitoring: shoot: enabled: false ","categories":"","description":"","excerpt":"Monitoring Roles of the different Prometheus instances Cache …","ref":"/docs/gardener/monitoring/readme/","tags":"","title":"Monitoring"},{"body":"Extending the Monitoring Stack This document provides instructions to extend the Shoot cluster monitoring stack by integrating new scrape targets, alerts and dashboards.\nPlease ensure that you have understood the basic principles of Prometheus and its ecosystem before you continue.\n‼️ The purpose of the monitoring stack is to observe the behaviour of the control plane and the system components deployed by Gardener onto the worker nodes. Monitoring of custom workloads running in the cluster is out of scope.\nOverview Each Shoot cluster comes with its own monitoring stack. The following components are deployed into the seed and shoot:\n Seed Prometheus Plutono blackbox-exporter kube-state-metrics (Seed metrics) kube-state-metrics (Shoot metrics) Alertmanager (Optional) Shoot node-exporter(s) kube-state-metrics blackbox-exporter In each Seed cluster there is a Prometheus in the garden namespace responsible for collecting metrics from the Seed kubelets and cAdvisors. These metrics are provided to each Shoot Prometheus via federation.\nThe alerts for all Shoot clusters hosted on a Seed are routed to a central Alertmanger running in the garden namespace of the Seed. The purpose of this central Alertmanager is to forward all important alerts to the operators of the Gardener setup.\nThe Alertmanager in the Shoot namespace on the Seed is only responsible for forwarding alerts from its Shoot cluster to a cluster owner/cluster alert receiver via email. The Alertmanager is optional and the conditions for a deployment are already described in Alerting.\nThe node-exporter’s textfile collector is enabled and configured to parse all *.prom files in the /var/lib/node-exporter/textfile-collector directory on each Shoot node. Scripts and programs which run on Shoot nodes and cannot expose an endpoint to be scraped by prometheus can use this directory to export metrics in files that match the glob *.prom using the text format.\nAdding New Monitoring Targets After exploring the metrics which your component provides or adding new metrics, you should be aware which metrics are required to write the needed alerts and dashboards.\nPrometheus prefers a pull based metrics collection approach and therefore the targets to observe need to be defined upfront. The targets are defined in charts/seed-monitoring/charts/core/charts/prometheus/templates/config.yaml. New scrape jobs can be added in the section scrape_configs. Detailed information how to configure scrape jobs and how to use the kubernetes service discovery are available in the Prometheus documentation.\nThe job_name of a scrape job should be the name of the component e.g. kube-apiserver or vpn. The collection interval should be the default of 30s. You do not need to specify this in the configuration.\nPlease do not ingest all metrics which are provided by a component. Rather, collect only those metrics which are needed to define the alerts and dashboards (i.e. whitelist). This can be achieved by adding the following metric_relabel_configs statement to your scrape jobs (replace exampleComponent with component name).\n - job_name: example-component ... metric_relabel_configs: {{ include \"prometheus.keep-metrics.metric-relabel-config\" .Values.allowedMetrics.exampleComponent | indent 6 }} The whitelist for the metrics of your job can be maintained in charts/seed-monitoring/charts/core/charts/prometheus/values.yaml in section allowedMetrics.exampleComponent (replace exampleComponent with component name). Check the following example:\nallowedMetrics: ... exampleComponent: * metrics_name_1 * metrics_name_2 ... Adding Alerts The alert definitons are located in charts/seed-monitoring/charts/core/charts/prometheus/rules. There are two approaches for adding new alerts.\n Adding additional alerts for a component which already has a set of alerts. In this case you have to extend the existing rule file for the component. Adding alerts for a new component. In this case a new rule file with name scheme example-component.rules.yaml needs to be added. Add the new alert to alertInhibitionGraph.dot, add any required inhibition flows and render the new graph. To render the graph, run: dot -Tpng ./content/alertInhibitionGraph.dot -o ./content/alertInhibitionGraph.png Create a test for the new alert. See Alert Tests. Example alert:\ngroups: * name: example.rules rules: * alert: ExampleAlert expr: absent(up{job=\"exampleJob\"} == 1) for: 20m labels: service: example severity: critical # How severe is the alert? (blocker|critical|info|warning) type: shoot # For which topology is the alert relevant? (seed|shoot) visibility: all # Who should receive the alerts? (all|operator|owner) annotations: description: A longer description of the example alert that should also explain the impact of the alert. summary: Short summary of an example alert. If the deployment of component is optional then the alert definitions needs to be added to charts/seed-monitoring/charts/core/charts/prometheus/optional-rules instead. Furthermore the alerts for component need to be activatable in charts/seed-monitoring/charts/core/charts/prometheus/values.yaml via rules.optional.example-component.enabled. The default should be true.\nBasic instruction how to define alert rules can be found in the Prometheus documentation.\nRouting Tree The Alertmanager is grouping incoming alerts based on labels into buckets. Each bucket has its own configuration like alert receivers, initial delaying duration or resending frequency, etc. You can find more information about Alertmanager routing in the Prometheus/Alertmanager documentation. The routing trees for the Alertmanagers deployed by Gardener are depicted below.\nCentral Seed Alertmanager\n∟ main route (all alerts for all shoots on the seed will enter) ∟ group by project and shoot name ∟ group by visibility \"all\" and \"operator\" ∟ group by severity \"blocker\", \"critical\", and \"info\" → route to Garden operators ∟ group by severity \"warning\" (dropped) ∟ group by visibility \"owner\" (dropped) Shoot Alertmanager\n∟ main route (only alerts for one Shoot will enter) ∟ group by visibility \"all\" and \"owner\" ∟ group by severity \"blocker\", \"critical\", and \"info\" → route to cluster alert receiver ∟ group by severity \"warning\" (dropped, will change soon → route to cluster alert receiver) ∟ group by visibility \"operator\" (dropped) Alert Inhibition All alerts related to components running on the Shoot workers are inhibited in case of an issue with the vpn connection, because those components can’t be scraped anymore and Prometheus will fire alerts in consequence. The components running on the workers are probably healthy and the alerts are presumably false positives. The inhibition flow is shown in the figure below. If you add a new alert, make sure to add it to the diagram.\nAlert Attributes Each alert rule definition has to contain the following annotations:\n summary: A short description of the issue. description: A detailed explanation of the issue with hints to the possible root causes and the impact assessment of the issue. In addition, each alert must contain the following labels:\n type shoot: Components running on the Shoot worker nodes in the kube-system namespace. seed: Components running on the Seed in the Shoot namespace as part of/next to the control plane. service Name of the component (in lowercase) e.g. kube-apiserver, alertmanager or vpn. severity blocker: All issues which make the cluster entirely unusable, e.g. KubeAPIServerDown or KubeSchedulerDown critical: All issues which affect single functionalities/components but do not affect the cluster in its core functionality e.g. VPNDown or KubeletDown. info: All issues that do not affect the cluster or its core functionality, but if this component is down we cannot determine if a blocker alert is firing. (i.e. A component with an info level severity is a dependency for a component with a blocker severity) warning: No current existing issue, rather a hint for situations which could lead to real issue in the close future e.g. HighLatencyApiServerToWorkers or ApiServerResponseSlow. Adding Plutono Dashboards The dashboard definition files are located in charts/seed-monitoring/charts/plutono/dashboards. Every dashboard needs its own file.\nIf you are adding a new component dashboard please also update the overview dashboard by adding a chart for its current up/down status and with a drill down option to the component dashboard.\nDashboard Structure The dashboards should be structured in the following way. The assignment of the component dashboards to the categories should be handled via dashboard tags.\n Kubernetes control plane components (Tag: control-plane) All components which are part of the Kubernetes control plane e. g. Kube API Server, Kube Controller Manager, Kube Scheduler and Cloud Controller Manager ETCD + Backup/Restore Kubernetes Addon Manager Node/Machine components (Tag: node/machine) All metrics which are related to the behaviour/control of the Kubernetes nodes and kubelets Machine-Controller-Manager + Cluster Autoscaler Networking components (Tag: network) CoreDNS, KubeProxy, Calico, VPN, Nginx Ingress Addon components (Tag: addon) Cert Broker Monitoring components (Tag: monitoring) Logging components (Tag: logging) Mandatory Charts for Component Dashboards For each new component, its corresponding dashboard should contain the following charts in the first row, before adding custom charts for the component in the subsequent rows.\n Pod up/down status up{job=\"example-component\"} Pod/containers cpu utilization Pod/containers memory consumption Pod/containers network i/o That information is provided by the cAdvisor metrics. These metrics are already integrated. Please check the other dashboards for detailed information on how to query.\nChart Requirements Each chart needs to contain:\n a meaningful name a detailed description (for non trivial charts) appropriate x/y axis descriptions appropriate scaling levels for the x/y axis proper units for the x/y axis Dashboard Parameters The following parameters should be added to all dashboards to ensure a homogeneous experience across all dashboards.\nDashboards have to:\n contain a title which refers to the component name(s) contain a timezone statement which should be the browser time contain tags which express where the component is running (seed or shoot) and to which category the component belong (see dashboard structure) contain a version statement with a value of 1 be immutable Example dashboard configuration:\n{ \"title\": \"example-component\", \"timezone\": \"utc\", \"tags\": [ \"seed\", \"control-plane\" ], \"version\": 1, \"editable\": \"false\" } Furthermore, all dashboards should contain the following time options:\n{ \"time\": { \"from\": \"now-1h\", \"to\": \"now\" }, \"timepicker\": { \"refresh_intervals\": [ \"30s\", \"1m\", \"5m\" ], \"time_options\": [ \"5m\", \"15m\", \"1h\", \"6h\", \"12h\", \"24h\", \"2d\", \"10d\" ] } } ","categories":"","description":"","excerpt":"Extending the Monitoring Stack This document provides instructions to …","ref":"/docs/gardener/monitoring-stack/","tags":"","title":"Monitoring Stack"},{"body":"Overview You can configure a NetworkPolicy to deny all the traffic from other namespaces while allowing all the traffic coming from the same namespace the pod was deployed into.\nThere are many reasons why you may chose to employ Kubernetes network policies:\n Isolate multi-tenant deployments Regulatory compliance Ensure containers assigned to different environments (e.g. dev/staging/prod) cannot interfere with each other Kubernetes network policies are application centric compared to infrastructure/network centric standard firewalls. There are no explicit CIDRs or IP addresses used for matching source or destination IP’s. Network policies build up on labels and selectors which are key concepts of Kubernetes that are used to organize (for example, all DB tier pods of an app) and select subsets of objects.\nExample We create two nginx HTTP-Servers in two namespaces and block all traffic between the two namespaces. E.g. you are unable to get content from namespace1 if you are sitting in namespace2.\nSetup the Namespaces # create two namespaces for test purpose kubectl create ns customer1 kubectl create ns customer2 # create a standard HTTP web server kubectl run nginx --image=nginx --replicas=1 --port=80 -n=customer1 kubectl run nginx --image=nginx --replicas=1 --port=80 -n=customer2 # expose the port 80 for external access kubectl expose deployment nginx --port=80 --type=NodePort -n=customer1 kubectl expose deployment nginx --port=80 --type=NodePort -n=customer2 Test Without NP Create a pod with curl preinstalled inside the namespace customer1:\n# create a \"bash\" pod in one namespace kubectl run -i --tty client --image=tutum/curl -n=customer1 Try to curl the exposed nginx server to get the default index.html page. Execute this in the bash prompt of the pod created above.\n# get the index.html from the nginx of the namespace \"customer1\" =\u003e success curl http://nginx.customer1 # get the index.html from the nginx of the namespace \"customer2\" =\u003e success curl http://nginx.customer2 Both calls are done in a pod within the namespace customer1 and both nginx servers are always reachable, no matter in what namespace.\n Test with NP Install the NetworkPolicy from your shell:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-from-other-namespaces spec: podSelector: matchLabels: ingress: - from: - podSelector: {} it applies the policy to ALL pods in the named namespace as the spec.podSelector.matchLabels is empty and therefore selects all pods. it allows traffic from ALL pods in the named namespace, as spec.ingress.from.podSelector is empty and therefore selects all pods. kubectl apply -f ./network-policy.yaml -n=customer1 kubectl apply -f ./network-policy.yaml -n=customer2 After this, curl http://nginx.customer2 shouldn’t work anymore if you are a service inside the namespace customer1 and vice versa Note This policy, once applied, will also disable all external traffic to these pods. For example, you can create a service of type LoadBalancer in namespace customer1 that match the nginx pod. When you request the service by its \u003cEXTERNAL_IP\u003e:\u003cPORT\u003e, then the network policy that will deny the ingress traffic from the service and the request will time out. Related Links You can get more information on how to configure the NetworkPolicies at:\n Calico WebSite Kubernetes NP Recipes ","categories":"","description":"Deny all traffic from other namespaces","excerpt":"Deny all traffic from other namespaces","ref":"/docs/guides/applications/network-isolation/","tags":"","title":"Namespace Isolation"},{"body":"Necessary Labeling for Custom CSI Components Some provider extensions for Gardener are using CSI components to manage persistent volumes in the shoot clusters. Additionally, most of the provider extensions are deploying controllers for taking volume snapshots (CSI snapshotter).\nEnd-users can deploy their own CSI components and controllers into shoot clusters. In such situations, there are multiple controllers acting on the VolumeSnapshot custom resources (each responsible for those instances associated with their respective driver provisioner types).\nHowever, this might lead to operational conflicts that cannot be overcome by Gardener alone. Concretely, Gardener cannot know which custom CSI components were installed by end-users which can lead to issues, especially during shoot cluster deletion. You can add a label to your custom CSI components indicating that Gardener should not try to remove them during shoot cluster deletion. This means you have to take care of the lifecycle for these components yourself!\nRecommendations Custom CSI components are typically regular Deployments running in the shoot clusters.\nPlease label them with the shoot.gardener.cloud/no-cleanup=true label.\nBackground Information When a shoot cluster is deleted, Gardener deletes most Kubernetes resources (Deployments, DaemonSets, StatefulSets, etc.). Gardener will also try to delete CSI components if they are not marked with the above mentioned label.\nThis can result in VolumeSnapshot resources still having finalizers that will never be cleaned up. Consequently, manual intervention is required to clean them up before the cluster deletion can continue.\n","categories":"","description":"","excerpt":"Necessary Labeling for Custom CSI Components Some provider extensions …","ref":"/docs/gardener/csi_components/","tags":"","title":"Necessary Labeling for Custom CSI Components"},{"body":"Gardener Network Extension Gardener is an open-source project that provides a nested user model. Basically, there are two types of services provided by Gardener to its users:\n Managed: end-users only request a Kubernetes cluster (Clusters-as-a-Service) Hosted: operators utilize Gardener to provide their own managed version of Kubernetes (Cluster-Provisioner-as-a-service) Whether a user is an operator or an end-user, it makes sense to provide choice. For example, for an end-user it might make sense to choose a network-plugin that would support enforcing network policies (some plugins does not come with network-policy support by default). For operators however, choice only matters for delegation purposes i.e., when providing an own managed-service, it becomes important to also provide choice over which network-plugins to use.\nFurthermore, Gardener provisions clusters on different cloud-providers with different networking requirements. For example, Azure does not support Calico overlay networking with IP in IP [1], this leads to the introduction of manual exceptions in static add-on charts which is error prone and can lead to failures during upgrades.\nFinally, every provider is different, and thus the network always needs to adapt to the infrastructure needs to provide better performance. Consistency does not necessarily lie in the implementation but in the interface.\nMotivation Prior to the Network Extensibility concept, Gardener followed a mono network-plugin support model (i.e., Calico). Although this seemed to be the easier approach, it did not completely reflect the real use-case. The goal of the Gardener Network Extensions is to support different network plugins, therefore, the specification for the network resource won’t be fixed and will be customized based on the underlying network plugin.\nTo do so, a ProviderConfig field in the spec will be provided where each plugin will define. Below is an example for how to deploy Calico as the cluster network plugin.\nThe Network Extensions Resource Here is what a typical Network resource would look-like:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: my-network spec: ipFamilies: - IPv4 podCIDR: 100.244.0.0/16 serviceCIDR: 100.32.0.0/13 type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig backend: bird ipam: cidr: usePodCIDR type: host-local The above resources is divided into two parts (more information can be found at Using the Networking Calico Extension):\n global configuration (e.g., podCIDR, serviceCIDR, and type) provider specific config (e.g., for calico we can choose to configure a bird backend) Note: Certain cloud-provider extensions might have webhooks that would modify the network-resource to fit into their network specific context. As previously mentioned, Azure does not support IPIP, as a result, the Azure provider extension implements a webhook to mutate the backend and set it to None instead of bird.\n Supporting a New Network Extension Provider To add support for another networking provider (e.g., weave, Cilium, Flannel) a network extension controller needs to be implemented which would optionally have its own custom configuration specified in the spec.providerConfig in the Network resource. For example, if support for a network plugin named gardenet is required, the following Network resource would be created:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: my-network spec: ipFamilies: - IPv4 podCIDR: 100.244.0.0/16 serviceCIDR: 100.32.0.0/13 type: gardenet providerConfig: apiVersion: gardenet.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig gardenetCustomConfigField: \u003cvalue\u003e ipam: cidr: usePodCIDR type: host-local Once applied, the presumably implemented Gardenet extension controller would pick the configuration up, parse the providerConfig, and create the necessary resources in the shoot.\nFor additional reference, please have a look at the networking-calico provider extension, which provides more information on how to configure the necessary charts, as well as the actuators required to reconcile networking inside the Shoot cluster to the desired state.\nSupporting kube-proxy-less Service Routing Some networking extensions support service routing without the kube-proxy component. This is why Gardener supports disabling of kube-proxy for service routing by setting .spec.kubernetes.kubeproxy.enabled to false in the Shoot specification. The implicit contract of the flag is:\nIf kube-proxy is disabled, then the networking extension is responsible for the service routing.\nThe networking extensions need to handle this twofold:\n During the reconciliation of the networking resources, the extension needs to check whether kube-proxy takes care of the service routing or the networking extension itself should handle it. In case the networking extension should be responsible according to .spec.kubernetes.kubeproxy.enabled (but is unable to perform the service routing), it should raise an error during the reconciliation. If the networking extension should handle the service routing, it may reconfigure itself accordingly. (Optional) In case the networking extension does not support taking over the service routing (in some scenarios), it is recommended to also provide a validating admission webhook to reject corresponding changes early on. The validation may take the current operating mode of the networking extension into consideration. Related Links [1] Calico overlay networking on Azure ","categories":"","description":"","excerpt":"Gardener Network Extension Gardener is an open-source project that …","ref":"/docs/gardener/extensions/network/","tags":"","title":"Network"},{"body":"NetworkPolicys In Garden, Seed, Shoot Clusters This document describes which Kubernetes NetworkPolicys deployed by Gardener into the various clusters.\nGarden Cluster (via gardener-operator and gardener-resource-manager)\nThe gardener-operator runs a NetworkPolicy controller which is responsible for the following namespaces:\n garden istio-system *istio-ingress-* shoot-* extension-* (in case the garden cluster is a seed cluster at the same time) It deploys the following so-called “general NetworkPolicys”:\n Name Purpose deny-all Denies all ingress and egress traffic for all pods in this namespace. Hence, all traffic must be explicitly allowed. allow-to-dns Allows egress traffic from pods labeled with networking.gardener.cloud/to-dns=allowed to DNS pods running in the kube-sytem namespace. In practice, most of the pods performing network egress traffic need this label. allow-to-runtime-apiserver Allows egress traffic from pods labeled with networking.gardener.cloud/to-runtime-apiserver=allowed to the API server of the runtime cluster. allow-to-blocked-cidrs Allows egress traffic from pods labeled with networking.gardener.cloud/to-blocked-cidrs=allowed to explicitly blocked addresses configured by human operators (configured via .spec.networking.blockedCIDRs in the Seed). For instance, this can be used to block the cloud provider’s metadata service. allow-to-public-networks Allows egress traffic from pods labeled with networking.gardener.cloud/to-public-networks=allowed to all public network IPs, except for private networks (RFC1918), carrier-grade NAT (RFC6598), and explicitly blocked addresses configured by human operators for all pods labeled with networking.gardener.cloud/to-public-networks=allowed. In practice, this blocks egress traffic to all networks in the cluster and only allows egress traffic to public IPv4 addresses. allow-to-private-networks Allows egress traffic from pods labeled with networking.gardener.cloud/to-private-networks=allowed to the private networks (RFC1918) and carrier-grade NAT (RFC6598) except for cluster-specific networks (configured via .spec.networks in the Seed). Apart from those, the gardener-operator also enables the NetworkPolicy controller of gardener-resource-manager. Please find more information in the linked document. In summary, most of the pods that initiate connections with other pods will have labels with networking.resources.gardener.cloud/ prefixes. This way, they leverage the automatically created NetworkPolicys by the controller. As a result, in most cases no special/custom-crafted NetworkPolicys must be created anymore.\nSeed Cluster (via gardenlet and gardener-resource-manager)\nIn seed clusters it works the same way as in the garden cluster managed by gardener-operator. When a seed cluster is the garden cluster at the same time, gardenlet does not enable the NetworkPolicy controller (since gardener-operator already runs it). Otherwise, it uses the exact same controller and code like gardener-operator, resulting in the same behaviour in both garden and seed clusters.\nLogging \u0026 Monitoring Seed System Namespaces As part of the seed reconciliation flow, the gardenlet deploys various Prometheus instances into the garden namespace. See also this document for more information. Each pod that should be scraped for metrics by these instances must have a Service which is annotated with\nannotations: networking.resources.gardener.cloud/from-all-seed-scrape-targets-allowed-ports: '[{\"port\":\u003cmetrics-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' If the respective pod is not running in the garden namespace, the Service needs these annotations in addition:\nannotations: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"garden\"}}]' If the respective pod is running in an extension-* namespace, the Service needs this annotation in addition:\nannotations: networking.resources.gardener.cloud/pod-label-selector-namespace-alias: extensions This automatically allows the needed network traffic from the respective Prometheus pods.\nShoot Namespaces As part of the shoot reconciliation flow, the gardenlet deploys a shoot-specific Prometheus into the shoot namespace. Each pod that should be scraped for metrics must have a Service which is annotated with\nannotations: networking.resources.gardener.cloud/from-all-scrape-targets-allowed-ports: '[{\"port\":\u003cmetrics-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' This automatically allows the network traffic from the Prometheus pod.\nWebhook Servers Components serving webhook handlers that must be reached by kube-apiservers of the virtual garden cluster or shoot clusters just need to annotate their Service as follows:\nannotations: networking.resources.gardener.cloud/from-all-webhook-targets-allowed-ports: '[{\"port\":\u003cserver-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' This automatically allows the network traffic from the API server pods.\nIn case the servers run in a different namespace than the kube-apiservers, the following annotations are needed:\nannotations: networking.resources.gardener.cloud/from-all-webhook-targets-allowed-ports: '[{\"port\":\u003cserver-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' networking.resources.gardener.cloud/pod-label-selector-namespace-alias: extensions # for the virtual garden cluster: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"garden\"}}]' # for shoot clusters: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"gardener.cloud/role\":\"shoot\"}}]' Additional Namespace Coverage in Garden/Seed Cluster In some cases, garden or seed clusters might run components in dedicated namespaces which are not covered by the controller by default (see list above). Still, it might(/should) be desired to also include such “custom namespaces” into the control of the NetworkPolicy controllers.\nIn order to do so, human operators can adapt the component configs of gardener-operator or gardenlet by providing label selectors for additional namespaces:\ncontrollers: networkPolicy: additionalNamespaceSelectors: - matchLabels: foo: bar Communication With kube-apiserver For Components In Custom Namespaces Egress Traffic Component running in such custom namespaces might need to initiate the communication with the kube-apiservers of the virtual garden cluster or a shoot cluster. In order to achieve this, their custom namespace must be labeled with networking.gardener.cloud/access-target-apiserver=allowed. This will make the NetworkPolicy controllers automatically provisioning the required policies into their namespace.\nAs a result, the respective component pods just need to be labeled with\n networking.resources.gardener.cloud/to-garden-virtual-garden-kube-apiserver-tcp-443=allowed (virtual garden cluster) networking.resources.gardener.cloud/to-all-shoots-kube-apiserver-tcp-443=allowed (shoot clusters) Ingress Traffic Components running in such custom namespaces might serve webhook handlers that must be reached by the kube-apiservers of the virtual garden cluster or a shoot cluster. In order to achieve this, their Service must be annotated. Please refer to this section for more information.\nShoot Cluster (via gardenlet)\nFor shoot clusters, the concepts mentioned above don’t apply and are not enabled. Instead, gardenlet only deploys a few “custom” NetworkPolicys for the shoot system components running in the kube-system namespace. All other namespaces in the shoot cluster do not contain network policies deployed by gardenlet.\nAs a best practice, every pod deployed into the kube-system namespace should use appropriate NetworkPolicy in order to only allow required network traffic. Therefore, pods should have labels matching to the selectors of the available network policies.\ngardenlet deploys the following NetworkPolicys:\nNAME POD-SELECTOR gardener.cloud--allow-dns k8s-app in (kube-dns) gardener.cloud--allow-from-seed networking.gardener.cloud/from-seed=allowed gardener.cloud--allow-to-dns networking.gardener.cloud/to-dns=allowed gardener.cloud--allow-to-apiserver networking.gardener.cloud/to-apiserver=allowed gardener.cloud--allow-to-from-nginx app=nginx-ingress gardener.cloud--allow-to-kubelet networking.gardener.cloud/to-kubelet=allowed gardener.cloud--allow-to-public-networks networking.gardener.cloud/to-public-networks=allowed gardener.cloud--allow-vpn app=vpn-shoot Note that a deny-all policy will not be created by gardenlet. Shoot owners can create it manually if needed/desired. Above listed NetworkPolicys ensure that the traffic for the shoot system components is allowed in case such deny-all policies is created.\nWebhook Servers in Shoot Clusters Shoot components serving webhook handlers must be reached by kube-apiservers of the shoot cluster. However, the control plane components, e.g. kube-apiserver, run on the seed cluster decoupled by a VPN connection. Therefore, shoot components serving webhook handlers need to allow the VPN endpoints in the shoot cluster as clients to allow kube-apiservers to call them.\nFor the kube-system namespace, the network policy gardener.cloud--allow-from-seed fulfils the purpose to allow pods to mark themselves as targets for such calls, allowing corresponding traffic to pass through.\nFor custom namespaces, operators can use the network policy gardener.cloud--allow-from-seed as a template. Please note that the label selector may change over time, i.e. with Gardener version updates. This is why a simpler variant with a reduced label selector like the example below is recommended:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-from-seed namespace: custom-namespace spec: ingress: - from: - namespaceSelector: matchLabels: gardener.cloud/purpose: kube-system podSelector: matchLabels: app: vpn-shoot Implications for Gardener Extensions Gardener extensions sometimes need to deploy additional components into the shoot namespace in the seed cluster hosting the control plane. For example, the gardener-extension-provider-aws deploys the cloud-controller-manager into the shoot namespace. In most cases, such pods require network policy labels to allow the traffic they are initiating.\nFor components deployed in the kube-system namespace of the shoots (e.g., CNI plugins or CSI drivers, etc.), custom NetworkPolicys might be required to ensure the respective components can still communicate in case the user creates a deny-all policy.\n","categories":"","description":"","excerpt":"NetworkPolicys In Garden, Seed, Shoot Clusters This document describes …","ref":"/docs/gardener/network_policies/","tags":"","title":"Network Policies"},{"body":"Gardener Extension for Network Problem Detector \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-networking-problemdetector extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nExtension Resources Currently there is nothing to specify in the extension spec.\nExample extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-networking-problemdetector namespace: shoot--project--abc spec: When an extension resource is reconciled, the extension controller will create two daemonsets nwpd-agent-pod-net and nwpd-agent-node-net deploying the “network problem detector agent”. These daemon sets perform and collect various checks between all nodes of the Kubernetes cluster, to its Kube API server and/or external endpoints. Checks are performed using TCP connections, PING (ICMP) or mDNS (UDP). More details about the network problem detector agent can be found in its repository gardener/network-problem-detector.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension for deploying network problem detector","excerpt":"Gardener extension for deploying network problem detector","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/","tags":"","title":"Networking problemdetector"},{"body":"Adding Cloud Providers This document provides an overview of how to integrate a new cloud provider into Gardener. Each component that requires integration has a detailed description of how to integrate it and the steps required.\nCloud Components Gardener is composed of 2 or more Kubernetes clusters:\n Shoot: These are the end-user clusters, the regular Kubernetes clusters you have seen. They provide places for your workloads to run. Seed: This is the “management” cluster. It manages the control planes of shoots by running them as native Kubernetes workloads. These two clusters can run in the same cloud provider, but they do not need to. For example, you could run your Seed in AWS, while having one shoot in Azure, two in Google, two in Alicloud, and three in Equinix Metal.\nThe Seed cluster deploys and manages the Shoot clusters. Importantly, for this discussion, the etcd data store backing each Shoot runs as workloads inside the Seed. Thus, to use the above example, the clusters in Azure, Google, Alicloud and Equinix Metal will have their worker nodes and master nodes running in those clouds, but the etcd clusters backing them will run as separate deployments in the Seed Kubernetes cluster on AWS.\nThis distinction becomes important when preparing the integration to a new cloud provider.\nGardener Cloud Integration Gardener and its related components integrate with cloud providers at the following key lifecycle elements:\n Create/destroy/get/list machines for the Shoot. Create/destroy/get/list infrastructure components for the Shoot, e.g. VPCs, subnets, routes, etc. Backup/restore etcd for the Seed via writing files to and reading them from object storage. Thus, the integrations you need for your cloud provider depend on whether you want to deploy Shoot clusters to the provider, Seed or both.\n Shoot Only: machine lifecycle management, infrastructure Seed: etcd backup/restore Gardener API In addition to the requirements to integrate with the cloud provider, you also need to enable the core Gardener app to receive, validate, and process requests to use that cloud provider.\n Expose the cloud provider to the consumers of the Gardener API, so it can be told to use that cloud provider as an option. Validate that API as requests come in. Write cloud provider specific implementation (called “provider extension”). Cloud Provider API Requirements In order for a cloud provider to integrate with Gardener, the provider must have an API to perform machine lifecycle events, specifically:\n Create a machine Destroy a machine Get information about a machine and its state List machines In addition, if the Seed is to run on the given provider, it also must have an API to save files to block storage and retrieve them, for etcd backup/restore.\nThe current integration with cloud providers is to add their API calls to Gardener and the Machine Controller Manager. As both Gardener and the Machine Controller Manager are written in go, the cloud provider should have a go SDK. However, if it has an API that is wrappable in go, e.g. a REST API, then you can use that to integrate.\nThe Gardener team is working on bringing cloud provider integrations out-of-tree, making them plugable, which should simplify the process and make it possible to use other SDKs.\nSummary To add a new cloud provider, you need some or all of the following. Each repository contains instructions on how to extend it to a new cloud provider.\n Type Purpose Location Documentation Seed or Shoot Machine Lifecycle machine-controller-manager MCM new cloud provider Seed only etcd backup/restore etcd-backup-restore In process All Extension implementation gardener Extension controller ","categories":"","description":"","excerpt":"Adding Cloud Providers This document provides an overview of how to …","ref":"/docs/gardener/new-cloud-provider/","tags":"","title":"New Cloud Provider"},{"body":"Adding Support For a New Kubernetes Version This document describes the steps needed to perform in order to confidently add support for a new Kubernetes minor version.\n ⚠️ Typically, once a minor Kubernetes version vX.Y is supported by Gardener, then all patch versions vX.Y.Z are also automatically supported without any required action. This is because patch versions do not introduce any new feature or API changes, so there is nothing that needs to be adapted in gardener/gardener code.\n The Kubernetes community release a new minor version roughly every 4 months. Please refer to the official documentation about their release cycles for any additional information.\nShortly before a new release, an “umbrella” issue should be opened which is used to collect the required adaptations and to track the work items. For example, #5102 can be used as a template for the issue description. As you can see, the task of supporting a new Kubernetes version also includes the provider extensions maintained in the gardener GitHub organization and is not restricted to gardener/gardener only.\nGenerally, the work items can be split into two groups: The first group contains tasks specific to the changes in the given Kubernetes release, the second group contains Kubernetes release-independent tasks.\n ℹ️ Upgrading the k8s.io/* and sigs.k8s.io/controller-runtime Golang dependencies is typically tracked and worked on separately (see e.g. #4772 or #5282).\n Deriving Release-Specific Tasks Most new minor Kubernetes releases incorporate API changes, deprecations, or new features. The community announces them via their change logs. In order to derive the release-specific tasks, the respective change log for the new version vX.Y has to be read and understood (for example, the changelog for v1.24).\nAs already mentioned, typical changes to watch out for are:\n API version promotions or deprecations Feature gate promotions or deprecations CLI flag changes for Kubernetes components New default values in resources New available fields in resources New features potentially relevant for the Gardener system Changes of labels or annotations Gardener relies on … Obviously, this requires a certain experience and understanding of the Gardener project so that all “relevant changes” can be identified. While reading the change log, add the tasks (along with the respective PR in kubernetes/kubernetes to the umbrella issue).\n ℹ️ Some of the changes might be specific to certain cloud providers. Pay attention to those as well and add related tasks to the issue.\n List Of Release-Independent Tasks The following paragraphs describe recurring tasks that need to be performed for each new release.\nMake Sure a New hyperkube Image Is Released The gardener/hyperkube repository is used to release container images consisting of the kubectl and kubelet binaries.\nThere is a CI/CD job that runs periodically and releases a new hyperkube image when there is a new Kubernetes release. Before proceeding with the next steps, make sure that a new hyperkube image is released for the corresponding new Kubernetes minor version. Make sure that container image is present in GCR.\nAdapting Gardener Allow instantiation of a Kubernetes client for the new minor version and update the README.md: See this example commit. The list of supported versions is meanwhile maintained here in the SupportedVersions variable. Maintain the Kubernetes feature gates used for validation of Shoot resources: The feature gates are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-feature-gates.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-feature-gates.sh v1.26 v1.27). It will present 3 lists of feature gates: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e and feature gates that got locked to default in \u003cnew-version\u003e. Add all added feature gates to the map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed feature gates, add \u003cnew-version\u003e as RemovedInVersion to the already existing feature gate in the map. For feature gates locked to default, add \u003cnew-version\u003e as LockedToDefaultInVersion to the already existing feature gate in the map. See this example commit. Maintain the Kubernetes kube-apiserver admission plugins used for validation of Shoot resources: The admission plugins are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-admission-plugins.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-admission-plugins.sh 1.26 1.27). It will present 2 lists of admission plugins: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added admission plugins to the admissionPluginsVersionRanges map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed admission plugins, add \u003cnew-version\u003e as RemovedInVersion to the already existing admission plugin in the map. Flag any admission plugins that are required (plugins that must not be disabled in the Shoot spec) by setting the Required boolean variable to true for the admission plugin in the map. Flag any admission plugins that are forbidden by setting the Forbidden boolean variable to true for the admission plugin in the map. Maintain the Kubernetes kube-apiserver API groups used for validation of Shoot resources: The API groups are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-api-groups.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-api-groups.sh 1.26 1.27). It will present 2 lists of API GroupVersions and 2 lists of API GroupVersionResources: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added group versions to the apiGroupVersionRanges map and group version resources to the apiGVRVersionRanges map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed APIs, add \u003cnew-version\u003e as RemovedInVersion to the already existing API in the corresponding map. Flag any APIs that are required (APIs that must not be disabled in the Shoot spec) by setting the Required boolean variable to true for the API in the apiGVRVersionRanges map. If this API also should not be disabled for Workerless Shoots, then set RequiredForWorkerless boolean variable also to true. If the API is required for both Shoot types, then both of these booleans need to be set to true. If the whole API Group is required, then mark it correspondingly in the apiGroupVersionRanges map. Maintain the Kubernetes kube-controller-manager controllers for each API group used in deploying required KCM controllers based on active APIs: The API groups are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compute-k8s-controllers.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compute-k8s-controllers.sh 1.28 1.29). If it complains that the path for the controller is not present in the map, check the release branch of the new Kubernetes version and find the correct path for the missing/wrong controller. You can do so by checking the file cmd/kube-controller-manager/app/controllermanager.go and where the controller is initialized from. As of now, there is no straight-forward way to map each controller to its file. If this has improved, please enhance the script. If the paths are correct, it will present 2 lists of controllers: those added and those removed for each API group in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added controllers to the APIGroupControllerMap map and under the corresponding API group with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed controllers, add \u003cnew-version\u003e as RemovedInVersion to the already existing controller in the corresponding API group map. If you are unable to find the removed controller name, then check for its alias. Either in the staging/src/k8s.io/cloud-provider/names/controller_names.go file (example) or in the cmd/kube-controller-manager/app/* files (example for apps API group). This is because for kubernetes versions starting from v1.28, we don’t maintain the aliases in the controller, but the controller names itself since some controllers can be initialized without aliases as well (example). The old alias should still be working since it should be backwards compatible as explained here. Once the support for kubernetes version \u003c v1.28 is droppped, we can drop the usages of these aliases and move completely to controller names. Make sure that the API groups in this file are in sync with the groups in this file. For example, core/v1 is replaced by the script as v1 and apiserverinternal as internal. This is because the API groups registered by the apiserver (example) and the file path imported by the controllers (example) might be slightly different in some cases. Maintain the ServiceAccount names for the controllers part of kube-controller-manager: The names are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-controllers.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-controllers.sh 1.26 1.27). It will present 2 lists of controllers: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Double check whether such ServiceAccount indeed appears in the kube-system namespace when creating a cluster with \u003cnew-version\u003e. Note that it sometimes might be hidden behind a default-off feature gate. You can create a local cluster with the new version using the local provider. It could so happen that the name of the controller is used in the form of a constant and not a string, see example, In that case not the value of the constant separetely. You could also cross check the names with the result of the compute-k8s-controllers.sh script used in the previous step. If it appears, add all added controllers to the list based on the Kubernetes version (example). For any removed controllers, add them only to the Kubernetes version if it is low enough. Maintain the names of controllers used for workerless Shoots, here after carefully evaluating whether they are needed if there are no workers. Maintain copies of the DaemonSet controller’s scheduling logic: gardener-resource-manager’s Node controller uses a copy of parts of the DaemonSet controller’s logic for determining whether a specific Node should run a daemon pod of a given DaemonSet: see this file. Check the referenced upstream files for changes to the DaemonSet controller’s logic and adapt our copies accordingly. This might include introducing version-specific checks in our codebase to handle different shoot cluster versions. Maintain version specific defaulting logic in shoot admission plugin: Sometimes default values for shoots are intentionally changed with the introduction of a new Kubernetes version. The final Kubernetes version for a shoot is determined in the Shoot Validator Admission Plugin. Any defaulting logic that depends on the version should be placed in this admission plugin (example). Ensure that maintenance-controller is able to auto-update shoots to the new Kubernetes version. Changes to the shoot spec required for the Kubernetes update should be enforced in such cases (examples). Bump the used Kubernetes version for local e2e test. See this example commit. Filing the Pull Request Work on all the tasks you have collected and validate them using the local provider. Execute the e2e tests and if everything looks good, then go ahead and file the PR (example PR). Generally, it is great if you add the PRs also to the umbrella issue so that they can be tracked more easily.\nAdapting Provider Extensions After the PR in gardener/gardener for the support of the new version has been merged, you can go ahead and work on the provider extensions.\n Actually, you can already start even if the PR is not yet merged and use the branch of your fork.\n Update the github.com/gardener/gardener dependency in the extension and update the README.md. Work on release-specific tasks related to this provider. Maintaining the cloud-controller-manager Images Some of the cloud providers are not yet using upstream cloud-controller-manager images. Instead, we build and maintain them ourselves:\n cloud-provider-gcp Until we switch to upstream images, you need to update the Kubernetes dependencies and release a new image. The required steps are as follows:\n Checkout the legacy-cloud-provider branch of the respective repository Bump the versions in the Dockerfile (example commit). Update the VERSION to vX.Y.Z-dev where Z is the latest available Kubernetes patch version for the vX.Y minor version. Update the k8s.io/* dependencies in the go.mod file to vX.Y.Z and run go mod tidy (example commit). Checkout a new release-vX.Y branch and release it (example) As you are already on it, it is great if you also bump the k8s.io/* dependencies for the last three minor releases as well. In this case, you need to checkout the release-vX.{Y-{1,2,3}} branches and only perform the last three steps (example branch, example commit).\n Now you need to update the new releases in the imagevector/images.yaml of the respective provider extension so that they are used (see this example commit for reference).\nFiling the Pull Request Again, work on all the tasks you have collected. This time, you cannot use the local provider for validation but should create real clusters on the various infrastructures. Typically, the following validations should be performed:\n Create new clusters with versions \u003c vX.Y Create new clusters with version = vX.Y Upgrade old clusters from version vX.{Y-1} to version vX.Y Delete clusters with versions \u003c vX.Y Delete clusters with version = vX.Y If everything looks good, then go ahead and file the PR (example PR). Generally, it is again great if you add the PRs also to the umbrella issue so that they can be tracked more easily.\n","categories":"","description":"","excerpt":"Adding Support For a New Kubernetes Version This document describes …","ref":"/docs/gardener/new-kubernetes-version/","tags":"","title":"New Kubernetes Version"},{"body":"Gardener Extension to configure rsyslog with relp module \nGardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.\nUsage Configuring the Rsyslog Relp Extension - learn what is the use-case for rsyslog-relp, how to enable it and configure it Local Setup and Development Deploying the Rsyslog Relp Extension Locally - learn how to set up a local development environment Developer Docs for Gardener Shoot Rsyslog Relp Extension - learn about the inner workings ","categories":"","description":"Gardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.","excerpt":"Gardener extension controller which configures the rsyslog and auditd …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/","tags":"","title":"Node Audit Logging"},{"body":"NodeLocalDNS Configuration This is a short guide describing how to enable DNS caching on the shoot cluster nodes.\nBackground Currently in Gardener we are using CoreDNS as a deployment that is auto-scaled horizontally to cover for QPS-intensive applications. However, doing so does not seem to be enough to completely circumvent DNS bottlenecks such as:\n Cloud provider limits for DNS lookups. Unreliable UDP connections that forces a period of timeout in case packets are dropped. Unnecessary node hopping since CoreDNS is not deployed on all nodes, and as a result DNS queries end-up traversing multiple nodes before reaching the destination server. Inefficient load-balancing of services (e.g., round-robin might not be enough when using IPTables mode) and more … To workaround the issues described above, node-local-dns was introduced. The architecture is described below. The idea is simple:\n For new queries, the connection is upgraded from UDP to TCP and forwarded towards the cluster IP for the original CoreDNS server. For previously resolved queries, an immediate response from the same node where the requester workload / pod resides is provided. Configuring NodeLocalDNS All that needs to be done to enable the usage of the node-local-dns feature is to set the corresponding option (spec.systemComponents.nodeLocalDNS.enabled) in the Shoot resource to true:\n... spec: ... systemComponents: nodeLocalDNS: enabled: true ... It is worth noting that:\n When migrating from IPVS to IPTables, existing pods will continue to leverage the node-local-dns cache. When migrating from IPtables to IPVS, only newer pods will be switched to the node-local-dns cache. During the reconfiguration of the node-local-dns there might be a short disruption in terms of domain name resolution depending on the setup. Usually, DNS requests are repeated for some time as UDP is an unreliable protocol, but that strictly depends on the application/way the domain name resolution happens. It is recommended to let the shoot be reconciled during the next maintenance period. Enabling or disabling node-local-dns triggers a rollout of all shoot worker nodes, see also this document. For more information about node-local-dns, please refer to the KEP or to the usage documentation.\nKnown Issues Custom DNS configuration may not work as expected in conjunction with NodeLocalDNS. Please refer to Custom DNS Configuration.\n","categories":"","description":"","excerpt":"NodeLocalDNS Configuration This is a short guide describing how to …","ref":"/docs/gardener/node-local-dns/","tags":"","title":"NodeLocalDNS Configuration"},{"body":"Gardener Extension for openid connect services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-oidc-service extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nCompatibility The following lists compatibility requirements of this extension controller with regards to other Gardener components.\n OIDC Extension Gardener Notes == v0.15.0 \u003e= 1.60.0 \u003c= v1.64.0 A typical side-effect when running Gardener \u003c v1.63.0 is an unexpected scale-down of the OIDC webhook from 2 -\u003e 1. == v0.16.0 \u003e= 1.65.0 Extension Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-oidc-service namespace: shoot--project--abc spec: type: shoot-oidc-service When an extension resource is reconciled, the extension controller will create an instance of OIDC Webhook Authenticator. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for OpenID Connect services for shoot clusters","excerpt":"Gardener extension controller for OpenID Connect services for shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/","tags":"","title":"OpenID Connect services"},{"body":"ClusterOpenIDConnectPreset and OpenIDConnectPreset This page provides an overview of ClusterOpenIDConnectPresets and OpenIDConnectPresets, which are objects for injecting OpenIDConnect Configuration into Shoot at creation time. The injected information contains configuration for the Kube API Server and optionally configuration for kubeconfig generation using said configuration.\nOpenIDConnectPreset An OpenIDConnectPreset is an API resource for injecting additional runtime OIDC requirements into a Shoot at creation time. You use label selectors to specify the Shoot to which a given OpenIDConnectPreset applies.\nUsing a OpenIDConnectPresets allows project owners to not have to explicitly provide the same OIDC configuration for every Shoot in their Project.\nFor more information about the background, see the issue for OpenIDConnectPreset.\nHow OpenIDConnectPreset Works Gardener provides an admission controller (OpenIDConnectPreset) which, when enabled, applies OpenIDConnectPresets to incoming Shoot creation requests. When a Shoot creation request occurs, the system does the following:\n Retrieve all OpenIDConnectPreset available for use in the Shoot namespace.\n Check if the shoot label selectors of any OpenIDConnectPreset matches the labels on the Shoot being created.\n If multiple presets are matched then only one is chosen and results are sorted based on:\n .spec.weight value. lexicographically ordering their names (e.g., 002preset \u003e 001preset) If the Shoot already has a .spec.kubernetes.kubeAPIServer.oidcConfig, then no mutation occurs.\n Simple OpenIDConnectPreset Example This is a simple example to show how a Shoot is modified by the OpenIDConnectPreset:\napiVersion: settings.gardener.cloud/v1alpha1 kind: OpenIDConnectPreset metadata: name: test-1 namespace: default spec: shootSelector: matchLabels: oidc: enabled server: clientID: test-1 issuerURL: https://foo.bar # caBundle: | # -----BEGIN CERTIFICATE----- # Li4u # -----END CERTIFICATE----- groupsClaim: groups-claim groupsPrefix: groups-prefix usernameClaim: username-claim usernamePrefix: username-prefix signingAlgs: - RS256 requiredClaims: key: value weight: 90 Create the OpenIDConnectPreset:\nkubectl apply -f preset.yaml Examine the created OpenIDConnectPreset:\nkubectl get openidconnectpresets NAME ISSUER SHOOT-SELECTOR AGE test-1 https://foo.bar oidc=enabled 1s Simple Shoot example:\nThis is a sample of a Shoot with some fields omitted:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: version: 1.20.2 Create the Shoot:\nkubectl apply -f shoot.yaml Examine the created Shoot:\nkubectl get shoot preset -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: kubeAPIServer: oidcConfig: clientID: test-1 groupsClaim: groups-claim groupsPrefix: groups-prefix issuerURL: https://foo.bar requiredClaims: key: value signingAlgs: - RS256 usernameClaim: username-claim usernamePrefix: username-prefix version: 1.20.2 Disable OpenIDConnectPreset The OpenIDConnectPreset admission control is enabled by default. To disable it, use the --disable-admission-plugins flag on the gardener-apiserver.\nFor example:\n--disable-admission-plugins=OpenIDConnectPreset ClusterOpenIDConnectPreset A ClusterOpenIDConnectPreset is an API resource for injecting additional runtime OIDC requirements into a Shoot at creation time. In contrast to OpenIDConnect, it’s a cluster-scoped resource. You use label selectors to specify the Project and Shoot to which a given OpenIDCConnectPreset applies.\nUsing a OpenIDConnectPresets allows cluster owners to not have to explicitly provide the same OIDC configuration for every Shoot in specific Project.\nFor more information about the background, see the issue for ClusterOpenIDConnectPreset.\nHow ClusterOpenIDConnectPreset Works Gardener provides an admission controller (ClusterOpenIDConnectPreset) which, when enabled, applies ClusterOpenIDConnectPresets to incoming Shoot creation requests. When a Shoot creation request occurs, the system does the following:\n Retrieve all ClusterOpenIDConnectPresets available.\n Check if the project label selector of any ClusterOpenIDConnectPreset matches the labels of the Project in which the Shoot is being created.\n Check if the shoot label selectors of any ClusterOpenIDConnectPreset matches the labels on the Shoot being created.\n If multiple presets are matched then only one is chosen and results are sorted based on:\n .spec.weight value. lexicographically ordering their names ( e.g. 002preset \u003e 001preset ) If the Shoot already has a .spec.kubernetes.kubeAPIServer.oidcConfig then no mutation occurs.\n Note: Due to the previous requirement, if a Shoot is matched by both OpenIDConnectPreset and ClusterOpenIDConnectPreset, then OpenIDConnectPreset takes precedence over ClusterOpenIDConnectPreset.\n Simple ClusterOpenIDConnectPreset Example This is a simple example to show how a Shoot is modified by the ClusterOpenIDConnectPreset:\napiVersion: settings.gardener.cloud/v1alpha1 kind: ClusterOpenIDConnectPreset metadata: name: test spec: shootSelector: matchLabels: oidc: enabled projectSelector: {} # selects all projects. server: clientID: cluster-preset issuerURL: https://foo.bar # caBundle: | # -----BEGIN CERTIFICATE----- # Li4u # -----END CERTIFICATE----- groupsClaim: groups-claim groupsPrefix: groups-prefix usernameClaim: username-claim usernamePrefix: username-prefix signingAlgs: - RS256 requiredClaims: key: value weight: 90 Create the ClusterOpenIDConnectPreset:\nkubectl apply -f preset.yaml Examine the created ClusterOpenIDConnectPreset:\nkubectl get clusteropenidconnectpresets NAME ISSUER PROJECT-SELECTOR SHOOT-SELECTOR AGE test https://foo.bar \u003cnone\u003e oidc=enabled 1s This is a sample of a Shoot, with some fields omitted:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: version: 1.20.2 Create the Shoot:\nkubectl apply -f shoot.yaml Examine the created Shoot:\nkubectl get shoot preset -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: kubeAPIServer: oidcConfig: clientID: cluster-preset groupsClaim: groups-claim groupsPrefix: groups-prefix issuerURL: https://foo.bar requiredClaims: key: value signingAlgs: - RS256 usernameClaim: username-claim usernamePrefix: username-prefix version: 1.20.2 Disable ClusterOpenIDConnectPreset The ClusterOpenIDConnectPreset admission control is enabled by default. To disable it, use the --disable-admission-plugins flag on the gardener-apiserver.\nFor example:\n--disable-admission-plugins=ClusterOpenIDConnectPreset ","categories":"","description":"","excerpt":"ClusterOpenIDConnectPreset and OpenIDConnectPreset This page provides …","ref":"/docs/gardener/openidconnect-presets/","tags":"","title":"OpenIDConnect Presets"},{"body":"Register OpenID Connect provider in Shoot Clusters Introduction Within a shoot cluster, it is possible to dynamically register OpenID Connect providers. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-oidc-service extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-oidc-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-oidc-service ... OpenID Connect provider In order to register an OpenID Connect provider an openidconnect resource should be deployed in the shoot cluster.\nIt is strongly recommended to NOT disable prefixing since it may result in unwanted impersonations. The rule of thumb is to always use meaningful and unique prefixes for both username and groups. A good way to ensure this is to use the name of the openidconnect resource as shown in the example below.\napiVersion: authentication.gardener.cloud/v1alpha1 kind: OpenIDConnect metadata: name: abc spec: # issuerURL is the URL the provider signs ID Tokens as. # This will be the \"iss\" field of all tokens produced by the provider and is used for configuration discovery. issuerURL: https://abc-oidc-provider.example # clientID is the audience for which the JWT must be issued for, the \"aud\" field. clientID: my-shoot-cluster # usernameClaim is the JWT field to use as the user's username. usernameClaim: sub # usernamePrefix, if specified, causes claims mapping to username to be prefix with the provided value. # A value \"oidc:\" would result in usernames like \"oidc:john\". # If not provided, the prefix defaults to \"( .metadata.name )/\". The value \"-\" can be used to disable all prefixing. usernamePrefix: \"abc:\" # groupsClaim, if specified, causes the OIDCAuthenticator to try to populate the user's groups with an ID Token field. # If the groupsClaim field is present in an ID Token the value must be a string or list of strings. # groupsClaim: groups # groupsPrefix, if specified, causes claims mapping to group names to be prefixed with the value. # A value \"oidc:\" would result in groups like \"oidc:engineering\" and \"oidc:marketing\". # If not provided, the prefix defaults to \"( .metadata.name )/\". # The value \"-\" can be used to disable all prefixing. # groupsPrefix: \"abc:\" # caBundle is a PEM encoded CA bundle which will be used to validate the OpenID server's certificate. If unspecified, system's trusted certificates are used. # caBundle: \u003cbase64 encoded bundle\u003e # supportedSigningAlgs sets the accepted set of JOSE signing algorithms that can be used by the provider to sign tokens. # The default value is RS256. # supportedSigningAlgs: # - RS256 # requiredClaims, if specified, causes the OIDCAuthenticator to verify that all the # required claims key value pairs are present in the ID Token. # requiredClaims: # customclaim: requiredvalue # maxTokenExpirationSeconds if specified, sets a limit in seconds to the maximum validity duration of a token. # Tokens issued with validity greater that this value will not be verified. # Setting this will require that the tokens have the \"iat\" and \"exp\" claims. # maxTokenExpirationSeconds: 3600 # jwks if specified, provides an option to specify JWKS keys offline. # jwks: # keys is a base64 encoded JSON webkey Set. If specified, the OIDCAuthenticator skips the request to the issuer's jwks_uri endpoint to retrieve the keys. # keys: \u003cbase64 encoded jwks\u003e ","categories":"","description":"","excerpt":"Register OpenID Connect provider in Shoot Clusters Introduction Within …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/openidconnects/","tags":"","title":"Openidconnects"},{"body":"Contract: OperatingSystemConfig Resource Gardener uses the machine API and leverages the functionalities of the machine-controller-manager (MCM) in order to manage the worker nodes of a shoot cluster. The machine-controller-manager itself simply takes a reference to an OS-image and (optionally) some user-data (a script or configuration that is executed when a VM is bootstrapped), and forwards both to the provider’s API when creating VMs. MCM does not have any restrictions regarding supported operating systems as it does not modify or influence the machine’s configuration in any way - it just creates/deletes machines with the provided metadata.\nConsequently, Gardener needs to provide this information when interacting with the machine-controller-manager. This means that basically every operating system is possible to be used, as long as there is some implementation that generates the OS-specific configuration in order to provision/bootstrap the machines.\n⚠️ Currently, there are a few requirements of pre-installed components that must be present in all OS images:\n containerd ctr (client CLI) containerd must listen on its default socket path: unix:///run/containerd/containerd.sock containerd must be configured to work with the default configuration file in: /etc/containerd/config.toml (eventually created by Gardener). systemd The reasons for that will become evident later.\nWhat does the user-data bootstrapping the machines contain? Gardener installs a few components onto every worker machine in order to allow it to join the shoot cluster. There is the kubelet process, some scripts for continuously checking the health of kubelet and containerd, but also configuration for log rotation, CA certificates, etc. You can find the complete configuration at the components folder. We are calling this the “original” user-data.\nHow does Gardener bootstrap the machines? gardenlet makes use of gardener-node-agent to perform the bootstrapping and reconciliation of systemd units and files on the machine. Please refer to this document for a first overview.\nUsually, you would submit all the components you want to install onto the machine as part of the user-data during creation time. However, some providers do have a size limitation (around ~16KB) for that user-data. That’s why we do not send the “original” user-data to the machine-controller-manager (who then forwards it to the provider’s API). Instead, we only send a small “init” script that bootstrap the gardener-node-agent. It fetches the “original” content from a Secret and applies it on the machine directly. This way we can extend the “original” user-data without any size restrictions (except for the 1 MB limit for Secrets).\nThe high-level flow is as follows:\n For every worker pool X in the Shoot specification, Gardener creates a Secret named cloud-config-\u003cX\u003e in the kube-system namespace of the shoot cluster. The secret contains the “original” OperatingSystemConfig (i.e., systemd units and files for kubelet, etc.). Gardener generates a kubeconfig with minimal permissions just allowing reading these secrets. It is used by the gardener-node-agent later. Gardener provides the gardener-node-init.sh bash script and the machine image stated in the Shoot specification to the machine-controller-manager. Based on this information, the machine-controller-manager creates the VM. After the VM has been provisioned, the gardener-node-init.sh script starts, fetches the gardener-node-agent binary, and starts it. The gardener-node-agent will read the gardener-node-agent-\u003cX\u003e Secret for its worker pool (containing the “original” OperatingSystemConfig), and reconciles it. The gardener-node-agent can update itself in case of newer Gardener versions, and it performs a continuous reconciliation of the systemd units and files in the provided OperatingSystemConfig (just like any other Kubernetes controller).\nWhat needs to be implemented to support a new operating system? As part of the Shoot reconciliation flow, gardenlet will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile units: - name: containerd.service dropIns: - name: 10-containerd-opts.conf content: |[Service] Environment=\"SOME_OPTS=--foo=bar\" - name: containerd-monitor.service command: start enable: true content: |[Unit] Description=Containerd-monitor daemon After=kubelet.service [Install] WantedBy=multi-user.target [Service] Restart=always EnvironmentFile=/etc/environment ExecStart=/opt/bin/health-monitor containerd files: - path: /var/lib/kubelet/ca.crt permissions: 0644 encoding: b64 content: secretRef: name: default-token-5dtjz dataKey: token - path: /etc/sysctl.d/99-k8s-general.conf permissions: 0644 content: inline: data: |# A higher vm.max_map_count is great for elasticsearch, mongo, or other mmap users # See https://github.com/kubernetes/kops/issues/1340 vm.max_map_count = 135217728 In order to support a new operating system, you need to write a controller that watches all OperatingSystemConfigs with .spec.type=\u003cmy-operating-system\u003e. For those it shall generate a configuration blob that fits to your operating system.\nOperatingSystemConfigs can have two purposes: either provision or reconcile.\nprovision Purpose The provision purpose is used by gardenlet for the user-data that it later passes to the machine-controller-manager (and then to the provider’s API) when creating new VMs. It contains the gardener-node-init.sh script and systemd unit.\nThe OS controller has to translate the .spec.units and .spec.files into configuration that fits to the operating system. For example, a Flatcar controller might generate a CoreOS cloud-config or Ignition, SLES might generate cloud-init, and others might simply generate a bash script translating the .spec.units into systemd units, and .spec.files into real files on the disk.\n ⚠️ Please avoid mixing in additional systemd units or files - this step should just translate what gardenlet put into .spec.units and .spec.files.\n After generation, extension controllers are asked to store their OS config inside a Secret (as it might contain confidential data) in the same namespace. The secret’s .data could look like this:\napiVersion: v1 kind: Secret metadata: name: osc-result-pool-01-original namespace: default ownerReferences: - apiVersion: extensions.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: OperatingSystemConfig name: pool-01-original uid: 99c0c5ca-19b9-11e9-9ebd-d67077b40f82 data: cloud_config: base64(generated-user-data) Finally, the secret’s metadata must be provided in the OperatingSystemConfig’s .status field:\n... status: cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 reconcile Purpose The reconcile purpose contains the “original” OperatingSystemConfig (which is later stored in Secrets in the shoot’s kube-system namespace (see step 1)).\nThe OS controller does not need to translate anything here, but it has the option to provide additional systemd units or files via the .status field:\nstatus: extensionUnits: - name: my-custom-service.service command: start enable: true content: |[Unit] // some systemd unit content extensionFiles: - path: /etc/some/file permissions: 0644 content: inline: data: some-file-content lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 The gardener-node-agent will merge .spec.units and .status.extensionUnits as well as .spec.files and .status.extensionFiles when applying.\nYou can find an example implementation here.\nBootstrap Tokens gardenlet adds a file with the content \u003c\u003cBOOTSTRAP_TOKEN\u003e\u003e to the OperatingSystemConfig with purpose provision and sets transmitUnencoded=true. This instructs the responsible OS extension to pass this file (with its content in clear-text) to the corresponding Worker resource.\nmachine-controller-manager makes sure that\n a bootstrap token gets created per machine the \u003c\u003cBOOTSTRAP_TOKEN\u003e\u003e string in the user data of the machine gets replaced by the generated token. After the machine has been bootstrapped, the token secret in the shoot cluster gets deleted again.\nThe token is used to bootstrap Gardener Node Agent and kubelet.\nWhat needs to be implemented to support a new operating system? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile units: - name: docker.service dropIns: - name: 10-docker-opts.conf content: |[Service] Environment=\"DOCKER_OPTS=--log-opt max-size=60m --log-opt max-file=3\" - name: docker-monitor.service command: start enable: true content: |[Unit] Description=Containerd-monitor daemon After=kubelet.service [Install] WantedBy=multi-user.target [Service] Restart=always EnvironmentFile=/etc/environment ExecStart=/opt/bin/health-monitor docker files: - path: /var/lib/kubelet/ca.crt permissions: 0644 encoding: b64 content: secretRef: name: default-token-5dtjz dataKey: token - path: /etc/sysctl.d/99-k8s-general.conf permissions: 0644 content: inline: data: |# A higher vm.max_map_count is great for elasticsearch, mongo, or other mmap users # See https://github.com/kubernetes/kops/issues/1340 vm.max_map_count = 135217728 In order to support a new operating system, you need to write a controller that watches all OperatingSystemConfigs with .spec.type=\u003cmy-operating-system\u003e. For those it shall generate a configuration blob that fits to your operating system. For example, a CoreOS controller might generate a CoreOS cloud-config or Ignition, SLES might generate cloud-init, and others might simply generate a bash script translating the .spec.units into systemd units, and .spec.files into real files on the disk.\nOperatingSystemConfigs can have two purposes which can be used (or ignored) by the extension controllers: either provision or reconcile.\n The provision purpose is used by Gardener for the user-data that it later passes to the machine-controller-manager (and then to the provider’s API) when creating new VMs. It contains the gardener-node-init unit. The reconcile purpose contains the “original” user-data (that is then stored in Secrets in the shoot’s kube-system namespace (see step 1). This is downloaded and applies late (see step 5). As described above, the “original” user-data must be re-applicable to allow in-place updates. The way how this is done is specific to the generated operating system config (e.g., for CoreOS cloud-init the command is /usr/bin/coreos-cloudinit --from-file=\u003cpath\u003e, whereas SLES would run cloud-init --file \u003cpath\u003e single -n write_files --frequency=once). Consequently, besides the generated OS config, the extension controller must also provide a command for re-application an updated version of the user-data. As visible in the mentioned examples, the command requires a path to the user-data file. As soon as Gardener detects that the user data has changed it will reload the systemd daemon and restart all the units provided in the .status.units[] list (see the below example). The same logic applies during the very first application of the whole configuration.\nAfter generation, extension controllers are asked to store their OS config inside a Secret (as it might contain confidential data) in the same namespace. The secret’s .data could look like this:\napiVersion: v1 kind: Secret metadata: name: osc-result-pool-01-original namespace: default ownerReferences: - apiVersion: extensions.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: OperatingSystemConfig name: pool-01-original uid: 99c0c5ca-19b9-11e9-9ebd-d67077b40f82 data: cloud_config: base64(generated-user-data) Finally, the secret’s metadata, the OS-specific command to re-apply the configuration, and the list of systemd units that shall be considered to be restarted if an updated version of the user-data is re-applied must be provided in the OperatingSystemConfig’s .status field:\n... status: cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 units: - docker-monitor.service Once the .status indicates that the extension controller finished reconciling Gardener will continue with the next step of the shoot reconciliation flow.\nCRI Support Gardener supports specifying a Container Runtime Interface (CRI) configuration in the OperatingSystemConfig resource. If the .spec.cri section exists, then the name property is mandatory. The only supported value for cri.name at the moment is: containerd. For example:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile cri: name: containerd # cgroupDriver: cgroupfs # or systemd containerd: sandboxImage: registry.k8s.io/pause # registries: # - upstream: docker.io # server: https://registry-1.docker.io # hosts: # - url: http://\u003cservice-ip\u003e:\u003cport\u003e] # plugins: # - op: add # add (default) or remove # path: [io.containerd.grpc.v1.cri, containerd] # values: '{\"default_runtime_name\": \"runc\"}' ... To support containerd, an OS extension must satisfy the following criteria:\n The operating system must have built-in containerd and ctr (client CLI). containerd must listen on its default socket path: unix:///run/containerd/containerd.sock containerd must be configured to work with the default configuration file in: /etc/containerd/config.toml (Created by Gardener). For a convenient handling, gardener-node-agent can manage various aspects of containerd’s config, e.g. the registry configuration, if given in the OperatingSystemConfig. Any Gardener extension which needs to modify the config, should check the functionality exposed through this API first. If applicable, adjustments can be implemented through mutating webhooks, acting on the created or updated OperatingSystemConfig resource.\nIf CRI configurations are not supported, it is recommended to create a validating webhook running in the garden cluster that prevents specifying the .spec.providers.workers[].cri section in the Shoot objects.\nReferences and Additional Resources OperatingSystemConfig API (Golang Specification) Gardener Node Agent ","categories":"","description":"","excerpt":"Contract: OperatingSystemConfig Resource Gardener uses the machine API …","ref":"/docs/gardener/extensions/operatingsystemconfig/","tags":"","title":"Operatingsystemconfig"},{"body":"Using the Alicloud provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for this provider extension. In addition, this document also describes how to enable the use of customized machine images for Alicloud.\nCloudProfile resource This section describes, how the configuration for CloudProfile looks like for Alicloud by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of Alicloud shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Alicloud environment (AMIs). You have to map every version that you specify in .spec.machineImages[].versions here such that the Alicloud extension knows the AMI for every version you want to offer.\nAn example CloudProfileConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2023.4.0 regions: - name: eu-central-1 id: coreos_2023_4_0_64_30G_alibase_20190319.vhd Example CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: alicloud spec: type: alicloud kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2023.4.0 machineTypes: - name: ecs.sn2ne.large cpu: \"2\" gpu: \"0\" memory: 8Gi volumeTypes: - name: cloud_efficiency class: standard - name: cloud_essd class: premium regions: - name: eu-central-1 zones: - name: eu-central-1a - name: eu-central-1b providerConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2023.4.0 regions: - name: eu-central-1 id: coreos_2023_4_0_64_30G_alibase_20190319.vhd Enable customized machine images for the Alicloud extension Customized machine images can be created for an Alicloud account and shared with other Alicloud accounts. The same customized machine image has different image ID in different regions on Alicloud. If you need to enable encrypted system disk, you must provide customized machine images. Administrators/Operators need to explicitly declare them per imageID per region as below:\nmachineImages: - name: customized_coreos regions: - imageID: \u003cimage_id_in_eu_central_1\u003e region: eu-central-1 - imageID: \u003cimage_id_in_cn_shanghai\u003e region: cn-shanghai ... version: 2191.4.1 ... End-users have to have the permission to use the customized image from its creator Alicloud account. To enable end-users to use customized images, the images are shared from Alicloud account of Seed operator with end-users’ Alicloud accounts. Administrators/Operators need to explicitly provide Seed operator’s Alicloud account access credentials (base64 encoded) as below:\nmachineImageOwnerSecret: name: machine-image-owner accessKeyID: \u003cbase64_encoded_access_key_id\u003e accessKeySecret: \u003cbase64_encoded_access_key_secret\u003e As a result, a Secret named machine-image-owner by default will be created in namespace of Alicloud provider extension.\nOperators should also maintain custom image IDs which are to be shared with end-users as below:\ntoBeSharedImageIDs: - \u003cimage_id_1\u003e - \u003cimage_id_2\u003e - \u003cimage_id_3\u003e Example ControllerDeployment manifest for enabling customized machine images apiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-provider-alicloud spec: type: helm providerConfig: chart: | H4sIFAAAAAAA/yk... values: config: machineImageOwnerSecret: accessKeyID: \u003cbase64_encoded_access_key_id\u003e accessKeySecret: \u003cbase64_encoded_access_key_secret\u003e toBeSharedImageIDs: - \u003cimage_id_1\u003e - \u003cimage_id_2\u003e ... machineImages: - name: customized_coreos regions: - imageID: \u003cimage_id_in_eu_central_1\u003e region: eu-central-1 - imageID: \u003cimage_id_in_cn_shanghai\u003e region: cn-shanghai ... version: 2191.4.1 ... csi: enableADController: true resources: limits: cpu: 500m memory: 1Gi requests: memory: 128Mi Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type alicloud can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Alicloud Object Storage Service.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region).\nPlease find below an example Seed manifest (partly) that configures backups using Alicloud Object Storage Service.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: alicloud region: cn-shanghai backup: provider: alicloud secretRef: name: backup-credentials namespace: garden ... An example of the referenced secret containing the credentials for the Alicloud Object Storage Service can be found in the example folder.\nPermissions for Alicloud Object Storage Service Please make sure the RAM user associated with the provided AccessKey pair has the following permission.\n AliyunOSSFullAccess ","categories":"","description":"","excerpt":"Using the Alicloud provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/operations/","tags":"","title":"Operations"},{"body":"Using the AWS provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. Similarly, the core.gardener.cloud/v1beta1.Seed resource is structured. Additionally, it allows to configure settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains what is necessary to configure for this provider extension.\nCloudProfile resource In this section we are describing how the configuration for CloudProfiles looks like for AWS and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating AWS shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the AWS environment (AMIs). You have to map every version that you specify in .spec.machineImages[].versions here such that the AWS extension knows the AMI for every version you want to offer. For each AMI an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 regions: - name: eu-central-1 ami: ami-034fd8c3f4026eb39 # architecture: amd64 # optional Example CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: aws spec: type: aws kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: m5.large cpu: \"2\" gpu: \"0\" memory: 8Gi usable: true volumeTypes: - name: gp2 class: standard usable: true - name: io1 class: premium usable: true regions: - name: eu-central-1 zones: - name: eu-central-1a - name: eu-central-1b - name: eu-central-1c providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 regions: - name: eu-central-1 ami: ami-034fd8c3f4026eb39 # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to manage backup infrastructure, i.e., you can specify configuration for the .spec.backup field.\nBackup configuration Please find below an example Seed manifest (partly) that configures backups. As you can see, the location/region where the backups will be stored can be different to the region where the seed cluster is running.\napiVersion: v1 kind: Secret metadata: name: backup-credentials namespace: garden type: Opaque data: accessKeyID: base64(access-key-id) secretAccessKey: base64(secret-access-key) --- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: aws region: eu-west-1 backup: provider: aws region: eu-central-1 secretRef: name: backup-credentials namespace: garden ... Please look up https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys as well.\nPermissions for AWS IAM user Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):\n Click to expand the AWS IAM policy document! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": \"s3:*\", \"Resource\": \"*\" } ] } ","categories":"","description":"","excerpt":"Using the AWS provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/operations/","tags":"","title":"Operations"},{"body":"Using the Azure provider extension with Gardener as an operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for the Azure provider extension.\nCloudProfile resource This section describes, how the configuration for CloudProfiles looks like for Azure by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of Azure shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Azure environment (image urn, id, communityGalleryImageID or sharedGalleryImageID). You have to map every version that you specify in .spec.machineImages[].versions to an available VM image in your subscription. The VM image can be either from the Azure Marketplace and will then get identified via a urn, it can be a custom VM image from a shared image gallery and is then identified sharedGalleryImageID, or it can be from a community image gallery and is then identified by its communityGalleryImageID. You can use id field also to specifiy the image location in the azure compute gallery (in which case it would have a different kind of path) but it is not recommended as it sometimes faces problems in cross subscription image sharing. For each machine image version an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig countUpdateDomains: - region: westeurope count: 5 countFaultDomains: - region: westeurope count: 3 machineTypes: - name: Standard_D3_v2 acceleratedNetworking: true - name: Standard_X machineImages: - name: coreos versions: - version: 2135.6.0 urn: \"CoreOS:CoreOS:Stable:2135.6.0\" # architecture: amd64 # optional acceleratedNetworking: true - name: myimage versions: - version: 1.0.0 id: \"/subscriptions/\u003csubscription ID where the gallery is located\u003e/resourceGroups/myGalleryRG/providers/Microsoft.Compute/galleries/myGallery/images/myImageDefinition/versions/1.0.0\" - name: GardenLinuxCommunityImage versions: - version: 1.0.0 communityGalleryImageID: \"/CommunityGalleries/gardenlinux-567905d8-921f-4a85-b423-1fbf4e249d90/Images/gardenlinux/Versions/576.1.1\" - name: SharedGalleryImageName versions: - version: 1.0.0 sharedGalleryImageID: \"/SharedGalleries/sharedGalleryName/Images/sharedGalleryImageName/Versions/sharedGalleryImageVersionName\" The cloud profile configuration contains information about the update via .countUpdateDomains[] and failure domain via .countFaultDomains[] counts in the Azure regions you want to offer.\nThe .machineTypes[] list contain provider specific information to the machine types e.g. if the machine type support Azure Accelerated Networking, see .machineTypes[].acceleratedNetworking.\nAdditionally, it contains the real machine image identifiers in the Azure environment. You can provide either URN for Azure Market Place images or id of Shared Image Gallery images. When Shared Image Gallery is used, you have to ensure that the image is available in the desired regions and the end-user subscriptions have access to the image or to the whole gallery. You have to map every version that you specify in .spec.machineImages[].versions here such that the Azure extension knows the machine image identifiers for every version you want to offer. Furthermore, you can specify for each image version via .machineImages[].versions[].acceleratedNetworking if Azure Accelerated Networking is supported.\nExample CloudProfile manifest The possible values for .spec.volumeTypes[].name on Azure are Standard_LRS, StandardSSD_LRS and Premium_LRS. There is another volume type called UltraSSD_LRS but this type is not supported to use as os disk. If an end user select a volume type whose name is not equal to one of the valid values then the machine will be created with the default volume type which belong to the selected machine type. Therefore it is recommended to configure only the valid values for the .spec.volumeType[].name in the CloudProfile.\nPlease find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: azure spec: type: azure kubernetes: versions: - version: 1.28.2 - version: 1.23.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: Standard_D3_v2 cpu: \"4\" gpu: \"0\" memory: 14Gi - name: Standard_D4_v3 cpu: \"4\" gpu: \"0\" memory: 16Gi volumeTypes: - name: Standard_LRS class: standard usable: true - name: StandardSSD_LRS class: premium usable: false - name: Premium_LRS class: premium usable: false regions: - name: westeurope providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineTypes: - name: Standard_D3_v2 acceleratedNetworking: true - name: Standard_D4_v3 countUpdateDomains: - region: westeurope count: 5 countFaultDomains: - region: westeurope count: 3 machineImages: - name: coreos versions: - version: 2303.3.0 urn: CoreOS:CoreOS:Stable:2303.3.0 # architecture: amd64 # optional acceleratedNetworking: true - version: 2135.6.0 urn: \"CoreOS:CoreOS:Stable:2135.6.0\" # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type azure can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Azure Blob storage.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region), but can also be explicitly configured via the field spec.backup.region. The region of the backup can be different from where the Seed cluster is running. However, usually it makes sense to pick the same region for the backup bucket as used for the Seed cluster.\nPlease find below an example Seed manifest (partly) that configures backups using Azure Blob storage.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: azure region: westeurope backup: provider: azure region: westeurope # default region secretRef: name: backup-credentials namespace: garden ... The referenced secret has to contain the provider credentials of the Azure subscription. Please take a look here on how to create an Azure Application, Service Principle and how to obtain credentials. The example below demonstrates how the secret has to look like.\napiVersion: v1 kind: Secret metadata: name: core-azure namespace: garden-dev type: Opaque data: clientID: base64(client-id) clientSecret: base64(client-secret) subscriptionID: base64(subscription-id) tenantID: base64(tenant-id) Permissions for Azure Blob storage Please make sure the Azure application has the following IAM roles.\n Contributor Miscellaneous Gardener managed Service Principals The operators of the Gardener Azure extension can provide a list of managed service principals (technical users) that can be used for Azure Shoots. This eliminates the need for users to provide own service principals for their clusters.\nThe user would need to grant the managed service principal access to their subscription with proper permissions.\nAs service principals are managed in an Azure Active Directory for each supported Active Directory, an own service principal needs to be provided.\nIn case the user provides an own service principal in the Shoot secret, this one will be used instead of the managed one provided by the operator.\nEach managed service principal will be maintained in a Secret like that:\napiVersion: v1 kind: Secret metadata: name: service-principal-my-tenant namespace: extension-provider-azure labels: azure.provider.extensions.gardener.cloud/purpose: tenant-service-principal-secret data: tenantID: base64(my-tenant) clientID: base64(my-service-princiapl-id) clientSecret: base64(my-service-princiapl-secret) type: Opaque The user needs to provide in its Shoot secret a tenantID and subscriptionID.\nThe managed service principal will be assigned based on the tenantID. In case there is a managed service principal secret with a matching tenantID, this one will be used for the Shoot. If there is no matching managed service principal secret then the next Shoot operation will fail.\nOne of the benefits of having managed service principals is that the operator controls the lifecycle of the service principal and can rotate its secrets.\nAfter the service principal secret has been rotated and the corresponding secret is updated, all Shoot clusters using it need to be reconciled or the last operation to be retried.\n","categories":"","description":"","excerpt":"Using the Azure provider extension with Gardener as an operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/operations/","tags":"","title":"Operations"},{"body":"Using the Equinix Metal provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for Equinix Metal and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating Equinix Metal shoot clusters.\nExample CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: equinix-metal spec: type: equinixmetal kubernetes: versions: - version: 1.27.2 - version: 1.26.7 - version: 1.25.10 #expirationDate: \"2023-03-15T23:59:59Z\" machineImages: - name: flatcar versions: - version: 0.0.0-stable machineTypes: - name: t1.small cpu: \"4\" gpu: \"0\" memory: 8Gi usable: true regions: # List of offered metros - name: ny zones: # List of offered facilities within the respective metro - name: ewr1 - name: ny5 - name: ny7 providerConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: flatcar versions: - version: 0.0.0-stable id: flatcar_stable - version: 3510.2.2 ipxeScriptUrl: https://stable.release.flatcar-linux.net/amd64-usr/3510.2.2/flatcar_production_packet.ipxe CloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Equinix Metal environment (IDs). You have to map every version that you specify in .spec.machineImages[].versions here such that the Equinix Metal extension knows the ID for every version you want to offer.\nEquinix Metal supports two different options to specify the image:\n Supported Operating System: Images that are provided by Equinix Metal. They are referenced by their ID (slug). See (Operating Systems Reference)[https://deploy.equinix.com/developers/docs/metal/operating-systems/supported/#operating-systems-reference] for all supported operating system and their ids. Custom iPXE Boot: Equinix Metal supports passing custom iPXE scripts during provisioning, which allows you to install a custom operating system manually. This is useful if you want to have a custom image or want to pin to a specific version. See Custom iPXE Boot for details. An example CloudProfileConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: flatcar versions: - version: 0.0.0-stable id: flatcar_stable - version: 3510.2.2 ipxeScriptUrl: https://stable.release.flatcar-linux.net/amd64-usr/3510.2.2/flatcar_production_packet.ipxe NOTE: CloudProfileConfig is not a Custom Resource, so you cannot create it directly.\n ","categories":"","description":"","excerpt":"Using the Equinix Metal provider extension with Gardener as operator …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/operations/","tags":"","title":"Operations"},{"body":"Using the GCP provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for this provider extension.\nCloudProfile resource This section describes, how the configuration for CloudProfiles looks like for GCP by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of GCP shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the GCP environment (image URLs). You have to map every version that you specify in .spec.machineImages[].versions here such that the GCP extension knows the image URL for every version you want to offer. For each machine image version an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 image: projects/coreos-cloud/global/images/coreos-stable-2135-6-0-v20190801 # architecture: amd64 # optional Example CloudProfile manifest If you want to allow that shoots can create VMs with local SSDs volumes then you have to specify the type of the disk with SCRATCH in the .spec.volumeTypes[] list. Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: gcp spec: type: gcp kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: n1-standard-4 cpu: \"4\" gpu: \"0\" memory: 15Gi volumeTypes: - name: pd-standard class: standard - name: pd-ssd class: premium - name: SCRATCH class: standard regions: - region: europe-west1 names: - europe-west1-b - europe-west1-c - europe-west1-d providerConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 image: projects/coreos-cloud/global/images/coreos-stable-2135-6-0-v20190801 # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type gcp can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Google Cloud Storage buckets.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region), but can also be explicitly configured via the field spec.backup.region. The region of the backup can be different from where the seed cluster is running. However, usually it makes sense to pick the same region for the backup bucket as used for the Seed cluster.\nPlease find below an example Seed manifest (partly) that configures backups using Google Cloud Storage buckets.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: gcp region: europe-west1 backup: provider: gcp region: europe-west1 # default region secretRef: name: backup-credentials namespace: garden ... An example of the referenced secret containing the credentials for the GCP Cloud storage can be found in the example folder.\nPermissions for GCP Cloud Storage Please make sure the service account associated with the provided credentials has the following IAM roles.\n Storage Admin ","categories":"","description":"","excerpt":"Using the GCP provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/operations/","tags":"","title":"Operations"},{"body":"Using the OpenStack provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for OpenStack and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating OpenStack shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the OpenStack environment (image names). You have to map every version that you specify in .spec.machineImages[].versions here such that the OpenStack extension knows the image ID for every version you want to offer.\nIt also contains optional default values for DNS servers that shall be used for shoots. In the dnsServers[] list you can specify IP addresses that are used as DNS configuration for created shoot subnets.\nAlso, you have to specify the keystone URL in the keystoneURL field to your environment.\nAdditionally, you can influence the HTTP request timeout when talking to the OpenStack API in the requestTimeout field. This may help when you have for example a long list of load balancers in your environment.\nIn case your OpenStack system uses Octavia for network load balancing then you have to set the useOctavia field to true such that the cloud-controller-manager for OpenStack gets correctly configured (it defaults to false).\nSome hypervisors (especially those which are VMware-based) don’t automatically send a new volume size to a Linux kernel when a volume is resized and in-use. For those hypervisors you can enable the storage plugin interacting with Cinder to telling the SCSI block device to refresh its information to provide information about it’s updated size to the kernel. You might need to enable this behavior depending on the underlying hypervisor of your OpenStack installation. The rescanBlockStorageOnResize field controls this. Please note that it only applies for Kubernetes versions where CSI is used.\nSome openstack configurations do not allow to attach more volumes than a specific amount to a single node. To tell the k8s scheduler to not over schedule volumes on a node, you can set nodeVolumeAttachLimit which defaults to 256. Some openstack configurations have different names for volume and compute availability zones, which might cause pods to go into pending state as there are no nodes available in the detected volume AZ. To ignore the volume AZ when scheduling pods, you can set ignoreVolumeAZ to true (it defaults to false). See CSI Cinder driver.\nThe cloud profile config also contains constraints for floating pools and load balancer providers that can be used in shoots.\nIf your OpenStack system supports server groups, the serverGroupPolicies property will enable your end-users to create shoots with workers where the nodes are managed by Nova’s server groups. Specifying serverGroupPolicies is optional and can be omitted. If enabled, the end-user can choose whether or not to use this feature for a shoot’s workers. Gardener will handle the creation of the server group and node assignment.\nTo enable this feature, an operator should:\n specify the allowed policy values (e.g. affintity, anti-affinity) in this section. Only the policies in the allow-list will be available for end-users. make sure that your OpenStack project has enough server group capacity. Otherwise, shoot creation will fail. If your OpenStack system has multiple volume-types, the storageClasses property enables the creation of kubernetes storageClasses for shoots. Set storageClasses[].parameters.type to map it with an openstack volume-type. Specifying storageClasses is optional and can be omitted.\nAn example CloudProfileConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 # Fallback to image name if no region mapping is found # Only works for amd64 and is strongly discouraged. Prefer image IDs! image: coreos-2135.6.0 regions: - name: europe id: \"1234-amd64\" architecture: amd64 # optional, defaults to amd64 - name: europe id: \"1234-arm64\" architecture: arm64 - name: asia id: \"5678-amd64\" architecture: amd64 # keystoneURL: https://url-to-keystone/v3/ # keystoneURLs: # - region: europe # url: https://europe.example.com/v3/ # - region: asia # url: https://asia.example.com/v3/ # dnsServers: # - 10.10.10.11 # - 10.10.10.12 # requestTimeout: 60s # useOctavia: true # useSNAT: true # rescanBlockStorageOnResize: true # ignoreVolumeAZ: true # nodeVolumeAttachLimit: 30 # serverGroupPolicies: # - soft-anti-affinity # - anti-affinity # resolvConfOptions: # - rotate # - timeout:1 # storageClasses: # - name: example-sc # default: false # provisioner: cinder.csi.openstack.org # volumeBindingMode: WaitForFirstConsumer # parameters: # type: storage_premium_perf0 constraints: floatingPools: - name: fp-pool-1 # region: europe # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-*\" # region: europe # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-eu-demo\" # region: europe # domain: demo # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-eu-dev\" # region: europe # domain: dev # nonConstraining: true # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" loadBalancerProviders: - name: haproxy # - name: f5 # region: asia # - name: haproxy # region: asia Please note that it is possible to configure a region mapping for keystone URLs, floating pools, and load balancer providers. Additionally, floating pools can be constrainted to a keystone domain by specifying the domain field. Floating pool names may also contains simple wildcard expressions, like * or fp-pool-* or *-fp-pool. Please note that the * must be either single or at the beginning or at the end. Consequently, fp-*-pool is not possible/allowed. The default behavior is that, if found, the regional (and/or domain restricted) entry is taken. If no entry for the given region exists then the fallback value is the most matching entry (w.r.t. wildcard matching) in the list without a region field (or the keystoneURL value for the keystone URLs). If an additional floating pool should be selectable for a region and/or domain, you can mark it as non constraining with setting the optional field nonConstraining to true. Multiple loadBalancerProviders can be specified in the CloudProfile. Each provider may specify a region constraint for where it can be used. If at least one region specific entry exists in the CloudProfile, the shoot’s specified loadBalancerProvider must adhere to the list of the available providers of that region. Otherwise, one of the non-regional specific providers should be used. Each entry in the loadBalancerProviders must be uniquely identified by its name and if applicable, its region.\nThe loadBalancerClasses field is an optional list of load balancer classes which can be when the corresponding floating pool network is choosen. The load balancer classes can be configured in the same way as in the ControlPlaneConfig in the Shoot resource, therefore see here for more details.\nSome OpenStack environments don’t need these regional mappings, hence, the region and keystoneURLs fields are optional. If your OpenStack environment only has regional values and it doesn’t make sense to provide a (non-regional) fallback then simply omit keystoneURL and always specify region.\nIf Gardener creates and manages the router of a shoot cluster, it is additionally possible to specify that the enable_snat field is set to true via useSNAT: true in the CloudProfileConfig.\nOn some OpenStack enviroments, there may be the need to set options in the file /etc/resolv.conf on worker nodes. If the field resolvConfOptions is set, a systemd service will be installed which copies /run/systemd/resolve/resolv.conf on every change to /etc/resolv.conf and appends the given options.\nExample CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: openstack spec: type: openstack kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 architectures: # optional, defaults to [amd64] - amd64 - arm64 machineTypes: - name: medium_4_8 cpu: \"4\" gpu: \"0\" memory: 8Gi architecture: amd64 # optional, defaults to amd64 storage: class: standard type: default size: 40Gi - name: medium_4_8_arm cpu: \"4\" gpu: \"0\" memory: 8Gi architecture: arm64 storage: class: standard type: default size: 40Gi regions: - name: europe-1 zones: - name: europe-1a - name: europe-1b - name: europe-1c providerConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 # Fallback to image name if no region mapping is found # Only works for amd64 and is strongly discouraged. Prefer image IDs! image: coreos-2135.6.0 regions: - name: europe id: \"1234-amd64\" architecture: amd64 # optional, defaults to amd64 - name: europe id: \"1234-arm64\" architecture: arm64 - name: asia id: \"5678-amd64\" architecture: amd64 keystoneURL: https://url-to-keystone/v3/ constraints: floatingPools: - name: fp-pool-1 loadBalancerProviders: - name: haproxy ","categories":"","description":"","excerpt":"Using the OpenStack provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/operations/","tags":"","title":"Operations"},{"body":"Using the Calico networking extension with Gardener as operator This document explains configuration options supported by the networking-calico extension.\nRun calico-node in non-privileged and non-root mode Feature State: Alpha\nMotivation Running containers in privileged mode is not recommended as privileged containers run with all linux capabilities enabled and can access the host’s resources. Running containers in privileged mode opens number of security threats such as breakout to underlying host OS.\nSupport for non-privileged and non-root mode The Calico project has a preliminary support for running the calico-node component in non-privileged mode (see this guide). Similar to Tigera Calico operator the networking-calico extension can also run calico-node in non-privileged and non-root mode. This feature is controller via feature gate named NonPrivilegedCalicoNode. The feature gates are configured in the ControllerConfiguration of networking-calico. The corresponding ControllerDeployment configuration that enables the NonPrivilegedCalicoNode would look like:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: networking-calico type: helm providerConfig: values: chart: \u003comitted\u003e config: featureGates: NonPrivilegedCalicoNode: false Limitations The support for the non-privileged mode in the Calico project is not ready for productive usage. The upstream documentation states that in non-privileged mode the support for features added after Calico v3.21 is not guaranteed. Calico in non-privileged mode does not support eBPF dataplane. That’s why when eBPF dataplane is enabled, calico-node has to run in privileged mode (even when the NonPrivilegedCalicoNode feature gate is enabled). (At the time of writing this guide) there is the following issue projectcalico/calico#5348 that is not addressed. (At the time of writing this guide) the upstream adoptions seems to be low. The Calico charts and manifest in projectcalico/calico run calico-node in privileged mode. ","categories":"","description":"","excerpt":"Using the Calico networking extension with Gardener as operator This …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/operations/","tags":"","title":"Operations"},{"body":"Packages:\n operations.gardener.cloud/v1alpha1 operations.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: Bastion Bastion Bastion holds details about an SSH bastion for a shoot cluster.\n Field Description apiVersion string operations.gardener.cloud/v1alpha1 kind string Bastion metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BastionSpec Specification of the Bastion.\n shootRef Kubernetes core/v1.LocalObjectReference ShootRef defines the target shoot for a Bastion. The name field of the ShootRef is immutable.\n seedName string (Optional) SeedName is the name of the seed to which this Bastion is currently scheduled. This field is populated at the beginning of a create/reconcile operation.\n providerType string (Optional) ProviderType is cloud provider used by the referenced Shoot.\n sshPublicKey string SSHPublicKey is the user’s public key. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n status BastionStatus (Optional) Most recently observed status of the Bastion.\n BastionIngressPolicy (Appears on: BastionSpec) BastionIngressPolicy represents an ingress policy for SSH bastion hosts.\n Field Description ipBlock Kubernetes networking/v1.IPBlock IPBlock defines an IP block that is allowed to access the bastion.\n BastionSpec (Appears on: Bastion) BastionSpec is the specification of a Bastion.\n Field Description shootRef Kubernetes core/v1.LocalObjectReference ShootRef defines the target shoot for a Bastion. The name field of the ShootRef is immutable.\n seedName string (Optional) SeedName is the name of the seed to which this Bastion is currently scheduled. This field is populated at the beginning of a create/reconcile operation.\n providerType string (Optional) ProviderType is cloud provider used by the referenced Shoot.\n sshPublicKey string SSHPublicKey is the user’s public key. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n BastionStatus (Appears on: Bastion) BastionStatus holds the most recently observed status of the Bastion.\n Field Description ingress Kubernetes core/v1.LoadBalancerIngress (Optional) Ingress holds the public IP and/or hostname of the bastion instance.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Bastion’s current state.\n lastHeartbeatTimestamp Kubernetes meta/v1.Time (Optional) LastHeartbeatTimestamp is the time when the bastion was last marked as not to be deleted. When this is set, the ExpirationTimestamp is advanced as well.\n expirationTimestamp Kubernetes meta/v1.Time (Optional) ExpirationTimestamp is the time after which a Bastion is supposed to be garbage collected.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Bastion. It corresponds to the Bastion’s generation, which is updated on mutation by the API Server.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n operations.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/operations/","tags":"","title":"Operations"},{"body":"Packages:\n operator.gardener.cloud/v1alpha1 operator.gardener.cloud/v1alpha1 Package v1alpha1 contains the configuration of the Gardener Operator.\nResource Types: ACMEIssuer (Appears on: DefaultIssuer) ACMEIssuer specifies an issuer using an ACME server.\n Field Description email string Email is the e-mail for the ACME user.\n server string Server is the ACME server endpoint.\n secretRef Kubernetes core/v1.LocalObjectReference (Optional) SecretRef is a reference to a secret containing a private key of the issuer (data key ‘privateKey’).\n precheckNameservers []string (Optional) PrecheckNameservers overwrites the default precheck nameservers used for checking DNS propagation. Format host or host:port, e.g. “8.8.8.8” same as “8.8.8.8:53” or “google-public-dns-a.google.com:53”.\n AdmissionDeploymentSpec (Appears on: Deployment) AdmissionDeploymentSpec contains the deployment specification for the admission controller of an extension.\n Field Description runtimeCluster DeploymentSpec (Optional) RuntimeCluster is the deployment configuration for the admission in the runtime cluster. The runtime deployment is responsible for creating the admission controller in the runtime cluster.\n virtualCluster DeploymentSpec (Optional) VirtualCluster is the deployment configuration for the admission deployment in the garden cluster. The garden deployment installs necessary resources in the virtual garden cluster e.g. RBAC that are necessary for the admission controller.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the deployment values. The values will be applied to both admission deployments.\n AuditWebhook (Appears on: GardenerAPIServerConfig, KubeAPIServerConfig) AuditWebhook contains settings related to an audit webhook configuration.\n Field Description batchMaxSize int32 (Optional) BatchMaxSize is the maximum size of a batch.\n kubeconfigSecretName string KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this webhook.\n version string (Optional) Version is the API version to send and expect from the webhook.\n Authentication (Appears on: KubeAPIServerConfig) Authentication contains settings related to authentication.\n Field Description webhook AuthenticationWebhook (Optional) Webhook contains settings related to an authentication webhook configuration.\n AuthenticationWebhook (Appears on: Authentication) AuthenticationWebhook contains settings related to an authentication webhook configuration.\n Field Description cacheTTL Kubernetes meta/v1.Duration (Optional) CacheTTL is the duration to cache responses from the webhook authenticator.\n kubeconfigSecretName string KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this webhook.\n version string (Optional) Version is the API version to send and expect from the webhook.\n Backup (Appears on: ETCDMain) Backup contains the object store configuration for backups for the virtual garden etcd.\n Field Description provider string Provider is a provider name. This field is immutable.\n bucketName string BucketName is the name of the backup bucket.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is a reference to a Secret object containing the cloud provider credentials for the object store where backups should be stored. It should have enough privileges to manipulate the objects as well as buckets.\n CAIssuer (Appears on: DefaultIssuer) CAIssuer specifies an issuer using a root or intermediate CA to be used for signing.\n Field Description secretRef Kubernetes core/v1.LocalObjectReference SecretRef is a reference to a TLS secret containing the CA for signing certificates.\n CertManagement (Appears on: RuntimeCluster) CertManagement configures the cert-management component for issuing TLS certificates from an ACME server.\n Field Description config CertManagementConfig (Optional) Config contains configuration for deploying the cert-controller-manager.\n defaultIssuer DefaultIssuer DefaultIssuer is the default issuer used for requesting TLS certificates.\n CertManagementConfig (Appears on: CertManagement) CertManagementConfig contains information for deploying the cert-controller-manager.\n Field Description caCertificatesSecretRef Kubernetes core/v1.LocalObjectReference (Optional) CACertificatesSecretRef are additional root certificates to access ACME servers with private TLS certificates. The certificates are expected at key ‘bundle.crt’.\n ControlPlane (Appears on: VirtualCluster) ControlPlane holds information about the general settings for the control plane of the virtual garden cluster.\n Field Description highAvailability HighAvailability (Optional) HighAvailability holds the configuration settings for high availability settings.\n Credentials (Appears on: GardenStatus) Credentials contains information about the virtual garden cluster credentials.\n Field Description rotation CredentialsRotation (Optional) Rotation contains information about the credential rotations.\n CredentialsRotation (Appears on: Credentials) CredentialsRotation contains information about the rotation of credentials.\n Field Description certificateAuthorities github.com/gardener/gardener/pkg/apis/core/v1beta1.CARotation (Optional) CertificateAuthorities contains information about the certificate authority credential rotation.\n serviceAccountKey github.com/gardener/gardener/pkg/apis/core/v1beta1.ServiceAccountKeyRotation (Optional) ServiceAccountKey contains information about the service account key credential rotation.\n etcdEncryptionKey github.com/gardener/gardener/pkg/apis/core/v1beta1.ETCDEncryptionKeyRotation (Optional) ETCDEncryptionKey contains information about the ETCD encryption key credential rotation.\n observability github.com/gardener/gardener/pkg/apis/core/v1beta1.ObservabilityRotation (Optional) Observability contains information about the observability credential rotation.\n workloadIdentityKey WorkloadIdentityKeyRotation (Optional) WorkloadIdentityKey contains information about the workload identity key credential rotation.\n DNS (Appears on: VirtualCluster) DNS holds information about DNS settings.\n Field Description domains []string (Optional) Domains are the external domains of the virtual garden cluster. The first given domain in this list is immutable.\n DashboardGitHub (Appears on: GardenerDashboardConfig) DashboardGitHub contains configuration for the GitHub ticketing feature.\n Field Description apiURL string APIURL is the URL to the GitHub API.\n organisation string Organisation is the name of the GitHub organisation.\n repository string Repository is the name of the GitHub repository.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is the reference to a secret in the garden namespace containing the GitHub credentials.\n pollInterval Kubernetes meta/v1.Duration (Optional) PollInterval is the interval of how often the GitHub API is polled for issue updates. This field is used as a fallback mechanism to ensure state synchronization, even when there is a GitHub webhook configuration. If a webhook event is missed or not successfully delivered, the polling will help catch up on any missed updates. If this field is not provided and there is no ‘webhookSecret’ key in the referenced secret, it will be implicitly defaulted to 15m.\n DashboardOIDC (Appears on: GardenerDashboardConfig) DashboardOIDC contains configuration for the OIDC settings.\n Field Description sessionLifetime Kubernetes meta/v1.Duration (Optional) SessionLifetime is the maximum duration of a session.\n additionalScopes []string (Optional) AdditionalScopes is the list of additional OIDC scopes.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is the reference to a secret in the garden namespace containing the OIDC client ID and secret for the dashboard.\n DashboardTerminal (Appears on: GardenerDashboardConfig) DashboardTerminal contains configuration for the terminal settings.\n Field Description container DashboardTerminalContainer Container contains configuration for the dashboard terminal container.\n allowedHosts []string (Optional) AllowedHosts should consist of permitted hostnames (without the scheme) for terminal connections. It is important to consider that the usage of wildcards follows the rules defined by the content security policy. ‘.seed.local.gardener.cloud’, or ‘.other-seeds.local.gardener.cloud’. For more information, see https://github.com/gardener/dashboard/blob/master/docs/operations/webterminals.md#allowlist-for-hosts.\n DashboardTerminalContainer (Appears on: DashboardTerminal) DashboardTerminalContainer contains configuration for the dashboard terminal container.\n Field Description image string Image is the container image for the dashboard terminal container.\n description string (Optional) Description is a description for the dashboard terminal container with hints for the user.\n DefaultIssuer (Appears on: CertManagement) DefaultIssuer specifies an issuer to be created on the cluster.\n Field Description acme ACMEIssuer (Optional) ACME is the ACME protocol specific spec. Either ACME or CA must be specified.\n ca CAIssuer (Optional) CA is the CA specific spec. Either ACME or CA must be specified.\n Deployment (Appears on: ExtensionSpec) Deployment specifies how an extension can be installed for a Gardener landscape. It includes the specification for installing an extension and/or an admission controller.\n Field Description extension ExtensionDeploymentSpec (Optional) ExtensionDeployment contains the deployment configuration an extension.\n admission AdmissionDeploymentSpec (Optional) AdmissionDeployment contains the deployment configuration for an admission controller.\n DeploymentSpec (Appears on: AdmissionDeploymentSpec, ExtensionDeploymentSpec) DeploymentSpec is the specification for the deployment of a component.\n Field Description helm ExtensionHelm Helm contains the specification for a Helm deployment.\n ETCD (Appears on: VirtualCluster) ETCD contains configuration for the etcds of the virtual garden cluster.\n Field Description main ETCDMain (Optional) Main contains configuration for the main etcd.\n events ETCDEvents (Optional) Events contains configuration for the events etcd.\n ETCDEvents (Appears on: ETCD) ETCDEvents contains configuration for the events etcd.\n Field Description storage Storage (Optional) Storage contains storage configuration.\n ETCDMain (Appears on: ETCD) ETCDMain contains configuration for the main etcd.\n Field Description backup Backup (Optional) Backup contains the object store configuration for backups for the virtual garden etcd.\n storage Storage (Optional) Storage contains storage configuration.\n Extension Extension describes a Gardener extension.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ExtensionSpec Spec contains the specification of this extension.\n resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerResource (Optional) Resources is a list of combinations of kinds (DNSRecord, Backupbucket, …) and their actual types (aws-route53, gcp).\n deployment Deployment (Optional) Deployment contains deployment configuration for an extension and it’s admission controller.\n status ExtensionStatus Status contains the status of this extension.\n ExtensionDeploymentSpec (Appears on: Deployment) ExtensionDeploymentSpec specifies how to install the extension in a gardener landscape. The installation is split into two parts: - installing the extension in the virtual garden cluster by creating the ControllerRegistration and ControllerDeployment - installing the extension in the runtime cluster (if necessary).\n Field Description DeploymentSpec DeploymentSpec (Members of DeploymentSpec are embedded into this type.) (Optional) DeploymentSpec is the deployment configuration for the extension.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the deployment values used in the creation of the ControllerDeployment in the virtual garden cluster.\n runtimeClusterValues k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) RuntimeClusterValues are the deployment values for the extension deployment running in the runtime garden cluster. If no values are specified, a runtime deployment is considered deactivated.\n policy github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerDeploymentPolicy (Optional) Policy controls how the controller is deployed. It defaults to ‘OnDemand’.\n seedSelector Kubernetes meta/v1.LabelSelector (Optional) SeedSelector contains an optional label selector for seeds. Only if the labels match then this controller will be considered for a deployment. An empty list means that all seeds are selected.\n ExtensionHelm (Appears on: DeploymentSpec) ExtensionHelm is the configuration for a helm deployment.\n Field Description ociRepository github.com/gardener/gardener/pkg/apis/core/v1.OCIRepository (Optional) OCIRepository defines where to pull the chart from.\n ExtensionSpec (Appears on: Extension) ExtensionSpec contains the specification of a Gardener extension.\n Field Description resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerResource (Optional) Resources is a list of combinations of kinds (DNSRecord, Backupbucket, …) and their actual types (aws-route53, gcp).\n deployment Deployment (Optional) Deployment contains deployment configuration for an extension and it’s admission controller.\n ExtensionStatus (Appears on: Extension) ExtensionStatus is the status of a Gardener extension.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of an Extension’s current state.\n providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains type-specific status.\n Garden Garden describes a list of gardens.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec GardenSpec Spec contains the specification of this garden.\n runtimeCluster RuntimeCluster RuntimeCluster contains configuration for the runtime cluster.\n virtualCluster VirtualCluster VirtualCluster contains configuration for the virtual cluster.\n status GardenStatus Status contains the status of this garden.\n GardenSpec (Appears on: Garden) GardenSpec contains the specification of a garden environment.\n Field Description runtimeCluster RuntimeCluster RuntimeCluster contains configuration for the runtime cluster.\n virtualCluster VirtualCluster VirtualCluster contains configuration for the virtual cluster.\n GardenStatus (Appears on: Garden) GardenStatus is the status of a garden environment.\n Field Description gardener github.com/gardener/gardener/pkg/apis/core/v1beta1.Gardener (Optional) Gardener holds information about the Gardener which last acted on the Garden.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition Conditions is a list of conditions.\n lastOperation github.com/gardener/gardener/pkg/apis/core/v1beta1.LastOperation (Optional) LastOperation holds information about the last operation on the Garden.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n credentials Credentials (Optional) Credentials contains information about the virtual garden cluster credentials.\n encryptedResources []string (Optional) EncryptedResources is the list of resources which are currently encrypted in the virtual garden by the virtual kube-apiserver. Resources which are encrypted by default will not appear here. See https://github.com/gardener/gardener/blob/master/docs/concepts/operator.md#etcd-encryption-config for more details.\n Gardener (Appears on: VirtualCluster) Gardener contains the configuration settings for the Gardener components.\n Field Description clusterIdentity string ClusterIdentity is the identity of the garden cluster. This field is immutable.\n gardenerAPIServer GardenerAPIServerConfig (Optional) APIServer contains configuration settings for the gardener-apiserver.\n gardenerAdmissionController GardenerAdmissionControllerConfig (Optional) AdmissionController contains configuration settings for the gardener-admission-controller.\n gardenerControllerManager GardenerControllerManagerConfig (Optional) ControllerManager contains configuration settings for the gardener-controller-manager.\n gardenerScheduler GardenerSchedulerConfig (Optional) Scheduler contains configuration settings for the gardener-scheduler.\n gardenerDashboard GardenerDashboardConfig (Optional) Dashboard contains configuration settings for the gardener-dashboard.\n gardenerDiscoveryServer GardenerDiscoveryServerConfig (Optional) DiscoveryServer contains configuration settings for the gardener-discovery-server.\n GardenerAPIServerConfig (Appears on: Gardener) GardenerAPIServerConfig contains configuration settings for the gardener-apiserver.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) admissionPlugins []github.com/gardener/gardener/pkg/apis/core/v1beta1.AdmissionPlugin (Optional) AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding configuration.\n auditConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.AuditConfig (Optional) AuditConfig contains configuration settings for the audit of the kube-apiserver.\n auditWebhook AuditWebhook (Optional) AuditWebhook contains settings related to an audit webhook configuration.\n logging github.com/gardener/gardener/pkg/apis/core/v1beta1.APIServerLogging (Optional) Logging contains configuration for the log level and HTTP access logs.\n requests github.com/gardener/gardener/pkg/apis/core/v1beta1.APIServerRequests (Optional) Requests contains configuration for request-specific settings for the kube-apiserver.\n watchCacheSizes github.com/gardener/gardener/pkg/apis/core/v1beta1.WatchCacheSizes (Optional) WatchCacheSizes contains configuration of the API server’s watch cache sizes. Configuring these flags might be useful for large-scale Garden clusters with a lot of parallel update requests and a lot of watching controllers (e.g. large ManagedSeed clusters). When the API server’s watch cache’s capacity is too small to cope with the amount of update requests and watchers for a particular resource, it might happen that controller watches are permanently stopped with too old resource version errors. Starting from kubernetes v1.19, the API server’s watch cache size is adapted dynamically and setting the watch cache size flags will have no effect, except when setting it to 0 (which disables the watch cache).\n encryptionConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.EncryptionConfig (Optional) EncryptionConfig contains customizable encryption configuration of the Gardener API server.\n GardenerAdmissionControllerConfig (Appears on: Gardener) GardenerAdmissionControllerConfig contains configuration settings for the gardener-admission-controller.\n Field Description logLevel string (Optional) LogLevel is the configured log level for the gardener-admission-controller. Must be one of [info,debug,error]. Defaults to info.\n resourceAdmissionConfiguration ResourceAdmissionConfiguration (Optional) ResourceAdmissionConfiguration is the configuration for resource size restrictions for arbitrary Group-Version-Kinds.\n GardenerControllerManagerConfig (Appears on: Gardener) GardenerControllerManagerConfig contains configuration settings for the gardener-controller-manager.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) defaultProjectQuotas []ProjectQuotaConfiguration (Optional) DefaultProjectQuotas is the default configuration matching projects are set up with if a quota is not already specified.\n logLevel string (Optional) LogLevel is the configured log level for the gardener-controller-manager. Must be one of [info,debug,error]. Defaults to info.\n GardenerDashboardConfig (Appears on: Gardener) GardenerDashboardConfig contains configuration settings for the gardener-dashboard.\n Field Description enableTokenLogin bool (Optional) EnableTokenLogin specifies whether it is possible to log into the dashboard with a JWT token. If disabled, OIDC must be configured.\n frontendConfigMapRef Kubernetes core/v1.LocalObjectReference (Optional) FrontendConfigMapRef is the reference to a ConfigMap in the garden namespace containing the frontend configuration.\n assetsConfigMapRef Kubernetes core/v1.LocalObjectReference (Optional) AssetsConfigMapRef is the reference to a ConfigMap in the garden namespace containing the assets (logos/icons).\n gitHub DashboardGitHub (Optional) GitHub contains configuration for the GitHub ticketing feature.\n logLevel string (Optional) LogLevel is the configured log level. Must be one of [trace,debug,info,warn,error]. Defaults to info.\n oidcConfig DashboardOIDC (Optional) OIDC contains configuration for the OIDC provider. This field must be provided when EnableTokenLogin is false.\n terminal DashboardTerminal (Optional) Terminal contains configuration for the terminal settings.\n GardenerDiscoveryServerConfig (Appears on: Gardener) GardenerDiscoveryServerConfig contains configuration settings for the gardener-discovery-server.\nGardenerSchedulerConfig (Appears on: Gardener) GardenerSchedulerConfig contains configuration settings for the gardener-scheduler.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) logLevel string (Optional) LogLevel is the configured log level for the gardener-scheduler. Must be one of [info,debug,error]. Defaults to info.\n GroupResource (Appears on: KubeAPIServerConfig) GroupResource contains a list of resources which should be stored in etcd-events instead of etcd-main.\n Field Description group string Group is the API group name.\n resource string Resource is the resource name.\n HighAvailability (Appears on: ControlPlane) HighAvailability specifies the configuration settings for high availability for a resource.\nIngress (Appears on: RuntimeCluster) Ingress configures the Ingress specific settings of the runtime cluster.\n Field Description domains []string (Optional) Domains specify the ingress domains of the cluster pointing to the ingress controller endpoint. They will be used to construct ingress URLs for system applications running in runtime cluster.\n controller github.com/gardener/gardener/pkg/apis/core/v1beta1.IngressController Controller configures a Gardener managed Ingress Controller listening on the ingressDomain.\n KubeAPIServerConfig (Appears on: Kubernetes) KubeAPIServerConfig contains configuration settings for the kube-apiserver.\n Field Description KubeAPIServerConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubeAPIServerConfig (Members of KubeAPIServerConfig are embedded into this type.) (Optional) KubeAPIServerConfig contains all configuration values not specific to the virtual garden cluster.\n auditWebhook AuditWebhook (Optional) AuditWebhook contains settings related to an audit webhook configuration.\n authentication Authentication (Optional) Authentication contains settings related to authentication.\n resourcesToStoreInETCDEvents []GroupResource (Optional) ResourcesToStoreInETCDEvents contains a list of resources which should be stored in etcd-events instead of etcd-main. The ‘events’ resource is always stored in etcd-events. Note that adding or removing resources from this list will not migrate them automatically from the etcd-main to etcd-events or vice versa.\n sni SNI (Optional) SNI contains configuration options for the TLS SNI settings.\n KubeControllerManagerConfig (Appears on: Kubernetes) KubeControllerManagerConfig contains configuration settings for the kube-controller-manager.\n Field Description KubeControllerManagerConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubeControllerManagerConfig (Members of KubeControllerManagerConfig are embedded into this type.) (Optional) KubeControllerManagerConfig contains all configuration values not specific to the virtual garden cluster.\n certificateSigningDuration Kubernetes meta/v1.Duration (Optional) CertificateSigningDuration is the maximum length of duration signed certificates will be given. Individual CSRs may request shorter certs by setting spec.expirationSeconds.\n Kubernetes (Appears on: VirtualCluster) Kubernetes contains the version and configuration options for the Kubernetes components of the virtual garden cluster.\n Field Description kubeAPIServer KubeAPIServerConfig (Optional) KubeAPIServer contains configuration settings for the kube-apiserver.\n kubeControllerManager KubeControllerManagerConfig (Optional) KubeControllerManager contains configuration settings for the kube-controller-manager.\n version string Version is the semantic Kubernetes version to use for the virtual garden cluster.\n Maintenance (Appears on: VirtualCluster) Maintenance contains information about the time window for maintenance operations.\n Field Description timeWindow github.com/gardener/gardener/pkg/apis/core/v1beta1.MaintenanceTimeWindow TimeWindow contains information about the time window for maintenance operations.\n Networking (Appears on: VirtualCluster) Networking defines networking parameters for the virtual garden cluster.\n Field Description services string Services is the CIDR of the service network. This field is immutable.\n ProjectQuotaConfiguration (Appears on: GardenerControllerManagerConfig) ProjectQuotaConfiguration defines quota configurations.\n Field Description config k8s.io/apimachinery/pkg/runtime.RawExtension Config is the quota specification used for the project set-up. Only v1.ResourceQuota resources are supported.\n projectSelector Kubernetes meta/v1.LabelSelector (Optional) ProjectSelector is an optional setting to select the projects considered for quotas. Defaults to empty LabelSelector, which matches all projects.\n Provider (Appears on: RuntimeCluster) Provider defines the provider-specific information for this cluster.\n Field Description zones []string (Optional) Zones is the list of availability zones the cluster is deployed to.\n ResourceAdmissionConfiguration (Appears on: GardenerAdmissionControllerConfig) ResourceAdmissionConfiguration contains settings about arbitrary kinds and the size each resource should have at most.\n Field Description limits []ResourceLimit Limits contains configuration for resources which are subjected to size limitations.\n unrestrictedSubjects []Kubernetes rbac/v1.Subject (Optional) UnrestrictedSubjects contains references to users, groups, or service accounts which aren’t subjected to any resource size limit.\n operationMode ResourceAdmissionWebhookMode (Optional) OperationMode specifies the mode the webhooks operates in. Allowed values are “block” and “log”. Defaults to “block”.\n ResourceAdmissionWebhookMode (string alias)\n (Appears on: ResourceAdmissionConfiguration) ResourceAdmissionWebhookMode is an alias type for the resource admission webhook mode.\nResourceLimit (Appears on: ResourceAdmissionConfiguration) ResourceLimit contains settings about a kind and the size each resource should have at most.\n Field Description apiGroups []string (Optional) APIGroups is the name of the APIGroup that contains the limited resource. WildcardAll represents all groups.\n apiVersions []string (Optional) APIVersions is the version of the resource. WildcardAll represents all versions.\n resources []string Resources is the name of the resource this rule applies to. WildcardAll represents all resources.\n size k8s.io/apimachinery/pkg/api/resource.Quantity Size specifies the imposed limit.\n RuntimeCluster (Appears on: GardenSpec) RuntimeCluster contains configuration for the runtime cluster.\n Field Description ingress Ingress Ingress configures Ingress specific settings for the Garden cluster.\n networking RuntimeNetworking Networking defines the networking configuration of the runtime cluster.\n provider Provider Provider defines the provider-specific information for this cluster.\n settings Settings (Optional) Settings contains certain settings for this cluster.\n volume Volume (Optional) Volume contains settings for persistent volumes created in the runtime cluster.\n certManagement CertManagement (Optional) CertManagement configures the cert-management component for issuing TLS certificates from an ACME server.\n RuntimeNetworking (Appears on: RuntimeCluster) RuntimeNetworking defines the networking configuration of the runtime cluster.\n Field Description nodes string (Optional) Nodes is the CIDR of the node network. This field is immutable.\n pods string Pods is the CIDR of the pod network. This field is immutable.\n services string Services is the CIDR of the service network. This field is immutable.\n blockCIDRs []string (Optional) BlockCIDRs is a list of network addresses that should be blocked.\n SNI (Appears on: KubeAPIServerConfig) SNI contains configuration options for the TLS SNI settings.\n Field Description secretName string SecretName is the name of a secret containing the TLS certificate and private key.\n domainPatterns []string (Optional) DomainPatterns is a list of fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names.\n SettingLoadBalancerServices (Appears on: Settings) SettingLoadBalancerServices controls certain settings for services of type load balancer that are created in the runtime cluster.\n Field Description annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into every load balancer service object.\n SettingTopologyAwareRouting (Appears on: Settings) SettingTopologyAwareRouting controls certain settings for topology-aware traffic routing in the cluster. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Field Description enabled bool Enabled controls whether certain Services deployed in the cluster should be topology-aware. These Services are virtual-garden-etcd-main-client, virtual-garden-etcd-events-client and virtual-garden-kube-apiserver. Additionally, other components that are deployed to the runtime cluster via other means can read this field and according to its value enable/disable topology-aware routing for their Services.\n SettingVerticalPodAutoscaler (Appears on: Settings) SettingVerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n Field Description enabled bool (Optional) Enabled controls whether the VPA components shall be deployed into this cluster. It is true by default because the operator (and Gardener) heavily rely on a VPA being deployed. You should only disable this if your runtime cluster already has another, manually/custom managed VPA deployment. If this is not the case, but you still disable it, then reconciliation will fail.\n Settings (Appears on: RuntimeCluster) Settings contains certain settings for this cluster.\n Field Description loadBalancerServices SettingLoadBalancerServices (Optional) LoadBalancerServices controls certain settings for services of type load balancer that are created in the runtime cluster.\n verticalPodAutoscaler SettingVerticalPodAutoscaler (Optional) VerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the cluster.\n topologyAwareRouting SettingTopologyAwareRouting (Optional) TopologyAwareRouting controls certain settings for topology-aware traffic routing in the cluster. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Storage (Appears on: ETCDEvents, ETCDMain) Storage contains storage configuration.\n Field Description capacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Capacity is the storage capacity for the volumes.\n className string (Optional) ClassName is the name of a storage class.\n VirtualCluster (Appears on: GardenSpec) VirtualCluster contains configuration for the virtual cluster.\n Field Description controlPlane ControlPlane (Optional) ControlPlane holds information about the general settings for the control plane of the virtual cluster.\n dns DNS DNS holds information about DNS settings.\n etcd ETCD (Optional) ETCD contains configuration for the etcds of the virtual garden cluster.\n gardener Gardener Gardener contains the configuration options for the Gardener control plane components.\n kubernetes Kubernetes Kubernetes contains the version and configuration options for the Kubernetes components of the virtual garden cluster.\n maintenance Maintenance Maintenance contains information about the time window for maintenance operations.\n networking Networking Networking contains information about cluster networking such as CIDRs, etc.\n Volume (Appears on: RuntimeCluster) Volume contains settings for persistent volumes created in the runtime cluster.\n Field Description minimumSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinimumSize defines the minimum size that should be used for PVCs in the runtime cluster.\n WorkloadIdentityKeyRotation (Appears on: CredentialsRotation) WorkloadIdentityKeyRotation contains information about the workload identity key credential rotation.\n Field Description phase github.com/gardener/gardener/pkg/apis/core/v1beta1.CredentialsRotationPhase Phase describes the phase of the workload identity key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the workload identity key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the workload identity key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the workload identity key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the workload identity key credential rotation completion was triggered.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n operator.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/operator/","tags":"","title":"Operator"},{"body":"DEP-05: Operator Out-of-band Tasks Table of Contents DEP-05: Operator Out-of-band Tasks Table of Contents Summary Terminology Motivation Goals Non-Goals Proposal Custom Resource Golang API Spec Status Custom Resource YAML API Lifecycle Creation Execution Deletion Use Cases Recovery from permanent quorum loss Task Config Pre-Conditions Trigger on-demand snapshot compaction Possible scenarios Task Config Pre-Conditions Trigger on-demand full/delta snapshot Possible scenarios Task Config Pre-Conditions Trigger on-demand maintenance of etcd cluster Possible Scenarios Task Config Pre-Conditions Copy Backups Task Possible Scenarios Task Config Pre-Conditions Metrics Summary This DEP proposes an enhancement to etcd-druid’s capabilities to handle out-of-band tasks, which are presently performed manually or invoked programmatically via suboptimal APIs. The document proposes the establishment of a unified interface by defining a well-structured API to harmonize the initiation of any out-of-band task, monitor its status, and simplify the process of adding new tasks and managing their lifecycles.\nTerminology etcd-druid: etcd-druid is an operator to manage the etcd clusters.\n backup-sidecar: It is the etcd-backup-restore sidecar container running in each etcd-member pod of etcd cluster.\n leading-backup-sidecar: A backup-sidecar that is associated to an etcd leader of an etcd cluster.\n out-of-band task: Any on-demand tasks/operations that can be executed on an etcd cluster without modifying the Etcd custom resource spec (desired state).\n Motivation Today, etcd-druid mainly acts as an etcd cluster provisioner (creation, maintenance and deletion). In future, capabilities of etcd-druid will be enhanced via etcd-member proposal by providing it access to much more detailed information about each etcd cluster member. While we enhance the reconciliation and monitoring capabilities of etcd-druid, it still lacks the ability to allow users to invoke out-of-band tasks on an existing etcd cluster.\nThere are new learnings while operating etcd clusters at scale. It has been observed that we regularly need capabilities to trigger out-of-band tasks which are outside of the purview of a regular etcd reconciliation run. Many of these tasks are multi-step processes, and performing them manually is error-prone, even if an operator follows a well-written step-by-step guide. Thus, there is a need to automate these tasks. Some examples of an on-demand/out-of-band tasks:\n Recover from a permanent quorum loss of etcd cluster. Trigger an on-demand full/delta snapshot. Trigger an on-demand snapshot compaction. Trigger an on-demand maintenance of etcd cluster. Copy the backups from one object store to another object store. Goals Establish a unified interface for operator tasks by defining a single dedicated custom resource for out-of-band tasks. Define a contract (in terms of prerequisites) which needs to be adhered to by any task implementation. Facilitate the easy addition of new out-of-band task(s) through this custom resource. Provide CLI capabilities to operators, making it easy to invoke supported out-of-band tasks. Non-Goals In the current scope, capability to abort/suspend an out-of-band task is not going to be provided. This could be considered as an enhancement based on pull. Ordering (by establishing dependency) of out-of-band tasks submitted for the same etcd cluster has not been considered in the first increment. In a future version based on how operator tasks are used, we will enhance this proposal and the implementation. Proposal Authors propose creation of a new single dedicated custom resource to represent an out-of-band task. Etcd-druid will be enhanced to process the task requests and update its status which can then be tracked/observed.\nCustom Resource Golang API EtcdOperatorTask is the new custom resource that will be introduced. This API will be in v1alpha1 version and will be subject to change. We will be respecting Kubernetes Deprecation Policy.\n// EtcdOperatorTask represents an out-of-band operator task resource. type EtcdOperatorTask struct { metav1.TypeMeta metav1.ObjectMeta // Spec is the specification of the EtcdOperatorTask resource. Spec EtcdOperatorTaskSpec `json:\"spec\"` // Status is most recently observed status of the EtcdOperatorTask resource. Status EtcdOperatorTaskStatus `json:\"status,omitempty\"` } Spec The authors propose that the following fields should be specified in the spec (desired state) of the EtcdOperatorTask custom resource.\n To capture the type of out-of-band operator task to be performed, .spec.type field should be defined. It can have values from all supported out-of-band tasks eg. “OnDemandSnaphotTask”, “QuorumLossRecoveryTask” etc. To capture the configuration specific to each task, a .spec.config field should be defined of type string as each task can have different input configuration. // EtcdOperatorTaskSpec is the spec for a EtcdOperatorTask resource. type EtcdOperatorTaskSpec struct { // Type specifies the type of out-of-band operator task to be performed. Type string `json:\"type\"` // Config is a task specific configuration. Config string `json:\"config,omitempty\"` // TTLSecondsAfterFinished is the time-to-live to garbage collect the // related resource(s) of task once it has been completed. // +optional TTLSecondsAfterFinished *int32 `json:\"ttlSecondsAfterFinished,omitempty\"` // OwnerEtcdReference refers to the name and namespace of the corresponding // Etcd owner for which the task has been invoked. OwnerEtcdRefrence types.NamespacedName `json:\"ownerEtcdRefrence\"` } Status The authors propose the following fields for the Status (current state) of the EtcdOperatorTask custom resource to monitor the progress of the task.\n// EtcdOperatorTaskStatus is the status for a EtcdOperatorTask resource. type EtcdOperatorTaskStatus struct { // ObservedGeneration is the most recent generation observed for the resource. ObservedGeneration *int64 `json:\"observedGeneration,omitempty\"` // State is the last known state of the task. State TaskState `json:\"state\"` // Time at which the task has moved from \"pending\" state to any other state. InitiatedAt metav1.Time `json:\"initiatedAt\"` // LastError represents the errors when processing the task. // +optional LastErrors []LastError `json:\"lastErrors,omitempty\"` // Captures the last operation status if task involves many stages. // +optional LastOperation *LastOperation `json:\"lastOperation,omitempty\"` } type LastOperation struct { // Name of the LastOperation. Name opsName `json:\"name\"` // Status of the last operation, one of pending, progress, completed, failed. State OperationState `json:\"state\"` // LastTransitionTime is the time at which the operation state last transitioned from one state to another. LastTransitionTime metav1.Time `json:\"lastTransitionTime\"` // A human readable message indicating details about the last operation. Reason string `json:\"reason\"` } // LastError stores details of the most recent error encountered for the task. type LastError struct { // Code is an error code that uniquely identifies an error. Code ErrorCode `json:\"code\"` // Description is a human-readable message indicating details of the error. Description string `json:\"description\"` // ObservedAt is the time at which the error was observed. ObservedAt metav1.Time `json:\"observedAt\"` } // TaskState represents the state of the task. type TaskState string const ( TaskStateFailed TaskState = \"Failed\" TaskStatePending TaskState = \"Pending\" TaskStateRejected TaskState = \"Rejected\" TaskStateSucceeded TaskState = \"Succeeded\" TaskStateInProgress TaskState = \"InProgress\" ) // OperationState represents the state of last operation. type OperationState string const ( OperationStateFailed OperationState = \"Failed\" OperationStatePending OperationState = \"Pending\" OperationStateCompleted OperationState = \"Completed\" OperationStateInProgress OperationState = \"InProgress\" ) Custom Resource YAML API apiVersion: druid.gardener.cloud/v1alpha1 kind: EtcdOperatorTask metadata: name: \u003cname of operator task resource\u003e namespace: \u003ccluster namespace\u003e generation: \u003cspecific generation of the desired state\u003e spec: type: \u003ctype/category of supported out-of-band task\u003e ttlSecondsAfterFinished: \u003ctime-to-live to garbage collect the custom resource after it has been completed\u003e config: \u003ctask specific configuration\u003e ownerEtcdRefrence: \u003crefer to corresponding etcd owner name and namespace for which task has been invoked\u003e status: observedGeneration: \u003cspecific observedGeneration of the resource\u003e state: \u003clast known current state of the out-of-band task\u003e initiatedAt: \u003ctime at which task move to any other state from \"pending\" state\u003e lastErrors: - code: \u003cerror-code\u003e description: \u003cdescription of the error\u003e observedAt: \u003ctime the error was observed\u003e lastOperation: name: \u003coperation-name\u003e state: \u003ctask state as seen at the completion of last operation\u003e lastTransitionTime: \u003ctime of transition to this state\u003e reason: \u003creason/message if any\u003e Lifecycle Creation Task(s) can be created by creating an instance of the EtcdOperatorTask custom resource specific to a task.\n Note: In future, either a kubectl extension plugin or a druidctl tool will be introduced. Dedicated sub-commands will be created for each out-of-band task. This will drastically increase the usability for an operator for performing such tasks, as the CLI extension will automatically create relevant instance(s) of EtcdOperatorTask with the provided configuration.\n Execution Authors propose to introduce a new controller which watches for EtcdOperatorTask custom resource. Each out-of-band task may have some task specific configuration defined in .spec.config. The controller needs to parse this task specific config, which comes as a string, according to the schema defined for each task. For every out-of-band task, a set of pre-conditions can be defined. These pre-conditions are evaluated against the current state of the target etcd cluster. Based on the evaluation result (boolean), the task is permitted or denied execution. If multiple tasks are invoked simultaneously or in pending state, then they will be executed in a First-In-First-Out (FIFO) manner. Note: Dependent ordering among tasks will be addressed later which will enable concurrent execution of tasks when possible.\n Deletion Upon completion of the task, irrespective of its final state, Etcd-druid will ensure the garbage collection of the task custom resource and any other Kubernetes resources created to execute the task. This will be done according to the .spec.ttlSecondsAfterFinished if defined in the spec, or a default expiry time will be assumed.\nUse Cases Recovery from permanent quorum loss Recovery from permanent quorum loss involves two phases - identification and recovery - both of which are done manually today. This proposal intends to automate the latter. Recovery today is a multi-step process and needs to be performed carefully by a human operator. Automating these steps would be prudent, to make it quicker and error-free. The identification of the permanent quorum loss would remain a manual process, requiring a human operator to investigate and confirm that there is indeed a permanent quorum loss with no possibility of auto-healing.\nTask Config We do not need any config for this task. When creating an instance of EtcdOperatorTask for this scenario, .spec.config will be set to nil (unset).\nPre-Conditions There should be a quorum loss in a multi-member etcd cluster. For a single-member etcd cluster, invoking this task is unnecessary as the restoration of the single member is automatically handled by the backup-restore process. There should not already be a permanent-quorum-loss-recovery-task running for the same etcd cluster. Trigger on-demand snapshot compaction Etcd-druid provides a configurable etcd-events-threshold flag. When this threshold is breached, then a snapshot compaction is triggered for the etcd cluster. However, there are scenarios where an ad-hoc snapshot compaction may be required.\nPossible scenarios If an operator anticipates a scenario of permanent quorum loss, they can trigger an on-demand snapshot compaction to create a compacted full-snapshot. This can potentially reduce the recovery time from a permanent quorum loss. As an additional benefit, a human operator can leverage the current implementation of snapshot compaction, which internally triggers restoration. Hence, by initiating an on-demand snapshot compaction task, the operator can verify the integrity of etcd cluster backups, particularly in cases of potential backup corruption or re-encryption. The success or failure of this snapshot compaction can offer valuable insights into these scenarios. Task Config We do not need any config for this task. When creating an instance of EtcdOperatorTask for this scenario, .spec.config will be set to nil (unset).\nPre-Conditions There should not be a on-demand snapshot compaction task already running for the same etcd cluster. Note: on-demand snapshot compaction runs as a separate job in a separate pod, which interacts with the backup bucket and not the etcd cluster itself, hence it doesn’t depend on the health of etcd cluster members.\n Trigger on-demand full/delta snapshot Etcd custom resource provides an ability to set FullSnapshotSchedule which currently defaults to run once in 24 hrs. DeltaSnapshotPeriod is also made configurable which defines the duration after which a delta snapshot will be taken. If a human operator does not wish to wait for the scheduled full/delta snapshot, they can trigger an on-demand (out-of-schedule) full/delta snapshot on the etcd cluster, which will be taken by the leading-backup-restore.\nPossible scenarios An on-demand full snapshot can be triggered if scheduled snapshot fails due to any reason. Gardener Shoot Hibernation: Every etcd cluster incurs an inherent cost of preserving the volumes even when a gardener shoot control plane is scaled down, i.e the shoot is in a hibernated state. However, it is possible to save on hyperscaler costs by invoking this task to take a full snapshot before scaling down the etcd cluster, and deleting the etcd data volumes afterwards. Gardener Control Plane Migration: In gardener, a cluster control plane can be moved from one seed cluster to another. This process currently requires the etcd data to be replicated on the target cluster, so a full snapshot of the etcd cluster in the source seed before the migration would allow for faster restoration of the etcd cluster in the target seed. Task Config // SnapshotType can be full or delta snapshot. type SnapshotType string const ( SnapshotTypeFull SnapshotType = \"full\" SnapshotTypeDelta SnapshotType = \"delta\" ) type OnDemandSnapshotTaskConfig struct { // Type of on-demand snapshot. Type SnapshotType `json:\"type\"` } spec: config: | type: \u003ctype of on-demand snapshot\u003e Pre-Conditions Etcd cluster should have a quorum. There should not already be a on-demand snapshot task running with the same SnapshotType for the same etcd cluster. Trigger on-demand maintenance of etcd cluster Operator can trigger on-demand maintenance of etcd cluster which includes operations like etcd compaction, etcd defragmentation etc.\nPossible Scenarios If an etcd cluster is heavily loaded, which is causing performance degradation of an etcd cluster, and the operator does not want to wait for the scheduled maintenance window then an on-demand maintenance task can be triggered which will invoke etcd-compaction, etcd-defragmentation etc. on the target etcd cluster. This will make the etcd cluster lean and clean, thus improving cluster performance. Task Config type OnDemandMaintenanceTaskConfig struct { // MaintenanceType defines the maintenance operations need to be performed on etcd cluster. MaintenanceType maintenanceOps `json:\"maintenanceType` } type maintenanceOps struct { // EtcdCompaction if set to true will trigger an etcd compaction on the target etcd. // +optional EtcdCompaction bool `json:\"etcdCompaction,omitempty\"` // EtcdDefragmentation if set to true will trigger a etcd defragmentation on the target etcd. // +optional EtcdDefragmentation bool `json:\"etcdDefragmentation,omitempty\"` } spec: config: |maintenanceType: etcdCompaction: \u003ctrue/false\u003e etcdDefragmentation: \u003ctrue/false\u003e Pre-Conditions Etcd cluster should have a quorum. There should not already be a duplicate task running with same maintenanceType. Copy Backups Task Copy the backups(full and delta snapshots) of etcd cluster from one object store(source) to another object store(target).\nPossible Scenarios In Gardener, the Control Plane Migration process utilizes the copy-backups task. This task is responsible for copying backups from one object store to another, typically located in different regions. Task Config // EtcdCopyBackupsTaskConfig defines the parameters for the copy backups task. type EtcdCopyBackupsTaskConfig struct { // SourceStore defines the specification of the source object store provider. SourceStore StoreSpec `json:\"sourceStore\"` // TargetStore defines the specification of the target object store provider for storing backups. TargetStore StoreSpec `json:\"targetStore\"` // MaxBackupAge is the maximum age in days that a backup must have in order to be copied. // By default all backups will be copied. // +optional MaxBackupAge *uint32 `json:\"maxBackupAge,omitempty\"` // MaxBackups is the maximum number of backups that will be copied starting with the most recent ones. // +optional MaxBackups *uint32 `json:\"maxBackups,omitempty\"` } spec: config: |sourceStore: \u003csource object store specification\u003e targetStore: \u003ctarget object store specification\u003e maxBackupAge: \u003cmaximum age in days that a backup must have in order to be copied\u003e maxBackups: \u003cmaximum no. of backups that will be copied\u003e Note: For detailed object store specification please refer here\n Pre-Conditions There should not already be a copy-backups task running. Note: copy-backups-task runs as a separate job, and it operates only on the backup bucket, hence it doesn’t depend on health of etcd cluster members.\n Note: copy-backups-task has already been implemented and it’s currently being used in Control Plane Migration but copy-backups-task will be harmonized with EtcdOperatorTask custom resource.\n Metrics Authors proposed to introduce the following metrics:\n etcddruid_operator_task_duration_seconds : Histogram which captures the runtime for each etcd operator task. Labels:\n Key: type, Value: all supported tasks Key: state, Value: One-Of {failed, succeeded, rejected} Key: etcd, Value: name of the target etcd resource Key: etcd_namespace, Value: namespace of the target etcd resource etcddruid_operator_tasks_total: Counter which counts the number of etcd operator tasks. Labels:\n Key: type, Value: all supported tasks Key: state, Value: One-Of {failed, succeeded, rejected} Key: etcd, Value: name of the target etcd resource Key: etcd_namespace, Value: namespace of the target etcd resource ","categories":"","description":"","excerpt":"DEP-05: Operator Out-of-band Tasks Table of Contents DEP-05: Operator …","ref":"/docs/other-components/etcd-druid/proposals/05-etcd-operator-tasks/","tags":"","title":"operator out-of-band tasks"},{"body":"Disclaimer If an application depends on other services deployed separately, do not rely on a certain start sequence of containers. Instead, ensure that the application can cope with unavailability of the services it depends on.\nIntroduction Kubernetes offers a feature called InitContainers to perform some tasks during a pod’s initialization. In this tutorial, we demonstrate how to use InitContainers in order to orchestrate a starting sequence of multiple containers. The tutorial uses the example app url-shortener, which consists of two components:\n postgresql database webapp which depends on the postgresql database and provides two endpoints: create a short url from a given location and redirect from a given short URL to the corresponding target location This app represents the minimal example where an application relies on another service or database. In this example, if the application starts before the database is ready, the application will fail as shown below:\n$ kubectl logs webapp-958cf5567-h247n time=\"2018-06-12T11:02:42Z\" level=info msg=\"Connecting to Postgres database using: host=`postgres:5432` dbname=`url_shortener_db` username=`user`\\n\" time=\"2018-06-12T11:02:42Z\" level=fatal msg=\"failed to start: failed to open connection to database: dial tcp: lookup postgres on 100.64.0.10:53: no such host\\n\" $ kubectl get po -w NAME READY STATUS RESTARTS AGE webapp-958cf5567-h247n 0/1 Pending 0 0s webapp-958cf5567-h247n 0/1 Pending 0 0s webapp-958cf5567-h247n 0/1 ContainerCreating 0 0s webapp-958cf5567-h247n 0/1 ContainerCreating 0 1s webapp-958cf5567-h247n 0/1 Error 0 2s webapp-958cf5567-h247n 0/1 Error 1 3s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 1 4s webapp-958cf5567-h247n 0/1 Error 2 18s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 2 29s webapp-958cf5567-h247n 0/1 Error 3 43s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 3 56s If the restartPolicy is set to Always (default) in the yaml file, the application will continue to restart the pod with an exponential back-off delay in case of failure.\nUsing InitContaniner To avoid such a situation, InitContainers can be defined, which are executed prior to the application container. If one of the InitContainers fails, the application container won’t be triggered.\napiVersion: apps/v1 kind: Deployment metadata: name: webapp spec: selector: matchLabels: app: webapp template: metadata: labels: app: webapp spec: initContainers: # check if DB is ready, and only continue when true - name: check-db-ready image: postgres:9.6.5 command: ['sh', '-c', 'until pg_isready -h postgres -p 5432; do echo waiting for database; sleep 2; done;'] containers: - image: xcoulon/go-url-shortener:0.1.0 name: go-url-shortener env: - name: POSTGRES_HOST value: postgres - name: POSTGRES_PORT value: \"5432\" - name: POSTGRES_DATABASE value: url_shortener_db - name: POSTGRES_USER value: user - name: POSTGRES_PASSWORD value: mysecretpassword ports: - containerPort: 8080 In the above example, the InitContainers use the docker image postgres:9.6.5, which is different from the application container.\nThis also brings the advantage of not having to include unnecessary tools (e.g., pg_isready) in the application container.\nWith introduction of InitContainers, in case the database is not available yet, the pod startup will look like similarly to:\n$ kubectl get po -w NAME READY STATUS RESTARTS AGE nginx-deployment-5cc79d6bfd-t9n8h 1/1 Running 0 5d privileged-pod 1/1 Running 0 4d webapp-fdcb49cbc-4gs4n 0/1 Pending 0 0s webapp-fdcb49cbc-4gs4n 0/1 Pending 0 0s webapp-fdcb49cbc-4gs4n 0/1 Init:0/1 0 0s webapp-fdcb49cbc-4gs4n 0/1 Init:0/1 0 1s $ kubectl logs webapp-fdcb49cbc-4gs4n Error from server (BadRequest): container \"go-url-shortener\" in pod \"webapp-fdcb49cbc-4gs4n\" is waiting to start: PodInitializing ","categories":"","description":"How to orchestrate a startup sequence of multiple containers","excerpt":"How to orchestrate a startup sequence of multiple containers","ref":"/docs/guides/applications/container-startup/","tags":"","title":"Orchestration of Container Startup"},{"body":"The Gardener project implements the documentation-as-code paradigm. Essentially this means that:\n Documentation resides close to the code it describes - in the corresponding GitHub repositories. Only documentation with regards to cross-cutting concerns that cannot be affiliated to a specific component repository is hosted in the general gardener/documentation repository. We use tools to develop, validate and integrate documentation sources The change management process is largely automated with automatic validation, integration and deployment using docforge and docs-toolbelt. The documentation sources are intended for reuse and not bound to a specific publishing platform. The physical organization in a repository is irrelevant for the tool support. What needs to be maintained is the intended result in a docforge documentation bundle manifest configuration, very much like virtual machines configurations, that docforge can reliably recreate in any case. We use GitHub as distributed, versioning storage system and docforge to pull sources in their desired state to forge documentation bundles according to a desired specification provided as a manifest. Content Organization Documentation that can be affiliated to component is hosted and maintained in the component repository.\nA good way to organize your documentation is to place it in a ‘docs’ folder and create separate subfolders per role activity. For example:\nrepositoryX |_ docs |_ usage | |_ images | |_ 01.png | |_ hibernation.md |_ operations |_ deployment Do not use folders just because they are in the template. Stick to the predefined roles and corresponding activities for naming convention. A system makes it easier to maintain and get oriented. While recommended, this is not a mandatory way of organizing the documentation.\n User: usage Operator: operations Gardener (service) provider: deployment Gardener Developer: development Gardener Extension Developer: extensions Publishing on gardener.cloud The Gardener website is one of the multiple optional publishing channels where the source material might end up as documentation. We use docforge and automated integration and publish process to enable transparent change management.\nTo have documentation published on the website it is necessary to use the docforge manifests available at gardener/documentation/.docforge and register a reference to your documentation.\nNote This is work in progress and we are transitioning to a more transparent way of integrating component documentation. This guide will be updated as we progress. These manifests describe a particular publishing goal, i.e. using Hugo to publish on the website, and you will find out that they contain Hugo-specific front-matter properties. Consult with the documentation maintainers for details. Use the gardener channel in slack or open a PR.\n","categories":"","description":"","excerpt":"The Gardener project implements the documentation-as-code paradigm. …","ref":"/docs/contribute/documentation/organization/","tags":"","title":"Organization"},{"body":"Overview The kubectl command-line tool uses kubeconfig files to find the information it needs to choose a cluster and communicate with the API server of a cluster.\nProblem If you’ve become aware of a security breach that affects you, you may want to revoke or cycle credentials in case anything was leaked. However, this is not possible with the initial or master kubeconfig from your cluster.\nPitfall Never distribute the kubeconfig, which you can download directly within the Gardener dashboard, for a productive cluster.\nCreate a Custom kubeconfig File for Each User Create a separate kubeconfig for each user. One of the big advantages of this approach is that you can revoke them and control the permissions better. A limitation to single namespaces is also possible here.\nThe script creates a new ServiceAccount with read privileges in the whole cluster (Secrets are excluded). To run the script, Deno, a secure TypeScript runtime, must be installed.\n#!/usr/bin/env -S deno run --allow-run /* * This script create Kubernetes ServiceAccount and other required resource and print KUBECONFIG to console. * Depending on your requirements you might want change clusterRoleBindingTemplate() function * * In order to execute this script it's required to install Deno.js https://deno.land/ (TypeScript \u0026 JavaScript runtime). * It's single executable binary for the major OSs from the original author of the Node.js * example: deno run --allow-run kubeconfig-for-custom-user.ts d00001 * example: deno run --allow-run kubeconfig-for-custom-user.ts d00001 --delete * * known issue: shebang does works under the Linux but not for Windows Linux Subsystem */ const KUBECTL = \"/usr/local/bin/kubectl\" //or // const KUBECTL = \"C:\\\\Program Files\\\\Docker\\\\Docker\\\\resources\\\\bin\\\\kubectl.exe\" const serviceAccName = Deno.args[0] const deleteIt = Deno.args[1] if (serviceAccName == undefined || serviceAccName == \"--delete\" ) { console.log(\"please provide username as an argument, for example: deno run --allow-run kubeconfig-for-custom-user.ts USER_NAME [--delete]\") Deno.exit(1) } if (deleteIt == \"--delete\") { exec([KUBECTL, \"delete\", \"serviceaccount\", serviceAccName]) exec([KUBECTL, \"delete\", \"secret\", `${serviceAccName}-secret`]) exec([KUBECTL, \"delete\", \"clusterrolebinding\", `view-${serviceAccName}-global`]) Deno.exit(0) } await exec([KUBECTL, \"create\", \"serviceaccount\", serviceAccName, \"-o\", \"json\"]) await exec([KUBECTL, \"create\", \"-o\", \"json\", \"-f\", \"-\"], secretYamlTemplate()) let secret = await exec([KUBECTL, \"get\", \"secret\", `${serviceAccName}-secret`, \"-o\", \"json\"]) let caCRT = secret.data[\"ca.crt\"]; let userToken = atob(secret.data[\"token\"]); //decode base64 let kubeConfig = await exec([KUBECTL, \"config\", \"view\", \"--minify\", \"-o\", \"json\"]); let clusterApi = kubeConfig.clusters[0].cluster.server let clusterName = kubeConfig.clusters[0].name await exec([KUBECTL, \"create\", \"-o\", \"json\", \"-f\", \"-\"], clusterRoleBindingTemplate()) console.log(kubeConfigTemplate(caCRT, userToken, clusterApi, clusterName, serviceAccName + \"-\" + clusterName)) async function exec(args: string[], stdInput?: string): Promise\u003cObject\u003e { console.log(\"# \"+args.join(\" \")) let opt: Deno.RunOptions = { cmd: args, stdout: \"piped\", stderr: \"piped\", stdin: \"piped\", }; const p = Deno.run(opt); if (stdInput != undefined) { await p.stdin.write(new TextEncoder().encode(stdInput)); await p.stdin.close(); } const status = await p.status() const output = await p.output() const stderrOutput = await p.stderrOutput() if (status.code === 0) { return JSON.parse(new TextDecoder().decode(output)) } else { let error = new TextDecoder().decode(stderrOutput); return \"\" } } function clusterRoleBindingTemplate() { return ` apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: view-${serviceAccName}-global subjects: - kind: ServiceAccount name: ${serviceAccName}namespace: default roleRef: kind: ClusterRole name: view apiGroup: rbac.authorization.k8s.io ` } function secretYamlTemplate() { return ` apiVersion: v1 kind: Secret metadata: name: ${serviceAccName}-secret annotations: kubernetes.io/service-account.name: ${serviceAccName}type: kubernetes.io/service-account-token` } function kubeConfigTemplate(certificateAuthority: string, token: string, clusterApi: string, clusterName: string, username: string) { return ` ## KUBECONFIG generated on ${new Date()}apiVersion: v1 clusters: - cluster: certificate-authority-data: ${certificateAuthority}server: ${clusterApi}name: ${clusterName}contexts: - context: cluster: ${clusterName}user: ${username}name: ${clusterName}current-context: ${clusterName}kind: Config preferences: {} users: - name: ${username}user: token: ${token}` } If edit or admin rights are to be assigned, the ClusterRoleBinding must be adapted in the roleRef section with the roles listed below.\nFurthermore, you can restrict this to a single namespace by not creating a ClusterRoleBinding but only a RoleBinding within the desired namespace.\n Default ClusterRole Default ClusterRoleBinding Description cluster-admin system:masters group Allows super-user access to perform any action on any resource. When used in a ClusterRoleBinding, it gives full control over every resource in the cluster and in all namespaces. When used in a RoleBinding, it gives full control over every resource in the rolebinding’s namespace, including the namespace itself. admin None Allows admin access, intended to be granted within a namespace using a RoleBinding. If used in a RoleBinding, allows read/write access to most resources in a namespace, including the ability to create roles and rolebindings within the namespace. It does not allow write access to resource quota or to the namespace itself. edit None Allows read/write access to most objects in a namespace. It does not allow viewing or modifying roles or rolebindings. view None Allows read-only access to see most objects in a namespace. It does not allow viewing roles or rolebindings. It does not allow viewing secrets, since those are escalating. ","categories":"","description":"","excerpt":"Overview The kubectl command-line tool uses kubeconfig files to find …","ref":"/docs/guides/client-tools/working-with-kubeconfig/","tags":"","title":"Organizing Access Using kubeconfig Files"},{"body":"Problem After updating your HTML and JavaScript sources in your web application, the Kubernetes cluster delivers outdated versions - why?\nOverview By default, Kubernetes service pods are not accessible from the external network, but only from other pods within the same Kubernetes cluster.\nThe Gardener cluster has a built-in configuration for HTTP load balancing called Ingress, defining rules for external connectivity to Kubernetes services. Users who want external access to their Kubernetes services create an ingress resource that defines rules, including the URI path, backing service name, and other information. The Ingress controller can then automatically program a frontend load balancer to enable Ingress configuration.\nExample Ingress Configuration apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: vuejs-ingress spec: rules: - host: test.ingress.\u003cGARDENER-CLUSTER\u003e.\u003cGARDENER-PROJECT\u003e.shoot.canary.k8s-hana.ondemand.com http: paths: - backend: serviceName: vuejs-svc servicePort: 8080 where:\n \u003cGARDENER-CLUSTER\u003e: The cluster name in the Gardener \u003cGARDENER-PROJECT\u003e: You project name in the Gardener Diagnosing the Problem The ingress controller we are using is NGINX. NGINX is a software load balancer, web server, and content cache built on top of open source NGINX.\nNGINX caches the content as specified in the HTTP header. If the HTTP header is missing, it is assumed that the cache is forever and NGINX never updates the content in the stupidest case.\nSolution In general, you can avoid this pitfall with one of the solutions below:\n Use a cache buster + HTTP-Cache-Control (prefered) Use HTTP-Cache-Control with a lower retention period Disable the caching in the ingress (just for dev purposes) Learning how to set the HTTP header or setup a cache buster is left to you, as an exercise for your web framework (e.g., Express/NodeJS, SpringBoot, …)\nHere is an example on how to disable the cache control for your ingress, done with an annotation in your ingress YAML (during development).\n--- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: annotations: ingress.kubernetes.io/cache-enable: \"false\" name: vuejs-ingress spec: rules: - host: test.ingress.\u003cGARDENER-CLUSTER\u003e.\u003cGARDENER-PROJECT\u003e.shoot.canary.k8s-hana.ondemand.com http: paths: - backend: serviceName: vuejs-svc servicePort: 8080 ","categories":"","description":"Why is my application always outdated?","excerpt":"Why is my application always outdated?","ref":"/docs/guides/applications/service-cache-control/","tags":"","title":"Out-Dated HTML and JS Files Delivered"},{"body":"Machine Controller Manager CORE – ./machine-controller-manager(provider independent) Out of tree : Machine controller (provider specific) MCM is a set controllers:\n Machine Deployment Controller\n Machine Set Controller\n Machine Controller\n Machine Safety Controller\n Questions and refactoring Suggestions Refactoring Statement FilePath Status ConcurrentNodeSyncs” bad name - nothing to do with node syncs actually. If its value is ’10’ then it will start 10 goroutines (workers) per resource type (machine, machinist, machinedeployment, provider-specific-class, node - study the different resource types. cmd/machine-controller-manager/app/options/options.go pending LeaderElectionConfiguration is very similar to the one present in “client-go/tools/leaderelection/leaderelection.go” - can we simply used the one in client-go instead of defining again? pkg/options/types.go - MachineControllerManagerConfiguration pending Have all userAgents as constant. Right now there is just one. cmd/app/controllermanager.go pending Shouldn’t run function be defined on MCMServer struct itself? cmd/app/controllermanager.go pending clientcmd.BuildConfigFromFlags fallsback to inClusterConfig which will surely not work as that is not the target. Should it not check and exit early? cmd/app/controllermanager.go - run Function pending A more direct way to create an in cluster config is using k8s.io/client-go/rest -\u003e rest.InClusterConfig instead of using clientcmd.BuildConfigFromFlags passing empty arguments and depending upon the implementation to fallback to creating a inClusterConfig. If they change the implementation that you get affected. cmd/app/controllermanager.go - run Function pending Introduce a method on MCMServer which gets a target KubeConfig and controlKubeConfig or alternatively which creates respective clients. cmd/app/controllermanager.go - run Function pending Why can’t we use Kubernetes.NewConfigOrDie also for kubeClientControl? cmd/app/controllermanager.go - run Function pending I do not see any benefit of client builders actually. All you need to do is pass in a config and then directly use client-go functions to create a client. cmd/app/controllermanager.go - run Function pending Function: getAvailableResources - rename this to getApiServerResources cmd/app/controllermanager.go pending Move the method which waits for API server to up and ready to a separate method which returns a discoveryClient when the API server is ready. cmd/app/controllermanager.go - getAvailableResources function pending Many methods in client-go used are now deprecated. Switch to the ones that are now recommended to be used instead. cmd/app/controllermanager.go - startControllers pending This method needs a general overhaul cmd/app/controllermanager.go - startControllers pending If the design is influenced/copied from KCM then its very different. There are different controller structs defined for deployment, replicaset etc which makes the code much more clearer. You can see “kubernetes/cmd/kube-controller-manager/apps.go” and then follow the trail from there. - agreed needs to be changed in future (if time permits) pkg/controller/controller.go pending I am not sure why “MachineSetControlInterface”, “RevisionControlInterface”, “MachineControlInterface”, “FakeMachineControl” are defined in this file? pkg/controller/controller_util.go pending IsMachineActive - combine the first 2 conditions into one with OR. pkg/controller/controller_util.go pending Minor change - correct the comment, first word should always be the method name. Currently none of the comments have correct names. pkg/controller/controller_util.go pending There are too many deep copies made. What is the need to make another deep copy in this method? You are not really changing anything here. pkg/controller/deployment.go - updateMachineDeploymentFinalizers pending Why can’t these validations be done as part of a validating webhook? pkg/controller/machineset.go - reconcileClusterMachineSet pending Small change to the following if condition. else if is not required a simple else is sufficient. Code1 pkg/controller/machineset.go - reconcileClusterMachineSet pending Why call these inactiveMachines, these are live and running and therefore active. pkg/controller/machineset.go - terminateMachines pending Clarification Statement FilePath Status Why are there 2 versions - internal and external versions? General pending Safety controller freezes MCM controllers in the following cases: * Num replicas go beyond a threshold (above the defined replicas) * Target API service is not reachable There seems to be an overlap between DWD and MCM Safety controller. In the meltdown scenario why is MCM being added to DWD, you could have used Safety controller for that. General pending All machine resources are v1alpha1 - should we not promote it to beta. V1alpha1 has a different semantic and does not give any confidence to the consumers. cmd/app/controllermanager.go pending Shouldn’t controller manager use context.Context instead of creating a stop channel? - Check if signals (os.Interrupt and SIGTERM are handled properly. Do not see code where this is handled currently.) cmd/app/controllermanager.go pending What is the rationale behind a timeout of 10s? If the API server is not up, should this not just block as it can anyways not do anything. Also, if there is an error returned then you exit the MCM which does not make much sense actually as it will be started again and you will again do the poll for the API server to come back up. Forcing an exit of MCM will not have any impact on the reachability of the API server in anyway so why exit? cmd/app/controllermanager.go - getAvailableResources pending There is a very weird check - availableResources[machineGVR] || availableResources[machineSetGVR] || availableResources[machineDeploymentGVR] Shouldn’t this be conjunction instead of disjunction? * What happens if you do not find one or all of these resources? Currently an error log is printed and nothing else is done. MCM can be used outside gardener context where consumers can directly create MachineClass and Machine and not create MachineSet / Maching Deployment. There is no distinction made between context (gardener or outside-gardener). cmd/app/controllermanager.go - StartControllers pending Instead of having an empty select {} to block forever, isn’t it better to wait on the stop channel? cmd/app/controllermanager.go - StartControllers pending Do we need provider specific queues and syncs and listers pkg/controller/controller.go pending Why are resource types prefixed with “Cluster”? - not sure , check PR pkg/controller/controller.go pending When will forgetAfterSuccess be false and why? - as per the current code this is never the case. - Himanshu will check cmd/app/controllermanager.go - createWorker pending What is the use of “ExpectationsInterface” and “UIDTrackingContExpectations”? * All expectations related code should be in its own file “expectations.go” and not in this file. pkg/controller/controller_util.go pending Why do we not use lister but directly use the controlMachingClient to get the deployment? Is it because you want to avoid any potential delays caused by update of the local cache held by the informer and accessed by the lister? What is the load on API server due to this? pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Why is this conversion needed? code2 pkg/controller/deployment.go - reconcileClusterMachineDeployment pending A deep copy of machineDeployment is already passed and within the function another deepCopy is made. Any reason for it? pkg/controller/deployment.go - addMachineDeploymentFinalizers pending What is an Status.ObservedGeneration? *Read more about generations and observedGeneration at: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata https://alenkacz.medium.com/kubernetes-operator-best-practices-implementing-observedgeneration-250728868792 Ideally the update to the ObservedGeneration should only be made after successful reconciliation and not before. I see that this is just copied from deployment_controller.go as is pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Why and when will a MachineDeployment be marked as frozen and when will it be un-frozen? pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Shoudn’t the validation of the machine deployment be done during the creation via a validating webhook instead of allowing it to be stored in etcd and then failing the validation during sync? I saw the checks and these can be done via validation webhook. pkg/controller/deployment.go - reconcileClusterMachineDeployment pending RollbackTo has been marked as deprecated. What is the replacement? code3 pkg/controller/deployment.go - reconcileClusterMachineDeployment pending What is the max machineSet deletions that you could process in a single run? The reason for asking this question is that for every machineSetDeletion a new goroutine spawned. * Is the Delete call a synchrounous call? Which means it blocks till the machineset deletion is triggered which then also deletes the machines (due to cascade-delete and blockOwnerDeletion= true)? pkg/controller/deployment.go - terminateMachineSets pending If there are validation errors or error when creating label selector then a nil is returned. In the worker reconcile loop if the return value is nil then it will remove it from the queue (forget + done). What is the way to see any errors? Typically when we describe a resource the errors are displayed. Will these be displayed when we discribe a MachineDeployment? pkg/controller/deployment.go - reconcileClusterMachineSet pending If an error is returned by updateMachineSetStatus and it is IsNotFound error then returning an error will again queue the MachineSet. Is this desired as IsNotFound indicates the MachineSet has been deleted and is no longer there? pkg/controller/deployment.go - reconcileClusterMachineSet pending is machineControl.DeleteMachine a synchronous operation which will wait till the machine has been deleted? Also where is the DeletionTimestamp set on the Machine? Will it be automatically done by the API server? pkg/controller/deployment.go - prepareMachineForDeletion pending Bugs/Enhancements Statement + TODO FilePath Status This defines QPS and Burst for its requests to the KAPI. Check if it would make sense to explicitly define a FlowSchema and PriorityLevelConfiguration to ensure that the requests from this controller are given a well-defined preference. What is the rational behind deciding these values? pkg/options/types.go - MachineControllerManagerConfiguration pending In function “validateMachineSpec” fldPath func parameter is never used. pkg/apis/machine/validation/machine.go pending If there is an update failure then this method recursively calls itself without any sort of delays which could lead to a LOT of load on the API server. (opened: https://github.com/gardener/machine-controller-manager/issues/686) pkg/controller/deployment.go - updateMachineDeploymentFinalizers pending We are updating filteredMachines by invoking syncMachinesNodeTemplates, syncMachinesConfig and syncMachinesClassKind but we do not create any deepCopy here. Everywhere else the general principle is when you mutate always make a deepCopy and then mutate the copy instead of the original as a lister is used and that changes the cached copy. Fix: SatisfiedExpectations check has been commented and there is a TODO there to fix it. Is there a PR for this? pkg/controller/machineset.go - reconcileClusterMachineSet pending Code references\n1.1 code1 if machineSet.DeletionTimestamp == nil { // manageReplicas is the core machineSet method where scale up/down occurs // It is not called when deletion timestamp is set manageReplicasErr = c.manageReplicas(ctx, filteredMachines, machineSet) ​ } else if machineSet.DeletionTimestamp != nil { //FIX: change this to simple else without the if 1.2 code2 defer dc.enqueueMachineDeploymentAfter(deployment, 10*time.Minute) * `Clarification`: Why is this conversion needed? err = v1alpha1.Convert_v1alpha1_MachineDeployment_To_machine_MachineDeployment(deployment, internalMachineDeployment, nil) 1.3 code3 // rollback is not re-entrant in case the underlying machine sets are updated with a new \t// revision so we should ensure that we won't proceed to update machine sets until we \t// make sure that the deployment has cleaned up its rollback spec in subsequent enqueues. \tif d.Spec.RollbackTo != nil { \treturn dc.rollback(ctx, d, machineSets, machineMap) \t} ","categories":"","description":"","excerpt":"Machine Controller Manager CORE – …","ref":"/docs/other-components/machine-controller-manager/todo/outline/","tags":"","title":"Outline"},{"body":"Extensibility Overview Initially, everything was developed in-tree in the Gardener project. All cloud providers and the configuration for all the supported operating systems were released together with the Gardener core itself. But as the project grew, it got more and more difficult to add new providers and maintain the existing code base. As a consequence and in order to become agile and flexible again, we proposed GEP-1 (Gardener Enhancement Proposal). The document describes an out-of-tree extension architecture that keeps the Gardener core logic independent of provider-specific knowledge (similar to what Kubernetes has achieved with out-of-tree cloud providers or with CSI volume plugins).\nBasic Concepts Gardener keeps running in the “garden cluster” and implements the core logic of shoot cluster reconciliation / deletion. Extensions are Kubernetes controllers themselves (like Gardener) and run in the seed clusters. As usual, we try to use Kubernetes wherever applicable. We rely on Kubernetes extension concepts in order to enable extensibility for Gardener. The main ideas of GEP-1 are the following:\n During the shoot reconciliation process, Gardener will write CRDs into the seed cluster that are watched and managed by the extension controllers. They will reconcile (based on the .spec) and report whether everything went well or errors occurred in the CRD’s .status field.\n Gardener keeps deploying the provider-independent control plane components (etcd, kube-apiserver, etc.). However, some of these components might still need little customization by providers, e.g., additional configuration, flags, etc. In this case, the extension controllers register webhooks in order to manipulate the manifests.\n Example 1:\nGardener creates a new AWS shoot cluster and requires the preparation of infrastructure in order to proceed (networks, security groups, etc.). It writes the following CRD into the seed cluster:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--core--aws-01 spec: type: aws providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a dns: apiserver: api.aws-01.core.example.com region: eu-west-1 secretRef: name: my-aws-credentials sshPublicKey: | base64(key) Please note that the .spec.providerConfig is a raw blob and not evaluated or known in any way by Gardener. Instead, it was specified by the user (in the Shoot resource) and just “forwarded” to the extension controller. Only the AWS controller understands this configuration and will now start provisioning/reconciling the infrastructure. It reports in the .status field the result:\nstatus: observedGeneration: ... state: ... lastError: .. lastOperation: ... providerStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus vpc: id: vpc-1234 subnets: - id: subnet-acbd1234 name: workers zone: eu-west-1 securityGroups: - id: sg-xyz12345 name: workers iam: nodesRoleARN: \u003csome-arn\u003e instanceProfileName: foo ec2: keyName: bar Gardener waits until the .status.lastOperation / .status.lastError indicates that the operation reached a final state and either continuous with the next step, or stops and reports the potential error. The extension-specific output in .status.providerStatus is - similar to .spec.providerConfig - not evaluated, and simply forwarded to CRDs in subsequent steps.\nExample 2:\nGardener deploys the control plane components into the seed cluster, e.g. the kube-controller-manager deployment with the following flags:\napiVersion: apps/v1 kind: Deployment ... spec: template: spec: containers: - command: - /usr/local/bin/kube-controller-manager - --allocate-node-cidrs=true - --attach-detach-reconcile-sync-period=1m0s - --controllers=*,bootstrapsigner,tokencleaner - --cluster-cidr=100.96.0.0/11 - --cluster-name=shoot--core--aws-01 - --cluster-signing-cert-file=/srv/kubernetes/ca/ca.crt - --cluster-signing-key-file=/srv/kubernetes/ca/ca.key - --concurrent-deployment-syncs=10 - --concurrent-replicaset-syncs=10 ... The AWS controller requires some additional flags in order to make the cluster functional. It needs to provide a Kubernetes cloud-config and also some cloud-specific flags. Consequently, it registers a MutatingWebhookConfiguration on Deployments and adds these flags to the container:\n - --cloud-provider=external - --external-cloud-volume-plugin=aws - --cloud-config=/etc/kubernetes/cloudprovider/cloudprovider.conf Of course, it would have needed to create a ConfigMap containing the cloud config and to add the proper volume and volumeMounts to the manifest as well.\n(Please note for this special example: The Kubernetes community is also working on making the kube-controller-manager provider-independent. However, there will most probably be still components other than the kube-controller-manager which need to be adapted by extensions.)\nIf you are interested in writing an extension, or generally in digging deeper to find out the nitty-gritty details of the extension concepts, please read GEP-1. We are truly looking forward to your feedback!\nCurrent Status Meanwhile, the out-of-tree extension architecture of Gardener is in place and has been productively validated. We are tracking all internal and external extensions of Gardener in the Gardener Extensions Library repo.\n","categories":"","description":"","excerpt":"Extensibility Overview Initially, everything was developed in-tree in …","ref":"/docs/gardener/extensions/overview/","tags":"","title":"Overview"},{"body":"Setting up the usage environment Setting up the usage environment Important ⚠️ Set KUBECONFIG Replace provider credentials and desired VM configurations Deploy required CRDs and Objects Check current cluster state Important ⚠️ All paths are relative to the root location of this project repository.\n Run the Machine Controller Manager either as described in Setting up a local development environment or Deploying the Machine Controller Manager into a Kubernetes cluster.\n Make sure that the following steps are run before managing machines/ machine-sets/ machine-deploys.\n Set KUBECONFIG Using the existing Kubeconfig, open another Terminal panel/window with the KUBECONFIG environment variable pointing to this Kubeconfig file as shown below,\n$ export KUBECONFIG=\u003cPATH_TO_REPO\u003e/dev/kubeconfig.yaml Replace provider credentials and desired VM configurations Open kubernetes/machine_classes/aws-machine-class.yaml and replace required values there with the desired VM configurations.\nSimilarily open kubernetes/secrets/aws-secret.yaml and replace - userData, providerAccessKeyId, providerSecretAccessKey with base64 encoded values of cloudconfig file, AWS access key id, and AWS secret access key respectively. Use the following command to get the base64 encoded value of your details\n$ echo \"sample-cloud-config\" | base64 base64-encoded-cloud-config Do the same for your access key id and secret access key.\nDeploy required CRDs and Objects Create all the required CRDs in the cluster using kubernetes/crds.yaml\n$ kubectl apply -f kubernetes/crds.yaml Create the class template that will be used as an machine template to create VMs using kubernetes/machine_classes/aws-machine-class.yaml\n$ kubectl apply -f kubernetes/machine_classes/aws-machine-class.yaml Create the secret used for the cloud credentials and cloudconfig using kubernetes/secrets/aws-secret.yaml\n$ kubectl apply -f kubernetes/secrets/aws-secret.yaml Check current cluster state Get to know the current cluster state using the following commands,\n Checking aws-machine-class in the cluster $ kubectl get awsmachineclass NAME MACHINE TYPE AMI AGE test-aws t2.large ami-123456 5m Checking kubernetes secrets in the cluster $ kubectl get secret NAME TYPE DATA AGE test-secret Opaque 3 21h Checking kubernetes nodes in the cluster $ kubectl get nodes Lists the default set of nodes attached to your cluster\n Checking Machine Controller Manager machines in the cluster $ kubectl get machine No resources found. Checking Machine Controller Manager machine-sets in the cluster $ kubectl get machineset No resources found. Checking Machine Controller Manager machine-deploys in the cluster $ kubectl get machinedeployment No resources found. ","categories":"","description":"","excerpt":"Setting up the usage environment Setting up the usage environment …","ref":"/docs/other-components/machine-controller-manager/prerequisite/","tags":"","title":"Prerequisite"},{"body":"PriorityClasses in Gardener Clusters Gardener makes use of PriorityClasses to improve the overall robustness of the system. In order to benefit from the full potential of PriorityClasses, the gardenlet manages a set of well-known PriorityClasses with fine-granular priority values.\nAll components of the system should use these well-known PriorityClasses instead of creating and using separate ones with arbitrary values, which would compromise the overall goal of using PriorityClasses in the first place. The gardenlet manages the well-known PriorityClasses listed in this document, so that third parties (e.g., Gardener extensions) can rely on them to be present when deploying components to Seed and Shoot clusters.\nThe listed well-known PriorityClasses follow this rough concept:\n Values are close to the maximum that can be declared by the user. This is important to ensure that Shoot system components have higher priority than the workload deployed by end-users. Values have a bit of headroom in between to ensure flexibility when the need for intermediate priority values arises. Values of PriorityClasses created on Seed clusters are lower than the ones on Shoots to ensure that Shoot system components have higher priority than Seed components, if the Seed is backed by a Shoot (ManagedSeed), e.g. coredns should have higher priority than gardenlet. Names simply include the last digits of the value to minimize confusion caused by many (similar) names like critical, importance-high, etc. Garden Clusters When using the gardener-operator for managing the garden runtime and virtual cluster, the following PriorityClasses are available:\nPriorityClasses for Garden Control Plane Components Name Priority Associated Components (Examples) gardener-garden-system-critical 999999550 gardener-operator, gardener-resource-manager, istio gardener-garden-system-500 999999500 virtual-garden-etcd-events, virtual-garden-etcd-main, virtual-garden-kube-apiserver, gardener-apiserver gardener-garden-system-400 999999400 virtual-garden-gardener-resource-manager, gardener-admission-controller, Extension Admission Controllers gardener-garden-system-300 999999300 virtual-garden-kube-controller-manager, vpa-admission-controller, etcd-druid, nginx-ingress-controller gardener-garden-system-200 999999200 vpa-recommender, vpa-updater, hvpa-controller, gardener-scheduler, gardener-controller-manager, gardener-dashboard, terminal-controller-manager, gardener-discovery-server, Extension Controllers gardener-garden-system-100 999999100 fluent-operator, fluent-bit, gardener-metrics-exporter, kube-state-metrics, plutono, vali, prometheus-operator, alertmanager-garden, prometheus-garden, blackbox-exporter, prometheus-longterm Seed Clusters PriorityClasses for Seed System Components Name Priority Associated Components (Examples) gardener-system-critical 999998950 gardenlet, gardener-resource-manager, istio-ingressgateway, istiod gardener-system-900 999998900 Extensions, reversed-vpn-auth-server gardener-system-800 999998800 dependency-watchdog-endpoint, dependency-watchdog-probe, etcd-druid, vpa-admission-controller gardener-system-700 999998700 hvpa-controller, vpa-recommender, vpa-updater gardener-system-600 999998600 alertmanager-seed, fluent-operator, fluent-bit, plutono, kube-state-metrics, nginx-ingress-controller, nginx-k8s-backend, prometheus-operator, prometheus-aggregate, prometheus-cache, prometheus-seed, vali gardener-reserve-excess-capacity -5 reserve-excess-capacity (ref) PriorityClasses for Shoot Control Plane Components Name Priority Associated Components (Examples) gardener-system-500 999998500 etcd-events, etcd-main, kube-apiserver gardener-system-400 999998400 gardener-resource-manager gardener-system-300 999998300 cloud-controller-manager, cluster-autoscaler, csi-driver-controller, kube-controller-manager, kube-scheduler, machine-controller-manager, terraformer, vpn-seed-server gardener-system-200 999998200 csi-snapshot-controller, csi-snapshot-validation, cert-controller-manager, shoot-dns-service, vpa-admission-controller, vpa-recommender, vpa-updater gardener-system-100 999998100 alertmanager-shoot, plutono, kube-state-metrics, prometheus-shoot, blackbox-exporter, vali, event-logger Shoot Clusters PriorityClasses for Shoot System Components Name Priority Associated Components (Examples) system-node-critical (created by Kubernetes) 2000001000 calico-node, kube-proxy, apiserver-proxy, csi-driver, egress-filter-applier system-cluster-critical (created by Kubernetes) 2000000000 calico-typha, calico-kube-controllers, coredns, vpn-shoot, registry-cache gardener-shoot-system-900 999999900 node-problem-detector gardener-shoot-system-800 999999800 calico-typha-horizontal-autoscaler, calico-typha-vertical-autoscaler gardener-shoot-system-700 999999700 blackbox-exporter, node-exporter gardener-shoot-system-600 999999600 addons-nginx-ingress-controller, addons-nginx-ingress-k8s-backend, kubernetes-dashboard, kubernetes-metrics-scraper ","categories":"","description":"","excerpt":"PriorityClasses in Gardener Clusters Gardener makes use of …","ref":"/docs/gardener/priority-classes/","tags":"","title":"Priority Classes"},{"body":"Prober Overview Prober starts asynchronous and periodic probes for every shoot cluster. The first probe is the api-server probe which checks the reachability of the API Server from the control plane. The second probe is the lease probe which is done after the api server probe is successful and checks if the number of expired node leases is below a certain threshold. If the lease probe fails, it will scale down the dependent kubernetes resources. Once the connectivity to kube-apiserver is reestablished and the number of expired node leases are within the accepted threshold, the prober will then proactively scale up the dependent kubernetes resources it had scaled down earlier. The failure threshold fraction for lease probe and dependent kubernetes resources are defined in configuration that is passed to the prober.\nOrigin In a shoot cluster (a.k.a data plane) each node runs a kubelet which periodically renewes its lease. Leases serve as heartbeats informing Kube Controller Manager that the node is alive. The connectivity between the kubelet and the Kube ApiServer can break for different reasons and not recover in time.\nAs an example, consider a large shoot cluster with several hundred nodes. There is an issue with a NAT gateway on the shoot cluster which prevents the Kubelet from any node in the shoot cluster to reach its control plane Kube ApiServer. As a consequence, Kube Controller Manager transitioned the nodes of this shoot cluster to Unknown state.\nMachine Controller Manager which also runs in the shoot control plane reacts to any changes to the Node status and then takes action to recover backing VMs/machine(s). It waits for a grace period and then it will begin to replace the unhealthy machine(s) with new ones.\nThis replacement of healthy machines due to a broken connectivity between the worker nodes and the control plane Kube ApiServer results in undesired downtimes for customer workloads that were running on these otherwise healthy nodes. It is therefore required that there be an actor which detects the connectivity loss between the the kubelet and shoot cluster’s Kube ApiServer and proactively scales down components in the shoot control namespace which could exacerbate the availability of nodes in the shoot cluster.\nDependency Watchdog Prober in Gardener Prober is a central component which is deployed in the garden namespace in the seed cluster. Control plane components for a shoot are deployed in a dedicated shoot namespace for the shoot within the seed cluster.\n NOTE: If you are not familiar with what gardener components like seed, shoot then please see the appendix for links.\n Prober periodically probes Kube ApiServer via two separate probes:\n API Server Probe: Local cluster DNS name which resolves to the ClusterIP of the Kube Apiserver Lease Probe: Checks for number of expired leases to be within the specified threshold. The threshold defines the limit after which DWD can say that the kubelets are not able to reach the API server. Behind the scene For all active shoot clusters (which have not been hibernated or deleted or moved to another seed via control-plane-migration), prober will schedule a probe to run periodically. During each run of a probe it will do the following:\n Checks if the Kube ApiServer is reachable via local cluster DNS. This should always succeed and will fail only when the Kube ApiServer has gone down. If the Kube ApiServer is down then there can be no further damage to the existing shoot cluster (barring new requests to the Kube Api Server). Only if the probe is able to reach the Kube ApiServer via local cluster DNS, will it attempt to check the number of expired node leases in the shoot. The node lease renewal is done by the Kubelet, and so we can say that the lease probe is checking if the kubelet is able to reach the API server. If the number of expired node leases reaches the threshold, then the probe fails. If and when a lease probe fails, then it will initiate a scale-down operation for dependent resources as defined in the prober configuration. In subsequent runs it will keep performing the lease probe. If it is successful, then it will start the scale-up operation for dependent resources as defined in the configuration. Prober lifecycle A reconciler is registered to listen to all events for Cluster resource.\nWhen a Reconciler receives a request for a Cluster change, it will query the extension kube-api server to get the Cluster resource.\nIn the following cases it will either remove an existing probe for this cluster or skip creating a new probe:\n Cluster is marked for deletion. Hibernation has been enabled for the cluster. There is an ongoing seed migration for this cluster. If a new cluster is created with no workers. If an update is made to the cluster by removing all workers (in other words making it worker-less). If none of the above conditions are true and there is no existing probe for this cluster then a new probe will be created, registered and started.\nProbe failure identification DWD probe can either be a success or it could return an error. If the API server probe fails, the lease probe is not done and the probes will be retried. If the error is a TooManyRequests error due to requests to the Kube-API-Server being throttled, then the probes are retried after a backOff of backOffDurationForThrottledRequests.\nIf the lease probe fails, then the error could be due to failure in listing the leases. In this case, no scaling operations are performed. If the error in listing the leases is a TooManyRequests error due to requests to the Kube-API-Server being throttled, then the probes are retried after a backOff of backOffDurationForThrottledRequests.\nIf there is no error in listing the leases, then the Lease probe fails if the number of expired leases reaches the threshold fraction specified in the configuration. A lease is considered expired in the following scenario:-\n\ttime.Now() \u003e= lease.Spec.RenewTime + (p.config.KCMNodeMonitorGraceDuration.Duration * expiryBufferFraction) Here, lease.Spec.RenewTime is the time when current holder of a lease has last updated the lease. config is the probe config generated from the configuration and KCMNodeMonitorGraceDuration is amount of time which KCM allows a running Node to be unresponsive before marking it unhealthy (See ref) . expiryBufferFraction is a hard coded value of 0.75. Using this fraction allows the prober to intervene before KCM marks a node as unknown, but at the same time allowing kubelet sufficient retries to renew the node lease (Kubelet renews the lease every 10s See ref).\nAppendix Gardener Reverse Cluster VPN ","categories":"","description":"","excerpt":"Prober Overview Prober starts asynchronous and periodic probes for …","ref":"/docs/other-components/dependency-watchdog/concepts/prober/","tags":"","title":"Prober"},{"body":"Hotfixes This document describes how to contribute hotfixes\n Hotfixes Cherry Picks Prerequisites Initiate a Cherry Pick Cherry Picks This section explains how to initiate cherry picks on hotfix branches within the gardener/dashboard repository.\n Prerequisites Initiate a Cherry Pick Prerequisites Before you initiate a cherry pick, make sure that the following prerequisites are accomplished.\n A pull request merged against the master branch. The hotfix branch exists (check in the branches section). Have the gardener/dashboard repository cloned as follows: the origin remote should point to your fork (alternatively this can be overwritten by passing FORK_REMOTE=\u003cfork-remote\u003e). the upstream remote should point to the Gardener GitHub org (alternatively this can be overwritten by passing UPSTREAM_REMOTE=\u003cupstream-remote\u003e). Have hub installed, e.g. brew install hub assuming you have a standard golang development environment. A GitHub token which has permissions to create a PR in an upstream branch. Initiate a Cherry Pick Run the [cherry pick script][cherry-pick-script].\nThis example applies a master branch PR #1824 to the remote branch upstream/hotfix-1.74:\nGITHUB_USER=\u003cyour-user\u003e hack/cherry-pick-pull.sh upstream/hotfix-1.74 1824 Be aware the cherry pick script assumes you have a git remote called upstream that points at the Gardener GitHub org.\n You will need to run the cherry pick script separately for each patch release you want to cherry pick to. Cherry picks should be applied to all active hotfix branches where the fix is applicable.\n When asked for your GitHub password, provide the created GitHub token rather than your actual GitHub password. Refer https://github.com/github/hub/issues/2655#issuecomment-735836048\n cherry-pick-script\n ","categories":"","description":"","excerpt":"Hotfixes This document describes how to contribute hotfixes\n Hotfixes …","ref":"/docs/dashboard/process/","tags":"","title":"Process"},{"body":"Releases, Features, Hotfixes This document describes how to contribute features or hotfixes, and how new Gardener releases are usually scheduled, validated, etc.\n Releases, Features, Hotfixes Releases Release Responsible Plan Release Validation Contributing New Features or Fixes TODO Statements Deprecations and Backwards-Compatibility Cherry Picks Prerequisites Initiate a Cherry Pick Releases The @gardener-maintainers are trying to provide a new release roughly every other week (depending on their capacity and the stability/robustness of the master branch).\nHotfixes are usually maintained for the latest three minor releases, though, there are no fixed release dates.\nRelease Responsible Plan Version Week No Begin Validation Phase Due Date Release Responsible v1.101 Week 31-32 July 29, 2024 August 11, 2024 @rfranzke v1.102 Week 33-34 August 12, 2024 August 25, 2024 @plkokanov v1.103 Week 35-36 August 26, 2024 September 8, 2024 @oliver-goetz v1.104 Week 37-38 September 9, 2024 September 22, 2024 @ialidzhikov v1.105 Week 39-40 September 23, 2024 October 6, 2024 @acumino v1.106 Week 41-42 October 7, 2024 October 20, 2024 @timuthy v1.107 Week 43-44 October 21, 2024 November 3, 2024 @LucaBernstein v1.108 Week 45-46 November 4, 2024 November 17, 2024 @shafeeqes v1.109 Week 47-48 November 18, 2024 December 1, 2024 @ary1992 v1.110 Week 48-49 December 2, 2024 December 15, 2024 @ScheererJ v1.111 Week 50-51 December 30, 2024 January 26, 2025 @oliver-goetz v1.112 Week 01-04 January 27, 2025 February 9, 2025 @tobschli v1.113 Week 05-06 February 10, 2025 February 23, 2025 @plkokanov v1.114 Week 07-08 February 24, 2025 March 9, 2025 @rfranzke v1.115 Week 09-10 March 10, 2025 March 23, 2025 @ialidzhikov Apart from the release of the next version, the release responsible is also taking care of potential hotfix releases of the last three minor versions. The release responsible is the main contact person for coordinating new feature PRs for the next minor versions or cherry-pick PRs for the last three minor versions.\n Click to expand the archived release responsible associations! Version Week No Begin Validation Phase Due Date Release Responsible v1.17 Week 07-08 February 15, 2021 February 28, 2021 @rfranzke v1.18 Week 09-10 March 1, 2021 March 14, 2021 @danielfoehrKn v1.19 Week 11-12 March 15, 2021 March 28, 2021 @timebertt v1.20 Week 13-14 March 29, 2021 April 11, 2021 @vpnachev v1.21 Week 15-16 April 12, 2021 April 25, 2021 @timuthy v1.22 Week 17-18 April 26, 2021 May 9, 2021 @BeckerMax v1.23 Week 19-20 May 10, 2021 May 23, 2021 @ialidzhikov v1.24 Week 21-22 May 24, 2021 June 5, 2021 @stoyanr v1.25 Week 23-24 June 7, 2021 June 20, 2021 @rfranzke v1.26 Week 25-26 June 21, 2021 July 4, 2021 @danielfoehrKn v1.27 Week 27-28 July 5, 2021 July 18, 2021 @timebertt v1.28 Week 29-30 July 19, 2021 August 1, 2021 @ialidzhikov v1.29 Week 31-32 August 2, 2021 August 15, 2021 @timuthy v1.30 Week 33-34 August 16, 2021 August 29, 2021 @BeckerMax v1.31 Week 35-36 August 30, 2021 September 12, 2021 @stoyanr v1.32 Week 37-38 September 13, 2021 September 26, 2021 @vpnachev v1.33 Week 39-40 September 27, 2021 October 10, 2021 @voelzmo v1.34 Week 41-42 October 11, 2021 October 24, 2021 @plkokanov v1.35 Week 43-44 October 25, 2021 November 7, 2021 @kris94 v1.36 Week 45-46 November 8, 2021 November 21, 2021 @timebertt v1.37 Week 47-48 November 22, 2021 December 5, 2021 @danielfoehrKn v1.38 Week 49-50 December 6, 2021 December 19, 2021 @rfranzke v1.39 Week 01-04 January 3, 2022 January 30, 2022 @ialidzhikov, @timuthy v1.40 Week 05-06 January 31, 2022 February 13, 2022 @BeckerMax v1.41 Week 07-08 February 14, 2022 February 27, 2022 @plkokanov v1.42 Week 09-10 February 28, 2022 March 13, 2022 @kris94 v1.43 Week 11-12 March 14, 2022 March 27, 2022 @rfranzke v1.44 Week 13-14 March 28, 2022 April 10, 2022 @timebertt v1.45 Week 15-16 April 11, 2022 April 24, 2022 @acumino v1.46 Week 17-18 April 25, 2022 May 8, 2022 @ialidzhikov v1.47 Week 19-20 May 9, 2022 May 22, 2022 @shafeeqes v1.48 Week 21-22 May 23, 2022 June 5, 2022 @ary1992 v1.49 Week 23-24 June 6, 2022 June 19, 2022 @plkokanov v1.50 Week 25-26 June 20, 2022 July 3, 2022 @rfranzke v1.51 Week 27-28 July 4, 2022 July 17, 2022 @timebertt v1.52 Week 29-30 July 18, 2022 July 31, 2022 @acumino v1.53 Week 31-32 August 1, 2022 August 14, 2022 @kris94 v1.54 Week 33-34 August 15, 2022 August 28, 2022 @ialidzhikov v1.55 Week 35-36 August 29, 2022 September 11, 2022 @oliver-goetz v1.56 Week 37-38 September 12, 2022 September 25, 2022 @shafeeqes v1.57 Week 39-40 September 26, 2022 October 9, 2022 @ary1992 v1.58 Week 41-42 October 10, 2022 October 23, 2022 @plkokanov v1.59 Week 43-44 October 24, 2022 November 6, 2022 @rfranzke v1.60 Week 45-46 November 7, 2022 November 20, 2022 @acumino v1.61 Week 47-48 November 21, 2022 December 4, 2022 @ialidzhikov v1.62 Week 49-50 December 5, 2022 December 18, 2022 @oliver-goetz v1.63 Week 01-04 January 2, 2023 January 29, 2023 @shafeeqes v1.64 Week 05-06 January 30, 2023 February 12, 2023 @ary1992 v1.65 Week 07-08 February 13, 2023 February 26, 2023 @timuthy v1.66 Week 09-10 February 27, 2023 March 12, 2023 @plkokanov v1.67 Week 11-12 March 13, 2023 March 26, 2023 @rfranzke v1.68 Week 13-14 March 27, 2023 April 9, 2023 @acumino v1.69 Week 15-16 April 10, 2023 April 23, 2023 @oliver-goetz v1.70 Week 17-18 April 24, 2023 May 7, 2023 @ialidzhikov v1.71 Week 19-20 May 8, 2023 May 21, 2023 @shafeeqes v1.72 Week 21-22 May 22, 2023 June 4, 2023 @ary1992 v1.73 Week 23-24 June 5, 2023 June 18, 2023 @timuthy v1.74 Week 25-26 June 19, 2023 July 2, 2023 @oliver-goetz v1.75 Week 27-28 July 3, 2023 July 16, 2023 @rfranzke v1.76 Week 29-30 July 17, 2023 July 30, 2023 @plkokanov v1.77 Week 31-32 July 31, 2023 August 13, 2023 @ialidzhikov v1.78 Week 33-34 August 14, 2023 August 27, 2023 @acumino v1.79 Week 35-36 August 28, 2023 September 10, 2023 @shafeeqes v1.80 Week 37-38 September 11, 2023 September 24, 2023 @ScheererJ v1.81 Week 39-40 September 25, 2023 October 8, 2023 @ary1992 v1.82 Week 41-42 October 9, 2023 October 22, 2023 @timuthy v1.83 Week 43-44 October 23, 2023 November 5, 2023 @oliver-goetz v1.84 Week 45-46 November 6, 2023 November 19, 2023 @rfranzke v1.85 Week 47-48 November 20, 2023 December 3, 2023 @plkokanov v1.86 Week 49-50 December 4, 2023 December 17, 2023 @ialidzhikov v1.87 Week 01-04 January 1, 2024 January 28, 2024 @acumino v1.88 Week 05-06 January 29, 2024 February 11, 2024 @timuthy v1.89 Week 07-08 February 12, 2024 February 25, 2024 @ScheererJ v1.90 Week 09-10 February 26, 2024 March 10, 2024 @ary1992 v1.91 Week 11-12 March 11, 2024 March 24, 2024 @shafeeqes v1.92 Week 13-14 March 25, 2024 April 7, 2024 @oliver-goetz v1.93 Week 15-16 April 8, 2024 April 21, 2024 @rfranzke v1.94 Week 17-18 April 22, 2024 May 5, 2024 @plkokanov v1.95 Week 19-20 May 6, 2024 May 19, 2024 @ialidzhikov v1.96 Week 21-22 May 20, 2024 June 2, 2024 @acumino v1.97 Week 23-24 June 3, 2024 June 16, 2024 @timuthy v1.98 Week 25-26 June 17, 2024 June 30, 2024 @ScheererJ v1.99 Week 27-28 July 1, 2024 July 14, 2024 @ary1992 v1.100 Week 29-30 July 15, 2024 July 28, 2024 @shafeeqes Release Validation The release phase for a new minor version lasts two weeks. Typically, the first week is used for the validation of the release. This phase includes the following steps:\n master (or latest release-* branch) is deployed to a development landscape that already hosts some existing seed and shoot clusters. An extended test suite is triggered by the “release responsible” which: executes the Gardener integration tests for different Kubernetes versions, infrastructures, and Shoot settings. executes the Kubernetes conformance tests. executes further tests like Kubernetes/OS patch/minor version upgrades. Additionally, every four hours (or on demand) more tests (e.g., including the Kubernetes e2e test suite) are executed for different infrastructures. The “release responsible” is verifying new features or other notable changes (derived of the draft release notes) in this development system. Usually, the new release is triggered in the beginning of the second week if all tests are green, all checks were successful, and if all of the planned verifications were performed by the release responsible.\nContributing New Features or Fixes Please refer to the Gardener contributor guide. Besides a lot of general information, it also provides a checklist for newly created pull requests that may help you to prepare your changes for an efficient review process. If you are contributing a fix or major improvement, please take care to open cherry-pick PRs to all affected and still supported versions once the change is approved and merged in the master branch.\n⚠️ Please ensure that your modifications pass the verification checks (linting, formatting, static code checks, tests, etc.) by executing\nmake verify before filing your pull request.\nThe guide applies for both changes to the master and to any release-* branch. All changes must be submitted via a pull request and be reviewed and approved by at least one code owner.\nTODO Statements Sometimes, TODO statements are being introduced when one cannot follow up immediately with certain tasks or when temporary migration code is required. In order to properly follow-up with such TODOs and to prevent them from piling up without getting attention, the following rules should be followed:\n Each TODO statement should have an associated person and state when it can be removed. Example: // TODO(\u003cgithub-username\u003e): Remove this code after v1.75 has been released. When the task depends on a certain implementation, a GitHub issue should be opened and referenced in the statement. Example: // TODO(\u003cgithub-username\u003e): Remove this code after https://github.com/gardener/gardener/issues/\u003cissue-number\u003e has been implemented. The associated person should actively drive the implementation of the referenced issue (unless it cannot be done because of third-party dependencies or conditions) so that the TODO statement does not get stale. TODO statements without actionable tasks or those that are unlikely to ever be implemented (maybe because of very low priorities) should not be specified in the first place. If a TODO is specified, the associated person should make sure to actively follow-up. Deprecations and Backwards-Compatibility In case you have to remove functionality relevant to end-users (e.g., a field or default value in the Shoot API), please connect it with a Kubernetes minor version upgrade. This way, end-users are forced to actively adapt their manifests when they perform their Kubernetes upgrades. For example, the .spec.kubernetes.enableStaticTokenKubeconfig field in the Shoot API is no longer allowed to be set for Kubernetes versions \u003e= 1.27.\nIn case you have to remove or change functionality which cannot be directly connected with a Kubernetes version upgrade, please consider introducing a feature gate. This way, landscape operators can announce the planned changes to their users and communicate a timeline when they plan to activate the feature gate. End-users can then prepare for it accordingly. For example, the fact that changes to kubelet.kubeReserved in the Shoot API will lead to a rolling update of the worker nodes (previously, these changes were updated in-place) is controlled via the NewWorkerPoolHash feature gate.\nIn case you have to remove functionality relevant to Gardener extensions, please deprecate it first, and add a TODO statement to remove it only after at least 9 releases. Do not forget to write a proper release note as part of your pull request. This gives extension developers enough time (~18 weeks) to adapt to the changes (and to release a new version of their extension) before Gardener finally removes the functionality. Examples are removing a field in the extensions.gardener.cloud/v1alpha1 API group, or removing a controller in the extensions library.\nIn case you have to run migration code (which is mostly internal), please add a TODO statement to remove it only after 3 releases. This way, we can ensure that the Gardener version skew policy is not violated. For example, the migration code for moving the Prometheus instances under management of prometheus-operator was running for three releases.\n [!TIP] Please revisit the version skew policy.\n Cherry Picks This section explains how to initiate cherry picks on release branches within the gardener/gardener repository.\n Prerequisites Initiate a Cherry Pick Prerequisites Before you initiate a cherry pick, make sure that the following prerequisites are accomplished.\n A pull request merged against the master branch. The release branch exists (check in the branches section). Have the gardener/gardener repository cloned as follows: the origin remote should point to your fork (alternatively this can be overwritten by passing FORK_REMOTE=\u003cfork-remote\u003e). the upstream remote should point to the Gardener GitHub org (alternatively this can be overwritten by passing UPSTREAM_REMOTE=\u003cupstream-remote\u003e). Have hub installed, which is most easily installed via go get github.com/github/hub assuming you have a standard golang development environment. A GitHub token which has permissions to create a PR in an upstream branch. Initiate a Cherry Pick Run the [cherry pick script][cherry-pick-script].\nThis example applies a master branch PR #3632 to the remote branch upstream/release-v3.14:\nGITHUB_USER=\u003cyour-user\u003e hack/cherry-pick-pull.sh upstream/release-v3.14 3632 Be aware the cherry pick script assumes you have a git remote called upstream that points at the Gardener GitHub org.\n You will need to run the cherry pick script separately for each patch release you want to cherry pick to. Cherry picks should be applied to all active release branches where the fix is applicable.\n When asked for your GitHub password, provide the created GitHub token rather than your actual GitHub password. Refer https://github.com/github/hub/issues/2655#issuecomment-735836048\n cherry-pick-script\n ","categories":"","description":"","excerpt":"Releases, Features, Hotfixes This document describes how to contribute …","ref":"/docs/gardener/process/","tags":"","title":"Process"},{"body":"Profiling Gardener Components Similar to Kubernetes, Gardener components support profiling using standard Go tools for analyzing CPU and memory usage by different code sections and more. This document shows how to enable and use profiling handlers with Gardener components.\nEnabling profiling handlers and the ports on which they are exposed differs between components. However, once the handlers are enabled, they provide profiles via the same HTTP endpoint paths, from which you can retrieve them via curl/wget or directly using go tool pprof. (You might need to use kubectl port-forward in order to access HTTP endpoints of Gardener components running in clusters.)\nFor example (gardener-controller-manager):\n$ curl http://localhost:2718/debug/pprof/heap \u003e /tmp/heap-controller-manager $ go tool pprof /tmp/heap-controller-manager Type: inuse_space Time: Sep 3, 2021 at 10:05am (CEST) Entering interactive mode (type \"help\" for commands, \"o\" for options) (pprof) or\n$ go tool pprof http://localhost:2718/debug/pprof/heap Fetching profile over HTTP from http://localhost:2718/debug/pprof/heap Saved profile in /Users/timebertt/pprof/pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.008.pb.gz Type: inuse_space Time: Sep 3, 2021 at 10:05am (CEST) Entering interactive mode (type \"help\" for commands, \"o\" for options) (pprof) gardener-apiserver gardener-apiserver provides the same flags as kube-apiserver for enabling profiling handlers (enabled by default):\n--contention-profiling Enable lock contention profiling, if profiling is enabled --profiling Enable profiling via web interface host:port/debug/pprof/ (default true) The handlers are served on the same port as the API endpoints (configured via --secure-port). This means that you will also have to authenticate against the API server according to the configured authentication and authorization policy.\ngardener-{admission-controller,controller-manager,scheduler,resource-manager}, gardenlet gardener-controller-manager, gardener-admission-controller, gardener-scheduler, gardener-resource-manager and gardenlet also allow enabling profiling handlers via their respective component configs (currently disabled by default). Here is an example for the gardener-admission-controller’s configuration and how to enable it (it looks similar for the other components):\napiVersion: admissioncontroller.config.gardener.cloud/v1alpha1 kind: AdmissionControllerConfiguration # ... server: metrics: port: 2723 debugging: enableProfiling: true enableContentionProfiling: true However, the handlers are served on the same port as configured in server.metrics.port via HTTP.\nFor example (gardener-admission-controller):\n$ curl http://localhost:2723/debug/pprof/heap \u003e /tmp/heap $ go tool pprof /tmp/heap ","categories":"","description":"","excerpt":"Profiling Gardener Components Similar to Kubernetes, Gardener …","ref":"/docs/gardener/monitoring/profiling/","tags":"","title":"Profiling"},{"body":"Project Operations This section demonstrates how to use the standard Kubernetes tool for cluster operation kubectl for common cluster operations with emphasis on Gardener resources. For more information on kubectl, see kubectl on kubernetes.io.\n Project Operations Prerequisites Using kubeconfig for remote project operations Downloading your kubeconfig List Gardener API resources Check your permissions Working with projects Working with clusters List project clusters Create a new cluster Delete cluster Get kubeconfig for a Shoot Cluster Related Links Prerequisites You’re logged on to the Gardener Dashboard. You’ve created a cluster and its status is operational. It’s recommended that you get acquainted with the resources in the Gardener API.\nUsing kubeconfig for remote project operations The kubeconfig for project operations is different from the one for cluster operations. It has a larger scope and allows a different set of operations that are applicable for a project administrator role, such as lifecycle control on clusters and managing project members.\nDepending on your goal, you can create a service account suitable for automation and use it for your pipelines, or you can get a user-specific kubeconfig and use it to manage your project resources via kubectl.\nDownloading your kubeconfig Kubernetes doesn’t offer an own resource type for human users that access the API server. Instead, you either have to manage unique user strings, or use an OpenID-Connect (OIDC) compatible Identity Provider (IDP) to do the job.\nOnce the latter is set up, each Gardener user can use the kubelogin plugin for kubectl to authenticate against the API server:\n Set up kubelogin if you don’t have it yet. More information: kubelogin setup.\n Open the menu at the top right of the screen, then choose MY ACCOUNT.\n On the Access card, choose the arrow to see all options for the personalized command-line interface access.\n The personal bearer token that is also offered here only provides access for a limited amount of time for one time operations, for example, in curl commands. The kubeconfig provided for the personalized access is used by kubelogin to grant access to the Gardener API for the user permanently by using a refresh token.\n Check that the right Project is chosen and keep the settings otherwise. Download the kubeconfig file and add its path to the KUBECONFIG environment variable.\n You can now execute kubectl commands on the garden cluster using the identity of your user.\n Note: You can also manage your Gardener project resources automatically using a Gardener service account. For more information, see Automating Project Resource Management.\n List Gardener API resources Using a kubeconfig for project operations, you can list the Gardner API resources using the following command:\nkubectl api-resources | grep garden The response looks like this:\nbackupbuckets bbc core.gardener.cloud false BackupBucket backupentries bec core.gardener.cloud true BackupEntry cloudprofiles cprofile,cpfl core.gardener.cloud false CloudProfile controllerinstallations ctrlinst core.gardener.cloud false ControllerInstallation controllerregistrations ctrlreg core.gardener.cloud false ControllerRegistration plants pl core.gardener.cloud true Plant projects core.gardener.cloud false Project quotas squota core.gardener.cloud true Quota secretbindings sb core.gardener.cloud true SecretBinding seeds core.gardener.cloud false Seed shoots core.gardener.cloud true Shoot shootstates core.gardener.cloud true ShootState terminals dashboard.gardener.cloud true Terminal clusteropenidconnectpresets coidcps settings.gardener.cloud false ClusterOpenIDConnectPreset openidconnectpresets oidcps settings.gardener.cloud true OpenIDConnectPreset Enter the following command to view the Gardener API versions:\nkubectl api-versions | grep garden The response looks like this:\ncore.gardener.cloud/v1alpha1 core.gardener.cloud/v1beta1 dashboard.gardener.cloud/v1alpha1 settings.gardener.cloud/v1alpha1 Check your permissions The operations on project resources are limited by the role of the identity that tries to perform them. To get an overview over your permissions, use the following command:\nkubectl auth can-i --list | grep garden The response looks like this:\nplants.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] quotas.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] secretbindings.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] shoots.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] terminals.dashboard.gardener.cloud [] [] [create delete deletecollection get list patch update watch] openidconnectpresets.settings.gardener.cloud [] [] [create delete deletecollection get list patch update watch] cloudprofiles.core.gardener.cloud [] [] [get list watch] projects.core.gardener.cloud [] [flowering] [get patch update delete] namespaces [] [garden-flowering] [get] Try to execute an operation that you aren’t allowed, for example:\nkubectl get projects You receive an error message like this:\nError from server (Forbidden): projects.core.gardener.cloud is forbidden: User \"system:serviceaccount:garden-flowering:robot\" cannot list resource \"projects\" in API group \"core.gardener.cloud\" at the cluster scope Working with projects You can get the details for a project, where you (or the service account) is a member.\nkubectl get project flowering The response looks like this:\nNAME NAMESPACE STATUS OWNER CREATOR AGE flowering garden-flowering Ready [PROJECT-ADMIN]@domain [PROJECT-ADMIN]@domain system 45m For more information, see Project in the API reference.\n To query the names of the members of a project, use the following command:\nkubectl get project docu -o jsonpath='{.spec.members[*].name }' The response looks like this:\n[PROJECT-ADMIN]@domain system:serviceaccount:garden-flowering:robot For more information, see members in the API reference.\n Working with clusters The Gardener domain object for a managed cluster is called Shoot.\nList project clusters To query the clusters in a project:\nkubectl get shoots The output looks like this:\nNAME CLOUDPROFILE VERSION SEED DOMAIN HIBERNATION OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE geranium aws 1.18.3 aws-eu1 geranium.flowering.shoot.\u003ctruncated\u003e Awake Succeeded 100 True True True True 74m Create a new cluster To create a new cluster using the command line, you need a YAML definition of the Shoot resource.\n To get started, copy the following YAML definition to a new file, for example, daffodil.yaml (or copy file shoot.yaml to daffodil.yaml) and adapt it to your needs.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: daffodil namespace: garden-flowering spec: secretBindingName: trial-secretbinding-gcp cloudProfileName: gcp region: europe-west1 purpose: evaluation provider: type: gcp infrastructureConfig: kind: InfrastructureConfig apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 networks: workers: 10.250.0.0/16 controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 zone: europe-west1-c kind: ControlPlaneConfig workers: - name: cpu-worker maximum: 2 minimum: 1 maxSurge: 1 maxUnavailable: 0 machine: type: n1-standard-2 image: name: coreos version: 2303.3.0 volume: type: pd-standard size: 50Gi zones: - europe-west1-c networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: true machineImageVersion: true hibernation: enabled: true schedules: - start: '00 17 * * 1,2,3,4,5' location: Europe/Kiev kubernetes: allowPrivilegedContainers: true kubeControllerManager: nodeCIDRMaskSize: 24 kubeProxy: mode: IPTables version: 1.18.3 addons: nginxIngress: enabled: false kubernetesDashboard: enabled: false In your new YAML definition file, replace the value of field metadata.namespace with your namespace following the convention garden-[YOUR-PROJECTNAME].\n Create a cluster using this manifest (with flag --wait=false the command returns immediately, otherwise it doesn’t return until the process is finished):\nkubectl apply -f daffodil.yaml --wait=false The response looks like this:\nshoot.core.gardener.cloud/daffodil created It takes 5–10 minutes until the cluster is created. To watch the progress, get all shoots and use the -w flag.\nkubectl get shoots -w For a more extended example, see Gardener example shoot manifest.\nDelete cluster To delete a shoot cluster, you must first annotate the shoot resource to confirm the operation with confirmation.gardener.cloud/deletion: \"true\":\n Add the annotation to your manifest (daffodil.yaml in the previous example):\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: daffodil namespace: garden-flowering annotations: confirmation.gardener.cloud/deletion: \"true\" spec: addons: ... Apply your changes of daffodil.yaml.\nkubectl apply -f daffodil.yaml The response looks like this:\nshoot.core.gardener.cloud/daffodil configured Trigger the deletion.\nkubectl delete shoot daffodil --wait=false The response looks like this:\nshoot.core.gardener.cloud \"daffodil\" deleted It takes 5–10 minutes to delete the cluster. To watch the progress, get all shoots and use the -w flag.\nkubectl get shoots -w Get kubeconfig for a Shoot Cluster To get the kubeconfig for a shoot cluster in Gardener from the command line, use one of the following methods:\n Using shoots/admin/kubeconfig Subresource:\n You can obtain a temporary admin kubeconfig by using the shoots/admin/kubeconfig subresource. Detailed instructions can be found in the Gardener documentation here. Using gardenctl and gardenlogin: gardenctl simplifies targeting Shoot clusters. It automatically downloads a kubeconfig that uses the gardenlogin kubectl auth plugin. This plugin transparently manages Shoot cluster authentication and certificate renewal without embedding any credentials in the kubeconfig file.\n When installing gardenctl via Homebrew or Chocolatey, gardenlogin will be installed as a dependency. Refer to the installation instructions here. Both tools can share the same configuration. To set up the tools, refer to the documentation here. To get the kubeconfig, use either the target or kubeconfig command: Target Command: This command targets the specified Shoot cluster and automatically downloads the kubeconfig.\ngardenctl target --garden landscape-dev --project my-project --shoot my-shoot To set the KUBECONFIG environment variable to point to the downloaded kubeconfig file, use the following command (for bash):\neval $(gardenctl kubectl-env bash) Detailed instructions can be found here.\n Kubeconfig Command: This command directly downloads the kubeconfig for the specified Shoot cluster and outputs it in raw format.\ngardenctl kubeconfig --garden landscape-dev --project my-project --shoot my-shoot --raw Related Links Automating Project Resource Management Authenticating with an Identity Provider. ","categories":"","description":"","excerpt":"Project Operations This section demonstrates how to use the standard …","ref":"/docs/dashboard/project-operations/","tags":"","title":"Project Operations"},{"body":"Extending Project Roles The Project resource allows to specify a list of roles for every member (.spec.members[*].roles). There are a few standard roles defined by Gardener itself. Please consult Projects for further information.\nHowever, extension controllers running in the garden cluster may also create CustomResourceDefinitions that project members might be able to CRUD. For this purpose, Gardener also allows to specify extension roles.\nAn extension role is prefixed with extension:, e.g.\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: name: dev spec: members: - apiGroup: rbac.authorization.k8s.io kind: User name: alice.doe@example.com role: admin roles: - owner - extension:foo The project controller will, for every extension role, create a ClusterRole with name gardener.cloud:extension:project:\u003cprojectName\u003e:\u003croleName\u003e, i.e., for the above example: gardener.cloud:extension:project:dev:foo. This ClusterRole aggregates other ClusterRoles that are labeled with rbac.gardener.cloud/aggregate-to-extension-role=foo which might be created by extension controllers.\nAn extension that might want to contribute to the core admin or viewer roles can use the labels rbac.gardener.cloud/aggregate-to-project-member=true or rbac.gardener.cloud/aggregate-to-project-viewer=true, respectively.\nPlease note that the names of the extension roles are restricted to 20 characters!\nMoreover, the project controller will also create a corresponding RoleBinding with the same name in the project namespace. It will automatically assign all members that are assigned to this extension role.\n","categories":"","description":"","excerpt":"Extending Project Roles The Project resource allows to specify a list …","ref":"/docs/gardener/extensions/project-roles/","tags":"","title":"Project Roles"},{"body":"Projects The Gardener API server supports a cluster-scoped Project resource which is used for data isolation between individual Gardener consumers. For example, each development team has its own project to manage its own shoot clusters.\nEach Project is backed by a Kubernetes Namespace that contains the actual related Kubernetes resources, like Secrets or Shoots.\nExample resource:\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: name: dev spec: namespace: garden-dev description: \"This is my first project\" purpose: \"Experimenting with Gardener\" owner: apiGroup: rbac.authorization.k8s.io kind: User name: john.doe@example.com members: - apiGroup: rbac.authorization.k8s.io kind: User name: alice.doe@example.com role: admin # roles: # - viewer # - uam # - serviceaccountmanager # - extension:foo - apiGroup: rbac.authorization.k8s.io kind: User name: bob.doe@example.com role: viewer # tolerations: # defaults: # - key: \u003csome-key\u003e # whitelist: # - key: \u003csome-key\u003e The .spec.namespace field is optional and is initialized if unset. The name of the resulting namespace will be determined based on the Project name and UID, e.g., garden-dev-5aef3. It’s also possible to adopt existing namespaces by labeling them gardener.cloud/role=project and project.gardener.cloud/name=dev beforehand (otherwise, they cannot be adopted).\nWhen deleting a Project resource, the corresponding namespace is also deleted. To keep a namespace after project deletion, an administrator/operator (not Project members!) can annotate the project-namespace with namespace.gardener.cloud/keep-after-project-deletion.\nThe spec.description and .spec.purpose fields can be used to describe to fellow team members and Gardener operators what this project is used for.\nEach project has one dedicated owner, configured in .spec.owner using the rbac.authorization.k8s.io/v1.Subject type. The owner is the main contact person for Gardener operators. Please note that the .spec.owner field is deprecated and will be removed in future API versions in favor of the owner role, see below.\nThe list of members (again a list in .spec.members[] using the rbac.authorization.k8s.io/v1.Subject type) contains all the people that are associated with the project in any way. Each project member must have at least one role (currently described in .spec.members[].role, additional roles can be added to .spec.members[].roles[]). The following roles exist:\n admin: This allows to fully manage resources inside the project (e.g., secrets, shoots, configmaps, and similar). Mind that the admin role has read only access to service accounts. serviceaccountmanager: This allows to fully manage service accounts inside the project namespace and request tokens for them. The permissions of the created service accounts are instead managed by the admin role. Please refer to Service Account Manager. uam: This allows to add/modify/remove human users or groups to/from the project member list. viewer: This allows to read all resources inside the project except secrets. owner: This combines the admin, uam, and serviceaccountmanager roles. Extension roles (prefixed with extension:): Please refer to Extending Project Roles. The project controller inside the Gardener Controller Manager is managing RBAC resources that grant the described privileges to the respective members.\nThere are three central ClusterRoles gardener.cloud:system:project-member, gardener.cloud:system:project-viewer, and gardener.cloud:system:project-serviceaccountmanager that grant the permissions for namespaced resources (e.g., Secrets, Shoots, ServiceAccounts). Via referring RoleBindings created in the respective namespace the project members get bound to these ClusterRoles and, thus, the needed permissions. There are also project-specific ClusterRoles granting the permissions for cluster-scoped resources, e.g., the Namespace or Project itself.\nFor each role, the following ClusterRoles, ClusterRoleBindings, and RoleBindings are created:\n Role ClusterRole ClusterRoleBinding RoleBinding admin gardener.cloud:system:project-member:\u003cprojectName\u003e gardener.cloud:system:project-member:\u003cprojectName\u003e gardener.cloud:system:project-member serviceaccountmanager gardener.cloud:system:project-serviceaccountmanager uam gardener.cloud:system:project-uam:\u003cprojectName\u003e gardener.cloud:system:project-uam:\u003cprojectName\u003e viewer gardener.cloud:system:project-viewer:\u003cprojectName\u003e gardener.cloud:system:project-viewer:\u003cprojectName\u003e gardener.cloud:system:project-viewer owner gardener.cloud:system:project:\u003cprojectName\u003e gardener.cloud:system:project:\u003cprojectName\u003e extension:* gardener.cloud:extension:project:\u003cprojectName\u003e:\u003cextensionRoleName\u003e gardener.cloud:extension:project:\u003cprojectName\u003e:\u003cextensionRoleName\u003e User Access Management For Projects created before Gardener v1.8, all admins were allowed to manage other members. Beginning with v1.8, the new uam role is being introduced. It is backed by the manage-members custom RBAC verb which allows to add/modify/remove human users or groups to/from the project member list. Human users are subjects with kind=User and name!=system:serviceaccount:*, and groups are subjects with kind=Group. The management of service account subjects (kind=ServiceAccount or name=system:serviceaccount:*) is not controlled via the uam custom verb but with the standard update/patch verbs for projects.\nAll newly created projects will only bind the owner to the uam role. The owner can still grant the uam role to other members if desired. For projects created before Gardener v1.8, the Gardener Controller Manager will migrate all projects to also assign the uam role to all admin members (to not break existing use-cases). The corresponding migration logic is present in Gardener Controller Manager from v1.8 to v1.13. The project owner can gradually remove these roles if desired.\nStale Projects When a project is not actively used for some period of time, it is marked as “stale”. This is done by a controller called “Stale Projects Reconciler”. Once the project is marked as stale, there is a time frame in which if not used it will be deleted by that controller.\nFour-Eyes-Principle For Resource Deletion In order to delete a Shoot, the deletion must be confirmed upfront with the confirmation.gardener.cloud/deletion=true annotation. Without this annotation being set, gardener-apiserver denies any DELETE request. Still, users sometimes accidentally shot themselves in the foot, meaning that they accidentally deleted a Shoot despite the confirmation requirement.\nTo prevent that (or make it harder, at least), the Project can be configured to apply the dual approval concept for Shoot deletion. This means that the subject confirming the deletion must not be the same as the subject sending the DELETE request.\nExample:\nspec: dualApprovalForDeletion: - resource: shoots selector: matchLabels: {} includeServiceAccounts: true [!NOTE] As of today, core.gardener.cloud/v1beta1.Shoot is the only resource for which this concept is implemented.\n As usual, .spec.dualApprovalForDeletion[].selector.matchLabels={} matches all resources, .spec.dualApprovalForDeletion[].selector.matchLabels=null matches none at all. It can also be decided to specify an individual label selector if this concept shall only apply to a subset of the Shoots in the project (e.g., CI/development clusters shall be excluded).\nThe includeServiceAccounts (default: true) controls whether the concept also applies when the Shoot deletion confirmation and actual deletion is triggered via ServiceAccounts. This is to prevent that CI jobs have to follow this concept as well, adding additional complexity/overhead. Alternatively, you could also use two ServiceAccounts, one for confirming the deletion, and another one for actually sending the DELETE request, if desired.\n [!IMPORTANT] Project members can still change the labels of Shoots (or the selector itself) to circumvent the dual approval concept. This concern is intentionally excluded/ignored for now since the principle is not a “security feature” but shall just help preventing accidental deletion.\n ","categories":"","description":"Project operations and roles. Four-Eyes-Principle for resource deletion","excerpt":"Project operations and roles. Four-Eyes-Principle for resource …","ref":"/docs/gardener/projects/","tags":"","title":"Projects"},{"body":"Gardener Extension for Alicloud provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Alicloud provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Alibaba cloud provider","excerpt":"Gardener extension controller for the Alibaba cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/","tags":"","title":"Provider Alicloud"},{"body":"Gardener Extension for AWS provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the AWS provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\nCompatibility The following lists known compatibility issues of this extension controller with other Gardener components.\n AWS Extension Gardener Action Notes \u003c= v1.15.0 \u003ev1.10.0 Please update the provider version to \u003e v1.15.0 or disable the feature gate MountHostCADirectories in the Gardenlet. Applies if feature flag MountHostCADirectories in the Gardenlet is enabled. Shoots with CSI enabled (Kubernetes version \u003e= 1.18) miss a mount to the directory /etc/ssl in the Shoot API Server. This can lead to not trusting external Root CAs when the API Server makes requests via webhooks or OIDC. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the AWS cloud provider","excerpt":"Gardener extension controller for the AWS cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/","tags":"","title":"Provider AWS"},{"body":"Gardener Extension for Azure provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Azure provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.31 1.31.0+ N/A Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Azure cloud provider","excerpt":"Gardener extension controller for the Azure cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/","tags":"","title":"Provider Azure"},{"body":"Gardener Extension for Equinix Metal provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Equinix Metal provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 untested N/A Kubernetes 1.29 untested N/A Kubernetes 1.28 untested N/A Kubernetes 1.27 untested N/A Kubernetes 1.26 untested N/A Kubernetes 1.25 untested N/A Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Equinix Metal cloud provider","excerpt":"Gardener extension controller for the Equinix Metal cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/","tags":"","title":"Provider Equinix Metal"},{"body":"Gardener Extension for GCP provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the GCP provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the GCP cloud provider","excerpt":"Gardener extension controller for the GCP cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/","tags":"","title":"Provider GCP"},{"body":"Packages:\n local.provider.extensions.gardener.cloud/v1alpha1 local.provider.extensions.gardener.cloud/v1alpha1 Package v1alpha1 contains the local provider API resources.\nResource Types: CloudProfileConfig WorkerStatus CloudProfileConfig CloudProfileConfig contains provider-specific configuration that is embedded into Gardener’s CloudProfile resource.\n Field Description apiVersion string local.provider.extensions.gardener.cloud/v1alpha1 kind string CloudProfileConfig machineImages []MachineImages MachineImages is the list of machine images that are understood by the controller. It maps logical names and versions to provider-specific identifiers.\n WorkerStatus WorkerStatus contains information about created worker resources.\n Field Description apiVersion string local.provider.extensions.gardener.cloud/v1alpha1 kind string WorkerStatus machineImages []MachineImage (Optional) MachineImages is a list of machine images that have been used in this worker. Usually, the extension controller gets the mapping from name/version to the provider-specific machine image data from the CloudProfile. However, if a version that is still in use gets removed from this componentconfig it cannot reconcile anymore existing Worker resources that are still using this version. Hence, it stores the used versions in the provider status to ensure reconciliation is possible.\n MachineImage (Appears on: WorkerStatus) MachineImage is a mapping from logical names and versions to provider-specific machine image data.\n Field Description name string Name is the logical name of the machine image.\n version string Version is the logical version of the machine image.\n image string Image is the image for the machine image.\n MachineImageVersion (Appears on: MachineImages) MachineImageVersion contains a version and a provider-specific identifier.\n Field Description version string Version is the version of the image.\n image string Image is the image for the machine image.\n MachineImages (Appears on: CloudProfileConfig) MachineImages is a mapping from logical names and versions to provider-specific identifiers.\n Field Description name string Name is the logical name of the machine image.\n versions []MachineImageVersion Versions contains versions and a provider-specific identifier.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n local.provider.extensions.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/provider-local/","tags":"","title":"Provider Local"},{"body":"Local Provider Extension The “local provider” extension is used to allow the usage of seed and shoot clusters which run entirely locally without any real infrastructure or cloud provider involved. It implements Gardener’s extension contract (GEP-1) and thus comprises several controllers and webhooks acting on resources in seed and shoot clusters.\nThe code is maintained in pkg/provider-local.\nMotivation The motivation for maintaining such extension is the following:\n 🛡 Output Qualification: Run fast and cost-efficient end-to-end tests, locally and in CI systems (increased confidence ⛑ before merging pull requests) ⚙️ Development Experience: Develop Gardener entirely on a local machine without any external resources involved (improved costs 💰 and productivity 🚀) 🤝 Open Source: Quick and easy setup for a first evaluation of Gardener and a good basis for first contributions Current Limitations The following enlists the current limitations of the implementation. Please note that all of them are not technical limitations/blockers, but simply advanced scenarios that we haven’t had invested yet into.\n No load balancers for Shoot clusters.\nWe have not yet developed a cloud-controller-manager which could reconcile load balancer Services in the shoot cluster.\n In case a seed cluster with multiple availability zones, i.e. multiple entries in .spec.provider.zones, is used in conjunction with a single-zone shoot control plane, i.e. a shoot cluster without .spec.controlPlane.highAvailability or with .spec.controlPlane.highAvailability.failureTolerance.type set to node, the local address of the API server endpoint needs to be determined manually or via the in-cluster coredns.\nAs the different istio ingress gateway loadbalancers have individual external IP addresses, single-zone shoot control planes can end up in a random availability zone. Having the local host use the coredns in the cluster as name resolver would form a name resolution cycle. The tests mitigate the issue by adapting the DNS configuration inside the affected test.\n ManagedSeeds It is possible to deploy ManagedSeeds with provider-local by first creating a Shoot in the garden namespace and then creating a referencing ManagedSeed object.\n Please note that this is only supported by the Skaffold-based setup.\n The corresponding e2e test can be run via:\n./hack/test-e2e-local.sh --label-filter \"ManagedSeed\" Implementation Details The images locally built by Skaffold for the Gardener components which are deployed to this shoot cluster are managed by a container registry in the registry namespace in the kind cluster. provider-local configures this registry as mirror for the shoot by mutating the OperatingSystemConfig and using the default contract for extending the containerd configuration.\nIn order to bootstrap a seed cluster, the gardenlet deploys PersistentVolumeClaims and Services of type LoadBalancer. While storage is supported in shoot clusters by using the local-path-provisioner, load balancers are not supported yet. However, provider-local runs a Service controller which specifically reconciles the seed-related Services of type LoadBalancer. This way, they get an IP and gardenlet can finish its bootstrapping process. Note that these IPs are not reachable, however for the sake of developing ManagedSeeds this is sufficient for now.\nAlso, please note that the provider-local extension only gets deployed because of the Always deployment policy in its corresponding ControllerRegistration and because the DNS provider type of the seed is set to local.\nImplementation Details This section contains information about how the respective controllers and webhooks in provider-local are implemented and what their purpose is.\nBootstrapping The Helm chart of the provider-local extension defined in its ControllerDeployment contains a special deployment for a CoreDNS instance in a gardener-extension-provider-local-coredns namespace in the seed cluster.\nThis CoreDNS instance is responsible for enabling the components running in the shoot clusters to be able to resolve the DNS names when they communicate with their kube-apiservers.\nIt contains a static configuration to resolve the DNS names based on local.gardener.cloud to istio-ingressgateway.istio-ingress.svc.\nControllers There are controllers for all resources in the extensions.gardener.cloud/v1alpha1 API group except for BackupBucket and BackupEntrys.\nControlPlane This controller is deploying the local-path-provisioner as well as a related StorageClass in order to support PersistentVolumeClaims in the local shoot cluster. Additionally, it creates a few (currently unused) dummy secrets (CA, server and client certificate, basic auth credentials) for the sake of testing the secrets manager integration in the extensions library.\nDNSRecord The controller adapts the cluster internal DNS configuration by extending the coredns configuration for every observed DNSRecord. It will add two corresponding entries in the custom DNS configuration per shoot cluster:\ndata: api.local.local.external.local.gardener.cloud.override: | rewrite stop name regex api.local.local.external.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer auto api.local.local.internal.local.gardener.cloud.override: | rewrite stop name regex api.local.local.internal.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer auto Infrastructure This controller generates a NetworkPolicy which allows the control plane pods (like kube-apiserver) to communicate with the worker machine pods (see Worker section).\nNetwork This controller is not implemented anymore. In the initial version of provider-local, there was a Network controller deploying kindnetd (see release v1.44.1). However, we decided to drop it because this setup prevented us from using NetworkPolicys (kindnetd does not ship a NetworkPolicy controller). In addition, we had issues with shoot clusters having more than one node (hence, we couldn’t support rolling updates, see PR #5666).\nOperatingSystemConfig This controller renders a simple cloud-init template which can later be executed by the shoot worker nodes.\nThe shoot worker nodes are Pods with a container based on the kindest/node image. This is maintained in the gardener/machine-controller-manager-provider-local repository and has a special run-userdata systemd service which executes the cloud-init generated earlier by the OperatingSystemConfig controller.\nWorker This controller leverages the standard generic Worker actuator in order to deploy the machine-controller-manager as well as the machine-controller-manager-provider-local.\nAdditionally, it generates the MachineClasses and the MachineDeployments based on the specification of the Worker resources.\nIngress The gardenlet creates a wildcard DNS record for the Seed’s ingress domain pointing to the nginx-ingress-controller’s LoadBalancer. This domain is commonly used by all Ingress objects created in the Seed for Seed and Shoot components. As provider-local implements the DNSRecord extension API (see the DNSRecordsection), this controller reconciles all Ingresss and creates DNSRecords of type local for each host included in spec.rules. This only happens for shoot namespaces (gardener.cloud/role=shoot label) to make Ingress domains resolvable on the machine pods.\nService This controller reconciles Services of type LoadBalancer in the local Seed cluster. Since the local Kubernetes clusters used as Seed clusters typically don’t support such services, this controller sets the .status.ingress.loadBalancer.ip[0] to the IP of the host. It makes important LoadBalancer Services (e.g. istio-ingress/istio-ingressgateway and garden/nginx-ingress-controller) available to the host by setting spec.ports[].nodePort to well-known ports that are mapped to hostPorts in the kind cluster configuration.\nistio-ingress/istio-ingressgateway is set to be exposed on nodePort 30433 by this controller.\nIn case the seed has multiple availability zones (.spec.provider.zones) and it uses SNI, the different zone-specific istio-ingressgateway loadbalancers are exposed via different IP addresses. Per default, IP addresses 172.18.255.10, 172.18.255.11, and 172.18.255.12 are used for the zones 0, 1, and 2 respectively.\nETCD Backups This controller reconciles the BackupBucket and BackupEntry of the shoot allowing the etcd-backup-restore to create and copy backups using the local provider functionality. The backups are stored on the host file system. This is achieved by mounting that directory to the etcd-backup-restore container.\nExtension Seed This controller reconciles Extensions of type local-ext-seed. It creates a single serviceaccount named local-ext-seed in the shoot’s namespace in the seed. The extension is reconciled before the kube-apiserver. More on extension lifecycle strategies can be read in Registering Extension Controllers.\nExtension Shoot This controller reconciles Extensions of type local-ext-shoot. It creates a single serviceaccount named local-ext-shoot in the kube-system namespace of the shoot. The extension is reconciled after the kube-apiserver. More on extension lifecycle strategies can be read Registering Extension Controllers.\nExtension Shoot After Worker This controller reconciles Extensions of type local-ext-shoot-after-worker. It creates a deployment named local-ext-shoot-after-worker in the kube-system namespace of the shoot. The extension is reconciled after the workers and waits until the deployment is ready. More on extension lifecycle strategies can be read Registering Extension Controllers.\nHealth Checks The health check controller leverages the health check library in order to:\n check the health of the ManagedResource/extension-controlplane-shoot-webhooks and populate the SystemComponentsHealthy condition in the ControlPlane resource. check the health of the ManagedResource/extension-networking-local and populate the SystemComponentsHealthy condition in the Network resource. check the health of the ManagedResource/extension-worker-mcm-shoot and populate the SystemComponentsHealthy condition in the Worker resource. check the health of the Deployment/machine-controller-manager and populate the ControlPlaneHealthy condition in the Worker resource. check the health of the Nodes and populate the EveryNodeReady condition in the Worker resource. Webhooks Control Plane This webhook reacts on the OperatingSystemConfig containing the configuration of the kubelet and sets the failSwapOn to false (independent of what is configured in the Shoot spec) (ref).\nDNS Config This webhook reacts on events for the dependency-watchdog-probe Deployment, the blackbox-exporter Deployment, as well as on events for Pods created when the machine-controller-manager reconciles Machines. All these pods need to be able to resolve the DNS names for shoot clusters. It sets the .spec.dnsPolicy=None and .spec.dnsConfig.nameServers to the cluster IP of the coredns Service created in the gardener-extension-provider-local-coredns namespaces so that these pods can resolve the DNS records for shoot clusters (see the Bootstrapping section for more details).\nMachine Controller Manager This webhook mutates the global ClusterRole related to machine-controller-manager and injects permissions for Service resources. The machine-controller-manager-provider-local deploys Pods for each Machine (while real infrastructure provider obviously deploy VMs, so no Kubernetes resources directly). It also deploys a Service for these machine pods, and in order to do so, the ClusterRole must allow the needed permissions for Service resources.\nNode This webhook reacts on updates to nodes/status in both seed and shoot clusters and sets the .status.{allocatable,capacity}.cpu=\"100\" and .status.{allocatable,capacity}.memory=\"100Gi\" fields.\nBackground: Typically, the .status.{capacity,allocatable} values are determined by the resources configured for the Docker daemon (see for example the docker Quick Start Guide for Mac). Since many of the Pods deployed by Gardener have quite high .spec.resources.requests, the Nodes easily get filled up and only a few Pods can be scheduled (even if they barely consume any of their reserved resources). In order to improve the user experience, on startup/leader election the provider-local extension submits an empty patch which triggers the “node webhook” (see the below section) for the seed cluster. The webhook will increase the capacity of the Nodes to allow all Pods to be scheduled. For the shoot clusters, this empty patch trigger is not needed since the MutatingWebhookConfiguration is reconciled by the ControlPlane controller and exists before the Node object gets registered.\nShoot This webhook reacts on the ConfigMap used by the kube-proxy and sets the maxPerCore field to 0 since other values don’t work well in conjunction with the kindest/node image which is used as base for the shoot worker machine pods (ref).\nDNS Configuration for Multi-Zonal Seeds In case a seed cluster has multiple availability zones as specified in .spec.provider.zones, multiple istio ingress gateways are deployed, one per availability zone in addition to the default deployment. The result is that single-zone shoot control planes, i.e. shoot clusters with .spec.controlPlane.highAvailability set or with .spec.controlPlane.highAvailability.failureTolerance.type set to node, may be exposed via any of the zone-specific istio ingress gateways. Previously, the endpoints were statically mapped via /etc/hosts. Unfortunately, this is no longer possible due to the aforementioned dynamic in the endpoint selection.\nFor multi-zonal seed clusters, there is an additional configuration following coredns’s view plugin mapping the external IP addresses of the zone-specific loadbalancers to the corresponding internal istio ingress gateway domain names. This configuration is only in place for requests from outside of the seed cluster. Those requests are currently being identified by the protocol. UDP requests are interpreted as originating from within the seed cluster while TCP requests are assumed to come from outside the cluster via the docker hostport mapping.\nThe corresponding test sets the DNS configuration accordingly so that the name resolution during the test use coredns in the cluster.\nFuture Work Future work could mostly focus on resolving the above listed limitations, i.e.:\n Implement a cloud-controller-manager and deploy it via the ControlPlane controller. Properly implement .spec.machineTypes in the CloudProfiles (i.e., configure .spec.resources properly for the created shoot worker machine pods). ","categories":"","description":"","excerpt":"Local Provider Extension The “local provider” extension is used to …","ref":"/docs/gardener/extensions/provider-local/","tags":"","title":"Provider Local"},{"body":"Gardener Extension for OpenStack provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the OpenStack provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.31 1.31.0+ N/A Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n Compatibility The following lists known compatibility issues of this extension controller with other Gardener components.\n OpenStack Extension Gardener Action Notes \u003c v1.12.0 \u003e v1.10.0 Please update the provider version to \u003e= v1.12.0 or disable the feature gate MountHostCADirectories in the Gardenlet. Applies if feature flag MountHostCADirectories in the Gardenlet is enabled. This is to prevent duplicate volume mounts to /usr/share/ca-certificates in the Shoot API Server. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the OpenStack cloud provider","excerpt":"Gardener extension controller for the OpenStack cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/","tags":"","title":"Provider Openstack"},{"body":"Overview When opening a pull request, it is best to give all the necessary details in order to help out the reviewers understand your changes and why you are proposing them. Here is the template that you will need to fill out:\n**What this PR does / why we need it**: \u003c!-- Describe the purpose of this PR and what changes have been proposed in it --\u003e **Which issue(s) this PR fixes**: Fixes # \u003c!-- If you are opening a PR in response to a specific issue, linking it will automatically close the issue once the PR has been merged --\u003e **Special notes for your reviewer**: \u003c!-- Any additional information your reviewer might need to know to better process your PR --\u003e **Release note**: \u003c!-- Write your release note: 1.Enter your release note in the below block. 2.If no release note is required, just write \"NONE\" within the block. Format of block header: \u003ccategory\u003e \u003ctarget_group\u003e Possible values: - category: improvement|noteworthy|action - target_group: user|operator|developer --\u003e ```other operator EXAMPLE \\``` Writing Release Notes Some guidelines and tips for writing release notes include:\n Be as descriptive as needed. Only use lists if you are describing multiple different additions. You can freely use markdown formatting, including links. You can find various examples in the Releases sections of the gardener/documentation and gardener/gardener repositories.\n","categories":"","description":"","excerpt":"Overview When opening a pull request, it is best to give all the …","ref":"/docs/contribute/documentation/pr-description/","tags":"","title":"Pull Request Description"},{"body":"Readiness of Shoot Worker Nodes Background When registering new Nodes, kubelet adds the node.kubernetes.io/not-ready taint to prevent scheduling workload Pods to the Node until the Ready condition gets True. However, the kubelet does not consider the readiness of node-critical Pods. Hence, the Ready condition might get True and the node.kubernetes.io/not-ready taint might get removed, for example, before the CNI daemon Pod (e.g., calico-node) has successfully placed the CNI binaries on the machine.\nThis problem has been discussed extensively in kubernetes, e.g., in kubernetes/kubernetes#75890. However, several proposals have been rejected because the problem can be solved by using the --register-with-taints kubelet flag and dedicated controllers (ref).\nImplementation in Gardener Gardener makes sure that workload Pods are only scheduled to Nodes where all node-critical components required for running workload Pods are ready. For this, Gardener follows the proposed solution by the Kubernetes community and registers new Node objects with the node.gardener.cloud/critical-components-not-ready taint (effect NoSchedule). gardener-resource-manager’s Node controller reacts on newly created Node objects that have this taint. The controller removes the taint once all node-critical Pods are ready (determined by checking the Pods’ Ready conditions).\nThe Node controller considers all DaemonSets and Pods with the label node.gardener.cloud/critical-component=true as node-critical. If there are DaemonSets that contain the node.gardener.cloud/critical-component=true label in their metadata and in their Pod template, the Node controller waits for corresponding daemon Pods to be scheduled and to get ready before removing the taint.\nAdditionally, the Node controller checks for the readiness of csi-driver-node components if a respective Pod indicates that it uses such a driver. This is achieved through a well-defined annotation prefix (node.gardener.cloud/wait-for-csi-node-). For example, the csi-driver-node Pod for Openstack Cinder is annotated with node.gardener.cloud/wait-for-csi-node-cinder=cinder.csi.openstack.org. A key prefix is used instead of a “regular” annotation to allow for multiple CSI drivers being registered by one csi-driver-node Pod. The annotation key’s suffix can be chosen arbitrarily (in this case cinder) and the annotation value needs to match the actual driver name as specified in the CSINode object. The Node controller will verify that the used driver is properly registered in this object before removing the node.gardener.cloud/critical-components-not-ready taint. Note that the csi-driver-node Pod still needs to be labelled and tolerate the taint as described above to be considered in this additional check.\nMarking Node-Critical Components To make use of this feature, node-critical DaemonSets and Pods need to:\n Tolerate the node.gardener.cloud/critical-components-not-ready NoSchedule taint. Be labelled with node.gardener.cloud/critical-component=true. csi-driver-node Pods additionally need to:\n Be annotated with node.gardener.cloud/wait-for-csi-node-\u003cname\u003e=\u003cfull-driver-name\u003e. It’s required that these Pods fulfill the above criteria (label and toleration) as well. Gardener already marks components like kube-proxy, apiserver-proxy and node-local-dns as node-critical. Provider extensions mark components like csi-driver-node as node-critical and add the wait-for-csi-node annotation. Network extensions mark components responsible for setting up CNI on worker Nodes (e.g., calico-node) as node-critical. If shoot owners manage any additional node-critical components, they can make use of this feature as well.\n","categories":"","description":"Implementation in Gardener for readiness of Shoot worker Nodes. How to mark components as node-critical","excerpt":"Implementation in Gardener for readiness of Shoot worker Nodes. How to …","ref":"/docs/gardener/node-readiness/","tags":"","title":"Readiness of Shoot Worker Nodes"},{"body":"Reconcile Trigger Gardener dictates the time of reconciliation for resources of the API group extensions.gardener.cloud. It does that by annotating the respected resource with gardener.cloud/operation=reconcile. Extension controllers shall react to this annotation and start reconciling the resource. They have to remove this annotation as soon as they begin with their reconcile operation and maintain the status of the extension resource accordingly.\nThe reason for this behaviour is that it is possible to configure Gardener to reconcile only in the shoots’ maintenance time windows. In order to avoid that, extension controllers reconcile outside of the shoot’s maintenance time window we have introduced this contract. This way extension controllers don’t need to care about when the shoot maintenance time window happens. Gardener keeps control and decides when the shoot shall be reconciled/updated.\nOur extension controller library provides all the required utilities to conveniently implement this behaviour.\n","categories":"","description":"","excerpt":"Reconcile Trigger Gardener dictates the time of reconciliation for …","ref":"/docs/gardener/extensions/reconcile-trigger/","tags":"","title":"Reconcile Trigger"},{"body":"What is impacted during a reconciliation? Infrastructure and DNSRecord reconciliation are only done during usual reconciliation if there were relevant changes. Otherwise, they are only done during maintenance.\nHow do you steer a reconciliation? Reconciliation is bound to the maintenance time window of a cluster. This means that your shoot will be reconciled regularly, without need for input.\nOutside of the maintenance time window your shoot will only reconcile if you change the specification or if you explicitly trigger it. To learn how, see Trigger shoot operations.\n","categories":"","description":"","excerpt":"What is impacted during a reconciliation? Infrastructure and DNSRecord …","ref":"/docs/faq/reconciliation-impact/","tags":"","title":"Reconciliation"},{"body":"Recovery from Permanent Quorum Loss in an Etcd Cluster Quorum loss in Etcd Cluster Quorum loss means when the majority of Etcd pods (greater than or equal to n/2 + 1) are down simultaneously for some reason.\nThere are two types of quorum loss that can happen to an Etcd multinode cluster:\n Transient quorum loss - A quorum loss is called transient when the majority of Etcd pods are down simultaneously for some time. The pods may be down due to network unavailability, high resource usages, etc. When the pods come back after some time, they can re-join the cluster and quorum is recovered automatically without any manual intervention. There should not be a permanent failure for the majority of etcd pods due to hardware failure or disk corruption.\n Permanent quorum loss - A quorum loss is called permanent when the majority of Etcd cluster members experience permanent failure, whether due to hardware failure or disk corruption, etc. In that case, the etcd cluster is not going to recover automatically from the quorum loss. A human operator will now need to intervene and execute the following steps to recover the multi-node Etcd cluster.\n If permanent quorum loss occurs to a multinode Etcd cluster, the operator needs to note down the PVCs, configmaps, statefulsets, CRs, etc. related to that Etcd cluster and work on those resources only. The following steps guide a human operator to recover from permanent quorum loss of an etcd cluster. We assume the name of the Etcd CR for the Etcd cluster is etcd-main.\nEtcd cluster in shoot control plane of gardener deployment: There are two Etcd clusters running in the shoot control plane. One is named etcd-events and another is named etcd-main. The operator needs to take care of permanent quorum loss to a specific cluster. If permanent quorum loss occurs to etcd-events cluster, the operator needs to note down the PVCs, configmaps, statefulsets, CRs, etc. related to the etcd-events cluster and work on those resources only.\n⚠️ Note: Please note that manually restoring etcd can result in data loss. This guide is the last resort to bring an Etcd cluster up and running again.\nIf etcd-druid and etcd-backup-restore is being used with gardener, then:\nTarget the control plane of affected shoot cluster via kubectl. Alternatively, you can use gardenctl to target the control plane of the affected shoot cluster. You can get the details to target the control plane from the Access tile in the shoot cluster details page on the Gardener dashboard. Ensure that you are targeting the correct namespace.\n Add the following annotations to the Etcd resource etcd-main:\n kubectl annotate etcd etcd-main druid.gardener.cloud/suspend-etcd-spec-reconcile=\n kubectl annotate etcd etcd-main druid.gardener.cloud/disable-resource-protection=\n Note down the configmap name that is attached to the etcd-main statefulset. If you describe the statefulset with kubectl describe sts etcd-main, look for the lines similar to following lines to identify attached configmap name. It will be needed at later stages:\nVolumes: etcd-config-file: Type: ConfigMap (a volume populated by a ConfigMap) Name: etcd-bootstrap-4785b0 Optional: false Alternatively, the related configmap name can be obtained by executing following command as well:\nkubectl get sts etcd-main -o jsonpath='{.spec.template.spec.volumes[?(@.name==\"etcd-config-file\")].configMap.name}'\n Scale down the etcd-main statefulset replicas to 0:\nkubectl scale sts etcd-main --replicas=0\n The PVCs will look like the following on listing them with the command kubectl get pvc:\nmain-etcd-etcd-main-0 Bound pv-shoot--garden--aws-ha-dcb51848-49fa-4501-b2f2-f8d8f1fad111 80Gi RWO gardener.cloud-fast 13d main-etcd-etcd-main-1 Bound pv-shoot--garden--aws-ha-b4751b28-c06e-41b7-b08c-6486e03090dd 80Gi RWO gardener.cloud-fast 13d main-etcd-etcd-main-2 Bound pv-shoot--garden--aws-ha-ff17323b-d62e-4d5e-a742-9de823621490 80Gi RWO gardener.cloud-fast 13d Delete all PVCs that are attached to etcd-main cluster.\nkubectl delete pvc -l instance=etcd-main\n Check the etcd’s member leases. There should be leases starting with etcd-main as many as etcd-main replicas. One of those leases will have holder identity as \u003cetcd-member-id\u003e:Leader and rest of etcd member leases have holder identities as \u003cetcd-member-id\u003e:Member. Please ignore the snapshot leases, i.e., those leases which have the suffix snap.\netcd-main member leases:\n NAME HOLDER AGE etcd-main-0 4c37667312a3912b:Member 1m etcd-main-1 75a9b74cfd3077cc:Member 1m etcd-main-2 c62ee6af755e890d:Leader 1m Delete all etcd-main member leases.\n Edit the etcd-main cluster’s configmap (ex: etcd-bootstrap-4785b0) as follows:\nFind the initial-cluster field in the configmap. It should look similar to the following:\n# Initial cluster initial-cluster: etcd-main-0=https://etcd-main-0.etcd-main-peer.default.svc:2380,etcd-main-1=https://etcd-main-1.etcd-main-peer.default.svc:2380,etcd-main-2=https://etcd-main-2.etcd-main-peer.default.svc:2380 Change the initial-cluster field to have only one member (etcd-main-0) in the string. It should now look like this:\n# Initial cluster initial-cluster: etcd-main-0=https://etcd-main-0.etcd-main-peer.default.svc:2380 Scale up the etcd-main statefulset replicas to 1:\nkubectl scale sts etcd-main --replicas=1\n Wait for the single-member etcd cluster to be completely ready.\nkubectl get pods etcd-main-0 will give the following output when ready:\nNAME READY STATUS RESTARTS AGE etcd-main-0 2/2 Running 0 1m Remove the following annotations from the Etcd resource etcd-main:\n kubectl annotate etcd etcd-main druid.gardener.cloud/suspend-etcd-spec-reconcile-\n kubectl annotate etcd etcd-main druid.gardener.cloud/disable-resource-protection-\n Finally, add the following annotation to the Etcd resource etcd-main:\nkubectl annotate etcd etcd-main gardener.cloud/operation='reconcile'\n Verify that the etcd cluster is formed correctly.\nAll the etcd-main pods will have outputs similar to following:\nNAME READY STATUS RESTARTS AGE etcd-main-0 2/2 Running 0 5m etcd-main-1 2/2 Running 0 1m etcd-main-2 2/2 Running 0 1m Additionally, check if the Etcd CR is ready with kubectl get etcd etcd-main:\nNAME READY AGE etcd-main true 13d Additionally, check the leases for 30 seconds at least. There should be leases starting with etcd-main as many as etcd-main replicas. One of those leases will have holder identity as \u003cetcd-member-id\u003e:Leader and rest of those leases have holder identities as \u003cetcd-member-id\u003e:Member. The AGE of those leases can also be inspected to identify if those leases were updated in conjunction with the restart of the Etcd cluster: Example:\nNAME HOLDER AGE etcd-main-0 4c37667312a3912b:Member 1m etcd-main-1 75a9b74cfd3077cc:Member 1m etcd-main-2 c62ee6af755e890d:Leader 1m ","categories":"","description":"","excerpt":"Recovery from Permanent Quorum Loss in an Etcd Cluster Quorum loss in …","ref":"/docs/other-components/etcd-druid/recovery-from-permanent-quorum-loss-in-etcd-cluster/","tags":"","title":"Recovery From Permanent Quorum Loss In Etcd Cluster"},{"body":"Topic Title (the topic title can also be placed in the frontmatter)\nContent This section gives the user all the information needed in order to understand the topic.\n Content Type Definition Example Name 1 Definition of Name 1 Relevant link Name 2 Definition of Name 2 Relevant link Related Links Link 1 Link 2 ","categories":"","description":"Describes the contents of a reference topic","excerpt":"Describes the contents of a reference topic","ref":"/docs/contribute/documentation/style-guide/reference_template/","tags":"","title":"Reference Topic Structure"},{"body":"Referenced Resources The Shoot resource can include a list of resources (usually secrets) that can be referenced by name in the extension providerConfig and other Shoot sections, for example:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: crazy-botany namespace: garden-dev ... spec: ... extensions: - type: foobar providerConfig: apiVersion: foobar.extensions.gardener.cloud/v1alpha1 kind: FooBarConfig foo: bar secretRef: foobar-secret resources: - name: foobar-secret resourceRef: apiVersion: v1 kind: Secret name: my-foobar-secret Gardener expects to find these referenced resources in the project namespace (e.g. garden-dev) and will copy them to the Shoot namespace in the Seed cluster when reconciling a Shoot, adding a prefix to their names to avoid naming collisions with Gardener’s own resources.\nExtension controllers can resolve the references to these resources by accessing the Shoot via the Cluster resource. To properly read a referenced resources, extension controllers should use the utility function GetObjectByReference from the extensions/pkg/controller package, for example:\n ... ref = \u0026autoscalingv1.CrossVersionObjectReference{ APIVersion: \"v1\", Kind: \"Secret\", Name: \"foo\", } secret := \u0026corev1.Secret{} if err := controller.GetObjectByReference(ctx, client, ref, \"shoot--test--foo\", secret); err != nil { return err } // Use secret ... ","categories":"","description":"","excerpt":"Referenced Resources The Shoot resource can include a list of …","ref":"/docs/gardener/extensions/referenced-resources/","tags":"","title":"Referenced Resources"},{"body":"Gardener Extension for Registry Cache \nGardener extension controller which deploys pull-through caches for container registries.\nUsage Configuring the Registry Cache Extension - learn what is the use-case for a pull-through cache, how to enable it and configure it How to provide credentials for upstream repository? Configuring the Registry Mirror Extension - learn what is the use-case for a registry mirror, how to enable and configure it Local Setup and Development Deploying Registry Cache Extension Locally - learn how to set up a local development environment Deploying Registry Cache Extension in Gardener’s Local Setup with Provider Extensions - learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster Developer Docs for Gardener Extension Registry Cache - learn about the inner workings ","categories":"","description":"Gardener extension controller which deploys pull-through caches for container registries.","excerpt":"Gardener extension controller which deploys pull-through caches for …","ref":"/docs/extensions/others/gardener-extension-registry-cache/","tags":"","title":"Registry cache"},{"body":"Overview If you commit sensitive data, such as a kubeconfig.yaml or SSH key into a Git repository, you can remove it from the history. To entirely remove unwanted files from a repository’s history you can use the git filter-branch command.\nThe git filter-branch command rewrites your repository’s history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. Merging or closing all open pull requests before removing files from your repository is recommended.\nWarning If someone has already checked out the repository, then of course they have the secret on their computer. So ALWAYS revoke the OAuthToken/Password or whatever it was immediately. Purging a File from Your Repository’s History Warning If you run git filter-branch after stashing changes, you won’t be able to retrieve your changes with other stash commands. Before running git filter-branch, we recommend unstashing any changes you’ve made. To unstash the last set of changes you’ve stashed, run git stash show -p | git apply -R. For more information, see Git Tools - Stashing and Cleaning. To illustrate how git filter-branch works, we’ll show you how to remove your file with sensitive data from the history of your repository and add it to .gitignore to ensure that it is not accidentally re-committed.\n1. Navigate into the repository’s working directory:\ncd YOUR-REPOSITORY 2. Run the following command, replacing PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA with the path to the file you want to remove, not just its filename.\nThese arguments will:\n Force Git to process, but not check out, the entire history of every branch and tag Remove the specified file, as well as any empty commits generated as a result Overwrite your existing tags git filter-branch --force --index-filter \\ 'git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA' \\ --prune-empty --tag-name-filter cat -- --all 3. Add your file with sensitive data to .gitignore to ensure that you don’t accidentally commit it again:\n echo \"YOUR-FILE-WITH-SENSITIVE-DATA\" \u003e\u003e .gitignore Double-check that you’ve removed everything you wanted to from your repository’s history, and that all of your branches are checked out. Once you’re happy with the state of your repository, continue to the next step.\n4. Force-push your local changes to overwrite your GitHub repository, as well as all the branches you’ve pushed up:\ngit push origin --force --all 4. In order to remove the sensitive file from your tagged releases, you’ll also need to force-push against your Git tags:\ngit push origin --force --tags Warning Tell your collaborators to rebase, not merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging. Related Links Removing Sensitive Data from a Repository ","categories":"","description":"Never ever commit a kubeconfig.yaml into github","excerpt":"Never ever commit a kubeconfig.yaml into github","ref":"/docs/guides/applications/commit-secret-fail/","tags":"","title":"Remove Committed Secrets in Github 💀"},{"body":"Packages:\n resources.gardener.cloud/v1alpha1 resources.gardener.cloud/v1alpha1 Package v1alpha1 contains the configuration of the Gardener Resource Manager.\nResource Types: ManagedResource ManagedResource describes a list of managed resources.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedResourceSpec Spec contains the specification of this managed resource.\n class string (Optional) Class holds the resource class used to control the responsibility for multiple resource manager instances\n secretRefs []Kubernetes core/v1.LocalObjectReference SecretRefs is a list of secret references.\n injectLabels map[string]string (Optional) InjectLabels injects the provided labels into every resource that is part of the referenced secrets.\n forceOverwriteLabels bool (Optional) ForceOverwriteLabels specifies that all existing labels should be overwritten. Defaults to false.\n forceOverwriteAnnotations bool (Optional) ForceOverwriteAnnotations specifies that all existing annotations should be overwritten. Defaults to false.\n keepObjects bool (Optional) KeepObjects specifies whether the objects should be kept although the managed resource has already been deleted. Defaults to false.\n equivalences [][]k8s.io/apimachinery/pkg/apis/meta/v1.GroupKind (Optional) Equivalences specifies possible group/kind equivalences for objects.\n deletePersistentVolumeClaims bool (Optional) DeletePersistentVolumeClaims specifies if PersistentVolumeClaims created by StatefulSets, which are managed by this resource, should also be deleted when the corresponding StatefulSet is deleted (defaults to false).\n status ManagedResourceStatus Status contains the status of this managed resource.\n ManagedResourceSpec (Appears on: ManagedResource) ManagedResourceSpec contains the specification of this managed resource.\n Field Description class string (Optional) Class holds the resource class used to control the responsibility for multiple resource manager instances\n secretRefs []Kubernetes core/v1.LocalObjectReference SecretRefs is a list of secret references.\n injectLabels map[string]string (Optional) InjectLabels injects the provided labels into every resource that is part of the referenced secrets.\n forceOverwriteLabels bool (Optional) ForceOverwriteLabels specifies that all existing labels should be overwritten. Defaults to false.\n forceOverwriteAnnotations bool (Optional) ForceOverwriteAnnotations specifies that all existing annotations should be overwritten. Defaults to false.\n keepObjects bool (Optional) KeepObjects specifies whether the objects should be kept although the managed resource has already been deleted. Defaults to false.\n equivalences [][]k8s.io/apimachinery/pkg/apis/meta/v1.GroupKind (Optional) Equivalences specifies possible group/kind equivalences for objects.\n deletePersistentVolumeClaims bool (Optional) DeletePersistentVolumeClaims specifies if PersistentVolumeClaims created by StatefulSets, which are managed by this resource, should also be deleted when the corresponding StatefulSet is deleted (defaults to false).\n ManagedResourceStatus (Appears on: ManagedResource) ManagedResourceStatus is the status of a managed resource.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n resources []ObjectReference (Optional) Resources is a list of objects that have been created.\n secretsDataChecksum string (Optional) SecretsDataChecksum is the checksum of referenced secrets data.\n ObjectReference (Appears on: ManagedResourceStatus) ObjectReference is a reference to another object.\n Field Description ObjectReference Kubernetes core/v1.ObjectReference (Members of ObjectReference are embedded into this type.) labels map[string]string Labels is a map of labels that were used during last update of the resource.\n annotations map[string]string Annotations is a map of annotations that were used during last update of the resource.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n resources.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/resources/","tags":"","title":"Resources"},{"body":"Restoration of a single member in multi-node etcd deployed by etcd-druid Note:\n For a cluster with n members, we are proposing the solution to only single member restoration within a etcd cluster not the quorum loss scenario (when majority of members within a cluster fail). In this proposal we are not targeting the recovery of single member which got separated from cluster due to network partition. Motivation If a single etcd member within a multi-node etcd cluster goes down due to DB corruption/PVC corruption/Invalid data-dir then it needs to be brought back. Unlike in the single-node case, a minority member of a multi-node cluster can’t be restored from the snapshots present in storage container as you can’t restore from the old snapshots as it contains the metadata information of cluster which leads to memberID mismatch that prevents the new member from coming up as new member is getting its metadata information from db which got restore from old snapshots.\nSolution If a corresponding backup-restore sidecar detects that its corresponding etcd is down due to data-dir corruption or Invalid data-dir Then backup-restore will first remove the failing etcd member from the cluster using the MemberRemove API call and clean the data-dir of failed etcd member. It won’t affect the etcd cluster as quorum is still maintained. After successfully removing failed etcd member from the cluster, backup-restore sidecar will try to add a new etcd member to a cluster to get the same cluster size as before. Backup-restore firstly adds new member as a Learner using the MemberAddAsLearner API call, once learner is added to the cluster and it’s get in sync with leader and becomes up-to-date then promote the learner(non-voting member) to a voting member using MemberPromote API call. So, the failed member first needs to be removed from the cluster and then added as a new member. Example If a 3 member etcd cluster has 1 downed member(due to invalid data-dir), the cluster can still make forward progress because the quorum is 2. Etcd downed member get restarted and it’s corresponding backup-restore sidecar receives an initialization request. Then, backup-restore sidecar checks for data corruption/invalid data-dir. Backup-restore sidecar detects that data-dir is invalid and its a multi-node etcd cluster. Then, backup-restore sidecar removed the downed etcd member from cluster. The number of members in a cluster becomes 2 and the quorum remains at 2, so it won’t affect the etcd cluster. Clean the data-dir and add a member as a learner(non-voting member). As soon as learner gets in sync with leader, promote the learner to a voting member, hence increasing number of members in a cluster back to 3. ","categories":"","description":"","excerpt":"Restoration of a single member in multi-node etcd deployed by …","ref":"/docs/other-components/etcd-druid/restoring-single-member-in-multi-node-etcd-cluster/","tags":"","title":"Restoring Single Member In Multi Node Etcd Cluster"},{"body":"Reversed VPN Tunnel Setup and Configuration The Reversed VPN Tunnel is enabled by default. A highly available VPN connection is automatically deployed in all shoots that configure an HA control-plane.\nReversed VPN Tunnel In the first VPN solution, connection establishment was initiated by a VPN client in the seed cluster. Due to several issues with this solution, the tunnel establishment direction has been reverted. The client is deployed in the shoot and initiates the connection from there. This way, there is no need to deploy a special purpose loadbalancer for the sake of addressing the data-plane, in addition to saving costs, this is considered the more secure alternative. For more information on how this is achieved, please have a look at the following GEP.\nConnection establishment with a reversed tunnel:\nAPIServer --\u003e Envoy-Proxy | VPN-Seed-Server \u003c-- Istio/Envoy-Proxy \u003c-- SNI API Server Endpoint \u003c-- LB (one for all clusters of a seed) \u003c--- internet \u003c--- VPN-Shoot-Client --\u003e Pods | Nodes | Services\nHigh Availability for Reversed VPN Tunnel Shoots which define spec.controlPlane.highAvailability.failureTolerance: {node, zone} get an HA control-plane, including a highly available VPN connection by deploying redundant VPN servers and clients.\nPlease note that it is not possible to move an open connection to another VPN tunnel. Especially long-running commands like kubectl exec -it ... or kubectl logs -f ... will still break if the routing path must be switched because either VPN server or client are not reachable anymore. A new request should be possible within seconds.\nHA Architecture for VPN Establishing a connection from the VPN client on the shoot to the server in the control plane works nearly the same way as in the non-HA case. The only difference is that the VPN client targets one of two VPN servers, represented by two services vpn-seed-server-0 and vpn-seed-server-1 with endpoints in pods with the same name. The VPN tunnel is used by a kube-apiserver to reach nodes, services, or pods in the shoot cluster. In the non-HA case, a kube-apiserver uses an HTTP proxy running as a side-car in the VPN server to address the shoot networks via the VPN tunnel and the vpn-shoot acts as a router. In the HA case, the setup is more complicated. Instead of an HTTP proxy in the VPN server, the kube-apiserver has additional side-cars, one side-car for each VPN client to connect to the corresponding VPN server. On the shoot side, there are now two vpn-shoot pods, each with two VPN clients for each VPN server. With this setup, there would be four possible routes, but only one can be used. Switching the route kills all open connections. Therefore, another layer is introduced: link aggregation, also named bonding. In Linux, you can create a network link by using several other links as slaves. Bonding here is used with active-backup mode. This means the traffic only goes through the active sublink and is only changed if the active one becomes unavailable. Switching happens in the bonding network driver without changing any routes. So with this layer, vpn-seed-server pods can be rolled without disrupting open connections.\nWith bonding, there are 2 possible routing paths, ensuring that there is at least one routing path intact even if one vpn-seed-server pod and one vpn-shoot pod are unavailable at the same time.\nAs multi-path routing is not available on the worker nodes, one routing path must be configured explicitly. For this purpose, the path-controller app is running in another side-car of the kube-apiserver pod. It pings all shoot-side VPN clients regularly every few seconds. If the active routing path is not responsive anymore, the routing is switched to the other responsive routing path.\nUsing an IPv6 transport network for communication between the bonding devices of the VPN clients, additional tunnel devices are needed on both ends to allow transport of both IPv4 and IPv6 packets. For this purpose, ip6tnl type tunnel devices are in place (an IPv4/IPv6 over IPv6 tunnel interface).\nThe connection establishment with a reversed tunnel in HA case is:\nAPIServer[k] --\u003e ip6tnl-device[j] --\u003e bond-device --\u003e tap-device[i] | VPN-Seed-Server[i] \u003c-- Istio/Envoy-Proxy \u003c-- SNI API Server Endpoint \u003c-- LB (one for all clusters of a seed) \u003c--- internet \u003c--- VPN-Shoot-Client[j] --\u003e tap-device[i] --\u003e bond-device --\u003e ip6tnl-device[k] --\u003e Pods | Nodes | Services\nHere, [k] is the index of the kube-apiserver instance, [j] of the VPN shoot instance, and [i] of VPN seed server.\nFor each kube-apiserver instance, an own ip6tnl tunnel device is needed on the shoot side. Additionally, the back routes from the VPN shoot to any new kube-apiserver instance must be set dynamically. Both tasks are managed by the tunnel-controller running in each VPN shoot client. It listens for UDP6 packets sent periodically from the path-controller running in the kube-apiserver pods. These UDP6 packets contain the IPv6 address of the bond device. If the tunnel controller detects a new kube-apiserver this way, it creates a new tunnel device and route to it.\nFor general information about HA control-plane, see GEP-20.\n","categories":"","description":"","excerpt":"Reversed VPN Tunnel Setup and Configuration The Reversed VPN Tunnel is …","ref":"/docs/gardener/reversed-vpn-tunnel/","tags":"","title":"Reversed VPN Tunnel"},{"body":"Scoped API Access for gardenlets and Extensions By default, gardenlets have administrative access in the garden cluster. They are able to execute any API request on any object independent of whether the object is related to the seed cluster the gardenlet is responsible for. As RBAC is not powerful enough for fine-grained checks and for the sake of security, Gardener provides two optional but recommended configurations for your environments that scope the API access for gardenlets.\nSimilar to the Node authorization mode in Kubernetes, Gardener features a SeedAuthorizer plugin. It is a special-purpose authorization plugin that specifically authorizes API requests made by the gardenlets.\nLikewise, similar to the NodeRestriction admission plugin in Kubernetes, Gardener features a SeedRestriction plugin. It is a special-purpose admission plugin that specifically limits the Kubernetes objects gardenlets can modify.\n📚 You might be interested to look into the design proposal for scoped Kubelet API access from the Kubernetes community. It can be translated to Gardener and Gardenlets with their Seed and Shoot resources.\nHistorically, gardenlet has been the only component running in the seed cluster that has access to both the seed cluster and the garden cluster. Starting from Gardener v1.74.0, extensions running on seed clusters can also get access to the garden cluster using a token for a dedicated ServiceAccount. Extensions using this mechanism only get permission to read global resources like CloudProfiles (this is granted to all authenticated users) unless the plugins described in this document are enabled.\nGenerally, the plugins handle extension clients exactly like gardenlet clients with some minor exceptions. Extension clients in the sense of the plugins are clients authenticated as a ServiceAccount with the extension- name prefix in a seed- namespace of the garden cluster. Other ServiceAccounts are not considered as seed clients, not handled by the plugins, and only get the described read access to global resources.\nFlow Diagram The following diagram shows how the two plugins are included in the request flow of a gardenlet. When they are not enabled, then the kube-apiserver is internally authorizing the request via RBAC before forwarding the request directly to the gardener-apiserver, i.e., the gardener-admission-controller would not be consulted (this is not entirely correct because it also serves other admission webhook handlers, but for simplicity reasons this document focuses on the API access scope only).\nWhen enabling the plugins, there is one additional step for each before the gardener-apiserver responds to the request.\nPlease note that the example shows a request to an object (Shoot) residing in one of the API groups served by gardener-apiserver. However, the gardenlet is also interacting with objects in API groups served by the kube-apiserver (e.g., Secret,ConfigMap). In this case, the consultation of the SeedRestriction admission plugin is performed by the kube-apiserver itself before it forwards the request to the gardener-apiserver.\nImplemented Rules Today, the following rules are implemented:\n Resource Verbs Path(s) Description BackupBucket get, list, watch, create, update, patch, delete BackupBucket -\u003e Seed Allow get, list, watch requests for all BackupBuckets. Allow only create, update, patch, delete requests for BackupBuckets assigned to the gardenlet’s Seed. BackupEntry get, list, watch, create, update, patch BackupEntry -\u003e Seed Allow get, list, watch requests for all BackupEntrys. Allow only create, update, patch requests for BackupEntrys assigned to the gardenlet’s Seed and referencing BackupBuckets assigned to the gardenlet’s Seed. Bastion get, list, watch, create, update, patch Bastion -\u003e Seed Allow get, list, watch requests for all Bastions. Allow only create, update, patch requests for Bastions assigned to the gardenlet’s Seed. CertificateSigningRequest get, create CertificateSigningRequest -\u003e Seed Allow only get, create requests for CertificateSigningRequests related to the gardenlet’s Seed. CloudProfile get CloudProfile -\u003e Shoot -\u003e Seed Allow only get requests for CloudProfiles referenced by Shoots that are assigned to the gardenlet’s Seed. ClusterRoleBinding create, get, update, patch, delete ClusterRoleBinding -\u003e ManagedSeed -\u003e Shoot -\u003e Seed Allow create, get, update, patch requests for ManagedSeeds in the bootstrapping phase assigned to the gardenlet’s Seeds. Allow delete requests from gardenlets bootstrapped via ManagedSeeds. ConfigMap get ConfigMap -\u003e Shoot -\u003e Seed Allow only get requests for ConfigMaps referenced by Shoots that are assigned to the gardenlet’s Seed. Allows reading the kube-system/cluster-identity ConfigMap. ControllerRegistration get, list, watch ControllerRegistration -\u003e ControllerInstallation -\u003e Seed Allow get, list, watch requests for all ControllerRegistrations. ControllerDeployment get ControllerDeployment -\u003e ControllerInstallation -\u003e Seed Allow get requests for ControllerDeploymentss referenced by ControllerInstallations assigned to the gardenlet’s Seed. ControllerInstallation get, list, watch, update, patch ControllerInstallation -\u003e Seed Allow get, list, watch requests for all ControllerInstallations. Allow only update, patch requests for ControllerInstallations assigned to the gardenlet’s Seed. CredentialsBinding get CredentialsBinding -\u003e Shoot -\u003e Seed Allow only get requests for CredentialsBindings referenced by Shoots that are assigned to the gardenlet’s Seed. Event create, patch none Allow to create or patch all kinds of Events. ExposureClass get ExposureClass -\u003e Shoot -\u003e Seed Allow get requests for ExposureClasses referenced by Shoots that are assigned to the gardenlet’s Seed. Deny get requests to other ExposureClasses. Gardenlet get, list, watch, update, patch, create Gardenlet -\u003e Seed Allow get, list, watch requests for all Gardenlets. Allow only create, update, and patch requests for Gardenlets belonging to the gardenlet’s Seed. Lease create, get, watch, update Lease -\u003e Seed Allow create, get, update, and delete requests for Leases of the gardenlet’s Seed. ManagedSeed get, list, watch, update, patch ManagedSeed -\u003e Shoot -\u003e Seed Allow get, list, watch requests for all ManagedSeeds. Allow only update, patch requests for ManagedSeeds referencing a Shoot assigned to the gardenlet’s Seed. Namespace get Namespace -\u003e Shoot -\u003e Seed Allow get requests for Namespaces of Shoots that are assigned to the gardenlet’s Seed. Always allow get requests for the garden Namespace. NamespacedCloudProfile get NamespacedCloudProfile -\u003e Shoot -\u003e Seed Allow only get requests for NamespacedCloudProfiles referenced by Shoots that are assigned to the gardenlet’s Seed. Project get Project -\u003e Namespace -\u003e Shoot -\u003e Seed Allow get requests for Projects referenced by the Namespace of Shoots that are assigned to the gardenlet’s Seed. SecretBinding get SecretBinding -\u003e Shoot -\u003e Seed Allow only get requests for SecretBindings referenced by Shoots that are assigned to the gardenlet’s Seed. Secret create, get, update, patch, delete(, list, watch) Secret -\u003e Seed, Secret -\u003e Shoot -\u003e Seed, Secret -\u003e SecretBinding -\u003e Shoot -\u003e Seed, Secret -\u003e CredentialsBinding -\u003e Shoot -\u003e Seed, BackupBucket -\u003e Seed Allow get, list, watch requests for all Secrets in the seed-\u003cname\u003e namespace. Allow only create, get, update, patch, delete requests for the Secrets related to resources assigned to the gardenlet’s Seeds. Seed get, list, watch, create, update, patch, delete Seed Allow get, list, watch requests for all Seeds. Allow only create, update, patch, delete requests for the gardenlet’s Seeds. [1] ServiceAccount create, get, update, patch, delete ServiceAccount -\u003e ManagedSeed -\u003e Shoot -\u003e Seed, ServiceAccount -\u003e Namespace -\u003e Seed Allow create, get, update, patch requests for ManagedSeeds in the bootstrapping phase assigned to the gardenlet’s Seeds. Allow delete requests from gardenlets bootstrapped via ManagedSeeds. Allow all verbs on ServiceAccounts in seed-specific namespace. Shoot get, list, watch, update, patch Shoot -\u003e Seed Allow get, list, watch requests for all Shoots. Allow only update, patch requests for Shoots assigned to the gardenlet’s Seed. ShootState get, create, update, patch ShootState -\u003e Shoot -\u003e Seed Allow only get, create, update, patch requests for ShootStates belonging by Shoots that are assigned to the gardenlet’s Seed. WorkloadIdentity get WorkloadIdentity -\u003e CredentialsBinding -\u003e Shoot -\u003e Seed Allow only get requests for WorkloadIdentities referenced by CredentialsBindings referenced by Shoots that are assigned to the gardenlet’s Seed. [1] If you use ManagedSeed resources then the gardenlet reconciling them (“parent gardenlet”) may be allowed to submit certain requests for the Seed resources resulting out of such ManagedSeed reconciliations (even if the “parent gardenlet” is not responsible for them):\n ℹ️ It is allowed to delete the Seed resources if the corresponding ManagedSeed objects already have a deletionTimestamp (this is secure as gardenlets themselves don’t have permissions for deleting ManagedSeeds).\nRule Exceptions for Extension Clients Extension clients are allowed to perform the same operations as gardenlet clients with the following exceptions:\n Extension clients are granted the read-only subset of verbs for CertificateSigningRequests, ClusterRoleBindings, and ServiceAccounts (to prevent privilege escalation). Extension clients are granted full access to Lease objects but only in the seed-specific namespace. When the need arises, more exceptions might be added to the access rules for resources that are already handled by the plugins. E.g., if an extension needs to populate additional shoot-specific InternalSecrets, according handling can be introduced. Permissions for resources that are not handled by the plugins can be granted using additional RBAC rules (independent of the plugins).\nSeedAuthorizer Authorization Webhook Enablement The SeedAuthorizer is implemented as a Kubernetes authorization webhook and part of the gardener-admission-controller component running in the garden cluster.\n🎛 In order to activate it, you have to follow these steps:\n Set the following flags for the kube-apiserver of the garden cluster (i.e., the kube-apiserver whose API is extended by Gardener):\n --authorization-mode=RBAC,Node,Webhook (please note that Webhook should appear after RBAC in the list [1]; Node might not be needed if you use a virtual garden cluster) --authorization-webhook-config-file=\u003cpath-to-the-webhook-config-file\u003e --authorization-webhook-cache-authorized-ttl=0 --authorization-webhook-cache-unauthorized-ttl=0 The webhook config file (stored at \u003cpath-to-the-webhook-config-file\u003e) should look as follows:\napiVersion: v1 kind: Config clusters: - name: garden cluster: certificate-authority-data: base64(CA-CERT-OF-GARDENER-ADMISSION-CONTROLLER) server: https://gardener-admission-controller.garden/webhooks/auth/seed users: - name: kube-apiserver user: {} contexts: - name: auth-webhook context: cluster: garden user: kube-apiserver current-context: auth-webhook When deploying the Gardener controlplane Helm chart, set .global.rbac.seedAuthorizer.enabled=true. This will ensure that the RBAC resources granting global access for all gardenlets will be deployed.\n Delete the existing RBAC resources granting global access for all gardenlets by running:\nkubectl delete \\ clusterrole.rbac.authorization.k8s.io/gardener.cloud:system:seeds \\ clusterrolebinding.rbac.authorization.k8s.io/gardener.cloud:system:seeds \\ --ignore-not-found Please note that you should activate the SeedRestriction admission handler as well.\n [1] The reason for the fact that Webhook authorization plugin should appear after RBAC is that the kube-apiserver will be depending on the gardener-admission-controller (serving the webhook). However, the gardener-admission-controller can only start when gardener-apiserver runs, but gardener-apiserver itself can only start when kube-apiserver runs. If Webhook is before RBAC, then gardener-apiserver might not be able to start, leading to a deadlock.\n Authorizer Decisions As mentioned earlier, it’s the authorizer’s job to evaluate API requests and return one of the following decisions:\n DecisionAllow: The request is allowed, further configured authorizers won’t be consulted. DecisionDeny: The request is denied, further configured authorizers won’t be consulted. DecisionNoOpinion: A decision cannot be made, further configured authorizers will be consulted. For backwards compatibility, no requests are denied at the moment, so that they are still deferred to a subsequent authorizer like RBAC. Though, this might change in the future.\nFirst, the SeedAuthorizer extracts the Seed name from the API request. This step considers the following two cases:\n If the authenticated user belongs to the gardener.cloud:system:seeds group, it is considered a gardenlet client. This requires a proper TLS certificate that the gardenlet uses to contact the API server and is automatically given if TLS bootstrapping is used. The authorizer extracts the seed name from the username by stripping the gardener.cloud:system:seed: prefix. In cases where this information is missing e.g., when a custom Kubeconfig is used, the authorizer cannot make any decision. Thus, RBAC is still a considerable option to restrict the gardenlet’s access permission if the above explained preconditions are not given. If the authenticated user belongs to the system:serviceaccounts group, it is considered an extension client under the following conditions: The ServiceAccount must be located in a seed- namespace. I.e., the user has to belong to a group with the system:serviceaccounts:seed- prefix. The seed name is extracted from this group by stripping the prefix. The ServiceAccount must have the extension- prefix. I.e., the username must have the system:serviceaccount:seed-\u003cseed-name\u003e:extension- prefix. With the Seed name at hand, the authorizer checks for an existing path from the resource that a request is being made for to the Seed belonging to the gardenlet/extension. Take a look at the Implementation Details section for more information.\nImplementation Details Internally, the SeedAuthorizer uses a directed, acyclic graph data structure in order to efficiently respond to authorization requests for gardenlets/extensions:\n A vertex in this graph represents a Kubernetes resource with its kind, namespace, and name (e.g., Shoot:garden-my-project/my-shoot). An edge from vertex u to vertex v in this graph exists when (1) v is referred by u and v is a Seed, or when (2) u is referred by v, or when (3) u is strictly associated with v. For example, a Shoot refers to a Seed, a CloudProfile, a SecretBinding, etc., so it has an outgoing edge to the Seed (1) and incoming edges from the CloudProfile and SecretBinding vertices (2). However, there might also be a ShootState or a BackupEntry resource strictly associated with this Shoot, hence, it has incoming edges from these vertices (3).\nIn the above picture, the resources that are actively watched are shaded. Gardener resources are green, while Kubernetes resources are blue. It shows the dependencies between the resources and how the graph is built based on the above rules.\nℹ️ The above picture shows all resources that may be accessed by gardenlets/extensions, except for the Quota resource which is only included for completeness.\nNow, when a gardenlet/extension wants to access certain resources, then the SeedAuthorizer uses a Depth-First traversal starting from the vertex representing the resource in question, e.g., from a Project vertex. If there is a path from the Project vertex to the vertex representing the Seed the gardenlet/extension is responsible for. then it allows the request.\nMetrics The SeedAuthorizer registers the following metrics related to the mentioned graph implementation:\n Metric Description gardener_admission_controller_seed_authorizer_graph_update_duration_seconds Histogram of duration of resource dependency graph updates in seed authorizer, i.e., how long does it take to update the graph’s vertices/edges when a resource is created, changed, or deleted. gardener_admission_controller_seed_authorizer_graph_path_check_duration_seconds Histogram of duration of checks whether a path exists in the resource dependency graph in seed authorizer. Debug Handler When the .server.enableDebugHandlers field in the gardener-admission-controller’s component configuration is set to true, then it serves a handler that can be used for debugging the resource dependency graph under /debug/resource-dependency-graph.\n🚨 Only use this setting for development purposes, as it enables unauthenticated users to view all data if they have access to the gardener-admission-controller component.\nThe handler renders an HTML page displaying the current graph with a list of vertices and its associated incoming and outgoing edges to other vertices. Depending on the size of the Gardener landscape (and consequently, the size of the graph), it might not be possible to render it in its entirety. If there are more than 2000 vertices, then the default filtering will selected for kind=Seed to prevent overloading the output.\nExample output:\n------------------------------------------------------------------------------- | | # Seed:my-seed | \u003c- (11) | BackupBucket:73972fe2-3d7e-4f61-a406-b8f9e670e6b7 | BackupEntry:garden-my-project/shoot--dev--my-shoot--4656a460-1a69-4f00-9372-7452cbd38ee3 | ControllerInstallation:dns-external-mxt8m | ControllerInstallation:extension-shoot-cert-service-4qw5j | ControllerInstallation:networking-calico-bgrb2 | ControllerInstallation:os-gardenlinux-qvb5z | ControllerInstallation:provider-gcp-w4mvf | Secret:garden/backup | Shoot:garden-my-project/my-shoot | ------------------------------------------------------------------------------- | | # Shoot:garden-my-project/my-shoot | \u003c- (5) | CloudProfile:gcp | Namespace:garden-my-project | Secret:garden-my-project/my-dns-secret | SecretBinding:garden-my-project/my-credentials | ShootState:garden-my-project/my-shoot | -\u003e (1) | Seed:my-seed | ------------------------------------------------------------------------------- | | # ShootState:garden-my-project/my-shoot | -\u003e (1) | Shoot:garden-my-project/my-shoot | ------------------------------------------------------------------------------- ... (etc., similarly for the other resources) There are anchor links to easily jump from one resource to another, and the page provides means for filtering the results based on the kind, namespace, and/or name.\nPitfalls When there is a relevant update to an existing resource, i.e., when a reference to another resource is changed, then the corresponding vertex (along with all associated edges) is first deleted from the graph before it gets added again with the up-to-date edges. However, this does only work for vertices belonging to resources that are only created in exactly one “watch handler”. For example, the vertex for a SecretBinding can either be created in the SecretBinding handler itself or in the Shoot handler. In such cases, deleting the vertex before (re-)computing the edges might lead to race conditions and potentially renders the graph invalid. Consequently, instead of deleting the vertex, only the edges the respective handler is responsible for are deleted. If the vertex ends up with no remaining edges, then it also gets deleted automatically. Afterwards, the vertex can either be added again or the updated edges can be created.\nSeedRestriction Admission Webhook Enablement The SeedRestriction is implemented as Kubernetes admission webhook and part of the gardener-admission-controller component running in the garden cluster.\n🎛 In order to activate it, you have to set .global.admission.seedRestriction.enabled=true when using the Gardener controlplane Helm chart. This will add an additional webhook in the existing ValidatingWebhookConfiguration of the gardener-admission-controller which contains the configuration for the SeedRestriction handler. Please note that it should only be activated when the SeedAuthorizer is active as well.\nAdmission Decisions The admission’s purpose is to perform extended validation on requests which require the body of the object in question. Additionally, it handles CREATE requests of gardenlets/extensions (the above discussed resource dependency graph cannot be used in such cases because there won’t be any vertex/edge for non-existing resources).\nGardenlets/extensions are restricted to only create new resources which are somehow related to the seed clusters they are responsible for.\n","categories":"","description":"","excerpt":"Scoped API Access for gardenlets and Extensions By default, gardenlets …","ref":"/docs/gardener/deployment/gardenlet_api_access/","tags":"","title":"Scoped API Access for gardenlets and Extensions"},{"body":"SecretBinding Provider Controller This page describes the process on how to enable the SecretBinding provider controller.\nOverview With Gardener v1.38.0, the SecretBinding resource now contains a new optional field .provider.type (details about the motivation can be found in https://github.com/gardener/gardener/issues/4888). To make the process of setting the new field automated and afterwards to enforce validation on the new field in backwards compatible manner, Gardener features the SecretBinding provider controller and a feature gate - SecretBindingProviderValidation.\nProcess A Gardener landscape operator can follow the following steps:\n Enable the SecretBinding provider controller of Gardener Controller Manager.\nThe SecretBinding provider controller is responsible for populating the .provider.type field of a SecretBinding based on its current usage by Shoot resources. For example, if a Shoot crazy-botany with .provider.type=aws is using a SecretBinding my-secret-binding, then the SecretBinding provider controller will take care to set the .provider.type field of the SecretBinding to the same provider type (aws). To enable the SecretBinding provider controller, set the controller.secretBindingProvider.concurrentSyncs field in the ControllerManagerConfiguration (e.g set it to 5). Although that it is not recommended, the API allows Shoots from different provider types to reference the same SecretBinding (assuming that the backing Secret contains data for both of the provider types). To preserve the backwards compatibility for such SecretBindings, the provider controller will maintain the multiple provider types in the field (it will join them with the separator , - for example aws,gcp).\n Disable the SecretBinding provider controller and enable the SecretBindingProviderValidation feature gate of Gardener API server.\nThe SecretBindingProviderValidation feature gate of Gardener API server enables a set of validations for the SecretBinding provider field. It forbids creating a Shoot that has a different provider type from the referenced SecretBinding’s one. It also enforces immutability on the field. After making sure that SecretBinding provider controller is enabled and it populated the .provider.type field of a majority of the SecretBindings on a Gardener landscape (the SecretBindings that are unused will have their provider type unset), a Gardener landscape operator has to disable the SecretBinding provider controller and to enable the SecretBindingProviderValidation feature gate of Gardener API server. To disable the SecretBinding provider controller, set the controller.secretBindingProvider.concurrentSyncs field in the ControllerManagerConfiguration to 0.\n Implementation History Gardener v1.38: The SecretBinding resource has a new optional field .provider.type. The SecretBinding provider controller is disabled by default. The SecretBindingProviderValidation feature gate of Gardener API server is disabled by default. Gardener v1.42: The SecretBinding provider controller is enabled by default. Gardener v1.51: The SecretBindingProviderValidation feature gate of Gardener API server is enabled by default and the SecretBinding provider controller is disabled by default. Gardener v1.53: The SecretBindingProviderValidation feature gate of Gardener API server is unconditionally enabled (can no longer be disabled). Gardener v1.55: The SecretBindingProviderValidation feature gate of Gardener API server and the SecretBinding provider controller are removed. ","categories":"","description":"","excerpt":"SecretBinding Provider Controller This page describes the process on …","ref":"/docs/gardener/deployment/secret_binding_provider_controller/","tags":"","title":"Secret Binding Provider Controller"},{"body":"Secrets Management for Seed and Shoot Cluster The gardenlet needs to create quite some amount of credentials (certificates, private keys, passwords) for seed and shoot clusters in order to ensure secure deployments. Such credentials typically should be renewed automatically when their validity expires, rotated regularly, and they potentially need to be persisted such that they don’t get lost in case of a control plane migration or a lost seed cluster.\nSecretsManager Introduction These requirements can be covered by using the SecretsManager package maintained in pkg/utils/secrets/manager. It is built on top of the ConfigInterface and DataInterface interfaces part of pkg/utils/secrets and provides the following functions:\n Generate(context.Context, secrets.ConfigInterface, ...GenerateOption) (*corev1.Secret, error)\nThis method either retrieves the current secret for the given configuration or it (re)generates it in case the configuration changed, the signing CA changed (for certificate secrets), or when proactive rotation was triggered. If the configuration describes a certificate authority secret then this method automatically generates a bundle secret containing the current and potentially the old certificate. Available GenerateOptions:\n SignedByCA(string, ...SignedByCAOption): This is only valid for certificate secrets and automatically retrieves the correct certificate authority in order to sign the provided server or client certificate. There are two SignedByCAOptions: UseCurrentCA. This option will sign server certificates with the new/current CA in case of a CA rotation. For more information, please refer to the “Certificate Signing” section below. UseOldCA. This option will sign client certificates with the old CA in case of a CA rotation. For more information, please refer to the “Certificate Signing” section below. Persist(): This marks the secret such that it gets persisted in the ShootState resource in the garden cluster. Consequently, it should only be used for secrets related to a shoot cluster. Rotate(rotationStrategy): This specifies the strategy in case this secret is to be rotated or regenerated (either InPlace which immediately forgets about the old secret, or KeepOld which keeps the old secret in the system). IgnoreOldSecrets(): This specifies that old secrets should not be considered and loaded (contrary to the default behavior). It should be used when old secrets are no longer important and can be “forgotten” (e.g. in “phase 2” (t2) of the CA certificate rotation). Such old secrets will be deleted on Cleanup(). IgnoreOldSecretsAfter(time.Duration): This specifies that old secrets should not be considered and loaded once a given duration after rotation has passed. It can be used to clean up old secrets after automatic rotation (e.g. the Seed cluster CA is automatically rotated when its validity will soon end and the old CA will be cleaned up 24 hours after triggering the rotation). Validity(time.Duration): This specifies how long the secret should be valid. For certificate secret configurations, the manager will automatically deduce this information from the generated certificate. RenewAfterValidityPercentage(int): This specifies the percentage of validity for renewal. The secret will be renewed based on whichever comes first: The specified percentage of validity or 10 days before end of validity. If not specified, the default percentage is 80. Get(string, ...GetOption) (*corev1.Secret, bool)\nThis method retrieves the current secret for the given name. In case the secret in question is a certificate authority secret then it retrieves the bundle secret by default. It is important that this method only knows about secrets for which there were prior Generate calls. Available GetOptions:\n Bundle (default): This retrieves the bundle secret. Current: This retrieves the current secret. Old: This retrieves the old secret. Cleanup(context.Context) error\nThis method deletes secrets which are no longer required. No longer required secrets are those still existing in the system which weren’t detected by prior Generate calls. Consequently, only call Cleanup after you have executed Generate calls for all desired secrets.\n Some exemplary usages would look as follows:\nsecret, err := k.secretsManager.Generate( ctx, \u0026secrets.CertificateSecretConfig{ Name: \"my-server-secret\", CommonName: \"server-abc\", DNSNames: []string{\"first-name\", \"second-name\"}, CertType: secrets.ServerCert, SkipPublishingCACertificate: true, }, secretsmanager.SignedByCA(\"my-ca\"), secretsmanager.Persist(), secretsmanager.Rotate(secretsmanager.InPlace), ) if err != nil { return err } As explained above, the caller does not need to care about the renewal, rotation or the persistence of this secret - all of these concerns are handled by the secrets manager. Automatic renewal of secrets happens when their validity approaches 80% or less than 10d are left until expiration.\nIn case a CA certificate is needed by some component, then it can be retrieved as follows:\ncaSecret, found := k.secretsManager.Get(\"my-ca\") if !found { return fmt.Errorf(\"secret my-ca not found\") } As explained above, this returns the bundle secret for the CA my-ca which might potentially contain both the current and the old CA (in case of rotation/regeneration).\nCertificate Signing Default Behaviour By default, client certificates are signed by the current CA while server certificate are signed by the old CA (if it exists). This is to ensure a smooth exchange of certificate during a CA rotation (typically has two phases, ref GEP-18):\n Client certificates: In phase 1, clients get new certificates as soon as possible to ensure that all clients have been adapted before phase 2. In phase 2, the respective server drops accepting certificates signed by the old CA. Server certificates: In phase 1, servers still use their old/existing certificates to allow clients to update their CA bundle used for verification of the servers’ certificates. In phase 2, the old CA is dropped, hence servers need to get a certificate signed by the new/current CA. At this point in time, clients have already adapted their CA bundles. Alternative: Sign Server Certificates with Current CA In case you control all clients and update them at the same time as the server, it is possible to make the secrets manager generate even server certificates with the new/current CA. This can help to prevent certificate mismatches when the CA bundle is already exchanged while the server still serves with a certificate signed by a CA no longer part of the bundle.\nLet’s consider the two following examples:\n gardenlet deploys a webhook server (gardener-resource-manager) and a corresponding MutatingWebhookConfiguration at the same time. In this case, the server certificate should be generated with the new/current CA to avoid above mentioned certificate mismatches during a CA rotation. gardenlet deploys a server (etcd) in one step, and a client (kube-apiserver) in a subsequent step. In this case, the default behaviour should apply (server certificate should be signed by old/existing CA). Alternative: Sign Client Certificate with Old CA In the unusual case where the client is deployed before the server, it might be useful to always use the old CA for signing the client’s certificate. This can help to prevent certificate mismatches when the client already gets a new certificate while the server still only accepts certificates signed by the old CA.\nLet’s consider the two following examples:\n gardenlet deploys the kube-apiserver before the kubelet. However, the kube-apiserver has a client certificate signed by the ca-kubelet in order to communicate with it (e.g., when retrieving logs or forwarding ports). In this case, the client certificate should be generated with the old CA to avoid above mentioned certificate mismatches during a CA rotation. gardenlet deploys a server (etcd) in one step, and a client (kube-apiserver) in a subsequent step. In this case, the default behaviour should apply (client certificate should be signed by new/current CA). Reusing the SecretsManager in Other Components While the SecretsManager is primarily used by gardenlet, it can be reused by other components (e.g. extensions) as well for managing secrets that are specific to the component or extension. For example, provider extensions might use their own SecretsManager instance for managing the serving certificate of cloud-controller-manager.\nExternal components that want to reuse the SecretsManager should consider the following aspects:\n On initialization of a SecretsManager, pass an identity specific to the component, controller and purpose. For example, gardenlet’s shoot controller uses gardenlet as the SecretsManager’s identity, the Worker controller in provider-foo should use provider-foo-worker, and the ControlPlane controller should use provider-foo-controlplane-exposure for ControlPlane objects of purpose exposure. The given identity is added as a value for the manager-identity label on managed Secrets. This label is used by the Cleanup function to select only those Secrets that are actually managed by the particular SecretManager instance. This is done to prevent removing still needed Secrets that are managed by other instances. Generate dedicated CAs for signing certificates instead of depending on CAs managed by gardenlet. Names of Secrets managed by external SecretsManager instances must not conflict with Secret names from other instances (e.g. gardenlet). For CAs that should be rotated in lock-step with the Shoot CAs managed by gardenlet, components need to pass information about the last rotation initiation time and the current rotation phase to the SecretsManager upon initialization. The relevant information can be retrieved from the Cluster resource under .spec.shoot.status.credentials.rotation.certificateAuthorities. Independent of the specific identity, secrets marked with the Persist option are automatically saved in the ShootState resource by the gardenlet and are also restored by the gardenlet on Control Plane Migration to the new Seed. Migrating Existing Secrets To SecretsManager If you already have existing secrets which were not created with SecretsManager, then you can (optionally) migrate them by labeling them with secrets-manager-use-data-for-name=\u003cconfig-name\u003e. For example, if your SecretsManager generates a CertificateConfigSecret with name foo like this\nsecret, err := k.secretsManager.Generate( ctx, \u0026secrets.CertificateSecretConfig{ Name: \"foo\", // ... }, ) and you already have an existing secret in your system whose data should be kept instead of regenerated, then labeling it with secrets-manager-use-data-for-name=foo will instruct SecretsManager accordingly.\n⚠️ Caveat: You have to make sure that the existing data keys match with what SecretsManager uses:\n Secret Type Data Keys Basic Auth username, password, auth CA Certificate ca.crt, ca.key Non-CA Certificate tls.crt, tls.key Control Plane Secret ca.crt, username, password, token, kubeconfig ETCD Encryption Key key, secret Kubeconfig kubeconfig RSA Private Key id_rsa, id_rsa.pub Static Token static_tokens.csv VPN TLS Auth vpn.tlsauth Implementation Details The source of truth for the secrets manager is the list of Secrets in the Kubernetes cluster it acts upon (typically, the seed cluster). The persisted secrets in the ShootState are only used if and only if the shoot is in the Restore phase - in this case all secrets are just synced to the seed cluster so that they can be picked up by the secrets manager.\nIn order to prevent kubelets from unneeded watches (thus, causing some significant traffic against the kube-apiserver), the Secrets are marked as immutable. Consequently, they have a unique, deterministic name which is computed as follows:\n For CA secrets, the name is just exactly the name specified in the configuration (e.g., ca). This is for backwards-compatibility and will be dropped in a future release once all components depending on the static name have been adapted. For all other secrets, the name specified in the configuration is used as prefix followed by an 8-digit hash. This hash is computed out of the checksum of the secret configuration and the checksum of the certificate of the signing CA (only for certificate configurations). In all cases, the name of the secrets is suffixed with a 5-digit hash computed out of the time when the rotation for this secret was last started.\n","categories":"","description":"","excerpt":"Secrets Management for Seed and Shoot Cluster The gardenlet needs to …","ref":"/docs/gardener/secrets_management/","tags":"","title":"Secrets Management"},{"body":"Packages:\n security.gardener.cloud/v1alpha1 security.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: CredentialsBinding WorkloadIdentity CredentialsBinding CredentialsBinding represents a binding to credentials in the same or another namespace.\n Field Description apiVersion string security.gardener.cloud/v1alpha1 kind string CredentialsBinding metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. provider CredentialsBindingProvider Provider defines the provider type of the CredentialsBinding. This field is immutable.\n credentialsRef Kubernetes core/v1.ObjectReference CredentialsRef is a reference to a resource holding the credentials. Accepted resources are core/v1.Secret and security.gardener.cloud/v1alpha1.WorkloadIdentity This field is immutable.\n quotas []Kubernetes core/v1.ObjectReference (Optional) Quotas is a list of references to Quota objects in the same or another namespace. This field is immutable.\n WorkloadIdentity WorkloadIdentity is resource that allows workloads to be presented before external systems by giving them identities managed by the Gardener API server. The identity of such workload is represented by JSON Web Token issued by the Gardener API server. Workload identities are designed to be used by components running in the Gardener environment, seed or runtime cluster, that make use of identity federation inspired by the OIDC protocol.\n Field Description apiVersion string security.gardener.cloud/v1alpha1 kind string WorkloadIdentity metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec WorkloadIdentitySpec Spec configures the JSON Web Token issued by the Gardener API server.\n audiences []string Audiences specify the list of recipients that the JWT is intended for. The values of this field will be set in the ‘aud’ claim.\n targetSystem TargetSystem TargetSystem represents specific configurations for the system that will accept the JWTs.\n status WorkloadIdentityStatus Status contain the latest observed status of the WorkloadIdentity.\n ContextObject (Appears on: TokenRequestSpec) ContextObject identifies the object the token is requested for.\n Field Description kind string Kind of the object the token is requested for. Valid kinds are ‘Shoot’, ‘Seed’, etc.\n apiVersion string API version of the object the token is requested for.\n name string Name of the object the token is requested for.\n namespace string (Optional) Namespace of the object the token is requested for.\n uid k8s.io/apimachinery/pkg/types.UID UID of the object the token is requested for.\n CredentialsBindingProvider (Appears on: CredentialsBinding) CredentialsBindingProvider defines the provider type of the CredentialsBinding.\n Field Description type string Type is the type of the provider.\n TargetSystem (Appears on: WorkloadIdentitySpec) TargetSystem represents specific configurations for the system that will accept the JWTs.\n Field Description type string Type is the type of the target system.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to extension resource.\n TokenRequest TokenRequest is a resource that is used to request WorkloadIdentity tokens.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec TokenRequestSpec Spec holds configuration settings for the requested token.\n contextObject ContextObject (Optional) ContextObject identifies the object the token is requested for.\n expirationSeconds int64 (Optional) ExpirationSeconds specifies for how long the requested token should be valid.\n status TokenRequestStatus Status bears the issued token with additional information back to the client.\n TokenRequestSpec (Appears on: TokenRequest) TokenRequestSpec holds configuration settings for the requested token.\n Field Description contextObject ContextObject (Optional) ContextObject identifies the object the token is requested for.\n expirationSeconds int64 (Optional) ExpirationSeconds specifies for how long the requested token should be valid.\n TokenRequestStatus (Appears on: TokenRequest) TokenRequestStatus bears the issued token with additional information back to the client.\n Field Description token string Token is the issued token.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the time of expiration of the returned token.\n WorkloadIdentitySpec (Appears on: WorkloadIdentity) WorkloadIdentitySpec configures the JSON Web Token issued by the Gardener API server.\n Field Description audiences []string Audiences specify the list of recipients that the JWT is intended for. The values of this field will be set in the ‘aud’ claim.\n targetSystem TargetSystem TargetSystem represents specific configurations for the system that will accept the JWTs.\n WorkloadIdentityStatus (Appears on: WorkloadIdentity) WorkloadIdentityStatus contain the latest observed status of the WorkloadIdentity.\n Field Description sub string Sub contains the computed value of the subject that is going to be set in JWTs ‘sub’ claim.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n security.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/security/","tags":"","title":"Security"},{"body":"Gardener Security Release Process Gardener is a growing community of volunteers and users. The Gardener community has adopted this security disclosure and response policy to ensure we responsibly handle critical issues.\nGardener Security Team Security vulnerabilities should be handled quickly and sometimes privately. The primary goal of this process is to reduce the total time users are vulnerable to publicly known exploits. The Gardener Security Team is responsible for organizing the entire response, including internal communication and external disclosure, but will need help from relevant developers and release managers to successfully run this process. The Gardener Security Team consists of the following volunteers:\n Vasu Chandrasekhara, (@vasu1124) Christian Cwienk, (@ccwienk) Donka Dimitrova, (@donistz) Claudia Hölters, (@hoeltcl) Vedran Lerenc, (@vlerenc) Dirk Marwinski, (@marwinski) Jordan Jordanov, (@jordanjordanov) Frederik Thormaehlen, (@ThormaehlenFred) Disclosures Private Disclosure Process The Gardener community asks that all suspected vulnerabilities be privately and responsibly disclosed. If you’ve found a vulnerability or a potential vulnerability in Gardener, please let us know by writing an e-mail to secure@sap.com. We’ll send a confirmation e-mail to acknowledge your report, and we’ll send an additional e-mail when we’ve identified the issue positively or negatively.\nPublic Disclosure Process If you know of a publicly disclosed vulnerability please IMMEDIATELY write an e-mail to secure@sap.com to inform the Gardener Security Team about the vulnerability so they may start the patch, release, and communication process.\nIf possible, the Gardener Security Team will ask the person making the public report if the issue can be handled via a private disclosure process (for example, if the full exploit details have not yet been published). If the reporter denies the request for private disclosure, the Gardener Security Team will move swiftly with the fix and release process. In extreme cases GitHub can be asked to delete the issue but this generally isn’t necessary and is unlikely to make a public disclosure less damaging.\nPatch, Release, and Public Communication For each vulnerability, a member of the Gardener Security Team will volunteer to lead coordination with the “Fix Team” and is responsible for sending disclosure e-mails to the rest of the community. This lead will be referred to as the “Fix Lead.” The role of the Fix Lead should rotate round-robin across the Gardener Security Team. Note that given the current size of the Gardener community it is likely that the Gardener Security Team is the same as the “Fix Team” (i.e., all maintainers).\nThe Gardener Security Team may decide to bring in additional contributors for added expertise depending on the area of the code that contains the vulnerability. All of the timelines below are suggestions and assume a private disclosure. The Fix Lead drives the schedule using his best judgment based on severity and development time.\nIf the Fix Lead is dealing with a public disclosure, all timelines become ASAP (assuming the vulnerability has a CVSS score \u003e= 7; see below). If the fix relies on another upstream project’s disclosure timeline, that will adjust the process as well. We will work with the upstream project to fit their timeline and best protect our users.\nFix Team Organization The Fix Lead will work quickly to identify relevant engineers from the affected projects and packages and CC those engineers into the disclosure thread. These selected developers are the Fix Team. The Fix Lead will give the Fix Team access to a private security repository to develop the fix.\nFix Development Process The Fix Lead and the Fix Team will create a CVSS using the CVSS Calculator. The Fix Lead makes the final call on the calculated CVSS; it is better to move quickly than make the CVSS perfect.\nThe Fix Team will notify the Fix Lead that work on the fix branch is complete once there are LGTMs on all commits in the private repository from one or more maintainers.\nIf the CVSS score is under 7.0 (a medium severity score) the Fix Team can decide to slow the release process down in the face of holidays, developer bandwidth, etc. These decisions must be discussed on the private Gardener Security mailing list.\nFix Disclosure Process With the fix development underway, the Fix Lead needs to come up with an overall communication plan for the wider community. This Disclosure process should begin after the Fix Team has developed a Fix or mitigation so that a realistic timeline can be communicated to users. The Fix Lead will inform the Gardener mailing list that a security vulnerability has been disclosed and that a fix will be made available in the future on a certain release date. The Fix Lead will include any mitigating steps users can take until a fix is available. The communication to Gardener users should be actionable. They should know when to block time to apply patches, understand exact mitigation steps, etc.\nFix Release Day The Release Managers will ensure all the binaries are built, publicly available, and functional before the Release Date. The Release Managers will create a new patch release branch from the latest patch release tag + the fix from the security branch. As a practical example, if v0.12.0 is the latest patch release in gardener.git, a new branch will be created called v0.12.1 which includes only patches required to fix the issue. The Fix Lead will cherry-pick the patches onto the master branch and all relevant release branches. The Fix Team will LGTM and merge. The Release Managers will merge these PRs as quickly as possible.\nChanges shouldn’t be made to the commits, even for a typo in the CHANGELOG, as this will change the git sha of the already built commits, leading to confusion and potentially conflicts as the fix is cherry-picked around branches. The Fix Lead will request a CVE from the SAP Product Security Response Team via email to cna@sap.com with all the relevant information (description, potential impact, affected version, fixed version, CVSS v3 base score, and supporting documentation for the CVSS score) for every vulnerability. The Fix Lead will inform the Gardener mailing list and announce the new releases, the CVE number (if available), the location of the binaries, and the relevant merged PRs to get wide distribution and user action.\nAs much as possible, this e-mail should be actionable and include links how to apply the fix to users environments; this can include links to external distributor documentation. The recommended target time is 4pm UTC on a non-Friday weekday. This means the announcement will be seen morning Pacific, early evening Europe, and late evening Asia. The Fix Lead will remove the Fix Team from the private security repository.\nRetrospective These steps should be completed after the Release Date. The retrospective process should be blameless.\nThe Fix Lead will send a retrospective of the process to the Gardener mailing list including details on everyone involved, the timeline of the process, links to relevant PRs that introduced the issue, if relevant, and any critiques of the response and release process. The Release Managers and Fix Team are also encouraged to send their own feedback on the process to the Gardener mailing list. Honest critique is the only way we are going to get good at this as a community.\nCommunication Channel The private or public disclosure process should be triggered exclusively by writing an e-mail to secure@sap.com.\nGardener security announcements will be communicated by the Fix Lead sending an e-mail to the Gardener mailing list (reachable via gardener@googlegroups.com), as well as posting a link in the Gardener Slack channel.\nPublic discussions about Gardener security announcements and retrospectives will primarily happen in the Gardener mailing list. Thus Gardener community members who are interested in participating in discussions related to the Gardener Security Release Process are encouraged to join the Gardener mailing list (how to find and join a group).\nThe members of the Gardener Security Team are subscribed to the private Gardener Security mailing list (reachable via gardener-security@googlegroups.com).\n","categories":"","description":"","excerpt":"Gardener Security Release Process Gardener is a growing community of …","ref":"/docs/contribute/code/security-guide/","tags":"","title":"Security Release Process"},{"body":"Seed Bootstrapping Whenever the gardenlet is responsible for a new Seed resource its “seed controller” is being activated. One part of this controller’s reconciliation logic is deploying certain components into the garden namespace of the seed cluster itself. These components are required to spawn and manage control planes for shoot clusters later on. This document is providing an overview which actions are performed during this bootstrapping phase, and it explains the rationale behind them.\nDependency Watchdog The dependency watchdog (abbreviation: DWD) is a component developed separately in the gardener/dependency-watchdog GitHub repository. Gardener is using it for two purposes:\n Prevention of melt-down situations when the load balancer used to expose the kube-apiserver of shoot clusters goes down while the kube-apiserver itself is still up and running. Fast recovery times for crash-looping pods when depending pods are again available. For the sake of separating these concerns, two instances of the DWD are deployed by the seed controller.\nProber The dependency-watchdog-prober deployment is responsible for above-mentioned first point.\nThe kube-apiserver of shoot clusters is exposed via a load balancer, usually with an attached public IP, which serves as the main entry point when it comes to interaction with the shoot cluster (e.g., via kubectl). While end-users are talking to their clusters via this load balancer, other control plane components like the kube-controller-manager or kube-scheduler run in the same namespace/same cluster, so they can communicate via the in-cluster Service directly instead of using the detour with the load balancer. However, the worker nodes of shoot clusters run in isolated, distinct networks. This means that the kubelets and kube-proxys also have to talk to the control plane via the load balancer.\nThe kube-controller-manager has a special control loop called nodelifecycle which will set the status of Nodes to NotReady in case the kubelet stops to regularly renew its lease/to send its heartbeat. This will trigger other self-healing capabilities of Kubernetes, for example, the eviction of pods from such “unready” nodes to healthy nodes. Similarly, the cloud-controller-manager has a control loop that will disconnect load balancers from “unready” nodes, i.e., such workload would no longer be accessible until moved to a healthy node. Furthermore, the machine-controller-manager removes “unready” nodes after health-timeout (default 10min).\nWhile these are awesome Kubernetes features on their own, they have a dangerous drawback when applied in the context of Gardener’s architecture: When the kube-apiserver load balancer fails for whatever reason, then the kubelets can’t talk to the kube-apiserver to renew their lease anymore. After a minute or so the kube-controller-manager will get the impression that all nodes have died and will mark them as NotReady. This will trigger above mentioned eviction as well as detachment of load balancers. As a result, the customer’s workload will go down and become unreachable.\nThis is exactly the situation that the DWD prevents: It regularly tries to talk to the kube-apiservers of the shoot clusters, once by using their load balancer, and once by talking via the in-cluster Service. If it detects that the kube-apiserver is reachable internally but not externally, it scales down machine-controller-manager, cluster-autoscaler (if enabled) and kube-controller-manager to 0. This will prevent it from marking the shoot worker nodes as “unready”. This will also prevent the machine-controller-manager from deleting potentially healthy nodes. As soon as the kube-apiserver is reachable externally again, kube-controller-manager, machine-controller-manager and cluster-autoscaler are restored to the state prior to scale-down.\nWeeder The dependency-watchdog-weeder deployment is responsible for above mentioned second point.\nKubernetes is restarting failing pods with an exponentially increasing backoff time. While this is a great strategy to prevent system overloads, it has the disadvantage that the delay between restarts is increasing up to multiple minutes very fast.\nIn the Gardener context, we are deploying many components that are depending on other components. For example, the kube-apiserver is depending on a running etcd, or the kube-controller-manager and kube-scheduler are depending on a running kube-apiserver. In case such a “higher-level” component fails for whatever reason, the dependent pods will fail and end-up in crash-loops. As Kubernetes does not know anything about these hierarchies, it won’t recognize that such pods can be restarted faster as soon as their dependents are up and running again.\nThis is exactly the situation in which the DWD will become active: If it detects that a certain Service is available again (e.g., after the etcd was temporarily down while being moved to another seed node), then DWD will restart all crash-looping dependant pods. These dependant pods are detected via a pre-configured label selector.\nAs of today, the DWD is configured to restart a crash-looping kube-apiserver after etcd became available again, or any pod depending on the kube-apiserver that has a gardener.cloud/role=controlplane label (e.g., kube-controller-manager, kube-scheduler).\n","categories":"","description":"","excerpt":"Seed Bootstrapping Whenever the gardenlet is responsible for a new …","ref":"/docs/gardener/seed_bootstrapping/","tags":"","title":"Seed Bootstrapping"},{"body":"Settings for Seeds The Seed resource offers a few settings that are used to control the behaviour of certain Gardener components. This document provides an overview over the available settings:\nDependency Watchdog Gardenlet can deploy two instances of the dependency-watchdog into the garden namespace of the seed cluster. One instance only activates the weeder while the second instance only activates the prober.\nWeeder The weeder helps to alleviate the delay where control plane components remain unavailable by finding the respective pods in CrashLoopBackoff status and restarting them once their dependents become ready and available again. For example, if etcd goes down then also kube-apiserver goes down (and into a CrashLoopBackoff state). If etcd comes up again then (without the endpoint controller) it might take some time until kube-apiserver gets restarted as well.\n⚠️ .spec.settings.dependencyWatchdog.endpoint.enabled is deprecated and will be removed in a future version of Gardener. Use .spec.settings.dependencyWatchdog.weeder.enabled instead.\nIt can be enabled/disabled via the .spec.settings.dependencyWatchdog.endpoint.enabled field. It defaults to true.\nProber The probe controller scales down the kube-controller-manager of shoot clusters in case their respective kube-apiserver is not reachable via its external ingress. This is in order to avoid melt-down situations, since the kube-controller-manager uses in-cluster communication when talking to the kube-apiserver, i.e., it wouldn’t be affected if the external access to the kube-apiserver is interrupted for whatever reason. The kubelets on the shoot worker nodes, however, would indeed be affected since they typically run in different networks and use the external ingress when talking to the kube-apiserver. Hence, without scaling down kube-controller-manager, the nodes might be marked as NotReady and eventually replaced (since the kubelets cannot report their status anymore). To prevent such unnecessary turbulence, kube-controller-manager is being scaled down until the external ingress becomes available again. In addition, as a precautionary measure, machine-controller-manager is also scaled down, along with cluster-autoscaler which depends on machine-controller-manager.\n⚠️ .spec.settings.dependencyWatchdog.probe.enabled is deprecated and will be removed in a future version of Gardener. Use .spec.settings.dependencyWatchdog.prober.enabled instead.\nIt can be enabled/disabled via the .spec.settings.dependencyWatchdog.probe.enabled field. It defaults to true.\nReserve Excess Capacity If the excess capacity reservation is enabled, then the gardenlet will deploy a special Deployment into the garden namespace of the seed cluster. This Deployment’s pod template has only one container, the pause container, which simply runs in an infinite loop. The priority of the deployment is very low, so any other pod will preempt these pause pods. This is especially useful if new shoot control planes are created in the seed. In case the seed cluster runs at its capacity, then there is no waiting time required during the scale-up. Instead, the low-priority pause pods will be preempted and allow newly created shoot control plane pods to be scheduled fast. In the meantime, the cluster-autoscaler will trigger the scale-up because the preempted pause pods want to run again. However, this delay doesn’t affect the important shoot control plane pods, which will improve the user experience.\nUse .spec.settings.excessCapacityReservation.configs to create excess capacity reservation deployments which allow to specify custom values for resources, nodeSelector and tolerations. Each config creates a deployment with a minium number of 2 replicas and a maximum equal to the number of zones configured for this seed. It defaults to a config reserving 2 CPUs and 6Gi of memory for each pod with no nodeSelector and no tolerations.\nExcess capacity reservation is enabled when .spec.settings.excessCapacityReservation.enabled is true or not specified while configs are present. It can be disabled by setting the field to false.\nScheduling By default, the Gardener Scheduler will consider all seed clusters when a new shoot cluster shall be created. However, administrators/operators might want to exclude some of them from being considered by the scheduler. Therefore, seed clusters can be marked as “invisible”. In this case, the scheduler simply ignores them as if they wouldn’t exist. Shoots can still use the invisible seed but only by explicitly specifying the name in their .spec.seedName field.\nSeed clusters can be marked visible/invisible via the .spec.settings.scheduling.visible field. It defaults to true.\nℹ️ In previous Gardener versions (\u003c 1.5) these settings were controlled via taint keys (seed.gardener.cloud/{disable-capacity-reservation,invisible}). The taint keys are no longer supported and removed in version 1.12. The rationale behind it is the implementation of tolerations similar to Kubernetes tolerations. More information about it can be found in #2193.\nLoad Balancer Services Gardener creates certain Kubernetes Service objects of type LoadBalancer in the seed cluster. Most prominently, they are used for exposing the shoot control planes, namely the kube-apiserver of the shoot clusters. In most cases, the cloud-controller-manager (responsible for managing these load balancers on the respective underlying infrastructure) supports certain customization and settings via annotations. This document provides a good overview and many examples.\nBy setting the .spec.settings.loadBalancerServices.annotations field the Gardener administrator can specify a list of annotations, which will be injected into the Services of type LoadBalancer.\nExternal Traffic Policy Setting the external traffic policy to Local can be beneficial as it preserves the source IP address of client requests. In addition to that, it removes one hop in the data path and hence reduces request latency. On some cloud infrastructures, it can furthermore be used in conjunction with Service annotations as described above to prevent cross-zonal traffic from the load balancer to the backend pod.\nThe default external traffic policy is Cluster, meaning that all traffic from the load balancer will be sent to any cluster node, which then itself will redirect the traffic to the actual receiving pod. This approach adds a node to the data path, may cross the zone boundaries twice, and replaces the source IP with one of the cluster nodes.\nUsing external traffic policy Local drops the additional node, i.e., only cluster nodes with corresponding backend pods will be in the list of backends of the load balancer. However, this has multiple implications. The health check port in this scenario is exposed by kube-proxy , i.e., if kube-proxy is not working on a node a corresponding pod on the node will not receive traffic from the load balancer as the load balancer will see a failing health check. (This is quite different from ordinary service routing where kube-proxy is only responsible for setup, but does not need to run for its operation.) Furthermore, load balancing may become imbalanced if multiple pods run on the same node because load balancers will split the load equally among the nodes and not among the pods. This is mitigated by corresponding node anti affinities.\nOperators need to take these implications into account when considering switching external traffic policy to Local.\nProxy Protocol Traditionally, the client IP address can be used for security filtering measures, e.g. IP allow listing. However, for this to have any usefulness, the client IP address needs to be correctly transferred to the filtering entity.\nLoad balancers can either act transparently and simply pass the client IP on, or they terminate one connection and forward data on a new connection. The latter (intransparant) approach requires a separate way to propagate the client IP address. Common approaches are an HTTP header for TLS terminating load balancers or (HA) proxy protocol.\nFor level 3 load balancers, (HA) proxy protocol is the default way to preserve client IP addresses. As it prepends a small proxy protocol header before the actual workload data, the receiving server needs to be aware of it and handle it properly. This means that activating proxy protocol needs to happen on both load balancer and receiving server at/around the same time, as otherwise the receiving server will incorrectly interpret data as workload/proxy protocol header.\nFor disruption-free migration to proxy protocol, set .spec.settings.loadBalancerServices.proxyProtocol.allow to true. The migration path should be to enable the option and shortly thereafter also enable proxy protocol on the load balancer with infrastructure-specific means, e.g. a corresponding load balancer annotation.\nWhen switching back from use of proxy protocol to no use of it, use the inverse order, i.e. disable proxy protocol first on the load balancer before disabling .spec.settings.loadBalancerServices.proxyProtocol.allow.\nZone-Specific Settings In case a seed cluster is configured to use multiple zones via .spec.provider.zones, it may be necessary to configure the load balancers in individual zones in different way, e.g., by utilizing different annotations. One reason may be to reduce cross-zonal traffic and have zone-specific load balancers in place. Zone-specific load balancers may then be bound to zone-specific subnets or availability zones in the cloud infrastructure.\nBesides the load balancer annotations, it is also possible to set proxy protocol termination and the external traffic policy for each zone-specific load balancer individually.\nVertical Pod Autoscaler Gardener heavily relies on the Kubernetes vertical-pod-autoscaler component. By default, the seed controller deploys the VPA components into the garden namespace of the respective seed clusters. In case you want to manage the VPA deployment on your own or have a custom one, then you might want to disable the automatic deployment of Gardener. Otherwise, you might end up with two VPAs, which will cause erratic behaviour. By setting the .spec.settings.verticalPodAutoscaler.enabled=false, you can disable the automatic deployment.\n⚠️ In any case, there must be a VPA available for your seed cluster. Using a seed without VPA is not supported.\nVPA Pitfall: Excessive Resource Requests Making Pod Unschedulable VPA is unaware of node capacity, and can increase the resource requests of a pod beyond the capacity of any single node. Such pod is likely to become permanently unschedulable. That problem can be partly mitigated by using the VerticalPodAutoscaler.Spec.ResourcePolicy.ContainerPolicies[].MaxAllowed field to constrain pod resource requests to the level of nodes’ allocatable resources. The downside is that a pod constrained in such fashion would be using more resources than it has requested, and can starve for resources and/or negatively impact neighbour pods with which it is sharing a node.\nAs an alternative, in scenarios where MaxAllowed is not set, it is important to maintain a worker pool which can accommodate the highest level of resources that VPA would actually request for the pods it controls.\nFinally, the optimal strategy typically is to both ensure large enough worker pools, and, as an insurance, use MaxAllowed aligned with the allocatable resources of the largest worker.\nTopology-Aware Traffic Routing Refer to the Topology-Aware Traffic Routing documentation as this document contains the documentation for the topology-aware routing Seed setting.\n","categories":"","description":"","excerpt":"Settings for Seeds The Seed resource offers a few settings that are …","ref":"/docs/gardener/seed_settings/","tags":"","title":"Seed Settings"},{"body":"Packages:\n seedmanagement.gardener.cloud/v1alpha1 seedmanagement.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: Gardenlet ManagedSeed ManagedSeedSet Gardenlet Gardenlet represents a Gardenlet configuration for an unmanaged seed.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string Gardenlet metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec GardenletSpec (Optional) Specification of the Gardenlet.\n deployment GardenletSelfDeployment Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n kubeconfigSecretRef Kubernetes core/v1.LocalObjectReference (Optional) KubeconfigSecretRef is a reference to a secret containing a kubeconfig for the cluster to which gardenlet should be deployed. This is only used by gardener-operator for a very first gardenlet deployment. After that, gardenlet will continuously upgrade itself. If this field is empty, gardener-operator deploys it into its own runtime cluster.\n status GardenletStatus (Optional) Most recently observed status of the Gardenlet.\n ManagedSeed ManagedSeed represents a Shoot that is registered as Seed.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string ManagedSeed metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSpec (Optional) Specification of the ManagedSeed.\n shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n status ManagedSeedStatus (Optional) Most recently observed status of the ManagedSeed.\n ManagedSeedSet ManagedSeedSet represents a set of identical ManagedSeeds.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string ManagedSeedSet metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSetSpec (Optional) Spec defines the desired identities of ManagedSeeds and Shoots in this set.\n replicas int32 (Optional) Replicas is the desired number of replicas of the given Template. Defaults to 1.\n selector Kubernetes meta/v1.LabelSelector Selector is a label query over ManagedSeeds and Shoots that should match the replica count. It must match the ManagedSeeds and Shoots template’s labels. This field is immutable.\n template ManagedSeedTemplate Template describes the ManagedSeed that will be created if insufficient replicas are detected. Each ManagedSeed created / updated by the ManagedSeedSet will fulfill this template.\n shootTemplate github.com/gardener/gardener/pkg/apis/core/v1beta1.ShootTemplate ShootTemplate describes the Shoot that will be created if insufficient replicas are detected for hosting the corresponding ManagedSeed. Each Shoot created / updated by the ManagedSeedSet will fulfill this template.\n updateStrategy UpdateStrategy (Optional) UpdateStrategy specifies the UpdateStrategy that will be employed to update ManagedSeeds / Shoots in the ManagedSeedSet when a revision is made to Template / ShootTemplate.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the maximum number of revisions that will be maintained in the ManagedSeedSet’s revision history. Defaults to 10. This field is immutable.\n status ManagedSeedSetStatus (Optional) Status is the current status of ManagedSeeds and Shoots in this ManagedSeedSet.\n Bootstrap (string alias)\n (Appears on: GardenletConfig) Bootstrap describes a mechanism for bootstrapping gardenlet connection to the Garden cluster.\nGardenletConfig (Appears on: ManagedSeedSpec) GardenletConfig specifies gardenlet deployment parameters and the GardenletConfiguration used to configure gardenlet.\n Field Description deployment GardenletDeployment (Optional) Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n bootstrap Bootstrap (Optional) Bootstrap is the mechanism that should be used for bootstrapping gardenlet connection to the Garden cluster. One of ServiceAccount, BootstrapToken, None. If set to ServiceAccount or BootstrapToken, a service account or a bootstrap token will be created in the garden cluster and used to compute the bootstrap kubeconfig. If set to None, the gardenClientConnection.kubeconfig field will be used to connect to the Garden cluster. Defaults to BootstrapToken. This field is immutable.\n mergeWithParent bool (Optional) MergeWithParent specifies whether the GardenletConfiguration of the parent gardenlet should be merged with the specified GardenletConfiguration. Defaults to true. This field is immutable.\n GardenletDeployment (Appears on: GardenletConfig, GardenletSelfDeployment) GardenletDeployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n Field Description replicaCount int32 (Optional) ReplicaCount is the number of gardenlet replicas. Defaults to 2.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the number of old gardenlet ReplicaSets to retain to allow rollback. Defaults to 2.\n serviceAccountName string (Optional) ServiceAccountName is the name of the ServiceAccount to use to run gardenlet pods.\n image Image (Optional) Image is the gardenlet container image.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources are the compute resources required by the gardenlet container.\n podLabels map[string]string (Optional) PodLabels are the labels on gardenlet pods.\n podAnnotations map[string]string (Optional) PodAnnotations are the annotations on gardenlet pods.\n additionalVolumes []Kubernetes core/v1.Volume (Optional) AdditionalVolumes is the list of additional volumes that should be mounted by gardenlet containers.\n additionalVolumeMounts []Kubernetes core/v1.VolumeMount (Optional) AdditionalVolumeMounts is the list of additional pod volumes to mount into the gardenlet container’s filesystem.\n env []Kubernetes core/v1.EnvVar (Optional) Env is the list of environment variables to set in the gardenlet container.\n vpa bool (Optional) VPA specifies whether to enable VPA for gardenlet. Defaults to true.\nDeprecated: This field is deprecated and has no effect anymore. It will be removed in the future. TODO(rfranzke): Remove this field after v1.110 has been released.\n GardenletHelm (Appears on: GardenletSelfDeployment) GardenletHelm is the Helm deployment configuration for gardenlet.\n Field Description ociRepository github.com/gardener/gardener/pkg/apis/core/v1.OCIRepository OCIRepository defines where to pull the chart.\n GardenletSelfDeployment (Appears on: GardenletSpec) GardenletSelfDeployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n Field Description GardenletDeployment GardenletDeployment (Members of GardenletDeployment are embedded into this type.) (Optional) GardenletDeployment specifies common gardenlet deployment parameters.\n helm GardenletHelm Helm is the Helm deployment configuration.\n imageVectorOverwrite string (Optional) ImageVectorOverwrite is the image vector overwrite for the components deployed by this gardenlet.\n componentImageVectorOverwrite string (Optional) ComponentImageVectorOverwrite is the component image vector overwrite for the components deployed by this gardenlet.\n GardenletSpec (Appears on: Gardenlet) GardenletSpec specifies gardenlet deployment parameters and the configuration used to configure gardenlet.\n Field Description deployment GardenletSelfDeployment Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n kubeconfigSecretRef Kubernetes core/v1.LocalObjectReference (Optional) KubeconfigSecretRef is a reference to a secret containing a kubeconfig for the cluster to which gardenlet should be deployed. This is only used by gardener-operator for a very first gardenlet deployment. After that, gardenlet will continuously upgrade itself. If this field is empty, gardener-operator deploys it into its own runtime cluster.\n GardenletStatus (Appears on: Gardenlet) GardenletStatus is the status of a Gardenlet.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Gardenlet’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Gardenlet. It corresponds to the Gardenlet’s generation, which is updated on mutation by the API Server.\n Image (Appears on: GardenletDeployment) Image specifies container image parameters.\n Field Description repository string (Optional) Repository is the image repository.\n tag string (Optional) Tag is the image tag.\n pullPolicy Kubernetes core/v1.PullPolicy (Optional) PullPolicy is the image pull policy. One of Always, Never, IfNotPresent. Defaults to Always if latest tag is specified, or IfNotPresent otherwise.\n ManagedSeedSetSpec (Appears on: ManagedSeedSet) ManagedSeedSetSpec is the specification of a ManagedSeedSet.\n Field Description replicas int32 (Optional) Replicas is the desired number of replicas of the given Template. Defaults to 1.\n selector Kubernetes meta/v1.LabelSelector Selector is a label query over ManagedSeeds and Shoots that should match the replica count. It must match the ManagedSeeds and Shoots template’s labels. This field is immutable.\n template ManagedSeedTemplate Template describes the ManagedSeed that will be created if insufficient replicas are detected. Each ManagedSeed created / updated by the ManagedSeedSet will fulfill this template.\n shootTemplate github.com/gardener/gardener/pkg/apis/core/v1beta1.ShootTemplate ShootTemplate describes the Shoot that will be created if insufficient replicas are detected for hosting the corresponding ManagedSeed. Each Shoot created / updated by the ManagedSeedSet will fulfill this template.\n updateStrategy UpdateStrategy (Optional) UpdateStrategy specifies the UpdateStrategy that will be employed to update ManagedSeeds / Shoots in the ManagedSeedSet when a revision is made to Template / ShootTemplate.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the maximum number of revisions that will be maintained in the ManagedSeedSet’s revision history. Defaults to 10. This field is immutable.\n ManagedSeedSetStatus (Appears on: ManagedSeedSet) ManagedSeedSetStatus represents the current state of a ManagedSeedSet.\n Field Description observedGeneration int64 ObservedGeneration is the most recent generation observed for this ManagedSeedSet. It corresponds to the ManagedSeedSet’s generation, which is updated on mutation by the API Server.\n replicas int32 Replicas is the number of replicas (ManagedSeeds and their corresponding Shoots) created by the ManagedSeedSet controller.\n readyReplicas int32 ReadyReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller that have a Ready Condition.\n nextReplicaNumber int32 NextReplicaNumber is the ordinal number that will be assigned to the next replica of the ManagedSeedSet.\n currentReplicas int32 CurrentReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller from the ManagedSeedSet version indicated by CurrentRevision.\n updatedReplicas int32 UpdatedReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller from the ManagedSeedSet version indicated by UpdateRevision.\n currentRevision string CurrentRevision, if not empty, indicates the version of the ManagedSeedSet used to generate ManagedSeeds with smaller ordinal numbers during updates.\n updateRevision string UpdateRevision, if not empty, indicates the version of the ManagedSeedSet used to generate ManagedSeeds with larger ordinal numbers during updates\n collisionCount int32 (Optional) CollisionCount is the count of hash collisions for the ManagedSeedSet. The ManagedSeedSet controller uses this field as a collision avoidance mechanism when it needs to create the name for the newest ControllerRevision.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a ManagedSeedSet’s current state.\n pendingReplica PendingReplica (Optional) PendingReplica, if not empty, indicates the replica that is currently pending creation, update, or deletion. This replica is in a state that requires the controller to wait for it to change before advancing to the next replica.\n ManagedSeedSpec (Appears on: ManagedSeed, ManagedSeedTemplate) ManagedSeedSpec is the specification of a ManagedSeed.\n Field Description shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n ManagedSeedStatus (Appears on: ManagedSeed) ManagedSeedStatus is the status of a ManagedSeed.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a ManagedSeed’s current state.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this ManagedSeed. It corresponds to the ManagedSeed’s generation, which is updated on mutation by the API Server.\n ManagedSeedTemplate (Appears on: ManagedSeedSetSpec) ManagedSeedTemplate is a template for creating a ManagedSeed object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSpec (Optional) Specification of the desired behavior of the ManagedSeed.\n shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n PendingReplica (Appears on: ManagedSeedSetStatus) PendingReplica contains information about a replica that is currently pending creation, update, or deletion.\n Field Description name string Name is the replica name.\n reason PendingReplicaReason Reason is the reason for the replica to be pending.\n since Kubernetes meta/v1.Time Since is the moment in time since the replica is pending with the specified reason.\n retries int32 (Optional) Retries is the number of times the shoot operation (reconcile or delete) has been retried after having failed. Only applicable if Reason is ShootReconciling or ShootDeleting.\n PendingReplicaReason (string alias)\n (Appears on: PendingReplica) PendingReplicaReason is a string enumeration type that enumerates all possible reasons for a replica to be pending.\nRollingUpdateStrategy (Appears on: UpdateStrategy) RollingUpdateStrategy is used to communicate parameters for RollingUpdateStrategyType.\n Field Description partition int32 (Optional) Partition indicates the ordinal at which the ManagedSeedSet should be partitioned. Defaults to 0.\n Shoot (Appears on: ManagedSeedSpec) Shoot identifies the Shoot that should be registered as Seed.\n Field Description name string Name is the name of the Shoot that will be registered as Seed.\n UpdateStrategy (Appears on: ManagedSeedSetSpec) UpdateStrategy specifies the strategy that the ManagedSeedSet controller will use to perform updates. It includes any additional parameters necessary to perform the update for the indicated strategy.\n Field Description type UpdateStrategyType (Optional) Type indicates the type of the UpdateStrategy. Defaults to RollingUpdate.\n rollingUpdate RollingUpdateStrategy (Optional) RollingUpdate is used to communicate parameters when Type is RollingUpdateStrategyType.\n UpdateStrategyType (string alias)\n (Appears on: UpdateStrategy) UpdateStrategyType is a string enumeration type that enumerates all possible update strategies for the ManagedSeedSet controller.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n seedmanagement.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/seedmanagement/","tags":"","title":"Seedmanagement"},{"body":"Service Account Manager Overview With Gardener v1.47, a new role called serviceaccountmanager was introduced. This role allows to fully manage ServiceAccount’s in the project namespace and request tokens for them. This is the preferred way of managing the access to a project namespace, as it aims to replace the usage of the default ServiceAccount secrets that will no longer be generated automatically.\nActions Once assigned the serviceaccountmanager role, a user can create/update/delete ServiceAccounts in the project namespace.\nCreate a Service Account In order to create a ServiceAccount named “robot-user”, run the following kubectl command:\nkubectl -n project-abc create sa robot-user Request a Token for a Service Account A token for the “robot-user” ServiceAccount can be requested via the TokenRequest API in several ways:\nkubectl -n project-abc create token robot-user --duration=3600s directly calling the Kubernetes HTTP API curl -X POST https://api.gardener/api/v1/namespaces/project-abc/serviceaccounts/robot-user/token \\ -H \"Authorization: Bearer \u003cauth-token\u003e\" \\ -H \"Content-Type: application/json\" \\ -d '{ \"apiVersion\": \"authentication.k8s.io/v1\", \"kind\": \"TokenRequest\", \"spec\": { \"expirationSeconds\": 3600 } }' Mind that the returned token is not stored within the Kubernetes cluster, will be valid for 3600 seconds, and will be invalidated if the “robot-user” ServiceAccount is deleted. Although expirationSeconds can be modified depending on the needs, the returned token’s validity will not exceed the configured service-account-max-token-expiration duration for the garden cluster. It is advised that the actual expirationTimestamp is verified so that expectations are met. This can be done by asserting the expirationTimestamp in the TokenRequestStatus or the exp claim in the token itself.\nDelete a Service Account In order to delete the ServiceAccount named “robot-user”, run the following kubectl command:\nkubectl -n project-abc delete sa robot-user This will invalidate all existing tokens for the “robot-user” ServiceAccount.\n","categories":"","description":"The role that allows a user to manage ServiceAccounts in the project namespace","excerpt":"The role that allows a user to manage ServiceAccounts in the project …","ref":"/docs/gardener/service-account-manager/","tags":"","title":"Service Account Manager"},{"body":"Packages:\n settings.gardener.cloud/v1alpha1 settings.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: ClusterOpenIDConnectPreset OpenIDConnectPreset ClusterOpenIDConnectPreset ClusterOpenIDConnectPreset is a OpenID Connect configuration that is applied to a Shoot objects cluster-wide.\n Field Description apiVersion string settings.gardener.cloud/v1alpha1 kind string ClusterOpenIDConnectPreset metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ClusterOpenIDConnectPresetSpec Spec is the specification of this OpenIDConnect preset.\n OpenIDConnectPresetSpec OpenIDConnectPresetSpec (Members of OpenIDConnectPresetSpec are embedded into this type.) projectSelector Kubernetes meta/v1.LabelSelector (Optional) Project decides whether to apply the configuration if the Shoot is in a specific Project matching the label selector. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Defaults to the empty LabelSelector, which matches everything.\n OpenIDConnectPreset OpenIDConnectPreset is a OpenID Connect configuration that is applied to a Shoot in a namespace.\n Field Description apiVersion string settings.gardener.cloud/v1alpha1 kind string OpenIDConnectPreset metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec OpenIDConnectPresetSpec Spec is the specification of this OpenIDConnect preset.\n server KubeAPIServerOpenIDConnect Server contains the kube-apiserver’s OpenID Connect configuration. This configuration is not overwriting any existing OpenID Connect configuration already set on the Shoot object.\n client OpenIDConnectClientAuthentication (Optional) Client contains the configuration used for client OIDC authentication of Shoot clusters. This configuration is not overwriting any existing OpenID Connect client authentication already set on the Shoot object.\nDeprecated: The OpenID Connect configuration this field specifies is not used and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n shootSelector Kubernetes meta/v1.LabelSelector (Optional) ShootSelector decides whether to apply the configuration if the Shoot has matching labels. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Default to the empty LabelSelector, which matches everything.\n weight int32 Weight associated with matching the corresponding preset, in the range 1-100. Required.\n ClusterOpenIDConnectPresetSpec (Appears on: ClusterOpenIDConnectPreset) ClusterOpenIDConnectPresetSpec contains the OpenIDConnect specification and project selector matching Shoots in Projects.\n Field Description OpenIDConnectPresetSpec OpenIDConnectPresetSpec (Members of OpenIDConnectPresetSpec are embedded into this type.) projectSelector Kubernetes meta/v1.LabelSelector (Optional) Project decides whether to apply the configuration if the Shoot is in a specific Project matching the label selector. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Defaults to the empty LabelSelector, which matches everything.\n KubeAPIServerOpenIDConnect (Appears on: OpenIDConnectPresetSpec) KubeAPIServerOpenIDConnect contains configuration settings for the OIDC provider. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description caBundle string (Optional) If set, the OpenID server’s certificate will be verified by one of the authorities in the oidc-ca-file, otherwise the host’s root CA set will be used.\n clientID string The client ID for the OpenID Connect client. Required.\n groupsClaim string (Optional) If provided, the name of a custom OpenID Connect claim for specifying user groups. The claim value is expected to be a string or array of strings. This field is experimental, please see the authentication documentation for further details.\n groupsPrefix string (Optional) If provided, all groups will be prefixed with this value to prevent conflicts with other authentication strategies.\n issuerURL string The URL of the OpenID issuer, only HTTPS scheme will be accepted. If set, it will be used to verify the OIDC JSON Web Token (JWT). Required.\n requiredClaims map[string]string (Optional) key=value pairs that describes a required claim in the ID Token. If set, the claim is verified to be present in the ID Token with a matching value.\n signingAlgs []string (Optional) List of allowed JOSE asymmetric signing algorithms. JWTs with a ‘alg’ header value not in this list will be rejected. Values are defined by RFC 7518 https://tools.ietf.org/html/rfc7518#section-3.1 Defaults to [RS256]\n usernameClaim string (Optional) The OpenID claim to use as the user name. Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable. This field is experimental, please see the authentication documentation for further details. Defaults to “sub”.\n usernamePrefix string (Optional) If provided, all usernames will be prefixed with this value. If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes. To skip any prefixing, provide the value ‘-’.\n OpenIDConnectClientAuthentication (Appears on: OpenIDConnectPresetSpec) OpenIDConnectClientAuthentication contains configuration for OIDC clients.\n Field Description secret string (Optional) The client Secret for the OpenID Connect client.\n extraConfig map[string]string (Optional) Extra configuration added to kubeconfig’s auth-provider. Must not be any of idp-issuer-url, client-id, client-secret, idp-certificate-authority, idp-certificate-authority-data, id-token or refresh-token\n OpenIDConnectPresetSpec (Appears on: OpenIDConnectPreset, ClusterOpenIDConnectPresetSpec) OpenIDConnectPresetSpec contains the Shoot selector for which a specific OpenID Connect configuration is applied.\n Field Description server KubeAPIServerOpenIDConnect Server contains the kube-apiserver’s OpenID Connect configuration. This configuration is not overwriting any existing OpenID Connect configuration already set on the Shoot object.\n client OpenIDConnectClientAuthentication (Optional) Client contains the configuration used for client OIDC authentication of Shoot clusters. This configuration is not overwriting any existing OpenID Connect client authentication already set on the Shoot object.\nDeprecated: The OpenID Connect configuration this field specifies is not used and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n shootSelector Kubernetes meta/v1.LabelSelector (Optional) ShootSelector decides whether to apply the configuration if the Shoot has matching labels. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Default to the empty LabelSelector, which matches everything.\n weight int32 Weight associated with matching the corresponding preset, in the range 1-100. Required.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n settings.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/settings/","tags":"","title":"Settings"},{"body":"Deploying Gardener into a Kubernetes Cluster Similar to Kubernetes, Gardener consists out of control plane components (Gardener API server, Gardener controller manager, Gardener scheduler), and an agent component (gardenlet). The control plane is deployed in the so-called garden cluster, while the agent is installed into every seed cluster. Please note that it is possible to use the garden cluster as seed cluster by simply deploying the gardenlet into it.\nWe are providing Helm charts in order to manage the various resources of the components. Please always make sure that you use the Helm chart version that matches the Gardener version you want to deploy.\nDeploying the Gardener Control Plane (API Server, Admission Controller, Controller Manager, Scheduler) The configuration values depict the various options to configure the different components. Please consult Gardener Configuration and Usage for component specific configurations and Authentication of Gardener Control Plane Components Against the Garden Cluster for authentication related specifics.\nAlso, note that all resources and deployments need to be created in the garden namespace (not overrideable). If you enable the Gardener admission controller as part of you setup, please make sure the garden namespace is labelled with app: gardener. Otherwise, the backing service account for the admission controller Pod might not be created successfully. No action is necessary if you deploy the garden namespace with the Gardener control plane Helm chart.\nAfter preparing your values in a separate controlplane-values.yaml file (values.yaml can be used as starting point), you can run the following command against your garden cluster:\nhelm install charts/gardener/controlplane \\ --namespace garden \\ --name gardener-controlplane \\ -f controlplane-values.yaml \\ --wait Deploying Gardener Extensions Gardener is an extensible system that does not contain the logic for provider-specific things like DNS management, cloud infrastructures, network plugins, operating system configs, and many more.\nYou have to install extension controllers for these parts. Please consult the documentation regarding extensions to get more information.\nDeploying the Gardener Agent (gardenlet) Please refer to Deploying Gardenlets on how to deploy a gardenlet.\n","categories":"","description":"","excerpt":"Deploying Gardener into a Kubernetes Cluster Similar to Kubernetes, …","ref":"/docs/gardener/deployment/setup_gardener/","tags":"","title":"Setup Gardener"},{"body":"Auto-Scaling in Shoot Clusters There are three auto-scaling scenarios of relevance in Kubernetes clusters in general and Gardener shoot clusters in particular:\n Horizontal node auto-scaling, i.e., dynamically adding and removing worker nodes. Horizontal pod auto-scaling, i.e., dynamically adding and removing pod replicas. Vertical pod auto-scaling, i.e., dynamically raising or shrinking the resource requests/limits of pods. This document provides an overview of these scenarios and how the respective auto-scaling components can be enabled and configured. For more details, please see our pod auto-scaling best practices.\nHorizontal Node Auto-Scaling Every shoot cluster that has at least one worker pool with minimum \u003c maximum nodes configuration will get a cluster-autoscaler deployment. Gardener is leveraging the upstream community Kubernetes cluster-autoscaler component. We have forked it to gardener/autoscaler so that it supports the way how Gardener manages the worker nodes (leveraging gardener/machine-controller-manager). However, we have not touched the logic how it performs auto-scaling decisions. Consequently, please refer to the official documentation for this component.\nThe Shoot API allows to configure a few flags of the cluster-autoscaler:\nThere are general options for cluster-autoscaler, and these values will be used for all worker groups except for those overwriting them. Additionally, there are some cluster-autoscaler flags to be set per worker pool. They override any general value such as those specified in the general flags above.\n Only some cluster-autoscaler flags can be configured per worker pool, and is limited by NodeGroupAutoscalingOptions of the upstream community Kubernetes repository. This list can be found here.\n Horizontal Pod Auto-Scaling This functionality (HPA) is a standard functionality of any Kubernetes cluster (implemented as part of the kube-controller-manager that all Kubernetes clusters have). It is always enabled.\nThe Shoot API allows to configure most of the flags of the horizontal-pod-autoscaler.\nVertical Pod Auto-Scaling This form of auto-scaling (VPA) is enabled by default, but it can be switched off in the Shoot by setting .spec.kubernetes.verticalPodAutoscaler.enabled=false in case you deploy your own VPA into your cluster (having more than one VPA on the same set of pods will lead to issues, eventually).\nGardener is leveraging the upstream community Kubernetes vertical-pod-autoscaler. If enabled, Gardener will deploy it as part of the control plane into the seed cluster. It will also be used for the vertical autoscaling of Gardener’s system components deployed into the kube-system namespace of shoot clusters, for example, kube-proxy or metrics-server.\nYou might want to refer to the official documentation for this component to get more information how to use it.\nThe Shoot API allows to configure a few flags of the vertical-pod-autoscaler.\n⚠️ Please note that if you disable VPA, the related CustomResourceDefinitions (ours and yours) will remain in your shoot cluster (whether someone acts on them or not). You can delete these CustomResourceDefinitions yourself using kubectl delete crd if you want to get rid of them (in case you statically size all resources, which we do not recommend).\nPod Auto-Scaling Best Practices Please continue reading our pod auto-scaling best practices for more details and recommendations.\n","categories":"","description":"The basics of horizontal Node and vertical Pod auto-scaling","excerpt":"The basics of horizontal Node and vertical Pod auto-scaling","ref":"/docs/gardener/shoot_autoscaling/","tags":"","title":"Shoot Autoscaling"},{"body":"Overview Day two operations for shoot clusters are related to:\n The Kubernetes version of the control plane and the worker nodes The operating system version of the worker nodes Note When referring to an update of the “operating system version” in this document, the update of the machine image of the shoot cluster’s worker nodes is meant. For example, Amazon Machine Images (AMI) for AWS. The following table summarizes what options Gardener offers to maintain these versions:\n Auto-Update Forceful Updates Manual Updates Kubernetes version Patches only Patches and consecutive minor updates only yes Operating system version yes yes yes Allowed Target Versions in the CloudProfile Administrators maintain the allowed target versions that you can update to in the CloudProfile for each IaaS-Provider. Users with access to a Gardener project can check supported target versions with:\nkubectl get cloudprofile [IAAS-SPECIFIC-PROFILE] -o yaml Path Description More Information spec.kubernetes.versions The supported Kubernetes version major.minor.patch. Patch releases spec.machineImages The supported operating system versions for worker nodes Both the Kubernetes version and the operating system version follow semantic versioning that allows Gardener to handle updates automatically.\nFor more information, see Semantic Versioning.\nImpact of Version Classifications on Updates Gardener allows to classify versions in the CloudProfile as preview, supported, deprecated, or expired. During maintenance operations, preview versions are excluded from updates, because they’re often recently released versions that haven’t yet undergone thorough testing and may contain bugs or security issues.\nFor more information, see Version Classifications.\nLet Gardener Manage Your Updates The Maintenance Window Gardener can manage updates for you automatically. It offers users to specify a maintenance window during which updates are scheduled:\n The time interval of the maintenance window can’t be less than 30 minutes or more than 6 hours. If there’s no maintenance window specified during the creation of a shoot cluster, Gardener chooses a maintenance window randomly to spread the load. You can either specify the maintenance window in the shoot cluster specification (.spec.maintenance.timeWindow) or the start time of the maintenance window using the Gardener dashboard (CLUSTERS \u003e [YOUR-CLUSTER] \u003e OVERVIEW \u003e Lifecycle \u003e Maintenance).\nAuto-Update and Forceful Updates To trigger updates during the maintenance window automatically, Gardener offers the following methods:\n Auto-update: Gardener starts an update during the next maintenance window whenever there’s a version available in the CloudProfile that is higher than the one of your shoot cluster specification, and that isn’t classified as preview version. For Kubernetes versions, auto-update only updates to higher patch levels.\nYou can either activate auto-update on the Gardener dashboard (CLUSTERS \u003e [YOUR-CLUSTER] \u003e OVERVIEW \u003e Lifecycle \u003e Maintenance) or in the shoot cluster specification:\n .spec.maintenance.autoUpdate.kubernetesVersion: true .spec.maintenance.autoUpdate.machineImageVersion: true Forceful updates: In the maintenance window, Gardener compares the current version given in the shoot cluster specification with the version list in the CloudProfile. If the version has an expiration date and if the date is before the start of the maintenance window, Gardener starts an update to the highest version available in the CloudProfile that isn’t classified as preview version. The highest version in CloudProfile can’t have an expiration date. For Kubernetes versions, Gardener only updates to higher patch levels or consecutive minor versions.\n If you don’t want to wait for the next maintenance window, you can annotate the shoot cluster specification with shoot.gardener.cloud/operation: maintain. Gardener then checks immediately if there’s an auto-update or a forceful update needed.\nNote Forceful version updates are executed even if the auto-update for the Kubernetes version(or the auto-update for the machine image version) is deactivated (set to false). With expiration dates, administrators can give shoot cluster owners more time for testing before the actual version update happens, which allows for smoother transitions to new versions.\nKubernetes Update Paths The bigger the delta of the Kubernetes source version and the Kubernetes target version, the better it must be planned and executed by operators. Gardener only provides automatic support for updates that can be applied safely to the cluster workload:\n Update Type Example Update Method Patches 1.10.12 to 1.10.13 auto-update or Forceful update Update to consecutive minor version 1.10.12 to 1.11.10 Forceful update Other 1.10.12 to 1.12.0 Manual update Gardener doesn’t support automatic updates of nonconsecutive minor versions, because Kubernetes doesn’t guarantee updateability in this case. However, multiple minor version updates are possible if not only the minor source version is expired, but also the minor target version is expired. Gardener then updates the Kubernetes version first to the expired target version, and waits for the next maintenance window to update this version to the next minor target version.\nWarning The administrator who maintains the CloudProfile has to ensure that the list of Kubernetes versions consists of consecutive minor versions, for example, from 1.10.x to 1.11.y. If the minor version increases in bigger steps, for example, from 1.10.x to 1.12.y, then the shoot cluster updates will fail during the maintenance window. Manual Updates To update the Kubernetes version or the node operating system manually, change the .spec.kubernetes.version field or the .spec.provider.workers.machine.image.version field correspondingly.\nManual updates are required if you would like to do a minor update of the Kubernetes version. Gardener doesn’t do such updates automatically, as they can have breaking changes that could impact the cluster workload.\nManual updates are either executed immediately (default) or can be confined to the maintenance time window.\nChoosing the latter option causes changes to the cluster (for example, node pool rolling-updates) and the subsequent reconciliation to only predictably happen during a defined time window (available since Gardener version 1.4).\nFor more information, see Confine Specification Changes/Update Roll Out.\nWarning Before applying such an update on minor or major releases, operators should check for all the breaking changes introduced in the target Kubernetes release changelog. Examples In the examples for the CloudProfile and the shoot cluster specification, only the fields relevant for the example are shown.\nAuto-Update of Kubernetes Version Let’s assume that the Kubernetes versions 1.10.5 and 1.11.0 were added in the following CloudProfile:\nspec: kubernetes: versions: - version: 1.11.0 - version: 1.10.5 - version: 1.10.0 Before this change, the shoot cluster specification looked like this:\nspec: kubernetes: version: 1.10.0 maintenance: timeWindow: begin: 220000+0000 end: 230000+0000 autoUpdate: kubernetesVersion: true As a consequence, the shoot cluster is updated to Kubernetes version 1.10.5 between 22:00-23:00 UTC. Your shoot cluster isn’t updated automatically to 1.11.0, even though it’s the highest Kubernetes version in the CloudProfile, because Gardener only does automatic updates of the Kubernetes patch level.\nForceful Update Due to Expired Kubernetes Version Let’s assume the following CloudProfile exists on the cluster:\nspec: kubernetes: versions: - version: 1.12.8 - version: 1.11.10 - version: 1.10.13 - version: 1.10.12 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: kubernetes: version: 1.10.12 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: false The shoot cluster specification refers to a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-12, the Kubernetes version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14, the Kubernetes version of the shoot cluster is updated to 1.10.13 (independently of the value of .spec.maintenance.autoUpdate.kubernetesVersion).\nForceful Update to New Minor Kubernetes Version Let’s assume the following CloudProfile exists on the cluster:\nspec: kubernetes: versions: - version: 1.12.8 - version: 1.11.10 - version: 1.11.09 - version: 1.10.12 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: kubernetes: version: 1.10.12 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: false The shoot cluster specification refers a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-14, the Kubernetes version of the shoot cluster is updated to 1.11.10, which is the highest patch version of minor target version 1.11 that follows the source version 1.10.\nAutomatic Update from Expired Machine Image Version Let’s assume the following CloudProfile exists on the cluster:\nspec: machineImages: - name: coreos versions: - version: 2191.5.0 - version: 2191.4.1 - version: 2135.6.0 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: provider: type: aws workers: - name: name maximum: 1 minimum: 1 maxSurge: 1 maxUnavailable: 0 image: name: coreos version: 2135.6.0 type: m5.large volume: type: gp2 size: 20Gi maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: machineImageVersion: false The shoot cluster specification refers a machine image version that has an expirationDate. In the maintenance window on 2019-04-12, the machine image version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14, the machine image version of the shoot cluster is updated to 2191.5.0 (independently of the value of .spec.maintenance.autoUpdate.machineImageVersion) as version 2135.6.0 is expired.\n","categories":"","description":"Understanding and configuring Gardener's Day-2 operations for Shoot clusters.","excerpt":"Understanding and configuring Gardener's Day-2 operations for Shoot …","ref":"/docs/guides/administer-shoots/maintain-shoot/","tags":"","title":"Shoot Cluster Maintenance"},{"body":"Shoot Cluster Purpose The Shoot resource contains a .spec.purpose field indicating how the shoot is used, whose allowed values are as follows:\n evaluation (default): Indicates that the shoot cluster is for evaluation scenarios. development: Indicates that the shoot cluster is for development scenarios. testing: Indicates that the shoot cluster is for testing scenarios. production: Indicates that the shoot cluster is for production scenarios. infrastructure: Indicates that the shoot cluster is for infrastructure scenarios (only allowed for shoots in the garden namespace). Behavioral Differences The following enlists the differences in the way the shoot clusters are set up based on the selected purpose:\n testing shoot clusters do not get a monitoring or a logging stack as part of their control planes. for production and infrastructure shoot clusters auto-scaling scale down of the main ETCD is disabled. There are also differences with respect to how testing shoots are scheduled after creation, please consult the Scheduler documentation.\nFuture Steps We might introduce more behavioral difference depending on the shoot purpose in the future. As of today, there are no plans yet.\n","categories":"","description":"Available Shoot cluster purposes and the behavioral differences between them","excerpt":"Available Shoot cluster purposes and the behavioral differences …","ref":"/docs/gardener/shoot_purposes/","tags":"","title":"Shoot Cluster Purposes"},{"body":"Credentials Rotation for Shoot Clusters There are a lot of different credentials for Shoots to make sure that the various components can communicate with each other and to make sure it is usable and operable.\nThis page explains how the varieties of credentials can be rotated so that the cluster can be considered secure.\nUser-Provided Credentials Cloud Provider Keys End-users must provide credentials such that Gardener and Kubernetes controllers can communicate with the respective cloud provider APIs in order to perform infrastructure operations. For example, Gardener uses them to setup and maintain the networks, security groups, subnets, etc., while the cloud-controller-manager uses them to reconcile load balancers and routes, and the CSI controller uses them to reconcile volumes and disks.\nDepending on the cloud provider, the required data keys of the Secret differ. Please consult the documentation of the respective provider extension documentation to get to know the concrete data keys (e.g., this document for AWS).\nIt is the responsibility of the end-user to regularly rotate those credentials. The following steps are required to perform the rotation:\n Update the data in the Secret with new credentials. ⚠️ Wait until all Shoots using the Secret are reconciled before you disable the old credentials in your cloud provider account! Otherwise, the Shoots will no longer work as expected. Check out this document to learn how to trigger a reconciliation of your Shoots. After all Shoots using the Secret were reconciled, you can go ahead and deactivate the old credentials in your provider account. Gardener-Provided Credentials The below credentials are generated by Gardener when shoot clusters are being created. Those include:\n kubeconfig (if enabled) certificate authorities (and related server and client certificates) observability passwords for Plutono SSH key pair for worker nodes ETCD encryption key ServiceAccount token signing key … 🚨 There is no auto-rotation of those credentials, and it is the responsibility of the end-user to regularly rotate them.\nWhile it is possible to rotate them one by one, there is also a convenient method to combine the rotation of all of those credentials. The rotation happens in two phases since it might be required to update some API clients (e.g., when CAs are rotated). In order to start the rotation (first phase), you have to annotate the shoot with the rotate-credentials-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-start Note: You can check the .status.credentials.rotation field in the Shoot to see when the rotation was last initiated and last completed.\n Kindly consider the detailed descriptions below to learn how the rotation is performed and what your responsibilities are. Please note that all respective individual actions apply for this combined rotation as well (e.g., worker nodes are rolled out in the first phase).\nYou can complete the rotation (second phase) by annotating the shoot with the rotate-credentials-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-complete Kubeconfig If the .spec.kubernetes.enableStaticTokenKubeconfig field is set to true (default), then Gardener generates a kubeconfig with cluster-admin privileges for the Shoots containing credentials for communication with the kube-apiserver (see this document for more information).\nThis Secret is stored with the name \u003cshoot-name\u003e.kubeconfig in the project namespace in the garden cluster and has multiple data keys:\n kubeconfig: the completed kubeconfig ca.crt: the CA bundle for establishing trust to the API server (same as in the Cluster CA bundle secret) Shoots created with Gardener \u003c= 0.28 used to have a kubeconfig based on a client certificate instead of a static token. With the first kubeconfig rotation, such clusters will get a static token as well.\n⚠️ This does not invalidate the old client certificate. In order to do this, you should perform a rotation of the CAs (see section below).\n It is the responsibility of the end-user to regularly rotate those credentials (or disable this kubeconfig entirely). In order to rotate the token in this kubeconfig, annotate the Shoot with gardener.cloud/operation=rotate-kubeconfig-credentials. This operation is not allowed for Shoots that are already marked for deletion. Please note that only the token (and basic auth password, if enabled) are exchanged. The CA certificate remains the same (see section below for information about the rotation).\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-kubeconfig-credentials You can check the .status.credentials.rotation.kubeconfig field in the Shoot to see when the rotation was last initiated and last completed.\n Certificate Authorities Gardener generates several certificate authorities (CAs) to ensure secured communication between the various components and actors. Most of those CAs are used for internal communication (e.g., kube-apiserver talks to etcd, vpn-shoot talks to the vpn-seed-server, kubelet talks to kube-apiserver). However, there is also the “cluster CA” which is part of all kubeconfigs and used to sign the server certificate exposed by the kube-apiserver.\nGardener populates a ConfigMap with the name \u003cshoot-name\u003e.ca-cluster in the project namespace in the garden cluster which contains the following data keys:\n ca.crt: the CA bundle of the cluster This bundle contains one or multiple CAs which are used for signing serving certificates of the Shoot’s API server. Hence, the certificates contained in this ConfigMap can be used to verify the API server’s identity when communicating with its public endpoint (e.g., as certificate-authority-data in a kubeconfig). This is the same certificate that is also contained in the kubeconfig’s certificate-authority-data field.\n Shoots created with Gardener \u003e= v1.45 have a dedicated client CA which verifies the legitimacy of client certificates. For older Shoots, the client CA is equal to the cluster CA. With the first CA rotation, such clusters will get a dedicated client CA as well.\n All of the certificates are valid for 10 years. Since it requires adaptation for the consumers of the Shoot, there is no automatic rotation and it is the responsibility of the end-user to regularly rotate the CA certificates.\nThe rotation happens in three stages (see also GEP-18 for the full details):\n In stage one, new CAs are created and added to the bundle (together with the old CAs). Client certificates are re-issued immediately. In stage two, end-users update all cluster API clients that communicate with the control plane. In stage three, the old CAs are dropped from the bundle and server certificate are re-issued. Technically, the Preparing phase indicates stage one. Once it is completed, the Prepared phase indicates readiness for stage two. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.certificateAuthorities field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-ca-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ca-start This will trigger a Shoot reconciliation and performs stage one. After it is completed, the .status.credentials.rotation.certificateAuthorities.phase is set to Prepared.\nNow you must update all API clients outside the cluster (such as the kubeconfigs on developer machines) to use the newly issued CA bundle in the \u003cshoot-name\u003e.ca-cluster ConfigMap. Please also note that client certificates must be re-issued now.\nAfter updating all API clients, you can complete the rotation by annotating the shoot with the rotate-ca-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ca-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.certificateAuthorities.phase is set to Completed. You could update your API clients again and drop the old CA from their bundle.\n Note that the CA rotation also rotates all internal CAs and signed certificates. Hence, most of the components need to be restarted (including etcd and kube-apiserver).\n⚠️ In stage one, all worker nodes of the Shoot will be rolled out to ensure that the Pods as well as the kubelets get the updated credentials as well.\n Observability Password(s) For Plutono and Prometheus For Shoots with .spec.purpose!=testing, Gardener deploys an observability stack with Prometheus for monitoring, Alertmanager for alerting (optional), Vali for logging, and Plutono for visualization. The Plutono instance is exposed via Ingress and accessible for end-users via basic authentication credentials generated and managed by Gardener.\nThose credentials are stored in a Secret with the name \u003cshoot-name\u003e.monitoring in the project namespace in the garden cluster and has multiple data keys:\n username: the user name password: the password auth: the user name with SHA-1 representation of the password It is the responsibility of the end-user to regularly rotate those credentials. In order to rotate the password, annotate the Shoot with gardener.cloud/operation=rotate-observability-credentials. This operation is not allowed for Shoots that are already marked for deletion.\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-observability-credentials You can check the .status.credentials.rotation.observability field in the Shoot to see when the rotation was last initiated and last completed.\n SSH Key Pair for Worker Nodes Gardener generates an SSH key pair whose public key is propagated to all worker nodes of the Shoot. The private key can be used to establish an SSH connection to the workers for troubleshooting purposes. It is recommended to use gardenctl-v2 and its gardenctl ssh command since it is required to first open up the security groups and create a bastion VM (no direct SSH access to the worker nodes is possible).\nThe private key is stored in a Secret with the name \u003cshoot-name\u003e.ssh-keypair in the project namespace in the garden cluster and has multiple data keys:\n id_rsa: the private key id_rsa.pub: the public key for SSH In order to rotate the keys, annotate the Shoot with gardener.cloud/operation=rotate-ssh-keypair. This will propagate a new key to all worker nodes while keeping the old key active and valid as well (it will only be invalidated/removed with the next rotation).\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ssh-keypair You can check the .status.credentials.rotation.sshKeypair field in the Shoot to see when the rotation was last initiated or last completed.\n The old key is stored in a Secret with the name \u003cshoot-name\u003e.ssh-keypair.old in the project namespace in the garden cluster and has the same data keys as the regular Secret.\nETCD Encryption Key This key is used to encrypt the data of Secret resources inside etcd (see upstream Kubernetes documentation).\nThe encryption key has no expiration date. There is no automatic rotation and it is the responsibility of the end-user to regularly rotate the encryption key.\nThe rotation happens in three stages:\n In stage one, a new encryption key is created and added to the bundle (together with the old encryption key). In stage two, all Secrets in the cluster and resources configured in the spec.kubernetes.kubeAPIServer.encryptionConfig of the Shoot (see ETCD Encryption Config) are rewritten by the kube-apiserver so that they become encrypted with the new encryption key. In stage three, the old encryption is dropped from the bundle. Technically, the Preparing phase indicates the stages one and two. Once it is completed, the Prepared phase indicates readiness for stage three. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.etcdEncryptionKey field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-etcd-encryption-key-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-etcd-encryption-key-start This will trigger a Shoot reconciliation and performs the stages one and two. After it is completed, the .status.credentials.rotation.etcdEncryptionKey.phase is set to Prepared. Now you can complete the rotation by annotating the shoot with the rotate-etcd-encryption-key-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-etcd-encryption-key-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.etcdEncryptionKey.phase is set to Completed.\nServiceAccount Token Signing Key Gardener generates a key which is used to sign the tokens for ServiceAccounts. Those tokens are typically used by workload Pods running inside the cluster in order to authenticate themselves with the kube-apiserver. This also includes system components running in the kube-system namespace.\nThe token signing key has no expiration date. Since it might require adaptation for the consumers of the Shoot, there is no automatic rotation and it is the responsibility of the end-user to regularly rotate the signing key.\nThe rotation happens in three stages, similar to how the CA certificates are rotated:\n In stage one, a new signing key is created and added to the bundle (together with the old signing key). In stage two, end-users update all out-of-cluster API clients that communicate with the control plane via ServiceAccount tokens. In stage three, the old signing key is dropped from the bundle. Technically, the Preparing phase indicates stage one. Once it is completed, the Prepared phase indicates readiness for stage two. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.serviceAccountKey field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-serviceaccount-key-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-serviceaccount-key-start This will trigger a Shoot reconciliation and performs stage one. After it is completed, the .status.credentials.rotation.serviceAccountKey.phase is set to Prepared.\nNow you must update all API clients outside the cluster using a ServiceAccount token (such as the kubeconfigs on developer machines) to use a token issued by the new signing key. Gardener already generates new secrets for those ServiceAccounts in the cluster, whose static token was automatically created by Kubernetes (typically before v1.22 - ref) However, if you need to create it manually, you can check out this document for instructions.\nAfter updating all API clients, you can complete the rotation by annotating the shoot with the rotate-serviceaccount-key-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-serviceaccount-key-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.serviceAccountKey.phase is set to Completed.\n ⚠️ In stage one, all worker nodes of the Shoot will be rolled out to ensure that the Pods use a new token.\n OpenVPN TLS Auth Keys This key is used to ensure encrypted communication for the VPN connection between the control plane in the seed cluster and the shoot cluster. It is currently not rotated automatically and there is no way to trigger it manually.\n","categories":"","description":"","excerpt":"Credentials Rotation for Shoot Clusters There are a lot of different …","ref":"/docs/gardener/shoot_credentials_rotation/","tags":"","title":"Shoot Credentials Rotation"},{"body":"Introduction This extension implements cosign image verification. It is strictly limited only to the kubernetes system components deployed by Gardener and other Gardener Extensions in the kube-system namespace of a shoot cluster.\nShoot Feature Gate In most of the Gardener setups the shoot-lakom-service extension is enabled globally and thus can be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to disable the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-lakom-service disabled: true providerConfig: apiVersion: lakom.extensions.gardener.cloud/v1alpha1 kind: LakomConfig scope: KubeSystem ... The scope field instruct lakom which pods to validate. The possible values are:\n KubeSystem Lakom will validate all pods in the kube-system namespace. KubeSystemManagedByGardener Lakom will validate all pods in the kube-system namespace that are annotated with “managed-by/gardener” Cluster Lakom will validate all pods in all namespaces. ","categories":"","description":"","excerpt":"Introduction This extension implements cosign image verification. It …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/shoot-extension/","tags":"","title":"Shoot Extension"},{"body":"Contributing to Shoot Health Status Conditions Gardener checks regularly (every minute by default) the health status of all shoot clusters. It categorizes its checks into five different types:\n APIServerAvailable: This type indicates whether the shoot’s kube-apiserver is available or not. ControlPlaneHealthy: This type indicates whether the core components of the Shoot controlplane (ETCD, KAPI, KCM..) are healthy. EveryNodeReady: This type indicates whether all Nodes and all Machine objects report healthiness. ObservabilityComponentsHealthy: This type indicates whether the observability components of the Shoot control plane (Prometheus, Vali, Plutono..) are healthy. SystemComponentsHealthy: This type indicates whether all system components deployed to the kube-system namespace in the shoot do exist and are running fine. In case of workerless Shoot, EveryNodeReady condition is not present in the Shoot’s conditions since there are no nodes in the cluster.\nEvery Shoot resource has a status.conditions[] list that contains the mentioned types, together with a status (True/False) and a descriptive message/explanation of the status.\nMost extension controllers are deploying components and resources as part of their reconciliation flows into the seed or shoot cluster. A prominent example for this is the ControlPlane controller that usually deploys a cloud-controller-manager or CSI controllers as part of the shoot control plane. Now that the extensions deploy resources into the cluster, especially resources that are essential for the functionality of the cluster, they might want to contribute to Gardener’s checks mentioned above.\nWhat can extensions do to contribute to Gardener’s health checks? Every extension resource in Gardener’s extensions.gardener.cloud/v1alpha1 API group also has a status.conditions[] list (like the Shoot). Extension controllers can write conditions to the resource they are acting on and use a type that also exists in the shoot’s conditions. One exception is that APIServerAvailable can’t be used, as Gardener clearly can identify the status of this condition and it doesn’t make sense for extensions to try to contribute/modify it.\nAs an example for the ControlPlane controller, let’s take a look at the following resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane namespace: shoot--foo--bar spec: ... status: conditions: - type: ControlPlaneHealthy status: \"False\" reason: DeploymentUnhealthy message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:44:27Z\" - type: ConfigComputedSuccessfully status: \"True\" reason: ConfigCreated message: The cloud-provider-config has been successfully computed. lastUpdateTime: \"2014-05-25T12:43:27Z\" The extension controller has declared in its extension resource that one of the deployments it is responsible for is unhealthy. Also, it has written a second condition using a type that is unknown by Gardener.\nGardener will pick the list of conditions and recognize that there is one with a type ControlPlaneHealthy. It will merge it with its own ControlPlaneHealthy condition and report it back to the Shoot’s status:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: labels: shoot.gardener.cloud/status: unhealthy name: some-shoot namespace: garden-core spec: status: conditions: - type: APIServerAvailable status: \"True\" reason: HealthzRequestSucceeded message: API server /healthz endpoint responded with success status code. [response_time:31ms] lastUpdateTime: \"2014-05-23T08:26:52Z\" lastTransitionTime: \"2014-05-25T12:45:13Z\" - type: ControlPlaneHealthy status: \"False\" reason: ControlPlaneUnhealthyReport message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:45:13Z\" lastTransitionTime: \"2014-05-25T12:45:13Z\" ... Hence, the only duty extensions have is to maintain the health status of their components in the extension resource they are managing. This can be accomplished using the health check library for extensions.\nError Codes The Gardener API includes some well-defined error codes, e.g., ERR_INFRA_UNAUTHORIZED, ERR_INFRA_DEPENDENCIES, etc. Extension may set these error codes in the .status.conditions[].codes[] list in case it makes sense. Gardener will pick them up and will similarly merge them into the .status.conditions[].codes[] list in the Shoot:\nstatus: conditions: - type: ControlPlaneHealthy status: \"False\" reason: DeploymentUnhealthy message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:44:27Z\" codes: - ERR_INFRA_UNAUTHORIZED ","categories":"","description":"","excerpt":"Contributing to Shoot Health Status Conditions Gardener checks …","ref":"/docs/gardener/extensions/shoot-health-status-conditions/","tags":"","title":"Shoot Health Status Conditions"},{"body":"Shoot Hibernation Clusters are only needed 24 hours a day if they run productive workload. So whenever you do development in a cluster, or just use it for tests or demo purposes, you can save a lot of money if you scale-down your Kubernetes resources whenever you don’t need them. However, scaling them down manually can become time-consuming the more resources you have.\nGardener offers a clever way to automatically scale-down all resources to zero: cluster hibernation. You can either hibernate a cluster by pushing a button, or by defining a hibernation schedule.\n To save costs, it’s recommended to define a hibernation schedule before the creation of a cluster. You can hibernate your cluster or wake up your cluster manually even if there’s a schedule for its hibernation.\n Hibernate a Cluster What Is Hibernation? What Isn’t Affected by the Hibernation? Hibernate Your Cluster Manually Wake Up Your Cluster Manually Create a Schedule to Hibernate Your Cluster What Is Hibernation? When a cluster is hibernated, Gardener scales down the worker nodes and the cluster’s control plane to free resources at the IaaS provider. This affects:\n Your workload, for example, pods, deployments, custom resources. The virtual machines running your workload. The resources of the control plane of your cluster. What Isn’t Affected by the Hibernation? To scale up everything where it was before hibernation, Gardener doesn’t delete state-related information, that is, information stored in persistent volumes. The cluster state as persistent in etcd is also preserved.\nHibernate Your Cluster Manually The .spec.hibernation.enabled field specifies whether the cluster needs to be hibernated or not. If the field is set to true, the cluster’s desired state is to be hibernated. If it is set to false or not specified at all, the cluster’s desired state is to be awakened.\nTo hibernate your cluster, you can run the following kubectl command:\n$ kubectl patch shoot -n $NAMESPACE $SHOOT_NAME -p '{\"spec\":{\"hibernation\":{\"enabled\": true}}}' Wake Up Your Cluster Manually To wake up your cluster, you can run the following kubectl command:\n$ kubectl patch shoot -n $NAMESPACE $SHOOT_NAME -p '{\"spec\":{\"hibernation\":{\"enabled\": false}}}' Create a Schedule to Hibernate Your Cluster You can specify a hibernation schedule to automatically hibernate/wake up a cluster.\nLet’s have a look into the following example:\n hibernation: enabled: false schedules: - start: \"0 20 * * *\" # Start hibernation every day at 8PM end: \"0 6 * * *\" # Stop hibernation every day at 6AM location: \"America/Los_Angeles\" # Specify a location for the cron to run in The above section configures a hibernation schedule that hibernates the cluster every day at 08:00 PM and wakes it up at 06:00 AM. The start or end fields can be omitted, though at least one of them has to be specified. Hence, it is possible to configure a hibernation schedule that only hibernates or wakes up a cluster. The location field is the time location used to evaluate the cron expressions.\n","categories":"","description":"What is hibernation? Manual hibernation/wake up and specifying a hibernation schedule","excerpt":"What is hibernation? Manual hibernation/wake up and specifying a …","ref":"/docs/gardener/shoot_hibernate/","tags":"","title":"Shoot Hibernation"},{"body":"Highly Available Shoot Control Plane Shoot resource offers a way to request for a highly available control plane.\nFailure Tolerance Types A highly available shoot control plane can be setup with either a failure tolerance of zone or node.\nNode Failure Tolerance The failure tolerance of a node will have the following characteristics:\n Control plane components will be spread across different nodes within a single availability zone. There will not be more than one replica per node for each control plane component which has more than one replica. Worker pool should have a minimum of 3 nodes. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different node within a single availability zone. Zone Failure Tolerance The failure tolerance of a zone will have the following characteristics:\n Control plane components will be spread across different availability zones. There will be at least one replica per zone for each control plane component which has more than one replica. Gardener scheduler will automatically select a seed which has a minimum of 3 zones to host the shoot control plane. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different zone. Shoot Spec To request for a highly available shoot control plane Gardener provides the following configuration in the shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: \u003cnode | zone\u003e Allowed Transitions\nIf you already have a shoot cluster with non-HA control plane, then the following upgrades are possible:\n Upgrade of non-HA shoot control plane to HA shoot control plane with node failure tolerance. Upgrade of non-HA shoot control plane to HA shoot control plane with zone failure tolerance. However, it is essential that the seed which is currently hosting the shoot control plane should be multi-zonal. If it is not, then the request to upgrade will be rejected. Note: There will be a small downtime during the upgrade, especially for etcd, which will transition from a single node etcd cluster to a multi-node etcd cluster.\n Disallowed Transitions\nIf you already have a shoot cluster with HA control plane, then the following transitions are not possible:\n Upgrade of HA shoot control plane from node failure tolerance to zone failure tolerance is currently not supported, mainly because already existing volumes are bound to the zone they were created in originally. Downgrade of HA shoot control plane with zone failure tolerance to node failure tolerance is currently not supported, mainly because of the same reason as above, that already existing volumes are bound to the respective zones they were created in originally. Downgrade of HA shoot control plane with either node or zone failure tolerance, to a non-HA shoot control plane is currently not supported, mainly because etcd-druid does not currently support scaling down of a multi-node etcd cluster to a single-node etcd cluster. Zone Outage Situation Implementing highly available software that can tolerate even a zone outage unscathed is no trivial task. You may find our HA Best Practices helpful to get closer to that goal. In this document, we collected many options and settings for you that also Gardener internally uses to provide a highly available service.\nDuring a zone outage, you may be forced to change your cluster setup on short notice in order to compensate for failures and shortages resulting from the outage. For instance, if the shoot cluster has worker nodes across three zones where one zone goes down, the computing power from these nodes is also gone during that time. Changing the worker pool (shoot.spec.provider.workers[]) and infrastructure (shoot.spec.provider.infrastructureConfig) configuration can eliminate this disbalance, having enough machines in healthy availability zones that can cope with the requests of your applications.\nGardener relies on a sophisticated reconciliation flow with several dependencies for which various flow steps wait for the readiness of prior ones. During a zone outage, this can block the entire flow, e.g., because all three etcd replicas can never be ready when a zone is down, and required changes mentioned above can never be accomplished. For this, a special one-off annotation shoot.gardener.cloud/skip-readiness helps to skip any readiness checks in the flow.\n The shoot.gardener.cloud/skip-readiness annotation serves as a last resort if reconciliation is stuck because of important changes during an AZ outage. Use it with caution, only in exceptional cases and after a case-by-case evaluation with your Gardener landscape administrator. If used together with other operations like Kubernetes version upgrades or credential rotation, the annotation may lead to a severe outage of your shoot control plane.\n ","categories":"","description":"Failure tolerance types `node` and `zone`. Possible mitigations for zone or node outages","excerpt":"Failure tolerance types `node` and `zone`. Possible mitigations for …","ref":"/docs/gardener/shoot_high_availability/","tags":"","title":"Shoot High Availability"},{"body":"Shoot Info ConfigMap Overview The gardenlet maintains a ConfigMap inside the Shoot cluster that contains information about the cluster itself. The ConfigMap is named shoot-info and located in the kube-system namespace.\nFields The following fields are provided:\napiVersion: v1 kind: ConfigMap metadata: name: shoot-info namespace: kube-system data: domain: crazy-botany.core.my-custom-domain.com # .spec.dns.domain field from the Shoot resource extensions: foobar,foobaz # List of extensions that are enabled kubernetesVersion: 1.25.4 # .spec.kubernetes.version field from the Shoot resource maintenanceBegin: 220000+0100 # .spec.maintenance.timeWindow.begin field from the Shoot resource maintenanceEnd: 230000+0100 # .spec.maintenance.timeWindow.end field from the Shoot resource nodeNetwork: 10.250.0.0/16 # .spec.networking.nodes field from the Shoot resource podNetwork: 100.96.0.0/11 # .spec.networking.pods field from the Shoot resource projectName: dev # .metadata.name of the Project provider: \u003csome-provider-name\u003e # .spec.provider.type field from the Shoot resource region: europe-central-1 # .spec.region field from the Shoot resource serviceNetwork: 100.64.0.0/13 # .spec.networking.services field from the Shoot resource shootName: crazy-botany # .metadata.name from the Shoot resource ","categories":"","description":"","excerpt":"Shoot Info ConfigMap Overview The gardenlet maintains a ConfigMap …","ref":"/docs/gardener/shoot_info_configmap/","tags":"","title":"Shoot Info Configmap"},{"body":"Shoot Kubernetes and Operating System Versioning in Gardener Motivation On the one hand-side, Gardener is responsible for managing the Kubernetes and the Operating System (OS) versions of its Shoot clusters. On the other hand-side, Gardener needs to be configured and updated based on the availability and support of the Kubernetes and Operating System version it provides. For instance, the Kubernetes community releases minor versions roughly every three months and usually maintains three minor versions (the current and the last two) with bug fixes and security updates. Patch releases are done more frequently.\nWhen using the term Machine image in the following, we refer to the OS version that comes with the machine image of the node/worker pool of a Gardener Shoot cluster. As such, we are not referring to the CloudProvider specific machine image like the AMI for AWS. For more information on how Gardener maps machine image versions to CloudProvider specific machine images, take a look at the individual gardener extension providers, such as the provider for AWS.\nGardener should be configured accordingly to reflect the “logical state” of a version. It should be possible to define the Kubernetes or Machine image versions that still receive bug fixes and security patches, and also vice-versa to define the version that are out-of-maintenance and are potentially vulnerable. Moreover, this allows Gardener to “understand” the current state of a version and act upon it (more information in the following sections).\nOverview As a Gardener operator:\n I can classify a version based on it’s logical state (preview, supported, deprecated, and expired; see Version Classification). I can define which Machine image and Kubernetes versions are eligible for the auto update of clusters during the maintenance time. I can define a moment in time when Shoot clusters are forcefully migrated off a certain version (through an expirationDate). I can define an update path for machine images for auto and force updates; see Update path for machine image versions). I can disallow the creation of clusters having a certain version (think of severe security issues). As an end-user/Shoot owner of Gardener:\n I can get information about which Kubernetes and Machine image versions exist and their classification. I can determine the time when my Shoot clusters Machine image and Kubernetes version will be forcefully updated to the next patch or minor version (in case the cluster is running a deprecated version with an expiration date). I can get this information via API from the CloudProfile. Version Classifications Administrators can classify versions into four distinct “logical states”: preview, supported, deprecated, and expired. The version classification serves as a “point-of-reference” for end-users and also has implications during shoot creation and the maintenance time.\nIf a version is unclassified, Gardener cannot make those decision based on the “logical state”. Nevertheless, Gardener can operate without version classifications and can be added at any time to the Kubernetes and machine image versions in the CloudProfile.\nAs a best practice, versions usually start with the classification preview, then are promoted to supported, eventually deprecated and finally expired. This information is programmatically available in the CloudProfiles of the Garden cluster.\n preview: A preview version is a new version that has not yet undergone thorough testing, possibly a new release, and needs time to be validated. Due to its short early age, there is a higher probability of undiscovered issues and is therefore not yet recommended for production usage. A Shoot does not update (neither auto-update or force-update) to a preview version during the maintenance time. Also, preview versions are not considered for the defaulting to the highest available version when deliberately omitting the patch version during Shoot creation. Typically, after a fresh release of a new Kubernetes (e.g., v1.25.0) or Machine image version (e.g., suse-chost 15.4.20220818), the operator tags it as preview until they have gained sufficient experience and regards this version to be reliable. After the operator has gained sufficient trust, the version can be manually promoted to supported.\n supported: A supported version is the recommended version for new and existing Shoot clusters. This is the version that new Shoot clusters should use and existing clusters should update to. Typically for Kubernetes versions, the latest Kubernetes patch versions of the actual (if not still in preview) and the last 3 minor Kubernetes versions are maintained by the community. An operator could define these versions as being supported (e.g., v1.27.6, v1.26.10, and v1.25.12).\n deprecated: A deprecated version is a version that approaches the end of its lifecycle and can contain issues which are probably resolved in a supported version. New Shoots should not use this version anymore. Existing Shoots will be updated to a newer version if auto-update is enabled (.spec.maintenance.autoUpdate.kubernetesVersion for Kubernetes version auto-update, or .spec.maintenance.autoUpdate.machineImageVersion for machine image version auto-update). Using automatic upgrades, however, does not guarantee that a Shoot runs a non-deprecated version, as the latest version (overall or of the minor version) can be deprecated as well. Deprecated versions should have an expiration date set for eventual expiration.\n expired: An expired versions has an expiration date (based on the Golang time package) in the past. New clusters with that version cannot be created and existing clusters are forcefully migrated to a higher version during the maintenance time.\n Below is an example how the relevant section of the CloudProfile might look like:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: alicloud spec: kubernetes: versions: - classification: preview version: 1.27.0 - classification: preview version: 1.26.3 - classification: supported version: 1.26.2 - classification: preview version: 1.25.5 - classification: supported version: 1.25.4 - classification: supported version: 1.24.6 - classification: deprecated expirationDate: \"2022-11-30T23:59:59Z\" version: 1.24.5 Automatic Version Upgrades There are two ways, the Kubernetes version of the control plane as well as the Kubernetes and machine image version of a worker pool can be upgraded: auto update and forceful update. See Automatic Version Updates for how to enable auto updates for Kubernetes or machine image versions on the Shoot cluster.\nIf a Shoot is running a version after its expiration date has passed, it will be forcefully updated during its maintenance time. This happens even if the owner has opted out of automatic cluster updates!\nWhen an auto update is triggered?:\n The Shoot has auto-update enabled and the version is not the latest eligible version for the auto-update. Please note that this latest version that qualifies for an auto-update is not necessarily the overall latest version in the CloudProfile: For Kubernetes version, the latest eligible version for auto-updates is the latest patch version of the current minor. For machine image version, the latest eligible version for auto-updates is controlled by the updateStrategy field of the machine image in the CloudProfile. The Shoot has auto-update disabled and the version is either expired or does not exist. The auto update can fail if the version is already on the latest eligible version for the auto-update. A failed auto update triggers a force update. The force and auto update path for Kubernetes and machine image versions differ slightly and are described in more detail below.\nUpdate rules for both Kubernetes and machine image versions\n Both auto and force update first try to update to the latest patch version of the same minor. An auto update prefers supported versions over deprecated versions. If there is a lower supported version and a higher deprecated version, auto update will pick the supported version. If all qualifying versions are deprecated, update to the latest deprecated version. An auto update never updates to an expired version. A force update prefers to update to not-expired versions. If all qualifying versions are expired, update to the latest expired version. Please note that therefore multiple consecutive version upgrades are possible. In this case, the version is again upgraded in the next maintenance time. Update path for machine image versions Administrators can define three different update strategies (field updateStrategy) for machine images in the CloudProfile: patch, minor, major (default). This is to accommodate the different version schemes of Operating Systems (e.g. Gardenlinux only updates major and minor versions with occasional patches).\n patch: update to the latest patch version of the current minor version. When using an expired version: force update to the latest patch of the current minor. If already on the latest patch version, then force update to the next higher (not necessarily +1) minor version. minor: update to the latest minor and patch version. When using an expired version: force update to the latest minor and patch of the current major. If already on the latest minor and patch of the current major, then update to the next higher (not necessarily +1) major version. major: always update to the overall latest version. This is the legacy behavior for automatic machine image version upgrades. Force updates are not possible and will fail if the latest version in the CloudProfile for that image is expired (EOL scenario). Example configuration in the CloudProfile:\nmachineImages: - name: gardenlinux updateStrategy: minor versions: - version: 1096.1.0 - version: 934.8.0 - version: 934.7.0 - name: suse-chost updateStrategy: patch versions: - version: 15.3.20220818 - version: 15.3.20221118 Please note that force updates for machine images can skip minor versions (strategy: patch) or major versions (strategy: minor) if the next minor/major version has no qualifying versions (only preview versions).\nUpdate path for Kubernetes versions For Kubernetes versions, the auto update picks the latest non-preview patch version of the current minor version.\nIf the cluster is already on the latest patch version and the latest patch version is also expired, it will continue with the latest patch version of the next consecutive minor (minor +1) Kubernetes version, so it will result in an update of a minor Kubernetes version!\nKubernetes “minor version jumps” are not allowed - meaning to skip the update to the consecutive minor version and directly update to any version after that. For instance, the version 1.24.x can only update to a version 1.25.x, not to 1.26.x or any other version. This is because Kubernetes does not guarantee upgradability in this case, leading to possibly broken Shoot clusters. The administrator has to set up the CloudProfile in such a way that consecutive Kubernetes minor versions are available. Otherwise, Shoot clusters will fail to upgrade during the maintenance time.\nConsider the CloudProfile below with a Shoot using the Kubernetes version 1.24.12. Even though the version is expired, due to missing 1.25.x versions, the Gardener Controller Manager cannot upgrade the Shoot’s Kubernetes version.\nspec: kubernetes: versions: - version: 1.26.10 - version: 1.26.9 - version: 1.24.12 expirationDate: \"\u003cexpiration date in the past\u003e\" The CloudProfile must specify versions 1.25.x of the consecutive minor version. Configuring the CloudProfile in such a way, the Shoot’s Kubernetes version will be upgraded to version 1.25.10 in the next maintenance time.\nspec: kubernetes: versions: - version: 1.26.9 - version: 1.25.10 - version: 1.25.9 - version: 1.24.12 expirationDate: \"\u003cexpiration date in the past\u003e\" Version Requirements (Kubernetes and Machine Image) The Gardener API server enforces the following requirements for versions:\n A version that is in use by a Shoot cannot be deleted from the CloudProfile. Creating a new version with expiration date in the past is not allowed. There can be only one supported version per minor version. The latest Kubernetes version cannot have an expiration date. NOTE: The latest version for a machine image can have an expiration date. [*] [*] Useful for cases in which support for a given machine image needs to be deprecated and removed (for example, the machine image reaches end of life).\nRelated Documentation You might want to read about the Shoot Updates and Upgrades procedures to get to know the effects of such operations.\n","categories":"","description":"","excerpt":"Shoot Kubernetes and Operating System Versioning in Gardener …","ref":"/docs/gardener/shoot_versions/","tags":"","title":"Shoot Kubernetes and Operating System Versioning in Gardener"},{"body":"Shoot Maintenance There is a general document about shoot maintenance that you might want to read. Here, we describe how you can influence certain operations that happen during a shoot maintenance.\nRestart Control Plane Controllers As outlined in the above linked document, Gardener offers to restart certain control plane controllers running in the seed during a shoot maintenance.\nExtension controllers can extend the amount of pods being affected by these restarts. If your Gardener extension manages pods of a shoot’s control plane (shoot namespace in seed) and it could potentially profit from a regular restart, please consider labeling it with maintenance.gardener.cloud/restart=true.\n","categories":"","description":"","excerpt":"Shoot Maintenance There is a general document about shoot maintenance …","ref":"/docs/gardener/extensions/shoot-maintenance/","tags":"","title":"Shoot Maintenance"},{"body":"Shoot Maintenance Shoots configure a maintenance time window in which Gardener performs certain operations that may restart the control plane, roll out the nodes, result in higher network traffic, etc. A summary of what was changed in the last maintenance time window in shoot specification is kept in the shoot status .status.lastMaintenance field.\nThis document outlines what happens during a shoot maintenance.\nTime Window Via the .spec.maintenance.timeWindow field in the shoot specification, end-users can configure the time window in which maintenance operations are executed. Gardener runs one maintenance operation per day in this time window:\nspec: maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 The offset (+0100) is considered with respect to UTC time. The minimum time window is 30m and the maximum is 6h.\n⚠️ Please note that there is no guarantee that a maintenance operation that, e.g., starts a node roll-out will finish within the time window. Especially for large clusters, it may take several hours until a graceful rolling update of the worker nodes succeeds (also depending on the workload and the configured pod disruption budgets/termination grace periods).\nInternally, Gardener is subtracting 15m from the end of the time window to (best-effort) try to finish the maintenance until the end is reached, however, this might not work in all cases.\nIf you don’t specify a time window, then Gardener will randomly compute it. You can change it later, of course.\nAutomatic Version Updates The .spec.maintenance.autoUpdate field in the shoot specification allows you to control how/whether automatic updates of Kubernetes patch and machine image versions are performed. Machine image versions are updated per worker pool.\nspec: maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true During the daily maintenance, the Gardener Controller Manager updates the Shoot’s Kubernetes and machine image version if any of the following criteria applies:\n There is a higher version available and the Shoot opted-in for automatic version updates. The currently used version is expired. The target version for machine image upgrades is controlled by the updateStrategy field for the machine image in the CloudProfile. Allowed update strategies are patch, minor and major.\nGardener (gardener-controller-manager) populates the lastMaintenance field in the Shoot status with the maintenance results.\nLast Maintenance: Description: \"All maintenance operations successful. Control Plane: Updated Kubernetes version from 1.26.4 to 1.27.1. Reason: Kubernetes version expired - force update required\" State: Succeeded Triggered Time: 2023-07-28T09:07:27Z Additionally, Gardener creates events with the type MachineImageVersionMaintenance or KubernetesVersionMaintenance on the Shoot describing the action performed during maintenance, including the reason why an update has been triggered.\nLAST SEEN TYPE REASON OBJECT MESSAGE 30m Normal MachineImageVersionMaintenance shoot/local Worker pool \"local\": Updated image from 'gardenlinux' version 'xy' to version 'abc'. Reason: Automatic update of the machine image version is configured (image update strategy: major). 30m Normal KubernetesVersionMaintenance shoot/local Control Plane: Updated Kubernetes version from \"1.26.4\" to \"1.27.1\". Reason: Kubernetes version expired - force update required. 15m Normal KubernetesVersionMaintenance shoot/local Worker pool \"local\": Updated Kubernetes version '1.26.3' to version '1.27.1'. Reason: Kubernetes version expired - force update required. If at least one maintenance operation fails, the lastMaintenance field in the Shoot status is set to Failed:\nLast Maintenance: Description: \"(1/2) maintenance operations successful: Control Plane: Updated Kubernetes version from 1.26.4 to 1.27.1. Reason: Kubernetes version expired - force update required, Worker pool x: 'gardenlinux' machine image version maintenance failed. Reason for update: machine image version expired\" FailureReason: \"Worker pool x: either the machine image 'gardenlinux' is reaching end of life and migration to another machine image is required or there is a misconfiguration in the CloudProfile.\" State: Failed Triggered Time: 2023-07-28T09:07:27Z Please refer to the Shoot Kubernetes and Operating System Versioning in Gardener topic for more information about Kubernetes and machine image versions in Gardener.\nCluster Reconciliation Gardener administrators/operators can configure the gardenlet in a way that it only reconciles shoot clusters during their maintenance time windows. This behaviour is not controllable by end-users but might make sense for large Gardener installations. Concretely, your shoot will be reconciled regularly during its maintenance time window. Outside of the maintenance time window it will only reconcile if you change the specification or if you explicitly trigger it, see also Trigger Shoot Operations.\nConfine Specification Changes/Updates Roll Out Via the .spec.maintenance.confineSpecUpdateRollout field you can control whether you want to make Gardener roll out changes/updates to your shoot specification only during the maintenance time window. It is false by default, i.e., any change to your shoot specification triggers a reconciliation (even outside of the maintenance time window). This is helpful if you want to update your shoot but don’t want the changes to be applied immediately. One example use-case would be a Kubernetes version upgrade that you want to roll out during the maintenance time window. Any update to the specification will not increase the .metadata.generation of the Shoot, which is something you should be aware of. Also, even if Gardener administrators/operators have not enabled the “reconciliation in maintenance time window only” configuration (as mentioned above), then your shoot will only reconcile in the maintenance time window. The reason is that Gardener cannot differentiate between create/update/reconcile operations.\n⚠️ If confineSpecUpdateRollout=true, please note that if you change the maintenance time window itself, then it will only be effective after the upcoming maintenance.\n⚠️ As exceptions to the above rules, manually triggered reconciliations and changes to the .spec.hibernation.enabled field trigger immediate rollouts. I.e., if you hibernate or wake-up your shoot, or you explicitly tell Gardener to reconcile your shoot, then Gardener gets active right away.\nShoot Operations In case you would like to perform a shoot credential rotation or a reconcile operation during your maintenance time window, you can annotate the Shoot with\nmaintenance.gardener.cloud/operation=\u003coperation\u003e This will execute the specified \u003coperation\u003e during the next maintenance reconciliation. Note that Gardener will remove this annotation after it has been performed in the maintenance reconciliation.\n ⚠️ This is skipped when the Shoot’s .status.lastOperation.state=Failed. Make sure to retry your shoot reconciliation beforehand.\n Special Operations During Maintenance The shoot maintenance controller triggers special operations that are performed as part of the shoot reconciliation.\nInfrastructure and DNSRecord Reconciliation The reconciliation of the Infrastructure and DNSRecord extension resources is only demanded during the shoot’s maintenance time window. The rationale behind it is to prevent sending too many requests against the cloud provider APIs, especially on large landscapes or if a user has many shoot clusters in the same cloud provider account.\nRestart Control Plane Controllers Gardener operators can make Gardener restart/delete certain control plane pods during a shoot maintenance. This feature helps to automatically solve service denials of controllers due to stale caches, dead-locks or starving routines.\nPlease note that these are exceptional cases but they are observed from time to time. Gardener, for example, takes this precautionary measure for kube-controller-manager pods.\nSee Shoot Maintenance to see how extension developers can extend this behaviour.\nRestart Some Core Addons Gardener operators can make Gardener restart some core addons (at the moment only CoreDNS) during a shoot maintenance.\nCoreDNS benefits from this feature as it automatically solve problems with clients stuck to single replica of the deployment and thus overloading it. Please note that these are exceptional cases but they are observed from time to time.\n","categories":"","description":"Defining the maintenance time window, configuring automatic version updates, confining reconciliations to only happen during maintenance, adding an additional maintenance operation, etc.","excerpt":"Defining the maintenance time window, configuring automatic version …","ref":"/docs/gardener/shoot_maintenance/","tags":"","title":"Shoot Maintenance"},{"body":"Shoot Networking Configurations This document contains network related information for Shoot clusters.\nPod Network A Pod network is imperative for any kind of cluster communication with Pods not started within the Node’s host network. More information about the Kubernetes network model can be found in the Cluster Networking topic.\nGardener allows users to configure the Pod network’s CIDR during Shoot creation:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: networking: type: \u003csome-network-extension-name\u003e # {calico,cilium} pods: 100.96.0.0/16 nodes: ... services: ... ⚠️ The networking.pods IP configuration is immutable and cannot be changed afterwards. Please consider the following paragraph to choose a configuration which will meet your demands.\n One of the network plugin’s (CNI) tasks is to assign IP addresses to Pods started in the Pod network. Different network plugins come with different IP address management (IPAM) features, so we can’t give any definite advice how IP ranges should be configured. Nevertheless, we want to outline the standard configuration.\nInformation in .spec.networking.pods matches the –cluster-cidr flag of the Kube-Controller-Manager of your Shoot cluster. This IP range is divided into smaller subnets, also called podCIDRs (default mask /24) and assigned to Node objects .spec.podCIDR. Pods get their IP address from this smaller node subnet in a default IPAM setup. Thus, it must be guaranteed that enough of these subnets can be created for the maximum amount of nodes you expect in the cluster.\nExample 1\nPod network: 100.96.0.0/16 nodeCIDRMaskSize: /24 ------------------------- Number of podCIDRs: 256 --\u003e max. Node count Number of IPs per podCIDRs: 256 With the configuration above a Shoot cluster can at most have 256 nodes which are ready to run workload in the Pod network.\nExample 2\nPod network: 100.96.0.0/20 nodeCIDRMaskSize: /24 ------------------------- Number of podCIDRs: 16 --\u003e max. Node count Number of IPs per podCIDRs: 256 With the configuration above a Shoot cluster can at most have 16 nodes which are ready to run workload in the Pod network.\nBeside the configuration in .spec.networking.pods, users can tune the nodeCIDRMaskSize used by Kube-Controller-Manager on shoot creation. A smaller IP range per node means more podCIDRs and thus the ability to provision more nodes in the cluster, but less available IPs for Pods running on each of the nodes.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubeControllerManager: nodeCIDRMaskSize: 24 (default) ⚠️ The nodeCIDRMaskSize configuration is immutable and cannot be changed afterwards.\n Example 3\nPod network: 100.96.0.0/20 nodeCIDRMaskSize: /25 ------------------------- Number of podCIDRs: 32 --\u003e max. Node count Number of IPs per podCIDRs: 128 With the configuration above, a Shoot cluster can at most have 32 nodes which are ready to run workload in the Pod network.\n","categories":"","description":"Configuring Pod network. Maximum number of Nodes and Pods per Node","excerpt":"Configuring Pod network. Maximum number of Nodes and Pods per Node","ref":"/docs/gardener/shoot_networking/","tags":"","title":"Shoot Networking Configurations"},{"body":"Register Shoot Networking Filter Extension in Shoot Clusters Introduction Within a shoot cluster, it is possible to enable the networking filter. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-networking-filter extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-networking-filter extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-networking-filter ... Opt-out If the shoot networking filter is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter disabled: true ... Ingress Filtering By default, the networking filter only filters egress traffic. However, if you enable blackholing, incoming traffic will also be blocked. You can enable blackholing on a per-shoot basis.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter providerConfig: egressFilter: blackholingEnabled: true ... Ingress traffic can only be blocked by blackhole routing, if the source IP address is preserved. On Azure, GCP and AliCloud this works by default. The default on AWS is a classic load balancer that replaces the source IP by it’s own IP address. Here, a network load balancer has to be configured adding the annotation service.beta.kubernetes.io/aws-load-balancer-type: \"nlb\" to the service. On OpenStack, load balancers don’t preserve the source address.\nPlease note that if you disable blackholing in an existing shoot, the associated blackhole routes will not be removed automatically. To remove these routes, you can either replace the affected nodes or delete the routes manually.\nCustom IP It is possible to add custom IP addresses to the network filter. This can be useful for testing purposes.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter providerConfig: egressFilter: staticFilterList: - network: 1.2.3.4/31 policy: BLOCK_ACCESS - network: 5.6.7.8/32 policy: BLOCK_ACCESS - network: ::2/128 policy: BLOCK_ACCESS ... ","categories":"","description":"","excerpt":"Register Shoot Networking Filter Extension in Shoot Clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/shoot-networking-filter/","tags":"","title":"Shoot Networking Filter"},{"body":"Register Shoot Networking Filter Extension in Shoot Clusters Introduction Within a shoot cluster, it is possible to enable the network problem detector. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-networking-problemdetector extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-networking-problemdetector extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector ... Opt-out If the shoot network problem detector is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector disabled: true ... ","categories":"","description":"","excerpt":"Register Shoot Networking Filter Extension in Shoot Clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/shoot-networking-problemdetector/","tags":"","title":"Shoot Networking Problemdetector"},{"body":"Enable / disable overlay network for shoots with Calico Gardener can be used with or without the overlay network.\nStarting versions:\n provider-gcp@v1.25.0 provider-alicloud@v1.43.0 provider-aws@v1.38.2 provider-openstack@v1.30.0 The default configuration of shoot clusters is without overlay network.\nUnderstanding overlay network The Overlay networking permits the routing of packets between multiples pods located on multiple nodes, even if the pod and the node network are not the same.\nThis is done through the encapsulation of pod packets in the node network so that the routing can be done as usual. We use ipip encapsulation with calico in case the overlay network is enabled. This (simply put) sends an IP packet as workload in another IP packet.\nIn order to simplify the troubleshooting of problems and reduce the latency of packets traveling between nodes, the overlay network is disabled by default as stated above for all new clusters.\nThis means that the routing is done directly through the VPC routing table. Basically, when a new node is created, it is assigned a slice (usually a /24) of the pod network. All future pods in that node are going to be in this slice. Then, the cloud-controller-manager updates the cloud provider router to add the new route (all packets within the network slice as destination should go to that node).\nThis has the advantage of:\n Doing less work for the node as encapsulation takes some CPU cycles. The maximum transmission unit (MTU) is slightly bigger resulting in slightly better performance, i.e. potentially more workload bytes per packet. More direct and simpler setup, which makes the problems much easier to troubleshoot. In the case where multiple shoots are in the same VPC and the overlay network is disabled, if the pod’s network is not configured properly, there is a very strong chance that some pod IP address might overlap, which is going to cause all sorts of funny problems. So, if someone asks you how to avoid that, they need to make sure that the podCIDRs for each shoot do not overlap with each other.\nEnabling the overlay network In certain cases, the overlay network might be preferable if, for example, the customer wants to create multiple clusters in the same VPC without ensuring there’s no overlap between the pod networks.\nTo enable the overlay network, add the following to the shoot’s YAML:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: ... spec: ... networking: type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: true ... Disabling the overlay network Inversely, here is how to disable the overlay network:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: ... spec: ... networking: type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: false ... How to know if a cluster is using overlay or not? You can look at any of the old nodes. If there are tunl0 devices at least at some point in time the overlay network was used. Another way is to look into the Network object in the shoot’s control plane namespace on the seed (see example above).\nDo we have some documentation somewhere on how to do the migration? No, not yet. The migration from no overlay to overlay is fairly simply by just setting the configuration as specified above. The other way is more complicated as the Network configuration needs to be changed AND the local routes need to be cleaned. Unfortunately, the change will be rolled out slowly (one calico-node at a time). Hence, it implies some network outages during the migration.\nAWS implementation On AWS, it is not possible to use the cloud-controller-manager for managing the routes as it does not support multiple route tables, which Gardener creates. Therefore, a custom controller is created to manage the routes.\n","categories":"","description":"","excerpt":"Enable / disable overlay network for shoots with Calico Gardener can …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/shoot_overlay_network/","tags":"","title":"Shoot Overlay Network"},{"body":"Introduction There are two types of pod autoscaling in Kubernetes: Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA). HPA (implemented as part of the kube-controller-manager) scales the number of pod replicas, while VPA (implemented as independent community project) adjusts the CPU and memory requests for the pods. Both types of autoscaling aim to optimize resource usage/costs and maintain the performance and (high) availability of applications running on Kubernetes.\nHorizontal Pod Autoscaling (HPA) Horizontal Pod Autoscaling involves increasing or decreasing the number of pod replicas in a deployment, replica set, stateful set, or anything really with a scale subresource that manages pods. HPA adjusts the number of replicas based on specified metrics, such as CPU or memory average utilization (usage divided by requests; most common) or average value (usage; less common). When the demand on your application increases, HPA automatically scales out the number of pods to meet the demand. Conversely, when the demand decreases, it scales in the number of pods to reduce resource usage.\nHPA targets (mostly stateless) applications where adding more instances of the application can linearly increase the ability to handle additional load. It is very useful for applications that experience variable traffic patterns, as it allows for real-time scaling without the need for manual intervention.\n [!NOTE] HPA continuously monitors the metrics of the targeted pods and adjusts the number of replicas based on the observed metrics. It operates solely on the current metrics when it calculates the averages across all pods, meaning it reacts to the immediate resource usage without considering past trends or patterns. Also, all pods are treated equally based on the average metrics. This could potentially lead to situations where some pods are under high load while others are underutilized. Therefore, particular care must be applied to (fair) load-balancing (connection vs. request vs. actual resource load balancing are crucial).\n A Few Words on the Cluster-Proportional (Horizontal) Autoscaler (CPA) and the Cluster-Proportional Vertical Autoscaler (CPVA) Besides HPA and VPA, CPA and CPVA are further options for scaling horizontally or vertically (neither is deployed by Gardener and must be deployed by the user). Unlike HPA and VPA, CPA and CPVA do not monitor the actual pod metrics, but scale solely on the number of nodes or CPU cores in the cluster. While this approach may be helpful and sufficient in a few rare cases, it is often a risky and crude scaling scheme that we do not recommend. More often than not, cluster-proportional scaling results in either under- or over-reserving your resources.\nVertical Pod Autoscaling (VPA) Vertical Pod Autoscaling, on the other hand, focuses on adjusting the CPU and memory resources allocated to the pods themselves. Instead of changing the number of replicas, VPA tweaks the resource requests (and limits, but only proportionally, if configured) for the pods in a deployment, replica set, stateful set, daemon set, or anything really with a scale subresource that manages pods. This means that each pod can be given more, or fewer resources as needed.\nVPA is very useful for optimizing the resource requests of pods that have dynamic resource needs over time. It does so by mutating pod requests (unfortunately, not in-place). Therefore, in order to apply new recommendations, pods that are “out of bounds” (i.e. below a configured/computed lower or above a configured/computed upper recommendation percentile) will be evicted proactively, but also pods that are “within bounds” may be evicted after a grace period. The corresponding higher-level replication controller will then recreate a new pod that VPA will then mutate to set the currently recommended requests (and proportional limits, if configured).\n [!NOTE] VPA continuously monitors all targeted pods and calculates recommendations based on their usage (one recommendation for the entire target). This calculation is influenced by configurable percentiles, with a greater emphasis on recent usage data and a gradual decrease (=decay) in the relevance of older data. However, this means, that VPA doesn’t take into account individual needs of single pods - eventually, all pods will receive the same recommendation, which may lead to considerable resource waste. Ideally, VPA would update pods in-place depending on their individual needs, but that’s (individual recommendations) not in its design, even if in-place updates get implemented, which may be years away for VPA based on current activity on the component.\n Selecting the Appropriate Autoscaler Before deciding on an autoscaling strategy, it’s important to understand the characteristics of your application:\n Interruptibility: Most importantly, if the clients of your workload are too sensitive to disruptions/cannot cope well with terminating pods, then maybe neither HPA nor VPA is an option (both, HPA and VPA cause pods and connections to be terminated, though VPA even more frequently). Clients must retry on disruptions, which is a reasonable ask in a highly dynamic (and self-healing) environment such as Kubernetes, but this is often not respected (or expected) by your clients (they may not know or care you run the workload in a Kubernetes cluster and have different expectations to the stability of the workload unless you communicated those through SLIs/SLOs/SLAs). Statelessness: Is your application stateless or stateful? Stateless applications are typically better candidates for HPA as they can be easily scaled out by adding more replicas without worrying about maintaining state. Traffic Patterns: Does your application experience variable traffic? If so, HPA can help manage these fluctuations by adjusting the number of replicas to handle the load. Resource Usage: Does your application’s resource usage change over time? VPA can adjust the CPU and memory reservations dynamically, which is beneficial for applications with non-uniform resource requirements. Scalability: Can your application handle increased load by scaling vertically (more resources per pod) or does it require horizontal scaling (more pod instances)? HPA is the right choice if:\n Your application is stateless and can handle increased load by adding more instances. You experience short-term fluctuations in traffic that require quick scaling responses. You want to maintain a specific performance metric, such as requests per second per pod. VPA is the right choice if:\n Your application’s resource requirements change over time, and you want to optimize resource usage without manual intervention. You want to avoid the complexity of managing resource requests for each pod, especially when they run code where it’s impossible for you to suggest static requests. In essence:\n For applications that can handle increased load by simply adding more replicas, HPA should be used to handle short-term fluctuations in load by scaling the number of replicas. For applications that require more resources per pod to handle additional work, VPA should be used to adjust the resource allocation for longer-term trends in resource usage. Consequently, if both cases apply (VPA often applies), HPA and VPA can also be combined. However, combining both, especially on the same metrics (CPU and memory), requires understanding and care to avoid conflicts and ensure that the autoscaling actions do not interfere with and rather complement each other. For more details, see Combining HPA and VPA.\nHorizontal Pod Autoscaler (HPA) HPA operates by monitoring resource metrics for all pods in a target. It computes the desired number of replicas from the current average metrics and the desired user-defined metrics as follows:\ndesiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]\nHPA checks the metrics at regular intervals, which can be configured by the user. Several types of metrics are supported (classical resource metrics like CPU and memory, but also custom and external metrics like requests per second or queue length can be configured, if available). If a scaling event is necessary, HPA adjusts the replica count for the targeted resource.\nDefining an HPA Resource To configure HPA, you need to create an HPA resource in your cluster. This resource specifies the target to scale, the metrics to be used for scaling decisions, and the desired thresholds. Here’s an example of an HPA configuration:\napiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: foo-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: foo-deployment minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: AverageValue averageValue: 2 - type: Resource resource: name: memory target: type: AverageValue averageValue: 8G behavior: scaleUp: stabilizationWindowSeconds: 30 policies: - type: Percent value: 100 periodSeconds: 60 scaleDown: stabilizationWindowSeconds: 1800 policies: - type: Pods value: 1 periodSeconds: 300 In this example, HPA is configured to scale foo-deployment based on pod average CPU and memory usage. It will maintain an average CPU and memory usage (not utilization, which is usage divided by requests!) across all replicas of 2 CPUs and 8G or lower with as few replicas as possible. The number of replicas will be scaled between a minimum of 1 and a maximum of 10 based on this target.\nSince a while, you can also configure the autoscaling based on the resource usage of individual containers, not only on the resource usage of the entire pod. All you need to do is to switch the type from Resource to ContainerResource and specify the container name.\nIn the official documentation ([1] and [2]) you will find examples with average utilization (averageUtilization), not average usage (averageValue), but this is not particularly helpful, especially if you plan to combine HPA together with VPA on the same metrics (generally discouraged in the documentation). If you want to safely combine both on the same metrics, you should scale on average usage (averageValue) as shown above. For more details, see Combining HPA and VPA.\nFinally, the behavior section influences how fast you scale up and down. Most of the time (depends on your workload), you like to scale out faster than you scale in. In this example, the configuration will trigger a scale-out only after observing the need to scale out for 30s (stabilizationWindowSeconds) and will then only scale out at most 100% (value + type) of the current number of replicas every 60s (periodSeconds). The configuration will trigger a scale-in only after observing the need to scale in for 1800s (stabilizationWindowSeconds) and will then only scale in at most 1 pod (value + type) every 300s (periodSeconds). As you can see, scale-out happens quicker than scale-in in this example.\nHPA (actually KCM) Options HPA is a function of the kube-controller-manager (KCM).\nYou can read up the full KCM options online and set most of them conveniently in your Gardener shoot cluster spec:\n downscaleStabilization (default 5m): HPA will scale out whenever the formula (in accordance with the behavior section, if present in the HPA resource) yields a higher replica count, but it won’t scale in just as eagerly. This option lets you define a trailing time window that HPA must check and only if the recommended replica count is consistently lower throughout the entire time window, HPA will scale in (in accordance with the behavior section, if present in the HPA resource). If at any point in time in that trailing time window the recommended replica count isn’t lower, scale-in won’t happen. This setting is just a default, if nothing is defined in the behavior section of an HPA resource. The default for the upscale stabilization is 0s and it cannot be set via a KCM option (downscale stabilization was historically more important than upscale stabilization and when later the behavior sections were added to the HPA resources, upscale stabilization remained missing from the KCM options). tolerance (default +/-10%): HPA will not scale out or in if the desired replica count is (mathematically as a float) near the actual replica count (see source code for details), which is a form of hysteresis to avoid replica flapping around a threshold. There are a few more configurable options of lesser interest:\n syncPeriod (default 15s): How often HPA retrieves the pods and metrics respectively how often it recomputes and sets the desired replica count.\n cpuInitializationPeriod (default 30s) and initialReadinessDelay (default 5m): Both settings only affect whether or not CPU metrics are considered for scaling decisions. They can be easily misinterpreted as the official docs are somewhat hard to read (see source code for details, which is more readable, if you ignore the comments). Normally, you have little reason to modify them, but here is what they do:\n cpuInitializationPeriod: Defines a grace period after a pod starts during which HPA won’t consider CPU metrics of the pod for scaling if the pod is either not ready or it is ready, but a given CPU metric is older than the last state transition (to ready). This is to ignore CPU metrics that predate the current readiness while still in initialization to not make scaling decisions based on potentially misleading data. If the pod is ready and a CPU metric was collected after it became ready, it is considered also within this grace period. initialReadinessDelay: Defines another grace period after a pod starts during which HPA won’t consider CPU metrics of the pod for scaling if the pod is not ready and it became not ready within this grace period (the docs/comments want to check whether the pod was ever ready, but the code only checks whether the pod condition last transition time to not ready happened within that grace period which it could have from being ready or simply unknown before). This is to ignore not (ever have been) ready pods while still in initialization to not make scaling decisions based on potentially misleading data. If the pod is ready, it is considered also within this grace period. So, regardless of the values of these settings, if a pod is reporting ready and it has a CPU metric from the time after it became ready, that pod and its metric will be considered. This holds true even if the pod becomes ready very early into its initialization. These settings cannot be used to “black-out” pods for a certain duration before being considered for scaling decisions. Instead, if it is your goal to ignore a potentially resource-intensive initialization phase that could wrongly lead to further scale-out, you would need to configure your pods to not report as ready until that resource-intensive initialization phase is over.\n Considerations When Using HPA Selection of metrics: Besides CPU and memory, HPA can also target custom or external metrics. Pick those (in addition or exclusively), if you guarantee certain SLOs in your SLAs. Targeting usage or utilization: HPA supports usage (absolute) and utilization (relative). Utilization is often preferred in simple examples, but usage is more precise and versatile. Compatibility with VPA: Care must be taken when using HPA in conjunction with VPA, as they can potentially interfere with each other’s scaling decisions. Vertical Pod Autoscaler (VPA) VPA operates by monitoring resource metrics for all pods in a target. It computes a resource requests recommendation from the historic and current resource metrics. VPA checks the metrics at regular intervals, which can be configured by the user. Only CPU and memory are supported. If VPA detects that a pod’s resource allocation is too high or too low, it may evict pods (if within the permitted disruption budget), which will trigger the creation of a new pod by the corresponding higher-level replication controller, which will then be mutated by VPA to match resource requests recommendation. This happens in three different components that work together:\n VPA Recommender: The Recommender observes the historic and current resource metrics of pods and generates recommendations based on this data. VPA Updater: The Updater component checks the recommendations from the Recommender and decides whether any pod’s resource requests need to be updated. If an update is needed, the Updater will evict the pod. VPA Admission Controller: When a pod is (re-)created, the Admission Controller modifies the pod’s resource requests based on the recommendations from the Recommender. This ensures that the pod starts with the optimal amount of resources. Since VPA doesn’t support in-place updates, pods will be evicted. You will want to control voluntary evictions by means of Pod Disruption Budgets (PDBs). Please make yourself familiar with those and use them.\n [!NOTE] PDBs will not always work as expected and can also get in your way, e.g. if the PDB is violated or would be violated, it may possibly block evictions that would actually help your workload, e.g. to get a pod out of an OOMKilled CrashLoopBackoff (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which VPA doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server, beta and enabled by default since Kubernetes v1.27) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\n Defining a VPA Resource To configure VPA, you need to create a VPA resource in your cluster. This resource specifies the target to scale, the metrics to be used for scaling decisions, and the policies for resource updates. Here’s an example of an VPA configuration:\napiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: foo-vpa spec: targetRef: apiVersion: \"apps/v1\" kind: Deployment name: foo-deployment updatePolicy: updateMode: \"Auto\" resourcePolicy: containerPolicies: - containerName: foo-container controlledValues: RequestsOnly minAllowed: cpu: 50m memory: 200M maxAllowed: cpu: 4 memory: 16G In this example, VPA is configured to scale foo-deployment requests (RequestsOnly) from 50m cores (minAllowed) up to 4 cores (maxAllowed) and 200M memory (minAllowed) up to 16G memory (maxAllowed) automatically (updateMode). VPA doesn’t support in-place updates, so in updateMode Auto it will evict pods under certain conditions and then mutate the requests (and possibly limits if you omit controlledValues or set it to RequestsAndLimits, which is the default) of upcoming new pods.\nMultiple update modes exist. They influence eviction and mutation. The most important ones are:\n Off: In this mode, recommendations are computed, but never applied. This mode is useful, if you want to learn more about your workload or if you have a custom controller that depends on VPA’s recommendations but shall act instead of VPA. Initial: In this mode, recommendations are computed and applied, but pods are never proactively evicted to enforce new recommendations over time. This mode is useful, if you want to control pod evictions yourself (similar to the StatefulSet updateStrategy OnDelete) or your workload is sensitive to evictions, e.g. some brownfield singleton application or a daemon set pod that is critical for the node. Auto (default): In this mode, recommendations are computed, applied, and pods are even proactively evicted to enforce new recommendations over time. This applies recommendations continuously without you having to worry too much. As mentioned, controlledValues influences whether only requests or requests and limits are scaled:\n RequestsOnly: Updates only requests and doesn’t change limits. Useful if you have defined absolute limits (unrelated to the requests). RequestsAndLimits (default): Updates requests and proportionally scales limits along with the requests. Useful if you have defined relative limits (related to the requests). In this case, the gap between requests and limits should be either zero for QoS Guaranteed or small for QoS Burstable to avoid useless (way beyond the threshold of unhealthy behavior) or absurd (larger than node capacity) values. VPA doesn’t offer many more settings that can be tuned per VPA resource than you see above (different than HPA’s behavior section). However, there is one more that isn’t shown above, which allows to scale only up or only down (evictionRequirements[].changeRequirement), in case you need that, e.g. to provide resources when needed, but avoid disruptions otherwise.\nVPA Options VPA is an independent community project that consists of a recommender (computing target recommendations and bounds), an updater (evicting pods that are out of recommendation bounds), and an admission controller (mutating webhook applying the target recommendation to newly created pods). As such, they have independent options.\nVPA Recommender Options You can read up the full VPA recommender options online and set some of them conveniently in your Gardener shoot cluster spec:\n recommendationMarginFraction (default 15%): Safety margin that will be added to the recommended requests. targetCPUPercentile (default 90%): CPU usage percentile that will be targeted with the CPU recommendation (i.e. recommendation will “fit” e.g. 90% of the observed CPU usages). This setting is relevant for balancing your requests reservations vs. your costs. If you want to reduce costs, you can reduce this value (higher risk because of potential under-reservation, but lower costs), because CPU is compressible, but then VPA may lack the necessary signals for scale-up as throttling on an otherwise fully utilized node will go unnoticed by VPA. If you want to err on the safe side, you can increase this value, but you will then target more and more a worst case scenario, quickly (maybe even exponentially) increasing the costs. targetMemoryPercentile (default 90%): Memory usage percentile that will be targeted with the memory recommendation (i.e. recommendation will “fit” e.g. 90% of the observed memory usages). This setting is relevant for balancing your requests reservations vs. your costs. If you want to reduce costs, you can reduce this value (higher risk because of potential under-reservation, but lower costs), because OOMs will trigger bump-ups, but those will disrupt the workload. If you want to err on the safe side, you can increase this value, but you will then target more and more a worst case scenario, quickly (maybe even exponentially) increasing the costs. There are a few more configurable options of lesser interest:\n recommenderInterval (default 1m): How often VPA retrieves the pods and metrics respectively how often it recomputes the recommendations and bounds. There are many more options that you can only configure if you deploy your own VPA and which we will not discuss here, but you can check them out here.\n [!NOTE] Due to an implementation detail (smallest bucket size), VPA cannot create recommendations below 10m cores and 10M memory even if minAllowed is lower.\n VPA Updater Options You can read up the full VPA updater options online and set some of them conveniently in your Gardener shoot cluster spec:\n evictAfterOOMThreshold (default 10m): Pods where at least one container OOMs within this time period since its start will be actively evicted, which will implicitly apply the new target recommendation that will have been bumped up after OOMKill. Please note, the kubelet may evict pods even before an OOM, but only if kube-reserved is underrun, i.e. node-level resources are running low. In these cases, eviction will happen first by pod priority and second by how much the usage overruns the requests. evictionTolerance (default 50%): Defines a threshold below which no further eligible pod will be evited anymore, i.e. limits how many eligible pods may be in eviction in parallel (but at least 1). The threshold is computed as follows: running - evicted \u003e replicas - tolerance. Example: 10 replicas, 9 running, 8 eligible for eviction, 20% tolerance with 10 replicas which amounts to 2 pods, and no pod evicted in this round yet, then 9 - 0 \u003e 10 - 2 is true and a pod would be evicted, but the next one would be in violation as 9 - 1 = 10 - 2 and no further pod would be evicted anymore in this round. evictionRateBurst (default 1): Defines how many eligible pods may be evicted in one go. evictionRateLimit (default disabled): Defines how many eligible pods may be evicted per second (a value of 0 or -1 disables the rate limiting). In general, avoid modifying these eviction settings unless you have good reasons and try to rely on Pod Disruption Budgets (PDBs) instead. However, PDBs are not available for daemon sets.\nThere are a few more configurable options of lesser interest:\n updaterInterval (default 1m): How often VPA evicts the pods. There are many more options that you can only configure if you deploy your own VPA and which we will not discuss here, but you can check them out here.\nConsiderations When Using VPA Initial Resource Estimates: VPA requires historical resource usage data to base its recommendations on. Until they kick in, your initial resource requests apply and should be sensible. Pod Disruption: When VPA adjusts the resources for a pod, it may need to “recreate” the pod, which can cause temporary disruptions. This should be taken into account. Compatibility with HPA: Care must be taken when using VPA in conjunction with HPA, as they can potentially interfere with each other’s scaling decisions. Combining HPA and VPA HPA and VPA serve different purposes and operate on different axes of scaling. HPA increases or decreases the number of pod replicas based on metrics like CPU or memory usage, effectively scaling the application out or in. VPA, on the other hand, adjusts the CPU and memory reservations of individual pods, scaling the application up or down.\nWhen used together, these autoscalers can provide both horizontal and vertical scaling. However, they can also conflict with each other if used on the same metrics (e.g. both on CPU or both on memory). In particular, if VPA adjusts the requests, the utilization, i.e. the ratio between usage and requests, will approach 100% (for various reasons not exactly right, but for this consideration, close enough), which may trigger HPA to scale out, if it’s configured to scale on utilization below 100% (often seen in simple examples), which will spread the load across more pods, which may trigger VPA again to adjust the requests to match the new pod usages.\nThis is a feedback loop and it stems from HPA’s method of calculating the desired number of replicas, which is:\ndesiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]\nIf desiredMetricValue is utilization and VPA adjusts the requests, which changes the utilization, this may inadvertently trigger HPA and create said feedback loop. On the other hand, if desiredMetricValue is usage and VPA adjusts the requests now, this will have no impact on HPA anymore (HPA will always influence VPA, but we can control whether VPA influences HPA).\nTherefore, to safely combine HPA and VPA, consider the following strategies:\n Configure HPA and VPA on different metrics: One way to avoid conflicts is to use HPA and VPA based on different metrics. For instance, you could configure HPA to scale based on requests per seconds (or another representative custom/external metric) and VPA to adjust CPU and memory requests. This way, each autoscaler operates independently based on its specific metric(s). Configure HPA to scale on usage, not utilization, when used with VPA: Another way to avoid conflicts is to use HPA not on average utilization (averageUtilization), but instead on average usage (averageValue) as replicas driver, which is an absolute metric (requests don’t affect usage). This way, you can combine both autoscalers even on the same metrics. Pod Autoscaling and Cluster Autoscaler Autoscaling within Kubernetes can be implemented at different levels: pod autoscaling (HPA and VPA) and cluster autoscaling (CA). While pod autoscaling adjusts the number of pod replicas or their resource reservations, cluster autoscaling focuses on the number of nodes in the cluster, so that your pods can be hosted. If your workload isn’t static and especially if you make use of pod autoscaling, it only works if you have sufficient node capacity available. The most effective way to do that, without running a worst-case number of nodes, is to configure burstable worker pools in your shoot spec, i.e. define a true minimum node count and a worst-case maximum node count and leave the node autoscaling to Gardener that internally uses the Cluster Autoscaler to provision and deprovision nodes as needed.\nCluster Autoscaler automatically adjusts the number of nodes by adding or removing nodes based on the demands of the workloads and the available resources. It interacts with the cloud provider’s APIs to provision or deprovision nodes as needed. Cluster Autoscaler monitors the utilization of nodes and the scheduling of pods. If it detects that pods cannot be scheduled due to a lack of resources, it will trigger the addition of new nodes to the cluster. Conversely, if nodes are underutilized for some time and their pods can be placed on other nodes, it will remove those nodes to reduce costs and improve resource efficiency.\nBest Practices:\n Resource Buffering: Maintain a buffer of resources to accommodate temporary spikes in demand without waiting for node provisioning. This can be done by deploying pods with low priority that can be preempted when real workloads require resources. This helps in faster pod scheduling and avoids delays in scaling out or up. Pod Disruption Budgets (PDBs): Use PDBs to ensure that during scale-down events, the availability of applications is maintained as the Cluster Autoscaler will not voluntarily evict a pod if a PDB would be violated. Interesting CA Options CA can be configured in your Gardener shoot cluster spec globally and also in parts per worker pool:\n Can only be configured globally: expander (default least-waste): Defines the “expander” algorithm to use during scale-up, see FAQ. scaleDownDelayAfterAdd (default 1h): Defines how long after scaling up a node, a node may be scaled down. scaleDownDelayAfterFailure (default 3m): Defines how long after scaling down a node failed, scaling down will be resumed. scaleDownDelayAfterDelete (default 0s): Defines how long after scaling down a node, another node may be scaled down. Can be configured globally and also overwritten individually per worker pool: scaleDownUtilizationThreshold (default 50%): Defines the threshold below which a node becomes eligible for scaling down. scaleDownUnneededTime (default 30m): Defines the trailing time window the node must be consistently below a certain utilization threshold before it can finally be scaled down. There are many more options that you can only configure if you deploy your own CA and which we will not discuss here, but you can check them out here.\nImportance of Monitoring Monitoring is a critical component of autoscaling for several reasons:\n Performance Insights: It provides insights into how well your autoscaling strategy is meeting the performance requirements of your applications. Resource Utilization: It helps you understand resource utilization patterns, enabling you to optimize resource allocation and reduce waste. Cost Management: It allows you to track the cost implications of scaling actions, helping you to maintain control over your cloud spending. Troubleshooting: It enables you to quickly identify and address issues with autoscaling, such as unexpected scaling behavior or resource bottlenecks. To effectively monitor autoscaling, you should leverage the following tools and metrics:\n Kubernetes Metrics Server: Collects resource metrics from kubelets and provides them to HPA and VPA for autoscaling decisions (automatically provided by Gardener). Prometheus: An open-source monitoring system that can collect and store custom metrics, providing a rich dataset for autoscaling decisions. Grafana/Plutono: A visualization tool that integrates with Prometheus to create dashboards for monitoring autoscaling metrics and events. Cloud Provider Tools: Most cloud providers offer native monitoring solutions that can be used to track the performance and costs associated with autoscaling. Key metrics to monitor include:\n CPU and Memory Utilization: Track the resource utilization of your pods and nodes to understand how they correlate with scaling events. Pod Count: Monitor the number of pod replicas over time to see how HPA is responding to changes in load. Scaling Events: Keep an eye on scaling events triggered by HPA and VPA to ensure they align with expected behavior. Application Performance Metrics: Track application-specific metrics such as response times, error rates, and throughput. Based on the insights gained from monitoring, you may need to adjust your autoscaling configurations:\n Refine Thresholds: If you notice frequent scaling actions or periods of underutilization or overutilization, adjust the thresholds used by HPA and VPA to better match the workload patterns. Update Policies: Modify VPA update policies if you observe that the current settings are causing too much or too little pod disruption. Custom Metrics: If using custom metrics, ensure they accurately reflect the load on your application and adjust them if they do not. Scaling Limits: Review and adjust the minimum and maximum scaling limits to prevent over-scaling or under-scaling based on the capacity of your cluster and the criticality of your applications. Quality of Service (QoS) A few words on the quality of service for pods. Basically, there are 3 classes of QoS and they influence the eviction of pods when kube-reserved is underrun, i.e. node-level resources are running low:\n BestEffort, i.e. pods where no container has CPU or memory requests or limits: Avoid them unless you have really good reasons. The kube-scheduler will place them just anywhere according to its policy, e.g. balanced or bin-packing, but whatever resources these pods consume, may bring other pods into trouble or even the kubelet and the container runtime itself, if it happens very suddenly. Burstable, i.e. pods where at least one container has CPU or memory requests and at least one has no limits or limits that don’t match the requests: Prefer them unless you have really good reasons for the other QoS classes. Always specify proper requests or use VPA to recommend those. This helps the kube-scheduler to make the right scheduling decisions. Not having limits will additionally provide upward resource flexibility, if the node is not under pressure. Guaranteed, i.e. pods where all containers have CPU and memory requests and equal limits: Avoid them unless you really know the limits or throttling/killing is intended. While “Guaranteed” sounds like something “positive” in the English language, this class comes with the downside, that pods will be actively CPU-throttled and will actively go OOM, even if the node is not under pressure and has excess capacity left. Worse, if containers in the pod are under VPA, their CPU requests/limits will often not be scaled up as CPU throttling will go unnoticed by VPA. Summary As a rule of thumb, always set CPU and memory requests (or let VPA do that) and always avoid CPU and memory limits. CPU limits aren’t helpful on an under-utilized node (=may result in needless outages) and even suppress the signals for VPA to act. On a nearly or fully utilized node, CPU limits are practically irrelevant as only the requests matter, which are translated into CPU shares that provide a fair use of the CPU anyway (see CFS). Therefore, if you do not know the healthy range, do not set CPU limits. If you as author of the source code know its healthy range, set them to the upper threshold of that healthy range (everything above, from your knowledge of that code, is definitely an unbound busy loop or similar, which is the main reason for CPU limits, besides batch jobs where throttling is acceptable or even desired). Memory limits may be more useful, but suffer a similar, though not as negative downside. As with CPU limits, memory limits aren’t helpful on an under-utilized node (=may result in needless outages), but different than CPU limits, they result in an OOM, which triggers VPA to provide more memory suddenly (modifies the currently computed recommendations by a configurable factor, defaulting to +20%, see docs). Therefore, if you do not know the healthy range, do not set memory limits. If you as author of the source code know its healthy range, set them to the upper threshold of that healthy range (everything above, from your knowledge of that code, is definitely an unbound memory leak or similar, which is the main reason for memory limits) Horizontal Pod Autoscaling (HPA): Use for pods that support horizontal scaling. Prefer scaling on usage, not utilization, as this is more predictable (not dependent on a second variable, namely the current requests) and conflict-free with vertical pod autoscaling (VPA). As a rule of thumb, set the initial replicas to the 5th percentile of the actually observed replica count in production. Since HPA reacts fast, this is not as critical, but may help reduce initial load on the control plane early after deployment. However, be cautious when you update the higher-level resource not to inadvertently reset the current HPA-controlled replica count (very easy to make mistake that can lead to catastrophic loss of pods). HPA modifies the replica count directly in the spec and you do not want to overwrite that. Even if it reacts fast, it is not instant (not via a mutating webhook as VPA operates) and the damage may already be done. As for minimum and maximum, let your high availability requirements determine the minimum and your theoretical maximum load determine the maximum, flanked with alerts to detect erroneous run-away out-scaling or the actual nearing of your practical maximum load, so that you can intervene. Vertical Pod Autoscaling (VPA): Use for containers that have a significant usage (e.g. any container above 50m CPU or 100M memory) and a significant usage spread over time (by more than 2x), i.e. ignore small (e.g. side-cars) or static (e.g. Java statically allocated heap) containers, but otherwise use it to provide the resources needed on the one hand and keep the costs in check on the other hand. As a rule of thumb, set the initial requests to the 5th percentile of the actually observed CPU resp. memory usage in production. Since VPA may need some time at first to respond and evict pods, this is especially critical early after deployment. The lower bound, below which pods will be immediately evicted, converges much faster than the upper bound, above which pods will be immediately evicted, but it isn’t instant, e.g. after 5 minutes the lower bound is just at 60% of the computed lower bound; after 12 hours the upper bound is still at 300% of the computed upper bound (see code). Unlike with HPA, you don’t need to be as cautious when updating the higher-level resource in the case of VPA. As long as VPA’s mutating webhook (VPA Admission Controller) is operational (which also the VPA Updater checks before evicting pods), it’s generally safe to update the higher-level resource. However, if it’s not up and running, any new pods that are spawned (e.g. as a consequence of a rolling update of the higher-level resource or for any other reason) will not be mutated. Instead, they will receive whatever requests are currently configured at the higher-level resource, which can lead to catastrophic resource under-reservation. Gardener deploys the VPA Admission Controller in HA - if unhealthy, it is reported under the ControlPlaneHealthy shoot status condition. If you have defined absolute limits (unrelated to the requests), configure VPA to only scale the requests or else it will proportionally scale the limits as well, which can easily become useless (way beyond the threshold of unhealthy behavior) or absurd (larger than node capacity): spec: resourcePolicy: containerPolicies: - controlledValues: RequestsOnly ... If you have defined relative limits (related to the requests), the default policy to scale the limits proportionally with the requests is fine, but the gap between requests and limits must be zero for QoS Guaranteed and should best be small for QoS Burstable to avoid useless or absurd limits either, e.g. prefer limits being 5 to at most 20% larger than requests as opposed to being 100% larger or more. As a rule of thumb, set minAllowed to the highest observed VPA recommendation (usually during the initialization phase or during any periodical activity) for an otherwise practically idle container, so that you avoid needless trashing (e.g. resource usage calms down over time and recommendations drop consecutively until eviction, which will then lead again to initialization or later periodical activity and higher recommendations and new evictions).⚠️ You may want to provide higher minAllowed values, if you observe that up-scaling takes too long for CPU or memory for a too large percentile of your workload. This will get you out of the danger zone of too few resources for too many pods at the expense of providing too many resources for a few pods. Memory may react faster than CPU, because CPU throttling is not visible and memory gets aided by OOM bump-up incidents, but still, if you observe that up-scaling takes too long, you may want to increase minAllowed accordingly. As a rule of thumb, set maxAllowed to your theoretical maximum load, flanked with alerts to detect erroneous run-away usage or the actual nearing of your practical maximum load, so that you can intervene. However, VPA can easily recommend requests larger than what is allocatable on a node, so you must either ensure large enough nodes (Gardener can scale up from zero, in case you like to define a low-priority worker pool with more resources for very large pods) and/or cap VPA’s target recommendations using maxAllowed at the node allocatable remainder (after daemon set pods) of the largest eligible machine type (may result in under-provisioning resources for a pod). Use your monitoring and check maximum pod usage to decide about the maximum machine type. Recommendations in a Box Container When to use Value Requests - Set them (recommended) unless:- Do not set requests for QoS BestEffort; useful only if pod can be evicted as often as needed and pod can pick up where it left off without any penalty Set requests to 95th percentile (w/o VPA) of the actually observed CPU resp. memory usage in production resp. 5th percentile (w/ VPA) (see below) Limits - Avoid them (recommended) unless:- Set limits for QoS Guaranteed; useful only if pod has strictly static resource requirements- Set CPU limits if you want to throttle CPU usage for containers that can be throttled w/o any other disadvantage than processing time (never do that when time-critical operations like leases are involved)- Set limits if you know the healthy range and want to shield against unbound busy loops, unbound memory leaks, or similar If you really can (otherwise not), set limits to healthy theoretical max load Scaler When to use Initial Minimum Maximum HPA Use for pods that support horizontal scaling Set initial replicas to 5th percentile of the actually observed replica count in production (prefer scaling on usage, not utilization) and make sure to never overwrite it later when controlled by HPA Set minReplicas to 0 (requires feature gate and custom/external metrics), to 1 (regular HPA minimum), or whatever the high availability requirements of the workload demand Set maxReplicas to healthy theoretical max load VPA Use for containers that have a significant usage (\u003e50m/100M) and a significant usage spread over time (\u003e2x) Set initial requests to 5th percentile of the actually observed CPU resp. memory usage in production Set minAllowed to highest observed VPA recommendation (includes start-up phase) for an otherwise practically idle container (avoids pod trashing when pod gets evicted after idling) Set maxAllowed to fresh node allocatable remainder after daemonset pods (avoids pending pods when requests exeed fresh node allocatable remainder) or, if you really can (otherwise not), to healthy theoretical max load (less disruptive than limits as no throttling or OOM happens on under-utilized nodes) CA Use for dynamic workloads, definitely if you use HPA and/or VPA N/A Set minimum to 0 or number of nodes required right after cluster creation or wake-up Set maximum to healthy theoretical max load [!NOTE] Theoretical max load may be very difficult to ascertain, especially with modern software that consists of building blocks you do not own or know in detail. If you have comprehensive monitoring in place, you may be tempted to pick the observed maximum and add a safety margin or even factor on top (2x, 4x, or any other number), but this is not to be confused with “theoretical max load” (solely depending on the code, not observations from the outside). At any point in time, your numbers may change, e.g. because you updated a software component or your usage increased. If you decide to use numbers that are set based only on observations, make sure to flank those numbers with monitoring alerts, so that you have sufficient time to investigate, revise, and readjust if necessary.\n Conclusion Pod autoscaling is a dynamic and complex aspect of Kubernetes, but it is also one of the most powerful tools at your disposal for maintaining efficient, reliable, and cost-effective applications. By carefully selecting the appropriate autoscaler, setting well-considered thresholds, and continuously monitoring and adjusting your strategies, you can ensure that your Kubernetes deployments are well-equipped to handle your resource demands while not over-paying for the provided resources at the same time.\nAs Kubernetes continues to evolve (e.g. in-place updates) and as new patterns and practices emerge, the approaches to autoscaling may also change. However, the principles discussed above will remain foundational to creating scalable and resilient Kubernetes workloads. Whether you’re a developer or operations engineer, a solid understanding of pod autoscaling will be instrumental in the successful deployment and management of containerized applications.\n","categories":"","description":"","excerpt":"Introduction There are two types of pod autoscaling in Kubernetes: …","ref":"/docs/guides/applications/shoot-pod-autoscaling-best-practices/","tags":"","title":"Shoot Pod Autoscaling Best Practices"},{"body":"Developer Docs for Gardener Shoot Rsyslog Relp Extension This document outlines how Shoot reconciliation and deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.\nShoot Reconciliation This section outlines how the reconciliation works for a Shoot with the shoot-rsyslog-relp extension enabled.\nExtension Enablement / Reconciliation This section outlines how the extension enablement/reconciliation works, e.g., the extension has been added to the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet deploys the Extension resource. The shoot-rsyslog-relp extension reconciles the Extension resource. pkg/controller/lifecycle/actuator.go contains the implementation of the extension.Actuator interface. The reconciliation of an Extension of type shoot-rsyslog-relp only deploys the necessary monitoring configuration - the shoot-rsyslog-relp-dashboards ConfigMap which contains the definitions for: Plutono dashboard for the Rsyslog component, and the shoot-shoot-rsyslog-relp ServiceMonitor and PrometheusRule resources which contains the definitions for: scraping metrics by prometheus, alerting rules. As part of the Shoot reconciliation flow, the gardenlet deploys the OperatingSystemConfig resource. The shoot-rsyslog-relp extension serves a webhook that mutates the OperatingSystemConfig resource for Shoots having the shoot-rsyslog-relp extension enabled (the corresponding namespace gets labeled by the gardenlet with extensions.gardener.cloud/shoot-rsyslog-relp=true). pkg/webhook/operatingsystemconfig/ensurer.go contains implementation of the genericmutator.Ensurer interface. The webhook renders the 60-audit.conf.tpl template script and appends it to the OperatingSystemConfig files. When rendering the template, the configuration of the shoot-rsyslog-relp extension is used to fill in the required template values. The file is installed as /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf on the host OS. The webhook appends the audit rules to the OperatingSystemConfig. The files are installed under /var/lib/rsyslog-relp-configurator/rules.d on the host OS. If the user has specified alternative audit rules in a config map reference, the webhook fetches the referenced ConfigMap from the Shoot’s control plane namespace and decodes the value of its auditd data key into an object of type Auditd. It then takes the auditRules defined in the object and places those under the /var/lib/rsyslog-relp-configurator/rules.d directory in a single file. The webhook renders the configure-rsyslog.tpl.sh script and appends it to the OperatingSystemConfig files. This script is installed as /var/lib/rsyslog-relp-configurator/configure-rsyslog.sh on the host OS. It keeps the configuration of the rsyslog systemd service up-to-date by copying /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf to /etc/rsyslog.d/60-audit.conf, if /etc/rsyslog.d/60-audit.conf does not exist or the files differ. The script also takes care of syncing the audit rules in /etc/audit/rules.d with the ones installed in /var/lib/rsyslog-relp-configurator/rules.d and restarts the auditd systemd service if necessary. The webhook renders the process-rsyslog-pstats.tpl.sh and appends it to the OperatingSystemConfig files. This script receives metrics from the rsyslog process, transforms them, and writes them to /var/lib/node-exporter/textfile-collector/rsyslog_pstats.prom so that they can be collected by the node-exporter. As part of the Shoot reconciliation, before the shoot-rsyslog-relp extension is deployed, the gardenlet copies all Secret and ConfigMap resources referenced in .spec.resources[] to the Shoot’s control plane namespace on the Seed. When the .tls.enabled field is true in the shoot-rsyslog-relp extension configuration, a value for .tls.secretReferenceName must also be specified so that it references a named resource reference in the Shoot’s .spec.resources[] array. The webhook appends the data of the referenced Secret in the Shoot’s control plane namespace to the OperatingSystemConfig files. The webhook appends the rsyslog-configurator.service unit to the OperatingSystemConfig units. The unit invokes the configure-rsyslog.sh script every 15 seconds. Extension Disablement This section outlines how the extension disablement works, i.e., the extension has to be removed from the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet destroys the Extension resource because it is no longer needed. As part of the deletion flow, the shoot-rsyslog-relp extension deploys the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules. Shoot Deletion This section outlines how the deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.\n As part of the Shoot deletion flow, the gardenlet destroys the Extension resource. In the Shoot deletion flow, the Extension resource is deleted after the Worker resource. Hence, there is no need to deploy the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules. ","categories":"","description":"","excerpt":"Developer Docs for Gardener Shoot Rsyslog Relp Extension This document …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/shoot-rsyslog-relp/","tags":"","title":"Shoot Rsyslog Relp"},{"body":"Shoot Scheduling Profiles This guide describes the available scheduling profiles and how they can be configured in the Shoot cluster. It also clarifies how a custom scheduling profile can be configured.\nScheduling Profiles The scheduling process in the kube-scheduler happens in a series of stages. A scheduling profile allows configuring the different stages of the scheduling.\nAs of today, Gardener supports two predefined scheduling profiles:\n balanced (default)\nOverview\nThe balanced profile attempts to spread Pods evenly across Nodes to obtain a more balanced resource usage. This profile provides the default kube-scheduler behavior.\nHow it works?\nThe kube-scheduler is started without any profiles. In such case, by default, one profile with the scheduler name default-scheduler is created. This profile includes the default plugins. If a Pod doesn’t specify the .spec.schedulerName field, kube-apiserver sets it to default-scheduler. Then, the Pod gets scheduled by the default-scheduler accordingly.\n bin-packing\nOverview\nThe bin-packing profile scores Nodes based on the allocation of resources. It prioritizes Nodes with the most allocated resources. By favoring the Nodes with the most allocation, some of the other Nodes become under-utilized over time (because new Pods keep being scheduled to the most allocated Nodes). Then, the cluster-autoscaler identifies such under-utilized Nodes and removes them from the cluster. In this way, this profile provides a greater overall resource utilization (compared to the balanced profile).\n Note: The decision of when to remove a Node is a trade-off between optimizing for utilization or the availability of resources. Removing under-utilized Nodes improves cluster utilization, but new workloads might have to wait for resources to be provisioned again before they can run.\n How it works?\nThe kube-scheduler is configured with the following bin packing profile:\napiVersion: kubescheduler.config.k8s.io/v1beta3 kind: KubeSchedulerConfiguration profiles: - schedulerName: bin-packing-scheduler pluginConfig: - name: NodeResourcesFit args: scoringStrategy: type: MostAllocated plugins: score: disabled: - name: NodeResourcesBalancedAllocation To impose the new profile, a MutatingWebhookConfiguration is deployed in the Shoot cluster. The MutatingWebhookConfiguration intercepts CREATE operations for Pods and sets the .spec.schedulerName field to bin-packing-scheduler. Then, the Pod gets scheduled by the bin-packing-scheduler accordingly. Pods that specify a custom scheduler (i.e., having .spec.schedulerName different from default-scheduler and bin-packing-scheduler) are not affected.\n Configuring the Scheduling Profile The scheduling profile can be configured via the .spec.kubernetes.kubeScheduler.profile field in the Shoot:\nspec: # ... kubernetes: kubeScheduler: profile: \"balanced\" # or \"bin-packing\" Custom Scheduling Profiles The kube-scheduler’s component configs allows configuring custom scheduling profiles to match the cluster needs. As of today, Gardener supports only two predefined scheduling profiles. The profile configuration in the component config is quite expressive and it is not possible to easily define profiles that would match the needs of every cluster. Because of these reasons, there are no plans to add support for new predefined scheduling profiles. If a cluster owner wants to use a custom scheduling profile, then they have to deploy (and maintain) a dedicated kube-scheduler deployment in the cluster itself.\n","categories":"","description":"Introducing `balanced` and `bin-packing` scheduling profiles","excerpt":"Introducing `balanced` and `bin-packing` scheduling profiles","ref":"/docs/gardener/shoot_scheduling_profiles/","tags":"","title":"Shoot Scheduling Profiles"},{"body":"ServiceAccount Configurations for Shoot Clusters The Shoot specification allows to configure some of the settings for the handling of ServiceAccounts:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeAPIServer: serviceAccountConfig: issuer: foo acceptedIssuers: - foo1 - foo2 extendTokenExpiration: true maxTokenExpiration: 45d ... Issuer and Accepted Issuers The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.{issuer,acceptedIssuers} fields are translated to the --service-account-issuer flag for the kube-apiserver. The issuer will assert its identifier in the iss claim of the issued tokens. According to the upstream specification, values need to meet the following requirements:\n This value is a string or URI. If this option is not a valid URI per the OpenID Discovery 1.0 spec, the ServiceAccountIssuerDiscovery feature will remain disabled, even if the feature gate is set to true. It is highly recommended that this value comply with the OpenID spec: https://openid.net/specs/openid-connect-discovery-1_0.html. In practice, this means that service-account-issuer must be an https URL. It is also highly recommended that this URL be capable of serving OpenID discovery documents at {service-account-issuer}/.well-known/openid-configuration.\n By default, Gardener uses the internal cluster domain as issuer (e.g., https://api.foo.bar.example.com). If you specify the issuer, then this default issuer will always be part of the list of accepted issuers (you don’t need to specify it yourself).\n [!CAUTION] If you change from the default issuer to a custom issuer, all previously issued tokens will still be valid/accepted. However, if you change from a custom issuer A to another issuer B (custom or default), then you have to add A to the acceptedIssuers so that previously issued tokens are not invalidated. Otherwise, the control plane components as well as system components and your workload pods might fail. You can remove A from the acceptedIssuers when all currently active tokens have been issued solely by B. This can be ensured by using projected token volumes with a short validity, or by rolling out all pods. Additionally, all ServiceAccount token secrets should be recreated. Apart from this, you should wait for at least 12h to make sure the control plane and system components have received a new token from Gardener.\n Token Expirations The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.extendTokenExpiration configures the --service-account-extend-token-expiration flag of the kube-apiserver. It is enabled by default and has the following specification:\n Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.\n The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.maxTokenExpiration configures the --service-account-max-token-expiration flag of the kube-apiserver. It has the following specification:\n The maximum validity duration of a token created by the service account token issuer. If an otherwise valid TokenRequest with a validity duration larger than this value is requested, a token will be issued with a validity duration of this value.\n [!NOTE] The value for this field must be in the [30d,90d] range. The background for this limitation is that all Gardener components rely on the TokenRequest API and the Kubernetes service account token projection feature with short-lived, auto-rotating tokens. Any values lower than 30d risk impacting the SLO for shoot clusters, and any values above 90d violate security best practices with respect to maximum validity of credentials before they must be rotated. Given that the field just specifies the upper bound, end-users can still use lower values for their individual workload by specifying the .spec.volumes[].projected.sources[].serviceAccountToken.expirationSeconds in the PodSpecs.\n Managed Service Account Issuer Gardener also provides a way to manage the service account issuer of a shoot cluster as well as serving its OIDC discovery documents from a centrally managed server called Gardener Discovery Server. This ability removes the need for changing the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.issuer and exposing it separately.\nPrerequisites [!NOTE] The following prerequisites are responsibility of the Gardener Administrators and are not something that end users can configure by themselves. If uncertain that these requirements are met, please contact your Gardener Administrator.\n Prerequisites:\n The Garden Cluster should have the Gardener Discovery Server deployed and configured. The easiest way to handle this is by using the gardener-operator. The ShootManagedIssuer feature gate should be enabled. Enablement If the prerequisites are met then the feature can be enabled for a shoot cluster by annotating it with authentication.gardener.cloud/issuer=managed. Mind that once enabled, this feature cannot be disabled. After the shoot is reconciled, you can retrieve the new shoot service account issuer value from the shoot’s status. A sample query that will retrieve the managed issuer looks like this:\nkubectl -n my-project get shoot my-shoot -o jsonpath='{.status.advertisedAddresses[?(@.name==\"service-account-issuer\")].url}' Once retrieved, the shoot’s OIDC discovery documents can be explored by querying the /.well-known/openid-configuration endpoint of the issuer.\nMind that this annotation is incompatible with the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.issuer field, so if you want to enable it then the issuer field should not be set in the shoot specification.\n [!CAUTION] If you change from the default issuer to a managed issuer, all previously issued tokens will still be valid/accepted. However, if you change from a custom issuer A to a managed issuer, then you have to add A to the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.acceptedIssuers so that previously issued tokens are not invalidated. Otherwise, the control plane components as well as system components and your workload pods might fail. You can remove A from the acceptedIssuers when all currently active tokens have been issued solely by the managed issuer. This can be ensured by using projected token volumes with a short validity, or by rolling out all pods. Additionally, all ServiceAccount token secrets should be recreated. Apart from this, you should wait for at least 12h to make sure the control plane and system components have received a new token from Gardener.\n ","categories":"","description":"","excerpt":"ServiceAccount Configurations for Shoot Clusters The Shoot …","ref":"/docs/gardener/shoot_serviceaccounts/","tags":"","title":"Shoot Serviceaccounts"},{"body":"Shoot Status This document provides an overview of the ShootStatus.\nConditions The Shoot status consists of a set of conditions. A Condition has the following fields:\n Field name Description type Name of the condition. status Indicates whether the condition is applicable, with possible values True, False, Unknown or Progressing. lastTransitionTime Timestamp for when the condition last transitioned from one status to another. lastUpdateTime Timestamp for when the condition was updated. Usually changes when reason or message in condition is updated. reason Machine-readable, UpperCamelCase text indicating the reason for the condition’s last transition. message Human-readable message indicating details about the last status transition. codes Well-defined error codes in case the condition reports a problem. Currently, the available Shoot condition types are:\n APIServerAvailable ControlPlaneHealthy EveryNodeReady ObservabilityComponentsHealthy SystemComponentsHealthy The Shoot conditions are maintained by the shoot care reconciler of the gardenlet. Find more information in the gardelent documentation.\nSync Period The condition checks are executed periodically at an interval which is configurable in the GardenletConfiguration (.controllers.shootCare.syncPeriod, defaults to 1m).\nCondition Thresholds The GardenletConfiguration also allows configuring condition thresholds (controllers.shootCare.conditionThresholds). A condition threshold is the amount of time to consider a condition as Processing on condition status changes.\nLet’s check the following example to get a better understanding. Let’s say that the APIServerAvailable condition of our Shoot is with status True. If the next condition check fails (for example kube-apiserver becomes unreachable), then the condition first goes to Processing state. Only if this state remains for condition threshold amount of time, then the condition is finally updated to False.\nConstraints Constraints represent conditions of a Shoot’s current state that constraint some operations on it. The current constraints are:\nHibernationPossible:\nThis constraint indicates whether a Shoot is allowed to be hibernated. The rationale behind this constraint is that a Shoot can have ValidatingWebhookConfigurations or MutatingWebhookConfigurations acting on resources that are critical for waking up a cluster. For example, if a webhook has rules for CREATE/UPDATE Pods or Nodes and failurePolicy=Fail, the webhook will block joining Nodes and creating critical system component Pods and thus block the entire wakeup operation, because the server backing the webhook is not running.\nEven if the failurePolicy is set to Ignore, high timeouts (\u003e15s) can lead to blocking requests of control plane components. That’s because most control-plane API calls are made with a client-side timeout of 30s, so if a webhook has timeoutSeconds=30 the overall request might still fail as there is overhead in communication with the API server and potential other webhooks.\nGenerally, it’s best practice to specify low timeouts in WebhookConfigs.\nAs an effort to correct this common problem, the webhook remediator has been created. This is enabled by setting .controllers.shootCare.webhookRemediatorEnabled=true in the gardenlet’s configuration. This feature simply checks whether webhook configurations in shoot clusters match a set of rules described here. If at least one of the rules matches, it will change set status=False for the .status.constraints of type HibernationPossible and MaintenancePreconditionsSatisfied in the Shoot resource. In addition, the failurePolicy in the affected webhook configurations will be set from Fail to Ignore. Gardenlet will also add an annotation to make it visible to end-users that their webhook configurations were mutated and should be fixed/adapted according to the rules and best practices.\nIn most cases, you can avoid this by simply excluding the kube-system namespace from your webhook via the namespaceSelector:\napiVersion: admissionregistration.k8s.io/v1 kind: MutatingWebhookConfiguration webhooks: - name: my-webhook.example.com namespaceSelector: matchExpressions: - key: gardener.cloud/purpose operator: NotIn values: - kube-system rules: - operations: [\"*\"] apiGroups: [\"\"] apiVersions: [\"v1\"] resources: [\"pods\"] scope: \"Namespaced\" However, some other resources (some of them cluster-scoped) might still trigger the remediator, namely:\n endpoints nodes clusterroles clusterrolebindings customresourcedefinitions apiservices certificatesigningrequests priorityclasses If one of the above resources triggers the remediator, the preferred solution is to remove that particular resource from your webhook’s rules. You can also use the objectSelector to reduce the scope of webhook’s rules. However, in special cases where a webhook is absolutely needed for the workload, it is possible to add the remediation.webhook.shoot.gardener.cloud/exclude=true label to your webhook so that the remediator ignores it. This label should not be used to silence an alert, but rather to confirm that a webhook won’t cause problems. Note that all of this is no perfect solution and just done on a best effort basis, and only the owner of the webhook can know whether it indeed is problematic and configured correctly.\nIn a special case, if a webhook has a rule for CREATE/UPDATE lease resources in kube-system namespace, its timeoutSeconds is updated to 3 seconds. This is required to ensure the proper functioning of the leader election of essential control plane controllers.\nYou can also find more help from the Kubernetes documentation\nMaintenancePreconditionsSatisfied:\nThis constraint indicates whether all preconditions for a safe maintenance operation are satisfied (see Shoot Maintenance for more information about what happens during a shoot maintenance). As of today, the same checks as in the HibernationPossible constraint are being performed (user-deployed webhooks that might interfere with potential rolling updates of shoot worker nodes). There is no further action being performed on this constraint’s status (maintenance is still being performed). It is meant to make the user aware of potential problems that might occur due to his configurations.\nCACertificateValiditiesAcceptable:\nThis constraint indicates that there is at least one CA certificate which expires in less than 1y. It will not be added to the .status.constraints if there is no such CA certificate. However, if it’s visible, then a credentials rotation operation should be considered.\nCRDsWithProblematicConversionWebhooks:\nThis constraint indicates that there is at least one CustomResourceDefinition in the cluster which has multiple stored versions and a conversion webhook configured. This could break the reconciliation flow of a Shoot cluster in some cases. See https://github.com/gardener/gardener/issues/7471 for more details. It will not be added to the .status.constraints if there is no such CRD. However, if it’s visible, then you should consider upgrading the existing objects to the current stored version. See Upgrade existing objects to a new stored version for detailed steps.\nLast Operation The Shoot status holds information about the last operation that is performed on the Shoot. The last operation field reflects overall progress and the tasks that are currently being executed. Allowed operation types are Create, Reconcile, Delete, Migrate, and Restore. Allowed operation states are Processing, Succeeded, Error, Failed, Pending, and Aborted. An operation in Error state is an operation that will be retried for a configurable amount of time (controllers.shoot.retryDuration field in GardenletConfiguration, defaults to 12h). If the operation cannot complete successfully for the configured retry duration, it will be marked as Failed. An operation in Failed state is an operation that won’t be retried automatically (to retry such an operation, see Retry failed operation).\nLast Errors The Shoot status also contains information about the last occurred error(s) (if any) during an operation. A LastError consists of identifier of the task returned error, human-readable message of the error and error codes (if any) associated with the error.\nError Codes Known error codes and their classification are:\n Error code User error Description ERR_INFRA_UNAUTHENTICATED true Indicates that the last error occurred due to the client request not being completed because it lacks valid authentication credentials for the requested resource. It is classified as a non-retryable error code. ERR_INFRA_UNAUTHORIZED true Indicates that the last error occurred due to the server understanding the request but refusing to authorize it. It is classified as a non-retryable error code. ERR_INFRA_QUOTA_EXCEEDED true Indicates that the last error occurred due to infrastructure quota limits. It is classified as a non-retryable error code. ERR_INFRA_RATE_LIMITS_EXCEEDED false Indicates that the last error occurred due to exceeded infrastructure request rate limits. ERR_INFRA_DEPENDENCIES true Indicates that the last error occurred due to dependent objects on the infrastructure level. It is classified as a non-retryable error code. ERR_RETRYABLE_INFRA_DEPENDENCIES false Indicates that the last error occurred due to dependent objects on the infrastructure level, but the operation should be retried. ERR_INFRA_RESOURCES_DEPLETED true Indicates that the last error occurred due to depleted resource in the infrastructure. ERR_CLEANUP_CLUSTER_RESOURCES true Indicates that the last error occurred due to resources in the cluster that are stuck in deletion. ERR_CONFIGURATION_PROBLEM true Indicates that the last error occurred due to a configuration problem. It is classified as a non-retryable error code. ERR_RETRYABLE_CONFIGURATION_PROBLEM true Indicates that the last error occurred due to a retryable configuration problem. “Retryable” means that the occurred error is likely to be resolved in a ungraceful manner after given period of time. ERR_PROBLEMATIC_WEBHOOK true Indicates that the last error occurred due to a webhook not following the Kubernetes best practices. Please note: Errors classified as User error: true do not require a Gardener operator to resolve but can be remediated by the user (e.g. by refreshing expired infrastructure credentials). Even though ERR_INFRA_RATE_LIMITS_EXCEEDED and ERR_RETRYABLE_INFRA_DEPENDENCIES is mentioned as User error: false` operator can’t provide any resolution because it is related to cloud provider issue.\nStatus Label Shoots will be automatically labeled with the shoot.gardener.cloud/status label. Its value might either be healthy, progressing, unhealthy or unknown depending on the .status.conditions, .status.lastOperation, and status.lastErrors of the Shoot. This can be used as an easy filter method to find shoots based on their “health” status.\n","categories":"","description":"Shoot conditions, constraints, and error codes","excerpt":"Shoot conditions, constraints, and error codes","ref":"/docs/gardener/shoot_status/","tags":"","title":"Shoot Status"},{"body":"Supported CPU Architectures for Shoot Worker Nodes Users can create shoot clusters with worker groups having virtual machines of different architectures. CPU architecture of each worker pool can be specified in the Shoot specification as follows:\nExample Usage in a Shoot spec: provider: workers: - name: cpu-worker machine: architecture: \u003csome-cpu-architecture\u003e # optional If no value is specified for the architecture field, it defaults to amd64. For a valid shoot object, a machine type should be present in the respective CloudProfile with the same CPU architecture as specified in the Shoot yaml. Also, a valid machine image should be present in the CloudProfile that supports the required architecture specified in the Shoot worker pool.\nExample Usage in a CloudProfile spec: machineImages: - name: test-image versions: - architectures: # optional - \u003carchitecture-1\u003e - \u003carchitecture-2\u003e version: 1.2.3 machineTypes: - architecture: \u003csome-cpu-architecture\u003e cpu: \"2\" gpu: \"0\" memory: 8Gi name: test-machine Currently, Gardener supports two of the most widely used CPU architectures:\n amd64 arm64 ","categories":"","description":"","excerpt":"Supported CPU Architectures for Shoot Worker Nodes Users can create …","ref":"/docs/gardener/shoot_supported_architectures/","tags":"","title":"Shoot Supported Architectures"},{"body":"Shoot Updates and Upgrades This document describes what happens during shoot updates (changes incorporated in a newly deployed Gardener version) and during shoot upgrades (changes for version controllable by end-users).\nUpdates Updates to all aspects of the shoot cluster happen when the gardenlet reconciles the Shoot resource.\nWhen are Reconciliations Triggered Generally, when you change the specification of your Shoot the reconciliation will start immediately, potentially updating your cluster. Please note that you can also confine the reconciliation triggered due to your specification updates to the cluster’s maintenance time window. Please find more information in Confine Specification Changes/Updates Roll Out.\nYou can also annotate your shoot with special operation annotations (for more information, see Trigger Shoot Operations), which will cause the reconciliation to start due to your actions.\nThere is also an automatic reconciliation by Gardener. The period, i.e., how often it is performed, depends on the configuration of the Gardener administrators/operators. In some Gardener installations the operators might enable “reconciliation in maintenance time window only” (for more information, see Cluster Reconciliation), which will result in at least one reconciliation during the time configured in the Shoot’s .spec.maintenance.timeWindow field.\nWhich Updates are Applied As end-users can only control the Shoot resource’s specification but not the used Gardener version, they don’t have any influence on which of the updates are rolled out (other than those settings configurable in the Shoot). A Gardener operator can deploy a new Gardener version at any point in time. Any subsequent reconciliation of Shoots will update them by rolling out the changes incorporated in this new Gardener version.\nSome examples for such shoot updates are:\n Add a new/remove an old component to/from the shoot’s control plane running in the seed, or to/from the shoot’s system components running on the worker nodes. Change the configuration of an existing control plane/system component. Restart of existing control plane/system components (this might result in a short unavailability of the Kubernetes API server, e.g., when etcd or a kube-apiserver itself is being restarted) Behavioural Changes Generally, some of such updates (e.g., configuration changes) could theoretically result in different behaviour of controllers. If such changes would be backwards-incompatible, then we usually follow one of those approaches (depends on the concrete change):\n Only apply the change for new clusters. Expose a new field in the Shoot resource that lets users control this changed behaviour to enable it at a convenient point in time. Put the change behind an alpha feature gate (disabled by default) in the gardenlet (only controllable by Gardener operators), which will be promoted to beta (enabled by default) in subsequent releases (in this case, end-users have no influence on when the behaviour changes - Gardener operators should inform their end-users and provide clear timelines when they will enable the feature gate). Upgrades We consider shoot upgrades to change either the:\n Kubernetes version (.spec.kubernetes.version) Kubernetes version of the worker pool if specified (.spec.provider.workers[].kubernetes.version) Machine image version of at least one worker pool (.spec.provider.workers[].machine.image.version) Generally, an upgrade is also performed through a reconciliation of the Shoot resource, i.e., the same concepts as for shoot updates apply. If an end-user triggers an upgrade (e.g., by changing the Kubernetes version) after a new Gardener version was deployed but before the shoot was reconciled again, then this upgrade might incorporate the changes delivered with this new Gardener version.\nIn-Place vs. Rolling Updates If the Kubernetes patch version is changed, then the upgrade happens in-place. This means that the shoot worker nodes remain untouched and only the kubelet process restarts with the new Kubernetes version binary. The same applies for configuration changes of the kubelet.\nIf the Kubernetes minor version is changed, then the upgrade is done in a “rolling update” fashion, similar to how pods in Kubernetes are updated (when backed by a Deployment). The worker nodes will be terminated one after another and replaced by new machines. The existing workload is gracefully drained and evicted from the old worker nodes to new worker nodes, respecting the configured PodDisruptionBudgets (see Specifying a Disruption Budget for your Application).\nCustomize Rolling Update Behaviour of Shoot Worker Nodes The .spec.provider.workers[] list exposes two fields that you might configure based on your workload’s needs: maxSurge and maxUnavailable. The same concepts like in Kubernetes apply. Additionally, you might customize how the machine-controller-manager (abbrev.: MCM; the component instrumenting this rolling update) is behaving. You can configure the following fields in .spec.provider.worker[].machineControllerManager:\n machineDrainTimeout: Timeout (in duration) used while draining of machine before deletion, beyond which MCM forcefully deletes the machine (default: 2h). machineHealthTimeout: Timeout (in duration) used while re-joining (in case of temporary health issues) of a machine before it is declared as failed (default: 10m). machineCreationTimeout: Timeout (in duration) used while joining (during creation) of a machine before it is declared as failed (default: 10m). maxEvictRetries: Maximum number of times evicts would be attempted on a pod before it is forcibly deleted during the draining of a machine (default: 10). nodeConditions: List of case-sensitive node-conditions which will change a machine to a Failed state after the machineHealthTimeout duration. It may further be replaced with a new machine if the machine is backed by a machine-set object (defaults: KernelDeadlock, ReadonlyFilesystem , DiskPressure). Rolling Update Triggers Apart from the above mentioned triggers, a rolling update of the shoot worker nodes is also triggered for some changes to your worker pool specification (.spec.provider.workers[], even if you don’t change the Kubernetes or machine image version). The complete list of fields that trigger a rolling update:\n .spec.kubernetes.version (except for patch version changes) .spec.provider.workers[].machine.image.name .spec.provider.workers[].machine.image.version .spec.provider.workers[].machine.type .spec.provider.workers[].volume.type .spec.provider.workers[].volume.size .spec.provider.workers[].providerConfig (except if feature gate NewWorkerPoolHash) .spec.provider.workers[].cri.name .spec.provider.workers[].kubernetes.version (except for patch version changes) .spec.systemComponents.nodeLocalDNS.enabled .status.credentials.rotation.certificateAuthorities.lastInitiationTime (changed by Gardener when a shoot CA rotation is initiated) .status.credentials.rotation.serviceAccountKey.lastInitiationTime (changed by Gardener when a shoot service account signing key rotation is initiated) If feature gate NewWorkerPoolHash is enabled:\n .spec.kubernetes.kubelet.kubeReserved (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.systemReserved (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.evictionHard (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.cpuManagerPolicy (unless a worker pool-specific value is set) .spec.provider.workers[].kubernetes.kubelet.kubeReserved .spec.provider.workers[].kubernetes.kubelet.systemReserved .spec.provider.workers[].kubernetes.kubelet.evictionHard .spec.provider.workers[].kubernetes.kubelet.cpuManagerPolicy Changes to kubeReserved or systemReserved do not trigger a node roll if their sum does not change.\nGenerally, the provider extension controllers might have additional constraints for changes leading to rolling updates, so please consult the respective documentation as well. In particular, if the feature gate NewWorkerPoolHash is enabled and a worker pool uses the new hash, then the providerConfig as a whole is not included. Instead only fields selected by the provider extension are considered.\nRelated Documentation Shoot Operations Shoot Maintenance Confine Specification Changes/Updates Roll Out To Maintenance Time Window. ","categories":"","description":"","excerpt":"Shoot Updates and Upgrades This document describes what happens during …","ref":"/docs/gardener/shoot_updates/","tags":"","title":"Shoot Updates and Upgrades"},{"body":"Shoot Resource Customization Webhooks Gardener deploys several components/resources into the shoot cluster. Some of these resources are essential (like the kube-proxy), others are optional addons (like the kubernetes-dashboard or the nginx-ingress-controller). In either case, some provider extensions might need to mutate these resources and inject provider-specific bits into it.\nWhat’s the approach to implement such mutations? Similar to how control plane components in the seed are modified, we are using MutatingWebhookConfigurations to achieve the same for resources in the shoot. Both the provider extension and the kube-apiserver of the shoot cluster are running in the same seed. Consequently, the kube-apiserver can talk cluster-internally to the provider extension webhook, which makes such operations even faster.\nHow is the MutatingWebhookConfiguration object created in the shoot? The preferred approach is to use a ManagedResource (see also Deploy Resources to the Shoot Cluster) in the seed cluster. This way the gardener-resource-manager ensures that end-users cannot delete/modify the webhook configuration. The provider extension doesn’t need to care about the same.\nWhat else is needed? The shoot’s kube-apiserver must be allowed to talk to the provider extension. To achieve this, you need to make sure that the relevant NetworkPolicy get created for allowing the network traffic. Please refer to this guide for more information.\n","categories":"","description":"","excerpt":"Shoot Resource Customization Webhooks Gardener deploys several …","ref":"/docs/gardener/extensions/shoot-webhooks/","tags":"","title":"Shoot Webhooks"},{"body":"Shoot Worker Nodes Settings Users can configure settings affecting all worker nodes via .spec.provider.workersSettings in the Shoot resource.\nSSH Access SSHAccess indicates whether the sshd.service should be running on the worker nodes. This is ensured by a systemd service called sshd-ensurer.service which runs every 15 seconds on each worker node. When set to true, the systemd service ensures that the sshd.service is unmasked, enabled and running. If it is set to false, the systemd service ensures that sshd.service is disabled, masked and stopped. This also terminates all established SSH connections on the host. In addition, when this value is set to false, existing Bastion resources are deleted during Shoot reconciliation and new ones are prevented from being created, SSH keypairs are not created/rotated, SSH keypair secrets are deleted from the Garden cluster, and the gardener-user.service is not deployed to the worker nodes.\nsshAccess.enabled is set to true by default.\nExample Usage in a Shoot spec: provider: workersSettings: sshAccess: enabled: false ","categories":"","description":"Configuring SSH Access through '.spec.provider.workersSettings`","excerpt":"Configuring SSH Access through '.spec.provider.workersSettings`","ref":"/docs/gardener/shoot_workers_settings/","tags":"","title":"Shoot Worker Nodes Settings"},{"body":"Shortcodes are the Hugo way to extend the limitations of Markdown before resorting to HTML. There are a number of built-in shortcodes available from Hugo. This list is extended with Gardener website shortcodes designed specifically for its content. Find a complete reference to the Hugo built-in shortcodes on its website.\nBelow is a reference to the shortcodes developed for the Gardener website.\nalert {{% alert color=\"info\" title=\"Notice\" %}} text {{% /alert %}} produces Notice A notice disclaimer All the color options are info|warning|primary\nYou can also omit the title section from an alert, useful when creating notes.\nIt is important to note that the text that the “alerts” shortcode wraps will not be processed during site building. Do not use shortcodes in it.\nYou should also avoid mixing HTML and markdown formatting in shortcodes, since it won’t render correctly when the site is built.\nAlert Examples Info color Warning color Primary color mermaid The GitHub mermaid fenced code block syntax is used. You can find additional documentation at mermaid’s official website.\n```mermaid graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ``` produces:\ngraph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] Default settings can be overridden using the %%init%% header at the start of the diagram definition. See the mermaid theming documentation.\n```mermaid %%{init: {'theme': 'neutral', 'themeVariables': { 'mainBkg': '#eee'}}}%% graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ``` produces:\n%%{init: {'theme': 'neutral', 'themeVariables': { 'mainBkg': '#eee'}}}%% graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ","categories":"","description":"","excerpt":"Shortcodes are the Hugo way to extend the limitations of Markdown …","ref":"/docs/contribute/documentation/shortcodes/","tags":"","title":"Shortcodes"},{"body":"This page gives writing style guidelines for the Gardener documentation. For formatting guidelines, see the Formatting Guide.\nThese are guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a Pull Request.\n Structure Language and Grammar Related Links Structure Documentation Types Overview The following table summarizes the types of documentation and their mapping to the SAP UA taxonomy. Every topic you create will fall into one of these categories.\n Gardener Content Type Definition Example Content Comparable UA Content Type Concept Introduce a functionality or concept; covers background information. Services Overview, Relevant headings Concept Reference Provide a reference, for example, list all command line options of gardenctl and what they are used for. Overview of kubectl Relevant headings Reference Task A step-by-step description that allows users to complete a specific task. Upgrading kubeadm clusters Overview, Prerequisites, Steps, Result Complex Task Trail Collection of all other content types to cover a big topic. Custom Networking None Maps Tutorial A combination of many tasks that allows users to complete an example task with the goal to learn the details of a given feature. Deploying Cassandra with a StatefulSet Overview, Prerequisites, Tasks, Result Tutorial See the Contributors Guide for more details on how to produce and contribute documentation.\nTopic Structure When creating a topic, you will need to follow a certain structure. A topic generally comprises of, in order:\n Metadata (Specific for .md files in Gardener) - Additional information about the topic\n Title - A short, descriptive name for the topic\n Content - The main part of the topic. It contains all the information relevant to the user\n Concept content: Overview, Relevant headings Task content: Overview, Prerequisites, Steps, Result Reference content: Relevant headings Related Links (Optional) - A part after the main content that contains links that are not a part of the topic, but are still connected to it\n You can use the provided content description files as a template for your own topics.\nFront Matter Front matter is metadata applied at the head of each content Markdown file. It is used to instruct the static site generator build process. The format is YAML and it must be enclosed in leading and trailing comment dashes (---).\nSample codeblock:\n--- title: Getting Started description: Guides to get you accustomed with Gardener weight: 10 --- There are a number of predefined front matter properites, but not all of them are considered by the layouts developed for the website. The most essential ones to consider are:\n title the content title that will be used as page title and in navigation structures. description describes the content. For some content types such as documentation guides, it may be rendered in the UI. weight a positive integer number that controls the ordering of the content in navigation structures. url if specified, it will override the default url constructed from the file path to the content. Make sure the url you specify is consistent and meaningful. Prefer short paths. Do not provide redundant URLs! persona specifies the type of user the topic is aimed towards. Use only a single persona per topic. persona: Users / Operators / Developers While this section will be automatically generated if your topic has a title header, adding more detailed information helps other users, developers, and technical writers better sort, classify and understand the topic.\nBy using a metadata section you can also skip adding a title header or overwrite it in the navigation section.\nAlerts If you want to add a note, tip or a warning to your topic, use the templates provides in the Shortcodes documentation.\nImages If you want to add an image to your topic, it is recommended to follow the guidelines outlined in the Images documentation.\nGeneral Tips Try to create a succint title and an informative description for your topics If a topic feels too long, it might be better to split it into a few different ones Avoid having have more than ten steps in one a task topic When writing a tutorial, link the tasks used in it instead of copying their content Language and Grammar Language Gardener documentation uses US English Keep it simple and use words that non-native English speakers are also familiar with Use the Merriam-Webster Dictionary when checking the spelling of words Writing Style Write in a conversational manner and use simple present tense Be friendly and refer to the person reading your content as “you”, instead of standard terms such as “user” Use an active voice - make it clear who is performing the action Creating Titles and Headers Use title case when creating titles or headers Avoid adding additional formatting to the title or header Concept and reference topic titles should be simple and succint Task and tutorial topic titles begin with a verb Related Links Formatting Guide Contributors Guide Shortcodes Images SAPterm ","categories":"","description":"","excerpt":"This page gives writing style guidelines for the Gardener …","ref":"/docs/contribute/documentation/style-guide/","tags":"","title":"Style Guide"},{"body":"Supported Kubernetes Versions We strongly recommend using etcd-druid with the supported kubernetes versions, published in this document. The following is a list of kubernetes versions supported by the respective etcd-druid versions.\n Etcd-druid version Kubernetes version \u003e=0.20 \u003e=1.21 \u003e=0.14 \u0026\u0026 \u003c0.20 All versions supported \u003c0.14 \u003c 1.25 ","categories":"","description":"","excerpt":"Supported Kubernetes Versions We strongly recommend using etcd-druid …","ref":"/docs/other-components/etcd-druid/supported_k8s_versions/","tags":"","title":"Supported K8s Versions"},{"body":"Supported Kubernetes Versions Currently, Gardener supports the following Kubernetes versions:\nGarden Clusters The minimum version of a garden cluster that can be used to run Gardener is 1.25.x.\nSeed Clusters The minimum version of a seed cluster that can be connected to Gardener is 1.25.x.\nShoot Clusters Gardener itself is capable of spinning up clusters with Kubernetes versions 1.25 up to 1.30. However, the concrete versions that can be used for shoot clusters depend on the installed provider extension. Consequently, please consult the documentation of your provider extension to see which Kubernetes versions are supported for shoot clusters.\n 👨🏼‍💻 Developers note: The Adding Support For a New Kubernetes Version topic explains what needs to be done in order to add support for a new Kubernetes version.\n ","categories":"","description":"","excerpt":"Supported Kubernetes Versions Currently, Gardener supports the …","ref":"/docs/gardener/supported_k8s_versions/","tags":"","title":"Supported Kubernetes Versions"},{"body":"Gardener Extension for SUSE CHost \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting SUSE Container Host configuration, i.e. suse-chost type:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: suse-chost units: ... files: ... Please find a concrete example in the example folder.\nIt is also capable of supporting the vSMP MemoryOne operating system with the memoryone-chost type. Please find more information here.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the SUSE Container Host operating system (CHost)","excerpt":"Gardener extension controller for the SUSE Container Host operating …","ref":"/docs/extensions/os-extensions/gardener-extension-os-suse-chost/","tags":"","title":"SUSE CHost OS"},{"body":"Problem One thing that always bothered me was that I couldn’t get logs of several pods at once with kubectl. A simple tail -f \u003cpath-to-logfile\u003e isn’t possible at all. Certainly, you can use kubectl logs -f \u003cpod-id\u003e, but it doesn’t help if you want to monitor more than one pod at a time.\nThis is something you really need a lot, at least if you run several instances of a pod behind a deployment. This is even more so if you don’t have a Kibana or a similar setup.\nSolution Luckily, there are smart developers out there who always come up with solutions. The finding of the week is a small bash script that allows you to aggregate log files of several pods at the same time in a simple way. The script is called kubetail and is available at GitHub.\n","categories":"","description":"Aggregate log files from different pods","excerpt":"Aggregate log files from different pods","ref":"/docs/guides/monitoring-and-troubleshooting/tail-logfile/","tags":"","title":"tail -f /var/log/my-application.log"},{"body":"Access the Kubernetes apiserver from your tailnet Overview If you would like to strengthen the security of your Kubernetes cluster even further, this guide post explains how this can be achieved.\nThe most common way to secure a Kubernetes cluster which was created with Gardener is to apply the ACLs described in the Gardener ACL Extension repository or to use ExposureClass, which exposes the Kubernetes apiserver in a corporate network not exposed to the public internet.\nHowever, those solutions are not without their drawbacks. Managing the ACL extension becomes fairly difficult with the growing number of participants, especially in a dynamic environment and work from home scenarios, and using ExposureClass requires you to first have a corporate network suitable for this purpose.\nBut there is a solution which bridges the gap between these two approaches by the use of a mesh VPN based on WireGuard\nTailscale Tailscale is a mesh VPN network which uses Wireguard under the hood, but automates the key exchange procedure. Please consult the official tailscale documentation for a detailed explanation.\nTarget Architecture Installation In order to be able to access the Kubernetes apiserver only from a tailscale VPN, you need this steps:\n Create a tailscale account and ensure MagicDNS is enabled. Create an OAuth ClientID and Secret OAuth ClientID and Secret. Don’t forget to create the required tags. Install the tailscale operator tailscale operator. If all went well after the operator installation, you should be able to see the tailscale operator by running tailscale status:\n# tailscale status ... 100.83.240.121 tailscale-operator tagged-devices linux - ... Expose the Kubernetes apiserver Now you are ready to expose the Kubernetes apiserver in the tailnet by annotating the service which was created by Gardener:\nkubectl annotate -n default kubernetes tailscale.com/expose=true tailscale.com/hostname=kubernetes It is required to kubernetes as the hostname, because this is part of the certificate common name of the Kubernetes apiserver.\nAfter annotating the service, it will be exposed in the tailnet and can be shown by running tailscale status:\n# tailscale status ... 100.83.240.121 tailscale-operator tagged-devices linux - 100.96.191.87 kubernetes tagged-devices linux idle, tx 19548 rx 71656 ... Modify the kubeconfig In order to access the cluster via the VPN, you must modify the kubeconfig to point to the Kubernetes service exposed in the tailnet, by changing the server entry to https://kubernetes.\n--- apiVersion: v1 clusters: - cluster: certificate-authority-data: \u003cbase64 encoded secret\u003e server: https://kubernetes name: my-cluster ... Enable ACLs to Block All IPs Now you are ready to use your cluster from every device which is part of your tailnet. Therefore you can now block all access to the Kubernetes apiserver with the ACL extension.\nCaveats Multiple Kubernetes Clusters You can actually not join multiple Kubernetes Clusters to join your tailnet because the kubernetes service in every cluster would overlap.\nHeadscale It is possible to host a tailscale coordination by your own if you do not want to rely on the service tailscale.com offers. The headscale project is a open source implementation of this.\nThis works for basic tailscale VPN setups, but not for the tailscale operator described here, because headscale does not implement all required API endpoints for the tailscale operator. The details can be found in this Github Issue.\n","categories":"","description":"","excerpt":"Access the Kubernetes apiserver from your tailnet Overview If you …","ref":"/docs/guides/administer-shoots/tailscale/","tags":"","title":"Tailscale"},{"body":"Taints and Tolerations for Seeds and Shoots Similar to taints and tolerations for Nodes and Pods in Kubernetes, the Seed resource supports specifying taints (.spec.taints, see this example) while the Shoot resource supports specifying tolerations (.spec.tolerations, see this example). The feature is used to control scheduling to seeds as well as decisions whether a shoot can use a certain seed.\nCompared to Kubernetes, Gardener’s taints and tolerations are very much down-stripped right now and have some behavioral differences. Please read the following explanations carefully if you plan to use them.\nScheduling When scheduling a new shoot, the gardener-scheduler will filter all seed candidates whose taints are not tolerated by the shoot. As Gardener’s taints/tolerations don’t support effects yet, you can compare this behaviour with using a NoSchedule effect taint in Kubernetes.\nBe reminded that taints/tolerations are no means to define any affinity or selection for seeds - please use .spec.seedSelector in the Shoot to state such desires.\n⚠️ Please note that - unlike how it’s implemented in Kubernetes - a certain seed cluster may only be used when the shoot tolerates all the seed’s taints. This means that specifying .spec.seedName for a seed whose taints are not tolerated will make the gardener-apiserver reject the request.\nConsequently, the taints/tolerations feature can be used as means to restrict usage of certain seeds.\nToleration Defaults and Whitelist The Project resource features a .spec.tolerations object that may carry defaults and a whitelist (see this example). The corresponding ShootTolerationRestriction admission plugin (cf. Kubernetes’ PodTolerationRestriction admission plugin) is responsible for evaluating these settings during creation/update of Shoots.\nWhitelist If a shoot gets created or updated with tolerations, then it is validated that only those tolerations may be used that were added to either a) the Project’s .spec.tolerations.whitelist, or b) to the global whitelist in the ShootTolerationRestriction’s admission config (see this example).\n⚠️ Please note that the tolerations whitelist of Projects can only be changed if the user trying to change it is bound to the modify-spec-tolerations-whitelist custom RBAC role, e.g., via the following ClusterRole:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: full-project-modification-access rules: - apiGroups: - core.gardener.cloud resources: - projects verbs: - create - patch - update - modify-spec-tolerations-whitelist - delete Defaults If a shoot gets created, then the default tolerations specified in both the Project’s .spec.tolerations.defaults and the global default list in the ShootTolerationRestriction admission plugin’s configuration will be added to the .spec.tolerations of the Shoot (unless it already specifies a certain key).\n","categories":"","description":"","excerpt":"Taints and Tolerations for Seeds and Shoots Similar to taints and …","ref":"/docs/gardener/tolerations/","tags":"","title":"Taints and Tolerations for Seeds and Shoots"},{"body":"Task Title (the topic title can also be placed in the frontmatter)\nOverview This section provides an overview of the topic and the information provided in it.\nPrerequisites Prerequisite 1 Prerequisite 2 Steps Avoid nesting headings directly on top of each other with no text inbetween.\n Describe step 1 here Describe step 2 here You can use smaller sections within sections for related tasks Avoid nesting headings directly on top of each other with no text inbetween.\n Describe step 1 here Describe step 2 here Result Screenshot of the final status once all the steps have been completed.\nRelated Links Provide links to other relevant topics, if applicable. Once someone has completed these steps, what might they want to do next?\n Link 1 Link 2 ","categories":"","description":"Describes the contents of a task topic","excerpt":"Describes the contents of a task topic","ref":"/docs/contribute/documentation/style-guide/task_template/","tags":"","title":"Task Topic Structure"},{"body":"Terminal Shortcuts As user and/or gardener administrator you can configure terminal shortcuts, which are preconfigured terminals for frequently used views.\nYou can launch the terminal shortcuts directly on the shoot details screen. You can view the definition of a terminal terminal shortcut by clicking on they eye icon What also has improved is, that when creating a new terminal you can directly alter the configuration. With expanded configuration On the Create Terminal Session dialog you can choose one or multiple terminal shortcuts. Project specific terminal shortcuts created (by a member of the project) have a project icon badge and are listed as Unverified. A warning message is displayed before a project specific terminal shortcut is ran informing the user about the risks. How to create a project specific terminal shortcut\nDisclaimer: “Project specific terminal shortcuts” is experimental feature and may change in future releases (we plan to introduce a dedicated custom resource).\nYou need to create a secret with the name terminal.shortcuts within your project namespace, containing your terminal shortcut configurations. Under data.shortcuts you add a list of terminal shortcuts (base64 encoded). Example terminal.shortcuts secret:\nkind: Secret type: Opaque metadata: name: terminal.shortcuts namespace: garden-myproject apiVersion: v1 data: shortcuts: LS0tCi0gdGl0bGU6IE5ldHdvcmtEZWxheVRlc3RzCiAgZGVzY3JpcHRpb246IFNob3cgbmV0d29ya21hY2hpbmVyeS5pbydzIE5ldHdvcmtEZWxheVRlc3RzCiAgdGFyZ2V0OiBzaG9vdAogIGNvbnRhaW5lcjoKICAgIGltYWdlOiBxdWF5LmlvL2RlcmFpbGVkL2s5czpsYXRlc3QKICAgIGFyZ3M6CiAgICAtIC0taGVhZGxlc3MKICAgIC0gLS1jb21tYW5kPW5ldHdvcmtkZWxheXRlc3QKICBzaG9vdFNlbGVjdG9yOgogICAgbWF0Y2hMYWJlbHM6CiAgICAgIGZvbzogYmFyCi0gdGl0bGU6IFNjYW4gQ2x1c3RlcgogIGRlc2NyaXB0aW9uOiBTY2FucyBsaXZlIEt1YmVybmV0ZXMgY2x1c3RlciBhbmQgcmVwb3J0cyBwb3RlbnRpYWwgaXNzdWVzIHdpdGggZGVwbG95ZWQgcmVzb3VyY2VzIGFuZCBjb25maWd1cmF0aW9ucwogIHRhcmdldDogc2hvb3QKICBjb250YWluZXI6CiAgICBpbWFnZTogcXVheS5pby9kZXJhaWxlZC9rOXM6bGF0ZXN0CiAgICBhcmdzOgogICAgLSAtLWhlYWRsZXNzCiAgICAtIC0tY29tbWFuZD1wb3BleWU= How to configure the dashboard with terminal shortcuts Example values.yaml:\nfrontend: features: terminalEnabled: true projectTerminalShortcutsEnabled: true # members can create a `terminal.shortcuts` secret containing the project specific terminal shortcuts terminal: shortcuts: - title: \"Control Plane Pods\" description: Using K9s to view the pods of the control plane for this cluster target: cp container: image: quay.io/derailed/k9s:latest - \"--headless\" - \"--command=pods\" - title: \"Cluster Overview\" description: This gives a quick overview about the status of your cluster using K9s pulse feature target: shoot container: image: quay.io/derailed/k9s:latest args: - \"--headless\" - \"--command=pulses\" - title: \"Nodes\" description: View the nodes for this cluster target: shoot container: image: quay.io/derailed/k9s:latest command: - bin/sh args: - -c - sleep 1 \u0026\u0026 while true; do k9s --headless --command=nodes; done # shootSelector: # matchLabels: # foo: bar [...] terminal: # is generally required for the terminal feature container: image: europe-docker.pkg.dev/gardener-project/releases/gardener/ops-toolbelt:0.26.0 containerImageDescriptions: - image: /.*/ops-toolbelt:.*/ description: Run `ghelp` to get information about installed tools and packages gardenTerminalHost: seedRef: my-soil garden: operatorCredentials: serviceAccountRef: name: dashboard-terminal-admin namespace: garden ","categories":"","description":"","excerpt":"Terminal Shortcuts As user and/or gardener administrator you can …","ref":"/docs/dashboard/terminal-shortcuts/","tags":"","title":"Terminal Shortcuts"},{"body":"Testing Jest We use Jest JavaScript Testing Framework\n Jest can collect code coverage information​ Jest support snapshot testing out of the box​ All in One solution. Replaces Mocha, Chai, Sinon and Istanbul​ It works with Vue.js and Node.js projects​ To execute all tests, simply run\nyarn workspaces foreach --all run test or to include test coverage generation\nyarn workspaces foreach --all run test-coverage You can also run tests for frontend, backend and charts directly inside the respective folder via\nyarn test Lint We use ESLint for static code analyzing.\nTo execute, run\nyarn workspaces foreach --all run lint ","categories":"","description":"","excerpt":"Testing Jest We use Jest JavaScript Testing Framework\n Jest can …","ref":"/docs/dashboard/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline This document walks you through:\n What kind of tests we have in Gardener How to run each of them What purpose each kind of test serves How to best write tests that are correct, stable, fast and maintainable How to debug tests that are not working as expected The document is aimed towards developers that want to contribute code and need to write tests, as well as maintainers and reviewers that review test code. It serves as a common guide that we commit to follow in our project to ensure consistency in our tests, good coverage for high confidence, and good maintainability.\nThe guidelines are not meant to be absolute rules. Always apply common sense and adapt the guideline if it doesn’t make much sense for some cases. If in doubt, don’t hesitate to ask questions during a PR review (as an author, but also as a reviewer). Add new learnings as soon as we make them!\nGenerally speaking, tests are a strict requirement for contributing new code. If you touch code that is currently untested, you need to add tests for the new cases that you introduce as a minimum. Ideally though, you would add the missing test cases for the current code as well (boy scout rule – “always leave the campground cleaner than you found it”).\nWriting Tests (Relevant for All Kinds) We follow BDD (behavior-driven development) testing principles and use Ginkgo, along with Gomega. Make sure to check out their extensive guides for more information and how to best leverage all of their features Use By to structure test cases with multiple steps, so that steps are easy to follow in the logs: example test Call defer GinkgoRecover() if making assertions in goroutines: doc, example test Use DeferCleanup instead of cleaning up manually (or use custom coding from the test framework): example test, example test DeferCleanup makes sure to run the cleanup code in the right point in time, e.g., a DeferCleanup added in BeforeEach is executed with AfterEach. Test results should point to locations that cause the failures, so that the CI output isn’t too difficult to debug/fix. Consider using ExpectWithOffset if the test uses assertions made in a helper function, among other assertions defined directly in the test (e.g. expectSomethingWasCreated): example test Make sure to add additional descriptions to Gomega matchers if necessary (e.g. in a loop): example test Introduce helper functions for assertions to make test more readable where applicable: example test Introduce custom matchers to make tests more readable where applicable: example matcher Don’t rely on accurate timing of time.Sleep and friends. If doing so, CPU throttling in CI will make tests flaky, example flake Use fake clocks instead, example PR Use the same client schemes that are also used by production code to avoid subtle bugs/regressions: example PR, production schemes, usage in test Make sure that your test is actually asserting the right thing and it doesn’t pass if the exact bug is introduced that you want to prevent. Use specific error matchers instead of asserting any error has happened, make sure that the corresponding branch in the code is tested, e.g., prefer Expect(err).To(MatchError(\"foo\")) over Expect(err).To(HaveOccurred()) If you’re unsure about your test’s behavior, attaching the debugger can sometimes be helpful to make sure your test is correct. About overwriting global variables: This is a common pattern (or hack?) in go for faking calls to external functions. However, this can lead to races, when the global variable is used from a goroutine (e.g., the function is called). Alternatively, set fields on structs (passed via parameter or set directly): this is not racy, as struct values are typically (and should be) only used for a single test case. An alternative to dealing with function variables and fields: Add an interface which your code depends on Write a fake and a real implementation (similar to clock.Clock.Sleep) The real implementation calls the actual function (clock.RealClock.Sleep calls time.Sleep) The fake implementation does whatever you want it to do for your test (clock.FakeClock.Sleep waits until the test code advanced the time) Use constants in test code with care. Typically, you should not use constants from the same package as the tested code, instead use literals. If the constant value is changed, tests using the constant will still pass, although the “specification” is not fulfilled anymore. There are cases where it’s fine to use constants, but keep this caveat in mind when doing so. Creating sample data for tests can be a high effort. If valuable, add a package for generating common sample data, e.g. Shoot/Cluster objects. Make use of the testdata directory for storing arbitrary sample data needed by tests (helm charts, YAML manifests, etc.), example PR From https://pkg.go.dev/cmd/go/internal/test: The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Unit Tests Running Unit Tests Run all unit tests:\nmake test Run all unit tests with test coverage:\nmake test-cov open test.coverage.html make test-cov-clean Run unit tests of specific packages:\n# run with same settings like in CI (race detector, timeout, ...) ./hack/test.sh ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... # freestyle go test ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... ginkgo run ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... Debugging Unit Tests Use ginkgo to focus on (a set of) test specs via code or via CLI flags. Remember to unfocus specs before contributing code, otherwise your PR tests will fail.\n$ ginkgo run --focus \"should delete the unused resources\" ./pkg/resourcemanager/controller/garbagecollector ... Will run 1 of 3 specs SS• Ran 1 of 3 Specs in 0.003 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 2 Skipped PASS Use ginkgo to run tests until they fail:\n$ ginkgo run --until-it-fails ./pkg/resourcemanager/controller/garbagecollector ... Ran 3 of 3 Specs in 0.004 seconds SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 0 Skipped PASS All tests passed... Will keep running them until they fail. This was attempt #58 No, seriously... you can probably stop now. Use the stress tool for deflaking tests that fail sporadically in CI, e.g., due resource contention (CPU throttling):\n# get the stress tool go install golang.org/x/tools/cmd/stress@latest # build a test binary ginkgo build ./pkg/resourcemanager/controller/garbagecollector # alternatively go test -c ./pkg/resourcemanager/controller/garbagecollector # run the test in parallel and report any failures stress -p 16 ./pkg/resourcemanager/controller/garbagecollector/garbagecollector.test -ginkgo.focus \"should delete the unused resources\" 5s: 1077 runs so far, 0 failures 10s: 2160 runs so far, 0 failures stress will output a path to a file containing the full failure message when a test run fails.\nPurpose of Unit Tests Unit tests prove the correctness of a single unit according to the specification of its interface. Think: Is the unit that I introduced doing what it is supposed to do for all cases? Unit tests protect against regressions caused by adding new functionality to or refactoring of a single unit. Think: Is the unit that was introduced earlier (by someone else) and that I changed still doing what it was supposed to do for all cases? Example units: functions (conversion, defaulting, validation, helpers), structs (helpers, basic building blocks like the Secrets Manager), predicates, event handlers. For these purposes, unit tests need to cover all important cases of input for a single unit and cover edge cases / negative paths as well (e.g., errors). Because of the possible high dimensionality of test input, unit tests need to be fast to execute: individual test cases should not take more than a few seconds, test suites not more than 2 minutes. Fuzzing can be used as a technique in addition to usual test cases for covering edge cases. Test coverage can be used as a tool during test development for covering all cases of a unit. However, test coverage data can be a false safety net. Full line coverage doesn’t mean you have covered all cases of valid input. We don’t have strict requirements for test coverage, as it doesn’t necessarily yield the desired outcome. Unit tests should not test too large components, e.g. entire controller Reconcile functions. If a function/component does many steps, it’s probably better to split it up into multiple functions/components that can be unit tested individually There might be special cases for very small Reconcile functions. If there are a lot of edge cases, extract dedicated functions that cover them and use unit tests to test them. Usual-sized controllers should rather be tested in integration tests. Individual parts (e.g. helper functions) should still be tested in unit test for covering all cases, though. Unit tests are especially easy to run with a debugger and can help in understanding concrete behavior of components. Writing Unit Tests For the sake of execution speed, fake expensive calls/operations, e.g. secret generation: example test Generally, prefer fakes over mocks, e.g., use controller-runtime fake client over mock clients. Mocks decrease maintainability because they expect the tested component to follow a certain way to reach the desired goal (e.g., call specific functions with particular arguments), example consequence Generally, fakes should be used in “result-oriented” test code (e.g., that a certain object was labelled, but the test doesn’t care if it was via patch or update as both a valid ways to reach the desired goal). Although rare, there are valid use cases for mocks, e.g. if the following aspects are important for correctness: Asserting that an exact function is called Asserting that functions are called in a specific order Asserting that exact parameters/values/… are passed Asserting that a certain function was not called Many of these can also be verified with fakes, although mocks might be simpler Only use mocks if the tested code directly calls the mock; never if the tested code only calls the mock indirectly (e.g., through a helper package/function). Keep in mind the maintenance implications of using mocks: Can you make a valid non-behavioral change in the code without breaking the test or dependent tests? It’s valid to mix fakes and mocks in the same test or between test cases. Generally, use the go test package, i.e., declare package \u003cproduction_package\u003e_test: Helps in avoiding cyclic dependencies between production, test and helper packages Also forces you to distinguish between the public (exported) API surface of your code and internal state that might not be of interest to tests It might be valid to use the same package as the tested code if you want to test unexported functions. Alternatively, an internal package can be used to host “internal” helpers: example package Helpers can also be exported if no one is supposed to import the containing package (e.g. controller package). Integration Tests (envtests) Integration tests in Gardener use the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) and runs the test against it. The test suites start their individual envtest environment before running the tested controller/webhook and executing test cases. Before exiting, the test suites tear down the temporary test environment.\nPackage github.com/gardener/gardener/test/envtest augments the controller-runtime’s envtest package by starting and registering gardener-apiserver. This is used to test controllers that act on resources in the Gardener APIs (aggregated APIs).\nHistorically, test machinery tests have also been called “integration tests”. However, test machinery does not perform integration testing but rather executes a form of end-to-end tests against a real landscape. Hence, we tried to sharpen the terminology that we use to distinguish between “real” integration tests and test machinery tests but you might still find “integration tests” referring to test machinery tests in old issues or outdated documents.\nRunning Integration Tests The test-integration make rule prepares the environment automatically by downloading the respective binaries (if not yet present) and setting the necessary environment variables.\nmake test-integration If you want to run a specific set of integration tests, you can also execute them using ./hack/test-integration.sh directly instead of using the test-integration rule. Prior to execution, the PATH environment variable needs to be set to also included the tools binary directory. For example:\nexport PATH=\"$PWD/hack/tools/bin/$(go env GOOS)-$(go env GOARCH):$PATH\" source ./hack/test-integration.env ./hack/test-integration.sh ./test/integration/resourcemanager/tokenrequestor The script takes care of preparing the environment for you. If you want to execute the test suites directly via go test or ginkgo, you have to point the KUBEBUILDER_ASSETS environment variable to the path that contains the etcd and kube-apiserver binaries. Alternatively, you can install the binaries to /usr/local/kubebuilder/bin. Additionally, the environment variables from hack/test-integration.env should be sourced.\nDebugging Integration Tests You can configure envtest to use an existing cluster or control plane instead of starting a temporary control plane that is torn down immediately after executing the test. This can be helpful for debugging integration tests because you can easily inspect what is going on in your test environment with kubectl.\nWhile you can use an existing cluster (e.g., kind), some test suites expect that no controllers and no nodes are running in the test environment (as it is the case in envtest test environments). Hence, using a full-blown cluster with controllers and nodes might sometimes be impractical, as you would need to stop cluster components for the tests to work.\nYou can use make start-envtest to start an envtest test environment that is managed separately from individual test suites. This allows you to keep the test environment running for as long as you want, and to debug integration tests by executing multiple test runs in parallel or inspecting test runs using kubectl. When you are finished, just hit CTRL-C for tearing down the test environment. The kubeconfig for the test environment is placed in dev/envtest-kubeconfig.yaml.\nmake start-envtest brings up an envtest environment using the default configuration. If your test suite requires a different control plane configuration (e.g., disabled admission plugins or enabled feature gates), feel free to locally modify the configuration in test/start-envtest while debugging.\nRun an envtest suite (not using gardener-apiserver) against an existing test environment:\nmake start-envtest # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_CLUSTER=true # run test with verbose output ./hack/test-integration.sh -v ./test/integration/resourcemanager/health -ginkgo.v # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml # watch test objects k get managedresource -A -w Run a gardenerenvtest suite (using gardener-apiserver) against an existing test environment:\n# modify GardenerTestEnvironment{} in test/start-envtest to disable admission plugins and enable feature gates like in test suite... make start-envtest ENVTEST_TYPE=gardener # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_GARDENER=true # run test with verbose output ./hack/test-integration.sh -v ./test/integration/controllermanager/bastion -ginkgo.v # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml # watch test objects k get bastion -A -w Similar to debugging unit tests, the stress tool can help hunting flakes in integration tests. Though, you might need to run less tests in parallel though (specified via -p) and have a bit more patience. Generally, reproducing flakes in integration tests is easier when stress-testing against an existing test environment instead of starting temporary individual control planes per test run.\nStress-test an envtest suite (not using gardener-apiserver):\n# build a test binary ginkgo build ./test/integration/resourcemanager/health # prepare a test environment to run the test against make start-envtest # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_CLUSTER=true # use same timeout settings like in CI source ./hack/test-integration.env # switch to test package directory like `go test` cd ./test/integration/resourcemanager/health # run the test in parallel and report any failures stress -ignore \"unable to grab random port\" -p 16 ./health.test ... Stress-test a gardenerenvtest suite (using gardener-apiserver):\n# modify test/start-envtest to disable admission plugins and enable feature gates like in test suite... # build a test binary ginkgo build ./test/integration/controllermanager/bastion # prepare a test environment including gardener-apiserver to run the test against make start-envtest ENVTEST_TYPE=gardener # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_GARDENER=true # use same timeout settings like in CI source ./hack/test-integration.env # switch to test package directory like `go test` cd ./test/integration/controllermanager/bastion # run the test in parallel and report any failures stress -ignore \"unable to grab random port\" -p 16 ./bastion.test ... Purpose of Integration Tests Integration tests prove that multiple units are correctly integrated into a fully-functional component of the system. Example components with multiple units: A controller with its reconciler, watches, predicates, event handlers, queues, etc. A webhook with its server, handler, decoder, and webhook configuration. Integration tests set up a full component (including used libraries) and run it against a test environment close to the actual setup. e.g., start controllers against a real Kubernetes control plane to catch bugs that can only happen when talking to a real API server. Integration tests are generally more expensive to run (e.g., in terms of execution time). Integration tests should not cover each and every detailed case. Rather than that, cover a good portion of the “usual” cases that components will face during normal operation (positive and negative test cases). Also, there is no need to cover all failure cases or all cases of predicates -\u003e they should be covered in unit tests already. Generally, not supposed to “generate test coverage” but to provide confidence that components work well. As integration tests typically test only one component (or a cohesive set of components) isolated from others, they cannot catch bugs that occur when multiple controllers interact (could be discovered by e2e tests, though). Rule of thumb: a new integration tests should be added for each new controller (an integration test doesn’t replace unit tests though). Writing Integration Tests Make sure to have a clean test environment on both test suite and test case level: Set up dedicated test environments (envtest instances) per test suite. Use dedicated namespaces per test suite: Use GenerateName with a test-specific prefix: example test Restrict the controller-runtime manager to the test namespace by setting manager.Options.Namespace: example test Alternatively, use a test-specific prefix with a random suffix determined upfront: example test This can be used to restrict webhooks to a dedicated test namespace: example test This allows running a test in parallel against the same existing cluster for deflaking and stress testing: example PR If the controller works on cluster-scoped resources: Label the resources with a label specific to the test run, e.g. the test namespace’s name: example test Restrict the manager’s cache for these objects with a corresponding label selector: example test Alternatively, use a checksum of a random UUID using uuid.NewUUID() function: example test This allows running a test in parallel against the same existing cluster for deflaking and stress testing, even if it works with cluster-scoped resources that are visible to all parallel test runs: example PR Use dedicated test resources for each test case: Use GenerateName: example test Alternatively, use a checksum of a random UUID using uuid.NewUUID() function: example test Logging the created object names is generally a good idea to support debugging failing or flaky tests: example test Always delete all resources after the test case (e.g., via DeferCleanup) that were created for the test case This avoids conflicts between test cases and cascading failures which distract from the actual root failures Don’t tolerate already existing resources (~dirty test environment), code smell: ignoring already exist errors Don’t use a cached client in test code (e.g., the one from a controller-runtime manager), always construct a dedicated test client (uncached): example test Use asynchronous assertions: Eventually and Consistently. Never Expect anything to happen synchronously (immediately). Don’t use retry or wait until functions -\u003e use Eventually, Consistently instead: example test This allows to override the interval/timeout values from outside instead of hard-coding this in the test (see hack/test-integration.sh): example PR Beware of the default Eventually / Consistently timeouts / poll intervals: docs Don’t set custom (high) timeouts and intervals in test code: example PR iInstead, shorten sync period of controllers, overwrite intervals of the tested code, or use fake clocks: example test Pass g Gomega to Eventually/Consistently and use g.Expect in it: docs, example test, example PR Don’t forget to call {Eventually,Consistently}.Should(), otherwise the assertions always silently succeeds without errors: onsi/gomega#561 When using Gardener’s envtest (envtest.GardenerTestEnvironment): Disable gardener-apiserver’s admission plugins that are not relevant to the integration test itself by passing --disable-admission-plugins: example test This makes setup / teardown code simpler and ensures to only test code relevant to the tested component itself (but not the entire set of admission plugins) e.g., you can disable the ShootValidator plugin to create Shoots that reference non-existing SecretBindings or disable the DeletionConfirmation plugin to delete Gardener resources without adding a deletion confirmation first. Use a custom rate limiter for controllers in integration tests: example test This can be used for limiting exponential backoff to shorten wait times. Otherwise, if using the default rate limiter, exponential backoff might exceed the timeout of Eventually calls and cause flakes. End-to-End (e2e) Tests (Using provider-local) We run a suite of e2e tests on every pull request and periodically on the master branch. It uses a KinD cluster and skaffold to bootstrap a full installation of Gardener based on the current revision, including provider-local. This allows us to run e2e tests in an isolated test environment and fully locally without any infrastructure interaction. The tests perform a set of operations on Shoot clusters, e.g. creating, deleting, hibernating and waking up.\nThese tests are executed in our prow instance at prow.gardener.cloud, see job definition and job history.\nRunning e2e Tests You can also run these tests on your development machine, using the following commands:\nmake kind-up export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig make gardener-up make test-e2e-local # alternatively: make test-e2e-local-simple If you want to run a specific set of e2e test cases, you can also execute them using ./hack/test-e2e-local.sh directly in combination with ginkgo label filters. For example:\n./hack/test-e2e-local.sh --label-filter \"Shoot \u0026\u0026 credentials-rotation\" ./test/e2e/gardener/... If you want to use an existing shoot instead of creating a new one for the test case and deleting it afterwards, you can specify the existing shoot via the following flags. This can be useful to speed up the development of e2e tests.\n./hack/test-e2e-local.sh --label-filter \"Shoot \u0026\u0026 credentials-rotation\" ./test/e2e/gardener/... -- --project-namespace=garden-local --existing-shoot-name=local For more information, see Developing Gardener Locally and Deploying Gardener Locally.\nDebugging e2e Tests When debugging e2e test failures in CI, logs of the cluster components can be very helpful. Our e2e test jobs export logs of all containers running in the kind cluster to prow’s artifacts storage. You can find them by clicking the Artifacts link in the top bar in prow’s job view and navigating to artifacts. This directory will contain all cluster component logs grouped by node.\nPull all artifacts using gsutil for searching and filtering the logs locally (use the path displayed in the artifacts view):\ngsutil cp -r gs://gardener-prow/pr-logs/pull/gardener_gardener/6136/pull-gardener-e2e-kind/1542030416616099840/artifacts/gardener-local-control-plane /tmp Purpose of e2e Tests e2e tests provide a high level of confidence that our code runs as expected by users when deployed to production. They are supposed to catch bugs resulting from interaction between multiple components. Test cases should be as close as possible to real usage by end users: You should test “from the perspective of the user” (or operator). Example: I create a Shoot and expect to be able to connect to it via the provided kubeconfig. Accordingly, don’t assert details of the system. e.g., the user also wouldn’t expect that there is a kube-apiserver deployment in the seed, they rather expect that they can talk to it no matter how it is deployed Only assert details of the system if the tested feature is not fully visible to the end-user and there is no other way of ensuring that the feature works reliably e.g., the Shoot CA rotation is not fully visible to the user but is assertable by looking at the secrets in the Seed. Pro: can be executed by developers and users without any real infrastructure (provider-local). Con: they currently cannot be executed with real infrastructure (e.g., provider-aws), we will work on this as part of #6016. Keep in mind that the tested scenario is still artificial in a sense of using default configuration, only a few objects, only a few config/settings combinations are covered. We will never be able to cover the full “test matrix” and this should not be our goal. Bugs will still be released and will still happen in production; we can’t avoid it. Instead, we should add test cases for preventing bugs in features or settings that were frequently regressed: example PR Usually e2e tests cover the “straight-forward cases”. However, negative test cases can also be included, especially if they are important from the user’s perspective. Writing e2e Tests Always wrap API calls and similar things in Eventually blocks: example test At this point, we are pretty much working with a distributed system and failures can happen anytime. Wrapping calls in Eventually makes tests more stable and more realistic (usually, you wouldn’t call the system broken if a single API call fails because of a short connectivity issue). Most of the points from writing integration tests are relevant for e2e tests as well (especially the points about asynchronous assertions). In contrast to integration tests, in e2e tests, it might make sense to specify higher timeouts for Eventually calls, e.g., when waiting for a Shoot to be reconciled. Generally, try to use the default settings for Eventually specified via the environment variables. Only set higher timeouts if waiting for long-running reconciliations to be finished. Gardener Upgrade Tests (Using provider-local) Gardener upgrade tests setup a kind cluster and deploy Gardener version vX.X.X before upgrading it to a given version vY.Y.Y.\nThis allows verifying whether the current (unreleased) revision/branch (or a specific release) is compatible with the latest (or a specific other) release. The GARDENER_PREVIOUS_RELEASE and GARDENER_NEXT_RELEASE environment variables are used to specify the respective versions.\nThis helps understanding what happens or how the system reacts when Gardener upgrades from versions vX.X.X to vY.Y.Y for existing shoots in different states (creation/hibernation/wakeup/deletion). Gardener upgrade tests also help qualifying releases for all flavors (non-HA or HA with failure tolerance node/zone).\nJust like E2E tests, upgrade tests also use a KinD cluster and skaffold for bootstrapping a full Gardener installation based on the current revision/branch, including provider-local. This allows running e2e tests in an isolated test environment, fully locally without any infrastructure interaction. The tests perform a set of operations on Shoot clusters, e.g. create, delete, hibernate and wake up.\nBelow is a sequence describing how the tests are performed.\n Create a kind cluster. Install Gardener version vX.X.X. Run gardener pre-upgrade tests which are labeled with pre-upgrade. Upgrade Gardener version from vX.X.X to vY.Y.Y. Run gardener post-upgrade tests which are labeled with post-upgrade Tear down seed and kind cluster. How to Run Upgrade Tests Between Two Gardener Releases Sometimes, we need to verify/qualify two Gardener releases when we upgrade from one version to another. This can performed by fetching the two Gardener versions from the GitHub Gardener release page and setting appropriate env variables GARDENER_PREVIOUS_RELEASE, GARDENER_NEXT_RELEASE.\n GARDENER_PREVIOUS_RELEASE – This env variable refers to a source revision/branch (or a specific release) which has to be installed first and then upgraded to version GARDENER_NEXT_RELEASE. By default, it fetches the latest release version from GitHub Gardener release page.\n GARDENER_NEXT_RELEASE – This env variable refers to the target revision/branch (or a specific release) to be upgraded to after successful installation of GARDENER_PREVIOUS_RELEASE. By default, it considers the local HEAD revision, builds code, and installs Gardener from the current revision where the Gardener upgrade tests triggered.\n make ci-e2e-kind-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 make ci-e2e-kind-ha-single-zone-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 make ci-e2e-kind-ha-multi-zone-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 Purpose of Upgrade Tests Tests will ensure that shoot clusters reconciled with the previous version of Gardener work as expected even with the next Gardener version. This will reproduce or catch actual issues faced by end users. One of the test cases ensures no downtime is faced by the end-users for shoots while upgrading Gardener if the shoot’s control-plane is configured as HA. Writing Upgrade Tests Tests are divided into two parts and labeled with pre-upgrade and post-upgrade labels. An example test case which ensures a shoot which was hibernated in a previous Gardener release should wakeup as expected in next release: Creating a shoot and hibernating a shoot is pre-upgrade test case which should be labeled pre-upgrade label. Then wakeup a shoot and delete a shoot is post-upgrade test case which should be labeled post-upgrade label. Test Machinery Tests Please see Test Machinery Tests.\nPurpose of Test Machinery Tests Test machinery tests have to be executed against full-blown Gardener installations. They can provide a very high level of confidence that an installation is functional in its current state, this includes: all Gardener components, Extensions, the used Cloud Infrastructure, all relevant settings/configuration. This brings the following benefits: They test more realistic scenarios than e2e tests (real configuration, real infrastructure, etc.). Tests run “where the users are”. However, this also brings significant drawbacks: Tests are difficult to develop and maintain. Tests require a full Gardener installation and cannot be executed in CI (on PR-level or against master). Tests require real infrastructure (think cloud provider credentials, cost). Using TestDefinitions under .test-defs requires a full test machinery installation. Accordingly, tests are heavyweight and expensive to run. Testing against real infrastructure can cause flakes sometimes (e.g., in outage situations). Failures are hard to debug, because clusters are deleted after the test (for obvious cost reasons). Bugs can only be caught, once it’s “too late”, i.e., when code is merged and deployed. Today, test machinery tests cover a bigger “test matrix” (e.g., Shoot creation across infrastructures, kubernetes versions, machine image versions). Test machinery also runs Kubernetes conformance tests. However, because of the listed drawbacks, we should rather focus on augmenting our e2e tests, as we can run them locally and in CI in order to catch bugs before they get merged. It’s still a good idea to add test machinery tests if a feature that is depending on some installation-specific configuration needs to be tested. Writing Test Machinery Tests Generally speaking, most points from writing integration tests and writing e2e tests apply here as well. However, test machinery tests contain a lot of technical debt and existing code doesn’t follow these best practices. As test machinery tests are out of our general focus, we don’t intend on reworking the tests soon or providing more guidance on how to write new ones. Manual Tests Manual tests can be useful when the cost of trying to automatically test certain functionality are too high. Useful for PR verification, if a reviewer wants to verify that all cases are properly tested by automated tests. Currently, it’s the simplest option for testing upgrade scenarios. e.g. migration coding is probably best tested manually, as it’s a high effort to write an automated test for little benefit Obviously, the need for manual tests should be kept at a bare minimum. Instead, we should add e2e tests wherever sensible/valuable. We want to implement some form of general upgrade tests as part of #6016. ","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline This document walks you …","ref":"/docs/gardener/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline Intent of this document is to introduce you (the developer) to the following:\n Category of tests that exists. Libraries that are used to write tests. Best practices to write tests that are correct, stable, fast and maintainable. How to run each category of tests. For any new contributions tests are a strict requirement. Boy Scouts Rule is followed: If you touch a code for which either no tests exist or coverage is insufficient then it is expected that you will add relevant tests.\nTools Used for Writing Tests These are the following tools that were used to write all the tests (unit + envtest + vanilla kind cluster tests), it is preferred not to introduce any additional tools / test frameworks for writing tests:\nGomega We use gomega as our matcher or assertion library. Refer to Gomega’s official documentation for details regarding its installation and application in tests.\nTesting Package from Standard Library We use the Testing package provided by the standard library in golang for writing all our tests. Refer to its official documentation to learn how to write tests using Testing package. You can also refer to this example.\nWriting Tests Common for All Kinds For naming the individual tests (TestXxx and testXxx methods) and helper methods, make sure that the name describes the implementation of the method. For eg: testScalingWhenMandatoryResourceNotFound tests the behaviour of the scaler when a mandatory resource (KCM deployment) is not present. Maintain proper logging in tests. Use t.log() method to add appropriate messages wherever necessary to describe the flow of the test. See this for examples. Make use of the testdata directory for storing arbitrary sample data needed by tests (YAML manifests, etc.). See this package for examples. From https://pkg.go.dev/cmd/go/internal/test: The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Table-driven tests We need a tabular structure in two cases:\n When we have multiple tests which require the same kind of setup:- In this case we have a TestXxxSuite method which will do the setup and run all the tests. We have a slice of test struct which holds all the tests (typically a title and run method). We use a for loop to run all the tests one by one. See this for examples. When we have the same code path and multiple possible values to check:- In this case we have the arguments and expectations in a struct. We iterate through the slice of all such structs, passing the arguments to appropriate methods and checking if the expectation is met. See this for examples. Env Tests Env tests in Dependency Watchdog use the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) and runs the test against it. The code to set up and teardown the environment can be checked out here.\nThese are the points to be followed while writing tests that use envtest setup:\n All tests should be divided into two top level partitions:\n tests with common environment (testXxxCommonEnvTests) tests which need a dedicated environment for each one. (testXxxDedicatedEnvTests) They should be contained within the TestXxxSuite method. See this for examples. If all tests are of one kind then this is not needed.\n Create a method named setUpXxxTest for performing setup tasks before all/each test. It should either return a method or have a separate method to perform teardown tasks. See this for examples.\n The tests run by the suite can be table-driven as well.\n Use the envtest setup when there is a need of an environment close to an actual setup. Eg: start controllers against a real Kubernetes control plane to catch bugs that can only happen when talking to a real API server.\n NOTE: It is currently not possible to bring up more than one envtest environments. See issue#1363. We enforce running serial execution of test suites each of which uses a different envtest environments. See hack/test.sh.\n Vanilla Kind Cluster Tests There are some tests where we need a vanilla kind cluster setup, for eg:- The scaler.go code in the prober package uses the scale subresource to scale the deployments mentioned in the prober config. But the envtest setup does not support the scale subresource as of now. So we need this setup to test if the deployments are scaled as per the config or not. You can check out the code for this setup here. You can add utility methods for different kubernetes and custom resources in there.\nThese are the points to be followed while writing tests that use Vanilla Kind Cluster setup:\n Use this setup only if there is a need of an actual Kubernetes cluster(api server + control plane + etcd) to write the tests. (Because this is slower than your normal envTest setup) Create setUpXxxTest similar to the one in envTest. Follow the same structural pattern used in envTest for writing these tests. See this for examples. Run Tests To run unit tests, use the following Makefile target\nmake test To run KIND cluster based tests, use the following Makefile target\nmake kind-tests # these tests will be slower as it brings up a vanilla KIND cluster To view coverage after running the tests, run :\ngo tool cover -html=cover.out Flaky tests If you see that a test is flaky then you can use make stress target which internally uses stress tool\nmake stress test-package=\u003ctest-package\u003e test-func=\u003ctest-func\u003e tool-params=\"\u003ctool-params\u003e\" An example invocation:\nmake stress test-package=./internal/util test-func=TestRetryUntilPredicateWithBackgroundContext tool-params=\"-p 10\" The make target will do the following:\n It will create a test binary for the package specified via test-package at /tmp/pkg-stress.test directory. It will run stress tool passing the tool-params and targets the function test-func. ","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline Intent of this document is to …","ref":"/docs/other-components/dependency-watchdog/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline Intent of this document is to introduce you (the developer) to the following:\n Libraries that are used to write tests. Best practices to write tests that are correct, stable, fast and maintainable. How to run tests. The guidelines are not meant to be absolute rules. Always apply common sense and adapt the guideline if it doesn’t make much sense for some cases. If in doubt, don’t hesitate to ask questions during a PR review (as an author, but also as a reviewer). Add new learnings as soon as we make them!\nFor any new contributions tests are a strict requirement. Boy Scouts Rule is followed: If you touch a code for which either no tests exist or coverage is insufficient then it is expected that you will add relevant tests.\nCommon guidelines for writing tests We use the Testing package provided by the standard library in golang for writing all our tests. Refer to its official documentation to learn how to write tests using Testing package. You can also refer to this example.\n We use gomega as our matcher or assertion library. Refer to Gomega’s official documentation for details regarding its installation and application in tests.\n For naming the individual test/helper functions, ensure that the name describes what the function tests/helps-with. Naming is important for code readability even when writing tests - example-testcase-naming.\n Introduce helper functions for assertions to make test more readable where applicable - example-assertion-function.\n Introduce custom matchers to make tests more readable where applicable - example-custom-matcher.\n Do not use time.Sleep and friends as it renders the tests flaky.\n If a function returns a specific error then ensure that the test correctly asserts the expected error instead of just asserting that an error occurred. To help make this assertion consider using DruidError where possible. example-test-utility \u0026 usage.\n Creating sample data for tests can be a high effort. Consider writing test utilities to generate sample data instead. example-test-object-builder.\n If tests require any arbitrary sample data then ensure that you create a testdata directory within the package and keep the sample data as files in it. From https://pkg.go.dev/cmd/go/internal/test\n The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Avoid defining shared variable/state across tests. This can lead to race conditions causing non-deterministic state. Additionally it limits the capability to run tests concurrently via t.Parallel().\n Do not assume or try and establish an order amongst different tests. This leads to brittle tests as the codebase evolves.\n If you need to have logs produced by test runs (especially helpful in failing tests), then consider using t.Log or t.Logf.\n Unit Tests If you need a kubernetes client.Client, prefer using fake client instead of mocking the client. You can inject errors when building the client which enables you test error handling code paths. Mocks decrease maintainability because they expect the tested component to follow a certain way to reach the desired goal (e.g., call specific functions with particular arguments). All unit tests should be run quickly. Do not use envtest and do not set up a Kind cluster in unit tests. If you have common setup for variations of a function, consider using table-driven tests. See this as an example. An individual test should only test one and only one thing. Do not try and test multiple variants in a single test. Either use table-driven tests or write individual tests for each variation. If a function/component has multiple steps, its probably better to split/refactor it into multiple functions/components that can be unit tested individually. If there are a lot of edge cases, extract dedicated functions that cover them and use unit tests to test them. Running Unit Tests NOTE: For unit tests we are currently transitioning away from ginkgo to using golang native tests. The make test-unit target runs both ginkgo and golang native tests. Once the transition is complete this target will be simplified.\n Run all unit tests\n\u003e make test-unit Run unit tests of specific packages:\n# if you have not already installed gotestfmt tool then install it once. # make test-unit target automatically installs this in ./hack/tools/bin. You can alternatively point the GOBIN to this directory and then directly invoke test-go.sh \u003e go install github.com/gotesttools/gotestfmt/v2/cmd/gotestfmt@v2.5.0 \u003e ./hack/test-go.sh \u003cpackage-1\u003e \u003cpackage-2\u003e De-flaking Unit Tests If tests have sporadic failures, then trying running ./hack/stress-test.sh which internally uses stress tool.\n# install the stress tool \u003e go install golang.org/x/tools/cmd/stress@latest # invoke the helper script to execute the stress test \u003e ./hack/stress-test.sh test-package=\u003ctest-package\u003e test-func=\u003ctest-function\u003e tool-params=\"\u003ctool-params\u003e\" An example invocation:\n\u003e ./hack/stress-test.sh test-package=./internal/utils test-func=TestRunConcurrentlyWithAllSuccessfulTasks tool-params=\"-p 10\" 5s: 877 runs so far, 0 failures 10s: 1906 runs so far, 0 failures 15s: 2885 runs so far, 0 failures ... stress tool will output a path to a file containing the full failure message when a test run fails.\nIntegration Tests (envtests) Integration tests in etcd-druid use envtest. It sets up a minimal temporary control plane (etcd + kube-apiserver) and runs the test against it. Test suites (group of tests) start their individual envtest environment before running the tests for the respective controller/webhook. Before exiting, the temporary test environment is shutdown.\n NOTE: For integration-tests we are currently transitioning away from ginkgo to using golang native tests. All ginkgo integration tests can be found here and golang native integration tests can be found here.\n Integration tests in etcd-druid only targets a single controller. It is therefore advised that code (other than common utility functions should not be shared between any two controllers). If you are sharing a common envtest environment across tests then it is recommended that an individual test is run in a dedicated namespace. Since envtest is used to setup a minimum environment where no controller (e.g. KCM, Scheduler) other than etcd and kube-apiserver runs, status updates to resources controller/reconciled by not-deployed-controllers will not happen. Tests should refrain from asserting changes to status. In case status needs to be set as part of a test setup then it must be done explicitly. If you have common setup and teardown, then consider using TestMain -example. If you have to wait for resources to be provisioned or reach a specific state, then it is recommended that you create smaller assertion functions and use Gomega’s AsyncAssertion functions - example. Beware of the default Eventually / Consistently timeouts / poll intervals: docs. Don’t forget to call {Eventually,Consistently}.Should(), otherwise the assertions always silently succeeds without errors: onsi/gomega#561 Running Integration Tests \u003e make test-integration Debugging Integration Tests There are two ways in which you can debug Integration Tests:\nUsing IDE All commonly used IDE’s provide in-built or easy integration with delve debugger. For debugging integration tests the only additional requirement is to set KUBEBUILDER_ASSETS environment variable. You can get the value of this environment variable by executing the following command:\n# ENVTEST_K8S_VERSION is the k8s version that you wish to use for testing. \u003e setup-envtest --os $(go env GOOS) --arch $(go env GOARCH) use $ENVTEST_K8S_VERSION -p path NOTE: All integration tests usually have a timeout. If you wish to debug a failing integration-test then increase the timeouts.\n Use standalone envtest We also provide a capability to setup a stand-alone envtest and leverage the cluster to run individual integration-test. This allows you more control over when this k8s control plane is destroyed and allows you to inspect the resources at the end of the integration-test run using kubectl.\n NOTE: While you can use an existing cluster (e.g., kind), some test suites expect that no controllers and no nodes are running in the test environment (as it is the case in envtest test environments). Hence, using a full-blown cluster with controllers and nodes might sometimes be impractical, as you would need to stop cluster components for the tests to work.\n To setup a standalone envtest and run an integration test against it, do the following:\n# In a terminal session use the following make target to setup a standalone envtest \u003e make start-envtest # As part of output path to kubeconfig will be also be printed on the console. # In another terminal session setup resource(s) watch: \u003e kubectl get po -A -w # alternatively you can also use `watch -d \u003ccommand\u003e` utility. # In another terminal session: \u003e export KUBECONFIG=\u003cenvtest-kubeconfig-path\u003e \u003e export USE_EXISTING_K8S_CLUSTER=true # run the test \u003e go test -run=\"\u003cregex-for-test\u003e\" \u003cpackage\u003e # example: go test -run=\"^TestEtcdDeletion/test deletion of all*\" ./test/it/controller/etcd Once you are done the testing you can press Ctrl+C in the terminal session where you started envtest. This will shutdown the kubernetes control plane.\nEnd-To-End (e2e) Tests End-To-End tests are run using Kind cluster and Skaffold. These tests provide a high level of confidence that the code runs as expected by users when deployed to production.\n Purpose of running these tests is to be able to catch bugs which result from interaction amongst different components within etcd-druid.\n In CI pipelines e2e tests are run with S3 compatible LocalStack (in cases where backup functionality has been enabled for an etcd cluster).\n In future we will only be using a file-system based local provider to reduce the run times for the e2e tests when run in a CI pipeline.\n e2e tests can be triggered either with other cloud provider object-store emulators or they can also be run against actual/remove cloud provider object-store services.\n In contrast to integration tests, in e2e tests, it might make sense to specify higher timeouts for Gomega’s AsyncAssertion calls.\n Running e2e tests locally Detailed instructions on how to run e2e tests can be found here.\n","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline Intent of this document is to …","ref":"/docs/other-components/etcd-druid/testing/","tags":"","title":"Testing"},{"body":"Dependency management We use golang modules to manage golang dependencies. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating dependencies The Makefile contains a rule called tidy which performs go mod tidy.\ngo mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages.\n$ make tidy The dependencies are installed into the go mod cache folder.\n⚠️ Make sure you test the code after you have updated the dependencies!\n","categories":"","description":"","excerpt":"Dependency management We use golang modules to manage golang …","ref":"/docs/other-components/machine-controller-manager/testing_and_dependencies/","tags":"","title":"Testing And Dependencies"},{"body":"Test Machinery Tests In order to automatically qualify Gardener releases, we execute a set of end-to-end tests using Test Machinery. This requires a full Gardener installation including infrastructure extensions, as well as a setup of Test Machinery itself. These tests operate on Shoot clusters across different Cloud Providers, using different supported Kubernetes versions and various configuration options (huge test matrix).\nThis manual gives an overview about test machinery tests in Gardener.\n Structure Add a new test Test Labels Framework Container Images Structure Gardener test machinery tests are split into two test suites that can be found under test/testmachinery/suites:\n The Gardener Test Suite contains all tests that only require a running gardener instance. The Shoot Test Suite contains all tests that require a predefined running shoot cluster. The corresponding tests of a test suite are defined in the import statement of the suite definition (see shoot/run_suite_test.go) and their source code can be found under test/testmachinery.\nThe test directory is structured as follows:\ntest ├── e2e # end-to-end tests (using provider-local) │ ├── gardener │ │ ├── seed │ │ ├── shoot | | └── ... | └──operator ├── framework # helper code shared across integration, e2e and testmachinery tests ├── integration # integration tests (envtests) │ ├── controllermanager │ ├── envtest │ ├── resourcemanager │ ├── scheduler │ └── ... └── testmachinery # test machinery tests ├── gardener # actual test cases imported by suites/gardener │ └── security ├── shoots # actual test cases imported by suites/shoot │ ├── applications │ ├── care │ ├── logging │ ├── operatingsystem │ ├── operations │ └── vpntunnel ├── suites # suites that run agains a running garden or shoot cluster │ ├── gardener │ └── shoot └── system # suites that are used for building a full test flow ├── complete_reconcile ├── managed_seed_creation ├── managed_seed_deletion ├── shoot_cp_migration ├── shoot_creation ├── shoot_deletion ├── shoot_hibernation ├── shoot_hibernation_wakeup └── shoot_update A suite can be executed by running the suite definition with ginkgo’s focus and skip flags to control the execution of specific labeled test. See the example below:\ngo test -timeout=0 ./test/testmachinery/suites/shoot \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of teh current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -shoot-name=\u003cshoot-name\u003e \\ # Name of the shoot to test -project-namespace=\u003cgardener project namespace\u003e \\ # Name of the gardener project the test shoot resides -ginkgo.focus=\"\\[RELEASE\\]\" \\ # Run all tests that are tagged as release -ginkgo.skip=\"\\[SERIAL\\]|\\[DISRUPTIVE\\]\" # Exclude all tests that are tagged SERIAL or DISRUPTIVE Add a New Test To add a new test the framework requires the following steps (step 1. and 2. can be skipped if the test is added to an existing package):\n Create a new test file e.g. test/testmachinery/shoot/security/my-sec-test.go Import the test into the appropriate test suite (gardener or shoot): import _ \"github.com/gardener/gardener/test/testmachinery/shoot/security\" Define your test with the testframework. The framework will automatically add its initialization, cleanup and dump functions. var _ = ginkgo.Describe(\"my suite\", func(){ f := framework.NewShootFramework(nil) f.Beta().CIt(\"my first test\", func(ctx context.Context) { f.ShootClient.Get(xx) // testing ... }) }) The newly created test can be tested by focusing the test with the default ginkgo focus f.Beta().FCIt(\"my first test\", func(ctx context.Context) and running the shoot test suite with:\ngo test -timeout=0 ./test/testmachinery/suites/shoot \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of the current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -shoot-name=\u003cshoot-name\u003e \\ # Name of the shoot to test -project-namespace=\u003cgardener project namespace\u003e \\ -fenced=\u003ctrue|false\u003e # Tested shoot is running in a fenced environment and cannot be reached by gardener or for the gardener suite with:\ngo test -timeout=0 ./test/testmachinery/suites/gardener \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of the current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -project-namespace=\u003cgardener project namespace\u003e ⚠️ Make sure that you do not commit any focused specs as this feature is only intended for local development! Ginkgo will fail the test suite if there are any focused specs.\nAlternatively, a test can be triggered by specifying a ginkgo focus regex with the name of the test e.g.\ngo test -timeout=0 ./test/testmachinery/suites/gardener \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file -kubecfg=/path/to/gardener/kubeconfig \\ -project-namespace=\u003cgardener project namespace\u003e \\ -ginkgo.focus=\"my first test\" # regex to match test cases Test Labels Every test should be labeled by using the predefined labels available with every framework to have consistent labeling across all test machinery tests.\nThe labels are applied to every new It()/CIt() definition by:\nf := framework.NewCommonFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [SERIAL] my test\" f := framework.NewShootFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [SERIAL] [SHOOT] my test\" f := framework.NewGardenerFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [GARDENER] [SERIAL] my test\" Labels:\n Beta: Newly created tests with no experience on stableness should be first labeled as beta tests. They should be watched (and probably improved) until stable enough to be promoted to Default. Default: Tests that were Beta before and proved to be stable are promoted to Default eventually. Default tests run more often, produce alerts and are considered during the release decision although they don’t necessarily block a release. Release: Test are release relevant. A failing Release test blocks the release pipeline. Therefore, these tests need to be stable. Only tests proven to be stable will eventually be promoted to Release. Behavior Labels:\n Serial: The test should always be executed in serial with no other tests running, as it may impact other tests. Destructive: The test is destructive. Which means that is runs with no other tests and may break Gardener or the shoot. Only create such tests if really necessary, as the execution will be expensive (neither Gardener nor the shoot can be reused in this case for other tests). Framework The framework directory contains all the necessary functions / utilities for running test machinery tests. For example, there are methods for creation/deletion of shoots, waiting for shoot deletion/creation, downloading/installing/deploying helm charts, logging, etc.\nThe framework itself consists of 3 different frameworks that expect different prerequisites and offer context specific functionality.\n CommonFramework: The common framework is the base framework that handles logging and setup of commonly needed resources like helm. It also contains common functions for interacting with Kubernetes clusters like Waiting for resources to be ready or Exec into a running pod. GardenerFramework contains all functions of the common framework and expects a running Gardener instance with the provided Gardener kubeconfig and a project namespace. It also contains functions to interact with gardener like Waiting for a shoot to be reconciled or Patch a shoot or Get a seed. ShootFramework: contains all functions of the common and the gardener framework. It expects a running shoot cluster defined by the shoot’s name and namespace (project namespace). This framework contains functions to directly interact with the specific shoot. The whole framework also includes commonly used checks, ginkgo wrapper, etc., as well as commonly used tests. Theses common application tests (like the guestbook test) can be used within multiple tests to have a default application (with ingress, deployment, stateful backend) to test external factors.\nConfig\nEvery framework commandline flag can also be defined by a configuration file (the value of the configuration file is only used if a flag is not specified by commandline). The test suite searches for a configuration file (yaml is preferred) if the command line flag --config=/path/to/config/file is provided. A framework can be defined in the configuration file by just using the flag name as root key e.g.\nverbose: debug kubecfg: /kubeconfig/path project-namespace: garden-it Report\nThe framework automatically writes the ginkgo default report to stdout and a specifically structured elastichsearch bulk report file to a specified location. The elastichsearch bulk report will write one json document per testcase and injects the metadata of the whole testsuite. An example document for one test case would look like the following document:\n{ \"suite\": { \"name\": \"Shoot Test Suite\", \"phase\": \"Succeeded\", \"tests\": 3, \"failures\": 1, \"errors\": 0, \"time\": 87.427 }, \"name\": \"Shoot application testing [DEFAULT] [RELEASE] [SHOOT] should download shoot kubeconfig successfully\", \"shortName\": \"should download shoot kubeconfig successfully\", \"labels\": [ \"DEFAULT\", \"RELEASE\", \"SHOOT\" ], \"phase\": \"Succeeded\", \"time\": 0.724512057 } Resources\nThe resources directory contains templates used by the tests.\nresources └── templates ├── guestbook-app.yaml.tpl └── logger-app.yaml.tpl System Tests This directory contains the system tests that have a special meaning for the testmachinery with their own Test Definition. Currently, these system tests consist of:\n Shoot creation Shoot deletion Shoot Kubernetes update Gardener Full reconcile check Shoot Creation Test Create Shoot test is meant to test shoot creation.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/shoot_creation \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -cloud-profile-name=$CLOUDPROFILE \\ -seed=$SEED \\ -secret-binding=$SECRET_BINDING \\ -provider-type=$PROVIDER_TYPE \\ -region=$REGION \\ -k8s-version=$K8S_VERSION \\ -project-namespace=$PROJECT_NAMESPACE \\ -annotations=$SHOOT_ANNOTATIONS \\ -infrastructure-provider-config-filepath=$INFRASTRUCTURE_PROVIDER_CONFIG_FILEPATH \\ -controlplane-provider-config-filepath=$CONTROLPLANE_PROVIDER_CONFIG_FILEPATH \\ -workers-config-filepath=$$WORKERS_CONFIG_FILEPATH \\ -worker-zone=$ZONE \\ -networking-pods=$NETWORKING_PODS \\ -networking-services=$NETWORKING_SERVICES \\ -networking-nodes=$NETWORKING_NODES \\ -start-hibernated=$START_HIBERNATED Shoot Deletion Test Delete Shoot test is meant to test the deletion of a shoot.\nExample Run\ngo test -timeout=0 -ginkgo.v -ginkgo.show-node-events \\ ./test/testmachinery/system/shoot_deletion \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -project-namespace=$PROJECT_NAMESPACE Shoot Update Test The Update Shoot test is meant to test the Kubernetes version update of a existing shoot. If no specific version is provided, the next patch version is automatically selected. If there is no available newer version, this test is a noop.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/shoot_update \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -project-namespace=$PROJECT_NAMESPACE \\ -version=$K8S_VERSION Gardener Full Reconcile Test The Gardener Full Reconcile test is meant to test if all shoots of a Gardener instance are successfully reconciled.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/complete_reconcile \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -project-namespace=$PROJECT_NAMESPACE \\ -gardenerVersion=$GARDENER_VERSION # needed to validate the last acted gardener version of a shoot Container Images Test machinery tests usually deploy a workload to the Shoot cluster as part of the test execution. When introducing a new container image, consider the following:\n Make sure the container image is multi-arch. Tests are executed against amd64 and arm64 based worker Nodes. Do not use container images from Docker Hub. Docker Hub has rate limiting (see Download rate limit). For anonymous users, the rate limit is set to 100 pulls per 6 hours per IP address. In some fenced environments the network setup can be such that all egress connections are issued from single IP (or set of IPs). In such scenarios the allowed rate limit can be exhausted too fast. See https://github.com/gardener/gardener/issues/4160. Docker Hub registry doesn’t support pulling images over IPv6 (see Beta IPv6 Support on Docker Hub Registry). Avoid manually copying Docker Hub images to Gardener GCR (europe-docker.pkg.dev/gardener-project/releases/3rd/). Use the existing prow job for this (see Copy Images). If possible, use a Kubernetes e2e image (registry.k8s.io/e2e-test-images/\u003cimage-name\u003e). In some cases, there is already a Kubernetes e2e image alternative of the Docker Hub image. For example, use registry.k8s.io/e2e-test-images/busybox instead of europe-docker.pkg.dev/gardener-project/releases/3rd/busybox or docker.io/busybox. Kubernetes has multiple test images - see https://github.com/kubernetes/kubernetes/tree/v1.27.0/test/images. agnhost is the most widely used image in Kubernetes e2e tests. It contains multiple testing related binaries inside such as pause, logs-generator, serve-hostname, webhook and others. See all of them in the agnhost’s README.md. The list of available Kubernetes e2e images and tags can be checked in this page. ","categories":"","description":"","excerpt":"Test Machinery Tests In order to automatically qualify Gardener …","ref":"/docs/gardener/testmachinery_tests/","tags":"","title":"Testmachinery Tests"},{"body":"Topology-Aware Traffic Routing Motivation The enablement of highly available shoot control-planes requires multi-zone seed clusters. A garden runtime cluster can also be a multi-zone cluster. The topology-aware routing is introduced to reduce costs and to improve network performance by avoiding the cross availability zone traffic, if possible. The cross availability zone traffic is charged by the cloud providers and it comes with higher latency compared to the traffic within the same zone. The topology-aware routing feature enables topology-aware routing for Services deployed in a seed or garden runtime cluster. For the clients consuming these topology-aware services, kube-proxy favors the endpoints which are located in the same zone where the traffic originated from. In this way, the cross availability zone traffic is avoided.\nHow it works The topology-aware routing feature relies on the Kubernetes feature TopologyAwareHints.\nEndpointSlice Hints Mutating Webhook The component that is responsible for providing hints in the EndpointSlices resources is the kube-controller-manager, in particular this is the EndpointSlice controller. However, there are several drawbacks with the TopologyAwareHints feature that don’t allow us to use it in its native way:\n The algorithm in the EndpointSlice controller is based on a CPU-balance heuristic. From the TopologyAwareHints documentation:\n The controller allocates a proportional amount of endpoints to each zone. This proportion is based on the allocatable CPU cores for nodes running in that zone. For example, if one zone had 2 CPU cores and another zone only had 1 CPU core, the controller would allocate twice as many endpoints to the zone with 2 CPU cores.\n In case it is not possible to achieve a balanced distribution of the endpoints, as a safeguard mechanism the controller removes hints from the EndpointSlice resource. In our setup, the clients and the servers are well-known and usually the traffic a component receives does not depend on the zone’s allocatable CPU. Many components deployed by Gardener are scaled automatically by VPA. In case of an overload of a replica, the VPA should provide and apply enhanced CPU and memory resources. Additionally, Gardener uses the cluster-autoscaler to upscale/downscale Nodes dynamically. Hence, it is not possible to ensure a balanced allocatable CPU across the zones.\n The TopologyAwareHints feature does not work at low-endpoint counts. It falls apart for a Service with less than 10 Endpoints.\n Hints provided by the EndpointSlice controller are not deterministic. With cluster-autoscaler running and load increasing, hints can be removed in the next moment. There is no option to enforce the zone-level topology.\n For more details, see the following issue kubernetes/kubernetes#113731.\nTo circumvent these issues with the EndpointSlice controller, a mutating webhook in the gardener-resource-manager assigns hints to EndpointSlice resources. For each endpoint in the EndpointSlice, it sets the endpoint’s hints to the endpoint’s zone. The webhook overwrites the hints provided by the EndpointSlice controller in kube-controller-manager. For more details, see the webhook’s documentation.\nkube-proxy By default, with kube-proxy running in iptables mode, traffic is distributed randomly across all endpoints, regardless of where it originates from. In a cluster with 3 zones, traffic is more likely to go to another zone than to stay in the current zone. With the topology-aware routing feature, kube-proxy filters the endpoints it routes to based on the hints in the EndpointSlice resource. In most of the cases, kube-proxy will prefer the endpoint(s) in the same zone. For more details, see the Kubernetes documentation.\nHow to make a Service topology-aware? To make a Service topology-aware, the following annotation and label have to be added to the Service:\napiVersion: v1 kind: Service metadata: annotations: service.kubernetes.io/topology-aware-hints: \"auto\" labels: endpoint-slice-hints.resources.gardener.cloud/consider: \"true\" Note: In Kubernetes 1.27 the service.kubernetes.io/topology-aware-hints=auto annotation is deprecated in favor of the newly introduced service.kubernetes.io/topology-mode=auto. When the runtime cluster’s K8s version is \u003e= 1.27, use the service.kubernetes.io/topology-mode=auto annotation. For more details, see the corresponding upstream PR.\n The service.kubernetes.io/topology-aware-hints=auto annotation is needed for kube-proxy. One of the prerequisites on kube-proxy side for using topology-aware routing is the corresponding Service to be annotated with the service.kubernetes.io/topology-aware-hints=auto. For more details, see the following kube-proxy function. The endpoint-slice-hints.resources.gardener.cloud/consider=true label is needed for gardener-resource-manager to prevent the EndpointSlice hints mutating webhook from selecting all EndpointSlice resources but only the ones that are labeled with the consider label.\nThe Gardener extensions can use this approach to make a Service they deploy topology-aware.\nPrerequisites for making a Service topology-aware:\n The Pods backing the Service should be spread on most of the available zones. This constraint should be ensured with appropriate scheduling constraints (topology spread constraints, (anti-)affinity). Enabling the feature for a Service with a single backing Pod or Pods all located in the same zone does not lead to a benefit. The component should be scaled up by VerticalPodAutoscaler. In case of an overload (a large portion of the of the traffic is originating from a given zone), the VerticalPodAutoscaler should provide better resource recommendations for the overloaded backing Pods. Consider the TopologyAwareHints constraints. Note: The topology-aware routing feature is considered as alpha feature. Use it only for evaluation purposes.\n Topology-aware Services in the Seed cluster etcd-main-client and etcd-events-client The etcd-main-client and etcd-events-client Services are topology-aware. They are consumed by the kube-apiserver.\nkube-apiserver The kube-apiserver Service is topology-aware. It is consumed by the controllers running in the Shoot control plane.\n Note: The istio-ingressgateway component routes traffic in topology-aware manner - if possible, it routes traffic to the target kube-apiserver Pods in the same zone. If there is no healthy kube-apiserver Pod available in the same zone, the traffic is routed to any of the healthy Pods in the other zones. This behaviour is unconditionally enabled.\n gardener-resource-manager The gardener-resource-manager Service that is part of the Shoot control plane is topology-aware. The resource-manager serves webhooks and the Service is consumed by the kube-apiserver for the webhook communication.\nvpa-webhook The vpa-webhook Service that is part of the Shoot control plane is topology-aware. It is consumed by the kube-apiserver for the webhook communication.\nTopology-aware Services in the garden runtime cluster virtual-garden-etcd-main-client and virtual-garden-etcd-events-client The virtual-garden-etcd-main-client and virtual-garden-etcd-events-client Services are topology-aware. virtual-garden-etcd-main-client is consumed by virtual-garden-kube-apiserver and gardener-apiserver, virtual-garden-etcd-events-client is consumed by virtual-garden-kube-apiserver.\nvirtual-garden-kube-apiserver The virtual-garden-kube-apiserver Service is topology-aware. It is consumed by virtual-garden-kube-controller-manager, gardener-controller-manager, gardener-scheduler, gardener-admission-controller, extension admission components, gardener-dashboard and other components.\n Note: Unlike the other Services, the virtual-garden-kube-apiserver Service is of type LoadBalancer. In-cluster components consuming the virtual-garden-kube-apiserver Service by its Service name will have benefit from the topology-aware routing. However, the TopologyAwareHints feature cannot help with external traffic routed to load balancer’s address - such traffic won’t be routed in a topology-aware manner and will be routed according to the cloud-provider specific implementation.\n gardener-apiserver The gardener-apiserver Service is topology-aware. It is consumed by virtual-garden-kube-apiserver. The aggregation layer in virtual-garden-kube-apiserver proxies requests sent for the Gardener API types to the gardener-apiserver.\ngardener-admission-controller The gardener-admission-controller Service is topology-aware. It is consumed by virtual-garden-kube-apiserver and gardener-apiserver for the webhook communication.\nHow to enable the topology-aware routing for a Seed cluster? For a Seed cluster the topology-aware routing functionality can be enabled in the Seed specification:\napiVersion: core.gardener.cloud/v1beta1 kind: Seed # ... spec: settings: topologyAwareRouting: enabled: true The topology-aware routing setting can be only enabled for a Seed cluster with more than one zone. gardenlet enables topology-aware Services only for Shoot control planes with failure tolerance type zone (.spec.controlPlane.highAvailability.failureTolerance.type=zone). Control plane Pods of non-HA Shoots and HA Shoots with failure tolerance type node are pinned to single zone. For more details, see High Availability Of Deployed Components.\nHow to enable the topology-aware routing for a garden runtime cluster? For a garden runtime cluster the topology-aware routing functionality can be enabled in the Garden resource specification:\napiVersion: operator.gardener.cloud/v1alpha1 kind: Garden # ... spec: runtimeCluster: settings: topologyAwareRouting: enabled: true The topology-aware routing setting can be only enabled for a garden runtime cluster with more than one zone.\n","categories":"","description":"","excerpt":"Topology-Aware Traffic Routing Motivation The enablement of highly …","ref":"/docs/gardener/topology_aware_routing/","tags":"","title":"Topology Aware Routing"},{"body":"Trigger Shoot Operations Through Annotations You can trigger a few explicit operations by annotating the Shoot with an operation annotation. This might allow you to induct certain behavior without the need to change the Shoot specification. Some of the operations can also not be caused by changing something in the shoot specification because they can’t properly be reflected here. Note that once the triggered operation is considered by the controllers, the annotation will be automatically removed and you have to add it each time you want to trigger the operation.\nPlease note: If .spec.maintenance.confineSpecUpdateRollout=true, then the only way to trigger a shoot reconciliation is by setting the reconcile operation, see below.\nImmediate Reconciliation Annotate the shoot with gardener.cloud/operation=reconcile to make the gardenlet start a reconciliation operation without changing the shoot spec and possibly without being in its maintenance time window:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=reconcile Immediate Maintenance Annotate the shoot with gardener.cloud/operation=maintain to make the gardener-controller-manager start maintaining your shoot immediately (possibly without being in its maintenance time window). If no reconciliation starts, then nothing needs to be maintained:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=maintain Retry Failed Reconciliation Annotate the shoot with gardener.cloud/operation=retry to make the gardenlet start a new reconciliation loop on a failed shoot. Failed shoots are only reconciled again if a new Gardener version is deployed, the shoot specification is changed or this annotation is set:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=retry Credentials Rotation Operations Please consult Credentials Rotation for Shoot Clusters for more information.\nRestart systemd Services on Particular Worker Nodes It is possible to make Gardener restart particular systemd services on your shoot worker nodes if needed. The annotation is not set on the Shoot resource but directly on the Node object you want to target. For example, the following will restart both the kubelet and the containerd services:\nkubectl annotate node \u003cnode-name\u003e worker.gardener.cloud/restart-systemd-services=kubelet,containerd It may take up to a minute until the service is restarted. The annotation will be removed from the Node object after all specified systemd services have been restarted. It will also be removed even if the restart of one or more services failed.\n ℹ️ In the example mentioned above, you could additionally verify when/whether the kubelet restarted by using kubectl describe node \u003cnode-name\u003e and looking for such a Starting kubelet event.\n Force Deletion When the ShootForceDeletion feature gate in the gardener-apiserver is enabled, users will be able to force-delete the Shoot. This is only possible if the Shoot fails to be deleted normally. For forceful deletion, the following conditions must be met:\n Shoot has a deletion timestamp. Shoot status contains at least one of the following ErrorCodes: ERR_CLEANUP_CLUSTER_RESOURCES ERR_CONFIGURATION_PROBLEM ERR_INFRA_DEPENDENCIES ERR_INFRA_UNAUTHENTICATED ERR_INFRA_UNAUTHORIZED If the above conditions are satisfied, you can annotate the Shoot with confirmation.gardener.cloud/force-deletion=true, and Gardener will cleanup the Shoot controlplane and the Shoot metadata.\n ⚠️ You MUST ensure that all the resources created in the IaaS account are cleaned up to prevent orphaned resources. Gardener will NOT delete any resources in the underlying infrastructure account. Hence, use this annotation at your own risk and only if you are fully aware of these consequences.\n ","categories":"","description":"","excerpt":"Trigger Shoot Operations Through Annotations You can trigger a few …","ref":"/docs/gardener/shoot_operations/","tags":"","title":"Trigger Shoot Operations Through Annotations"},{"body":"Trusted TLS Certificate for Shoot Control Planes Shoot clusters are composed of several control plane components deployed by Gardener and its corresponding extensions.\nSome components are exposed via Ingress resources, which make them addressable under the HTTPS protocol.\nExamples:\n Alertmanager Plutono Prometheus Gardener generates the backing TLS certificates, which are signed by the shoot cluster’s CA by default (self-signed).\nUnlike with a self-contained Kubeconfig file, common internet browsers or operating systems don’t trust a shoot’s cluster CA and adding it as a trusted root is often undesired in enterprise environments.\nTherefore, Gardener operators can predefine trusted wildcard certificates under which the mentioned endpoints will be served instead.\nRegister a trusted wildcard certificate Since control plane components are published under the ingress domain (core.gardener.cloud/v1beta1.Seed.spec.ingress.domain) a wildcard certificate is required.\nFor example:\n Seed ingress domain: dev.my-seed.example.com CN or SAN for a certificate: *.dev.my-seed.example.com A wildcard certificate matches exactly one seed. It must be deployed as part of your landscape setup as a Kubernetes Secret inside the garden namespace of the corresponding seed cluster.\nPlease ensure that the secret has the gardener.cloud/role label shown below:\napiVersion: v1 data: ca.crt: base64-encoded-ca.crt tls.crt: base64-encoded-tls.crt tls.key: base64-encoded-tls.key kind: Secret metadata: labels: gardener.cloud/role: controlplane-cert name: seed-ingress-certificate namespace: garden type: Opaque Gardener copies the secret during the reconciliation of shoot clusters to the shoot namespace in the seed. Afterwards, the Ingress resources in that namespace for the mentioned components will refer to the wildcard certificate.\nBest Practice While it is possible to create the wildcard certificates manually and deploy them to seed clusters, it is recommended to let certificate management components do this job. Often, a seed cluster is also a shoot cluster at the same time (ManagedSeed) and might already provide a certificate service extension. Otherwise, a Gardener operator may use solutions like Cert-Management or Cert-Manager.\n","categories":"","description":"","excerpt":"Trusted TLS Certificate for Shoot Control Planes Shoot clusters are …","ref":"/docs/gardener/trusted-tls-for-control-planes/","tags":"","title":"Trusted Tls For Control Planes"},{"body":"Trusted TLS Certificate for Garden Runtime Cluster In Garden Runtime Cluster components are exposed via Ingress resources, which make them addressable under the HTTPS protocol.\nExamples:\n Plutono Gardener generates the backing TLS certificates, which are signed by the garden runtime cluster’s CA by default (self-signed).\nUnlike with a self-contained Kubeconfig file, common internet browsers or operating systems don’t trust a garden runtime’s cluster CA and adding it as a trusted root is often undesired in enterprise environments.\nTherefore, Gardener operators can predefine a trusted wildcard certificate under which the mentioned endpoints will be served instead.\nRegister a trusted wildcard certificate Since Garden Runtime Cluster components are published under the ingress domain (operator.gardener.cloud/v1alpha1.Garden.spec.runtimeCluster.ingress.domain) a wildcard certificate is required.\nFor example:\n Garden Runtime cluster ingress domain: dev.my-garden.example.com CN or SAN for a certificate: *.dev.my-garden.example.com It must be deployed as part of your landscape setup as a Kubernetes Secret inside the garden namespace of the garden runtime cluster.\nPlease ensure that the secret has the gardener.cloud/role label shown below:\napiVersion: v1 data: ca.crt: base64-encoded-ca.crt tls.crt: base64-encoded-tls.crt tls.key: base64-encoded-tls.key kind: Secret metadata: labels: gardener.cloud/role: controlplane-cert name: garden-ingress-certificate namespace: garden type: Opaque Best Practice While it is possible to create the wildcard certificate manually and deploy it to the cluster, it is recommended to let certificate management components (e.g. gardener/cert-management) do this job.\n","categories":"","description":"","excerpt":"Trusted TLS Certificate for Garden Runtime Cluster In Garden Runtime …","ref":"/docs/gardener/trusted-tls-for-garden-runtime/","tags":"","title":"Trusted Tls For Garden Runtime"},{"body":"Gardener Extension for Ubuntu OS \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting Ubuntu OS configuration (.spec.type=ubuntu). An experimental support for Ubuntu Pro is added (.spec.type=ubuntu-pro):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: ubuntu units: ... files: ... Please find a concrete example in the example folder.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Ubuntu operating system","excerpt":"Gardener extension controller for the Ubuntu operating system","ref":"/docs/extensions/os-extensions/gardener-extension-os-ubuntu/","tags":"","title":"Ubuntu OS"},{"body":"Using the Alicloud provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for Alicloud and provides an example Shoot manifest with minimal configuration that can be used to create an Alicloud cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nAlicloud Provider Credentials In order for Gardener to create a Kubernetes cluster using Alicloud infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Alicloud project. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the Alicloud project.\nThis Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-alicloud namespace: garden-dev type: Opaque data: accessKeyID: base64(access-key-id) accessKeySecret: base64(access-key-secret) The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nThe required credentials for the Alicloud project are an AccessKey Pair associated with a Resource Access Management (RAM) User. A RAM user is a special account that can be used by services and applications to interact with Alicloud Cloud Platform APIs. Applications can use AccessKey pair to authorize themselves to a set of APIs and perform actions within the permissions granted to the RAM user.\nMake sure to create a Resource Access Management User, and create an AccessKey Pair that shall be used for the Shoot cluster.\nPermissions Please make sure the provided credentials have the correct privileges. You can use the following Alicloud RAM policy document and attach it to the RAM user backed by the credentials you provided.\n Click to expand the Alicloud RAM policy document! { \"Statement\": [ { \"Action\": [ \"vpc:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ecs:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"slb:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ram:GetRole\", \"ram:CreateRole\", \"ram:CreateServiceLinkedRole\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ros:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] } ], \"Version\": \"1\" } InfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: # specify either 'id' or 'cidr' # id: my-vpc cidr: 10.250.0.0/16 # gardenerManagedNATGateway: true zones: - name: eu-central-1a workers: 10.250.1.0/24 # natGateway: # eipAllocationID: eip-ufxsdg122elmszcg The networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.id is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …). If networks.vpc.cidr is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vpc.id or networks.vpc.cidr must be present, but not both at the same time. When networks.vpc.id is present, in addition, you can also choose to set networks.vpc.gardenerManagedNATGateway. It is by default false. When it is set to true, Gardener will create an Enhanced NATGateway in the VPC and associate it with a VSwitch created in the first zone in the networks.zones. Please note that when networks.vpc.id is present, and networks.vpc.gardenerManagedNATGateway is false or not set, you have to manually create an Enhance NATGateway and associate it with a VSwitch that you manually created. In this case, make sure the worker CIDRs in networks.zones do not overlap with the one you created. If a NATGateway is created manually and a shoot is created in the same VPC with networks.vpc.gardenerManagedNATGateway set true, you need to manually adjust the route rule accordingly. You may refer to here. The networks.zones section describes which subnets you want to create in availability zones. For every zone, the Alicloud extension creates one subnet:\n The workers subnet is used for all shoot worker nodes, i.e., VMs which later run your applications. For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDR and it is your responsibility to properly design the network layout to suit your needs.\nIf you want to use multiple availability zones then add a second, third, … entry to the networks.zones[] list and properly specify the AZ name in networks.zones[].name.\nApart from the VPC and the subnets the Alicloud extension will also create a NAT gateway (only if a new VPC is created), a key pair, elastic IPs, VSwitches, a SNAT table entry, and security groups.\nBy default, the Alicloud extension will create a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic. The networks.zones[].natGateway.eipAllocationID field allows you to specify the Elastic IP Allocation ID of an existing Elastic IP allocation in case you want to bring your own. If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.\n⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while Alicloud applies this change, because the NAT gateway must be recreated with the new Elastic IP association. Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the Alicloud extension.\nControlPlaneConfig The control plane configuration mainly contains values for the Alicloud-specific control plane components. Today, the Alicloud extension deploys the cloud-controller-manager and the CSI controllers.\nAn example ControlPlaneConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig csi: enableADController: true # cloudControllerManager: # featureGates: # SomeKubernetesFeature: true The csi.enableADController is used as the value of environment DISK_AD_CONTROLLER, which is used for AliCloud csi-disk-plugin. This field is optional. When a new shoot is creatd, this field is automatically set true. For an existing shoot created in previous versions, it remains unchanged. If there are persistent volumes created before year 2021, please be cautious to set this field true because they may fail to mount to nodes.\nThe cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nWorkerConfig The Alicloud extension does not support a specific WorkerConfig. However, it supports additional data volumes (plus encryption) per machine. By default (if not stated otherwise), all the disks are unencrypted. For each data volume, you have to specify a name. It also supports encrypted system disk. However, only Customized image is currently supported to be used as a basic image for encrypted system disk. Please be noted that the change of system disk encryption flag will cause reconciliation of a shoot, and it will result in nodes rolling update within the worker group.\nThe following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: cloud_efficiency size: 20Gi encrypted: true dataVolumes: - name: kubelet-dir type: cloud_efficiency size: 25Gi encrypted: true Example Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-alicloud namespace: garden-dev spec: cloudProfileName: alicloud region: eu-central-1 secretBindingName: core-alicloud provider: type: alicloud infrastructureConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: ecs.sn2ne.large minimum: 2 maximum: 2 volume: size: 50Gi type: cloud_efficiency zones: - eu-central-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (two availability zones) Please find below an example Shoot manifest for two availability zones:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-alicloud namespace: garden-dev spec: cloudProfileName: alicloud region: eu-central-1 secretBindingName: core-alicloud provider: type: alicloud infrastructureConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/26 - name: eu-central-1b workers: 10.250.0.64/26 controlPlaneConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: ecs.sn2ne.large minimum: 2 maximum: 4 volume: size: 50Gi type: cloud_efficiency # NOTE: Below comment is for the case when encrypted field of an existing shoot is updated from false to true. # It will cause affected nodes to be rolling updated. Users must trigger a MAINTAIN operation of the shoot. # Otherwise, the shoot will fail to reconcile. # You could do it either via Dashboard or annotating the shoot with gardener.cloud/operation=maintain encrypted: true zones: - eu-central-1a - eu-central-1b networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Kubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-alicloud@v1.33.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation feature gate since gardener-extension-provider-alicloud@v1.36 and ShootSARotation feature gate since gardener-extension-provider-alicloud@v1.37.\n","categories":"","description":"","excerpt":"Using the Alicloud provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/usage/","tags":"","title":"Usage"},{"body":"Using the AWS provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for AWS and provide an example Shoot manifest with minimal configuration that you can use to create an AWS cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider Secret Data Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of your AWS account. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-aws namespace: garden-dev type: Opaque data: accessKeyID: base64(access-key-id) secretAccessKey: base64(secret-access-key) The AWS documentation explains the necessary steps to enable programmatic access, i.e. create access key ID and access key, for the user of your choice.\n⚠️ For security reasons, we recommend creating a dedicated user with programmatic access only. Please avoid re-using a IAM user which has access to the AWS console (human user).\n⚠️ Depending on your AWS API usage it can be problematic to reuse the same AWS Account for different Shoot clusters in the same region due to rate limits. Please consider spreading your Shoots over multiple AWS Accounts if you are hitting those limits.\nPermissions Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):\n Click to expand the AWS IAM policy document! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": \"autoscaling:*\", \"Resource\": \"*\" }, { \"Effect\": \"Allow\", \"Action\": \"ec2:*\", \"Resource\": \"*\" }, { \"Effect\": \"Allow\", \"Action\": \"elasticloadbalancing:*\", \"Resource\": \"*\" }, { \"Action\": [ \"iam:GetInstanceProfile\", \"iam:GetPolicy\", \"iam:GetPolicyVersion\", \"iam:GetRole\", \"iam:GetRolePolicy\", \"iam:ListPolicyVersions\", \"iam:ListRolePolicies\", \"iam:ListAttachedRolePolicies\", \"iam:ListInstanceProfilesForRole\", \"iam:CreateInstanceProfile\", \"iam:CreatePolicy\", \"iam:CreatePolicyVersion\", \"iam:CreateRole\", \"iam:CreateServiceLinkedRole\", \"iam:AddRoleToInstanceProfile\", \"iam:AttachRolePolicy\", \"iam:DetachRolePolicy\", \"iam:RemoveRoleFromInstanceProfile\", \"iam:DeletePolicy\", \"iam:DeletePolicyVersion\", \"iam:DeleteRole\", \"iam:DeleteRolePolicy\", \"iam:DeleteInstanceProfile\", \"iam:PutRolePolicy\", \"iam:PassRole\", \"iam:UpdateAssumeRolePolicy\" ], \"Effect\": \"Allow\", \"Resource\": \"*\" }, // The following permission set is only needed, if AWS Load Balancer controller is enabled (see ControlPlaneConfig) { \"Effect\": \"Allow\", \"Action\": [ \"cognito-idp:DescribeUserPoolClient\", \"acm:ListCertificates\", \"acm:DescribeCertificate\", \"iam:ListServerCertificates\", \"iam:GetServerCertificate\", \"waf-regional:GetWebACL\", \"waf-regional:GetWebACLForResource\", \"waf-regional:AssociateWebACL\", \"waf-regional:DisassociateWebACL\", \"wafv2:GetWebACL\", \"wafv2:GetWebACLForResource\", \"wafv2:AssociateWebACL\", \"wafv2:DisassociateWebACL\", \"shield:GetSubscriptionState\", \"shield:DescribeProtection\", \"shield:CreateProtection\", \"shield:DeleteProtection\" ], \"Resource\": \"*\" } ] } InfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig enableECRAccess: true dualStack: enabled: false networks: vpc: # specify either 'id' or 'cidr' # id: vpc-123456 cidr: 10.250.0.0/16 # gatewayEndpoints: # - s3 zones: - name: eu-west-1a internal: 10.250.112.0/22 public: 10.250.96.0/22 workers: 10.250.0.0/19 # elasticIPAllocationID: eipalloc-123456 ignoreTags: keys: # individual ignored tag keys - SomeCustomKey - AnotherCustomKey keyPrefixes: # ignored tag key prefixes - user.specific/prefix/ The enableECRAccess flag specifies whether the AWS IAM role policy attached to all worker nodes of the cluster shall contain permissions to access the Elastic Container Registry of the respective AWS account. If the flag is not provided it is defaulted to true. Please note that if the iamInstanceProfile is set for a worker pool in the WorkerConfig (see below) then enableECRAccess does not have any effect. It only applies for those worker pools whose iamInstanceProfile is not set.\n Click to expand the default AWS IAM policy document used for the instance profiles! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": [ \"ec2:DescribeInstances\" ], \"Resource\": [ \"*\" ] }, // Only if `.enableECRAccess` is `true`. { \"Effect\": \"Allow\", \"Action\": [ \"ecr:GetAuthorizationToken\", \"ecr:BatchCheckLayerAvailability\", \"ecr:GetDownloadUrlForLayer\", \"ecr:GetRepositoryPolicy\", \"ecr:DescribeRepositories\", \"ecr:ListImages\", \"ecr:BatchGetImage\" ], \"Resource\": [ \"*\" ] } ] } The dualStack.enabled flag specifies whether dual-stack or IPv4-only should be supported by the infrastructure. When the flag is set to true an Amazon provided IPv6 CIDR block will be attached to the VPC. All subnets will receive a /64 block from it and a route entry is added to the main route table to route all IPv6 traffic over the IGW.\nThe networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.id is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …). Please make sure that the VPC has attached an internet gateway - the AWS controller won’t create one automatically for existing VPCs. To make sure the nodes are able to join and operate in your cluster properly, please make sure that your VPC has enabled DNS Support, explicitly the attributes enableDnsHostnames and enableDnsSupport must be set to true. If networks.vpc.cidr is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vpc.id or networks.vpc.cidr must be present, but not both at the same time. networks.vpc.gatewayEndpoints is optional. If specified then each item is used as service name in a corresponding Gateway VPC Endpoint. The networks.zones section contains configuration for resources you want to create or use in availability zones. For every zone, the AWS extension creates three subnets:\n The internal subnet is used for internal AWS load balancers. The public subnet is used for public AWS load balancers. The workers subnet is used for all shoot worker nodes, i.e., VMs which later run your applications. For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nAlso, the AWS extension creates a dedicated NAT gateway for each zone. By default, it also creates a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic. The elasticIPAllocationID field allows you to specify the ID of an existing Elastic IP allocation in case you want to bring your own. If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.\n⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while AWS applies this change. The reason is that the NAT gateway must be recreated with the new Elastic IP association. Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the AWS extension.\nYou can configure Gateway VPC Endpoints by adding items in the optional list networks.vpc.gatewayEndpoints. Each item in the list is used as a service name and a corresponding endpoint is created for it. All created endpoints point to the service within the cluster’s region. For example, consider this (partial) shoot config:\nspec: region: eu-central-1 provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: gatewayEndpoints: - s3 The service name of the S3 Gateway VPC Endpoint in this example is com.amazonaws.eu-central-1.s3.\nIf you want to use multiple availability zones then add a second, third, … entry to the networks.zones[] list and properly specify the AZ name in networks.zones[].name.\nApart from the VPC and the subnets the AWS extension will also create DHCP options and an internet gateway (only if a new VPC is created), routing tables, security groups, elastic IPs, NAT gateways, EC2 key pairs, IAM roles, and IAM instance profiles.\nThe ignoreTags section allows to configure which resource tags on AWS resources managed by Gardener should be ignored during infrastructure reconciliation. By default, all tags that are added outside of Gardener’s reconciliation will be removed during the next reconciliation. This field allows users and automation to add custom tags on AWS resources created and managed by Gardener without loosing them on the next reconciliation. Tags can ignored either by specifying exact key values (ignoreTags.keys) or key prefixes (ignoreTags.keyPrefixes). In both cases it is forbidden to ignore the Name tag or any tag starting with kubernetes.io or gardener.cloud. Please note though, that the tags are only ignored on resources created on behalf of the Infrastructure CR (i.e. VPC, subnets, security groups, keypair, etc.), while tags on machines, volumes, etc. are not in the scope of this controller.\nControlPlaneConfig The control plane configuration mainly contains values for the AWS-specific control plane components. Today, the only component deployed by the AWS extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig cloudControllerManager: # featureGates: # SomeKubernetesFeature: true useCustomRouteController: true # loadBalancerController: # enabled: true # ingressClassName: alb storage: managedDefaultClass: false The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe cloudControllerManager.useCustomRouteController controls if the custom routes controller should be enabled. If enabled, it will add routes to the pod CIDRs for all nodes in the route tables for all zones.\nThe storage.managedDefaultClass controls if the default storage / volume snapshot classes are marked as default by Gardener. Set it to false to mark another storage / volume snapshot class as default without Gardener overwriting this change. If unset, this field defaults to true.\nIf the AWS Load Balancer Controller should be deployed, set loadBalancerController.enabled to true. In this case, it is assumed that an IngressClass named alb is created by the user. You can overwrite the name by setting loadBalancerController.ingressClassName.\nPlease note, that currently only the “instance” mode is supported.\nExamples for Ingress and Service managed by the AWS Load Balancer Controller: Prerequites Make sure you have created an IngressClass. For more details about parameters, please see AWS Load Balancer Controller - IngressClass\napiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: alb # default name if not specified by `loadBalancerController.ingressClassName` spec: controller: ingress.k8s.aws/alb Ingress apiVersion: networking.k8s.io/v1 kind: Ingress metadata: namespace: default name: echoserver annotations: # complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/ingress/annotations/ alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/target-type: instance # target-type \"ip\" NOT supported in Gardener spec: ingressClassName: alb rules: - http: paths: - path: / pathType: Prefix backend: service: name: echoserver port: number: 80 For more details see AWS Load Balancer Documentation - Ingress Specification\nService of Type LoadBalancer This can be used to create a Network Load Balancer (NLB).\napiVersion: v1 kind: Service metadata: annotations: # complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/ service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance # target-type \"ip\" NOT supported in Gardener service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing name: ingress-nginx-controller namespace: ingress-nginx ... spec: ... type: LoadBalancer loadBalancerClass: service.k8s.aws/nlb # mandatory to be managed by AWS Load Balancer Controller (otherwise the Cloud Controller Manager will act on it) For more details see AWS Load Balancer Documentation - Network Load Balancer\nWorkerConfig The AWS extension supports encryption for volumes plus support for additional data volumes per machine. For each data volume, you have to specify a name. By default (if not stated otherwise), all the disks (root \u0026 data volumes) are encrypted. Please make sure that your instance-type supports encryption. If your instance-type doesn’t support encryption, you will have to disable encryption (which is enabled by default) by setting volume.encrpyted to false (refer below shown YAML snippet).\nThe following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: gp2 size: 20Gi encrypted: false dataVolumes: - name: kubelet-dir type: gp2 size: 25Gi encrypted: true Note: The AWS extension does not support EBS volume (root \u0026 data volumes) encryption with customer managed CMK. Support for customer managed CMK is out of scope for now. Only AWS managed CMK is supported.\n Additionally, it is possible to provide further AWS-specific values for configuring the worker pools. The additional configuration must be specified in the providerConfig field of the respective worker.\nspec: provider: workers: - name: cpu-worker ... providerConfig: # AWS worker config The configuration will be evaluated when the provider-aws will reconcile the worker pools for the respective shoot.\nAn example WorkerConfig for the AWS extension looks as follows:\nspec: provider: workers: - name: cpu-worker ... providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 throughput: 200 dataVolumes: - name: kubelet-dir iops: 12345 throughput: 150 snapshotID: snap-1234 iamInstanceProfile: # (specify either ARN or name) name: my-profile instanceMetadataOptions: httpTokens: required httpPutResponseHopLimit: 2 # arn: my-instance-profile-arn nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 0 memory: 50Gi The .volume.iops is the number of I/O operations per second (IOPS) that the volume supports. For io1 and gp3 volume type, this represents the number of IOPS that are provisioned for the volume. For gp2 volume type, this represents the baseline performance of the volume and the rate at which the volume accumulates I/O credits for bursting. For more information about General Purpose SSD baseline performance, I/O credits, IOPS range and bursting, see Amazon EBS Volume Types (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) in the Amazon Elastic Compute Cloud User Guide.\nConstraint: IOPS should be a positive value. Validation of IOPS (i.e. whether it is allowed and is in the specified range for a particular volume type) is done on aws side.\nThe volume.throughput is the throughput that the volume supports, in MiB/s. As of 16th Aug 2022, this parameter is valid only for gp3 volume types and will return an error from the provider side if specified for other volume types. Its current range of throughput is from 125MiB/s to 1000 MiB/s. To know more about throughput and its range, see the official AWS documentation here.\nThe .dataVolumes can optionally contain configurations for the data volumes stated in the Shoot specification in the .spec.provider.workers[].dataVolumes list. The .name must match to the name of the data volume in the shoot. It is also possible to provide a snapshot ID. It allows to restore the data volume from an existing snapshot.\nThe iamInstanceProfile section allows to specify the IAM instance profile name xor ARN that should be used for this worker pool. If not specified, a dedicated IAM instance profile created by the infrastructure controller is used (see above).\nThe instanceMetadataOptions controls access to the instance metadata service (IMDS) for members of the worker. You can do the following operations:\n access IMDSv1 (default) access IMDSv2 - httpPutResponseHopLimit \u003e= 2 access IMDSv2 only (restrict access to IMDSv1) - httpPutResponseHopLimit \u003e=2, httpTokens = \"required\" disable access to IMDS - httpTokens = \"required\" Note: The accessibility of IMDS discussed in the previous point is referenced from the point of view of containers NOT running in the host network. By default on host network IMDSv2 is already enabled (but not accessible from inside the pods). It is currently not possible to create a VM with complete restriction to the IMDS service. It is however possible to restrict access from inside the pods by setting httpTokens to required and not setting httpPutResponseHopLimit (or setting it to 1).\n You can find more information regarding the options in the AWS documentation.\ncpuOptions grants more finegrained control over the worker’s CPU configuration. It has two attributes:\n coreCount: Specify a custom amount of cores the instance should be configured with. threadsPerCore: How many threads should there be on each core. Set to 1 to disable multi-threading. Note that if you decide to configure cpuOptions both these values need to be provided. For a list of valid combinations of these values refer to the AWS documentation.\nExample Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: cloudProfileName: aws region: eu-central-1 secretBindingName: core-aws provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a internal: 10.250.112.0/22 public: 10.250.96.0/22 workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: m5.large minimum: 2 maximum: 2 volume: size: 50Gi type: gp2 # The following provider config is valid if the volume type is `io1`. # providerConfig: # apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 # kind: WorkerConfig # volume: # iops: 10000 zones: - eu-central-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (three availability zones) Please find below an example Shoot manifest for three availability zones:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: cloudProfileName: aws region: eu-central-1 secretBindingName: core-aws provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/26 public: 10.250.96.0/26 internal: 10.250.112.0/26 - name: eu-central-1b workers: 10.250.0.64/26 public: 10.250.96.64/26 internal: 10.250.112.64/26 - name: eu-central-1c workers: 10.250.0.128/26 public: 10.250.96.128/26 internal: 10.250.112.128/26 controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: m5.large minimum: 3 maximum: 9 volume: size: 50Gi type: gp2 zones: - eu-central-1a - eu-central-1b - eu-central-1c networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every AWS shoot cluster will be deployed with the AWS EBS CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new ebs.csi.aws.com provisioner.\nNode-specific Volume Limits The Kubernetes scheduler allows configurable limit for the number of volumes that can be attached to a node. See https://k8s.io/docs/concepts/storage/storage-limits/#custom-limits.\nCSI drivers usually have a different procedure for configuring this custom limit. By default, the EBS CSI driver parses the machine type name and then decides the volume limit. However, this is only a rough approximation and not good enough in most cases. Specifying the volume attach limit via command line flag (--volume-attach-limit) is currently the alternative until a more sophisticated solution presents itself (dynamically discovering the maximum number of attachable volume per EC2 machine type, see also https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/347). The AWS extension allows the --volume-attach-limit flag of the EBS CSI driver to be configurable via aws.provider.extensions.gardener.cloud/volume-attach-limit annotation on the Shoot resource. If the annotation is added to an existing Shoot, then reconciliation needs to be triggered manually (see Immediate reconciliation), as in general adding annotation to resource is not a change that leads to .metadata.generation increase in general.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-aws@v1.34.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-aws@v1.36.\nFlow Infrastructure Reconciler The extension offers two different reconciler implementations for the infrastructure resource:\n terraform-based native Go SDK based (dubbed the “flow”-based implementation) The default implementation currently is the terraform reconciler which uses the https://github.com/gardener/terraformer as the backend for managing the shoot’s infrastructure.\nThe “flow” implementation is a newer implementation that is trying to solve issues we faced with managing terraform infrastructure on Kubernetes. The goal is to have more control over the reconciliation process and be able to perform fine-grained tuning over it. The implementation is completely backwards-compatible and offers a migration route from the legacy terraformer implementation.\nFor most users there will be no noticable difference. However for certain use-cases, users may notice a slight deviation from the previous behavior. For example, with flow-based infrastructure users may be able to perform certain modifications to infrastructure resources without having them reconciled back by terraform. Operations that would degrade the shoot infrastructure are still expected to be reverted back.\nFor the time-being, to take advantage of the flow reconcilier users have to “opt-in” by annotating the shoot manifest with: aws.provider.extensions.gardener.cloud/use-flow=\"true\". For existing shoots with this annotation, the migration will take place on the next infrastructure reconciliation (on maintenance window or if other infrastructure changes are requested). The migration is not revertible.\n","categories":"","description":"","excerpt":"Using the AWS provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/usage/","tags":"","title":"Usage"},{"body":"Using the Azure provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for Azure and provides an example Shoot manifest with minimal configuration that can be used to create an Azure cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nAzure Provider Credentials In order for Gardener to create a Kubernetes cluster using Azure infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Azure subscription. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the Azure subscription. The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nCreate an Azure Application and Service Principle and obtain its credentials.\nPlease ensure that the Azure application (spn) has the IAM actions defined here assigned. If no fine-grained permissions/actions required then simply assign the Contributor role.\nThe example below demonstrates how the secret containing the client credentials of the Azure Application has to look like:\napiVersion: v1 kind: Secret metadata: name: core-azure namespace: garden-dev type: Opaque data: clientID: base64(client-id) clientSecret: base64(client-secret) subscriptionID: base64(subscription-id) tenantID: base64(tenant-id) ⚠️ Depending on your API usage it can be problematic to reuse the same Service Principal for different Shoot clusters due to rate limits. Please consider spreading your Shoots over Service Principals from different Azure subscriptions if you are hitting those limits.\nManaged Service Principals The operators of the Gardener Azure extension can provide managed service principals. This eliminates the need for users to provide an own service principal for a Shoot.\nTo make use of a managed service principal, the Azure secret of a Shoot cluster must contain only a subscriptionID and a tenantID field, but no clientID and clientSecret. Removing those fields from the secret of an existing Shoot will also let it adopt the managed service principal.\nBased on the tenantID field, the Gardener extension will try to assign the managed service principal to the Shoot. If no managed service principal can be assigned then the next operation on the Shoot will fail.\n⚠️ The managed service principal need to be assigned to the users Azure subscription with proper permissions before using it.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: # specify either 'name' and 'resourceGroup' or 'cidr' # name: my-vnet # resourceGroup: my-vnet-resource-group cidr: 10.250.0.0/16 # ddosProtectionPlanID: /subscriptions/test/resourceGroups/test/providers/Microsoft.Network/ddosProtectionPlans/test-ddos-protection-plan workers: 10.250.0.0/19 # natGateway: # enabled: false # idleConnectionTimeoutMinutes: 4 # zone: 1 # ipAddresses: # - name: my-public-ip-name # resourceGroup: my-public-ip-resource-group # zone: 1 # serviceEndpoints: # - Microsoft.Test # zones: # - name: 1 # cidr: \"10.250.0.0/24 # - name: 2 # cidr: \"10.250.0.0/24\" # natGateway: # enabled: false zoned: false # resourceGroup: # name: mygroup #identity: # name: my-identity-name # resourceGroup: my-identity-resource-group # acrAccess: true Currently, it’s not yet possible to deploy into existing resource groups. The .resourceGroup.name field will allow specifying the name of an already existing resource group that the shoot cluster and all infrastructure resources will be deployed to.\nVia the .zoned boolean you can tell whether you want to use Azure availability zones or not. If you don’t use zones then an availability set will be created and only basic load balancers will be used. Zoned clusters use standard load balancers.\nThe networks.vnet section describes whether you want to create the shoot cluster in an already existing VNet or whether to create a new one:\n If networks.vnet.name and networks.vnet.resourceGroup are given then you have to specify the VNet name and VNet resource group name of the existing VNet that was created by other means (manually, other tooling, …). If networks.vnet.cidr is given then you have to specify the VNet CIDR of a new VNet that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vnet.name and neworks.vnet.resourceGroup or networks.vnet.cidr must be present, but not both at the same time. The networks.vnet.ddosProtectionPlanID field can be used to specify the id of a ddos protection plan which should be assigned to the VNet. This will only work for a VNet managed by Gardener. For externally managed VNets the ddos protection plan must be assigned by other means. If a vnet name is given and cilium shoot clusters are created without a network overlay within one vnet make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that vnet. Overlapping pod CIDRs will lead to disfunctional shoot clusters. It’s possible to place multiple shoot cluster into the same vnet The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications. The specified CIDR range must be contained in the VNet CIDR specified above, or the VNet CIDR of your already existing VNet. You can freely choose this CIDR and it is your responsibility to properly design the network layout to suit your needs.\nIn the networks.serviceEndpoints[] list you can specify the list of Azure service endpoints which shall be associated with the worker subnet. All available service endpoints and their technical names can be found in the (Azure Service Endpoint documentation](https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoints-overview).\nThe networks.natGateway section contains configuration for the Azure NatGateway which can be attached to the worker subnet of a Shoot cluster. Here are some key information about the usage of the NatGateway for a Shoot cluster:\n NatGateway usage is optional and can be enabled or disabled via .networks.natGateway.enabled. If the NatGateway is not used then the egress connections initiated within the Shoot cluster will be nated via the LoadBalancer of the clusters (default Azure behaviour, see here). NatGateway is only available for zonal clusters .zoned=true. The NatGateway is currently not zone redundantly deployed. That mean the NatGateway of a Shoot cluster will always be in just one zone. This zone can be optionally selected via .networks.natGateway.zone. Caution: Modifying the .networks.natGateway.zone setting requires a recreation of the NatGateway and the managed public ip (automatically used if no own public ip is specified, see below). That mean you will most likely get a different public ip for egress connections. It is possible to bring own zonal public ip(s) via networks.natGateway.ipAddresses. Those public ip(s) need to be in the same zone as the NatGateway (see networks.natGateway.zone) and be of SKU standard. For each public ip the name, the resourceGroup and the zone need to be specified. The field networks.natGateway.idleConnectionTimeoutMinutes allows the configuration of NAT Gateway’s idle connection timeout property. The idle timeout value can be adjusted from 4 minutes, up to 120 minutes. Omitting this property will set the idle timeout to its default value according to NAT Gateway’s documentation. In the identity section you can specify an Azure user-assigned managed identity which should be attached to all cluster worker machines. With identity.name you can specify the name of the identity and with identity.resourceGroup you can specify the resource group which contains the identity resource on Azure. The identity need to be created by the user upfront (manually, other tooling, …). Gardener/Azure Extension will only use the referenced one and won’t create an identity. Furthermore the identity have to be in the same subscription as the Shoot cluster. Via the identity.acrAccess you can configure the worker machines to use the passed identity for pulling from an Azure Container Registry (ACR). Caution: Adding, exchanging or removing the identity will require a rolling update of all worker machines in the Shoot cluster.\nApart from the VNet and the worker subnet the Azure extension will also create a dedicated resource group, route tables, security groups, and an availability set (if not using zoned clusters).\nInfrastructureConfig with dedicated subnets per zone Another deployment option for zonal clusters only, is to create and configure a separate subnet per availability zone. This network layout is recommended to users that require fine-grained control over their network setup. One prevalent usecase is to create a zone-redundant NAT Gateway deployment by taking advantage of the ability to deploy separate NAT Gateways for each subnet.\nTo use this configuration the following requirements must be met:\n the zoned field must be set to true. the networks.vnet section must not be empty and must contain a valid configuration. For existing clusters that were not using the networks.vnet section, it is enough if networks.vnet.cidr field is set to the current networks.worker value. For each of the target zones a subnet CIDR range must be specified. The specified CIDR range must be contained in the VNet CIDR specified above, or the VNet CIDR of your already existing VNet. In addition, the CIDR ranges must not overlap with the ranges of the other subnets.\nServiceEndpoints and NatGateways can be configured per subnet. Respectively, when networks.zones is specified, the fields networks.workers, networks.serviceEndpoints and networks.natGateway cannot be set. All the configuration for the subnets must be done inside the respective zone’s configuration.\nExample:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: zoned: true vnet: # specify either 'name' and 'resourceGroup' or 'cidr' cidr: 10.250.0.0/16 zones: - name: 1 cidr: \"10.250.0.0/24\" - name: 2 cidr: \"10.250.0.0/24\" natGateway: enabled: false Migrating to zonal shoots with dedicated subnets per zone For existing zonal clusters it is possible to migrate to a network layout with dedicated subnets per zone. The migration works by creating additional network resources as specified in the configuration and progressively roll part of your existing nodes to use the new resources. To achieve the controlled rollout of your nodes, parts of the existing infrastructure must be preserved which is why the following constraint is imposed:\nOne of your specified zones must have the exact same CIDR range as the current network.workers field. Here is an example of such migration:\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true to\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 zones: - name: 3 cidr: 10.250.0.0/19 # note the preservation of the 'workers' CIDR # optionally add other zones # - name: 2 # cidr: 10.250.32.0/19 # natGateway: # enabled: true zoned: true Another more advanced example with user-provided public IP addresses for the NAT Gateway and how it can be migrated:\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 natGateway: enabled: true zone: 1 ipAddresses: - name: pip1 resourceGroup: group zone: 1 - name: pip2 resourceGroup: group zone: 1 zoned: true to\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig zoned: true networks: vnet: cidr: 10.250.0.0/16 zones: - name: 1 cidr: 10.250.0.0/19 # note the preservation of the 'workers' CIDR natGateway: enabled: true ipAddresses: - name: pip1 resourceGroup: group zone: 1 - name: pip2 resourceGroup: group zone: 1 # optionally add other zones # - name: 2 # cidr: 10.250.32.0/19 # natGateway: # enabled: true # ipAddresses: # - name: pip3 # resourceGroup: group You can apply such change to your shoot by issuing a kubectl patch command to replace your current .spec.provider.infrastructureConfig section:\n$ cat new-infra.json [ { \"op\": \"replace\", \"path\": \"/spec/provider/infrastructureConfig\", \"value\": { \"apiVersion\": \"azure.provider.extensions.gardener.cloud/v1alpha1\", \"kind\": \"InfrastructureConfig\", \"networks\": { \"vnet\": { \"cidr\": \"\u003cyour-vnet-cidr\u003e\" }, \"zones\": [ { \"name\": 1, \"cidr\": \"10.250.0.0/24\", \"natGateway\": { \"enabled\": true } }, { \"name\": 1, \"cidr\": \"10.250.1.0/24\", \"natGateway\": { \"enabled\": true } }, ] }, \"zoned\": true } } ] kubectl patch --type=\"json\" --patch-file new-infra.json shoot \u003cmy-shoot\u003e ⚠️ The migration to shoots with dedicated subnets per zone is a one-way process. Reverting the shoot to the previous configuration is not supported.\n⚠️ During the migration a subset of the nodes will be rolled to the new subnets.\nControlPlaneConfig The control plane configuration mainly contains values for the Azure-specific control plane components. Today, the only component deployed by the Azure extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig cloudControllerManager: # featureGates: # SomeKubernetesFeature: true The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nstorage contains options for storage-related control plane component. storage.managedDefaultStorageClass is enabled by default and will deploy a storageClass and mark it as a default (via the storageclass.kubernetes.io/is-default-class annotation) storage.managedDefaultVolumeSnapshotClass is enabled by default and will deploy a volumeSnapshotClass and mark it as a default (via the snapshot.storage.kubernetes.io/is-default-classs annotation) In case you want to manage your own default storageClass or volumeSnapshotClass you need to disable the respective options above, otherwise reconciliation of the controlplane may fail.\nWorkerConfig The Azure extension supports encryption for volumes plus support for additional data volumes per machine. Please note that you cannot specify the encrypted flag for Azure disks as they are encrypted by default/out-of-the-box. For each data volume, you have to specify a name. The following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: Standard_LRS size: 20Gi dataVolumes: - name: kubelet-dir type: Standard_LRS size: 25Gi Additionally, it supports for other Azure-specific values and could be configured under .spec.provider.workers[].providerConfig\nAn example WorkerConfig for the Azure extension looks like:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 1 memory: 50Gi diagnosticsProfile: enabled: true # storageURI: https://\u003cstorage-account-name\u003e.blob.core.windows.net/ dataVolumes: - name: test-image imageRef: communityGalleryImageID: /CommunityGalleries/gardenlinux-13e998fe-534d-4b0a-8a27-f16a73aef620/Images/gardenlinux/Versions/1443.10.0 # sharedGalleryImageID: /SharedGalleries/82fc46df-cc38-4306-9880-504e872cee18-VSMP_MEMORYONE_GALLERY/Images/vSMP_MemoryONE/Versions/1062800168.0.0 # id: /Subscriptions/2ebd38b6-270b-48a2-8e0b-2077106dc615/Providers/Microsoft.Compute/Locations/westeurope/Publishers/sap/ArtifactTypes/VMImage/Offers/gardenlinux/Skus/greatest/Versions/1443.10.0 # urn: sap:gardenlinux:greatest:1443.10.0 The .nodeTemplate is used to specify resource information of the machine during runtime. This then helps in Scale-from-Zero. Some points to note for this field:\n Currently only cpu, gpu and memory are configurable. a change in the value lead to a rolling update of the machine in the worker pool all the resources needs to be specified The .diagnosticsProfile is used to enable machine boot diagnostics (disabled per default). A storage account is used for storing vm’s boot console output and screenshots. If .diagnosticsProfile.StorageURI is not specified azure managed storage will be used (recommended way).\nThe .dataVolumes field is used to add provider specific configurations for dataVolumes. .dataVolumes[].name must match with one of the names in workers.dataVolumes[].name. To specify an image source for the dataVolume either use communityGalleryImageID, sharedGalleryImageID, id or urn as imageRef. However, users have to make sure that the image really exists, there’s yet no check in place. If the image does not exist the machine will get stuck in creation.\nExample Shoot manifest (non-zoned) Please find below an example Shoot manifest for a non-zoned cluster:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: false controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS # providerConfig: # apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 # kind: WorkerConfig # nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) # capacity: # cpu: 2 # gpu: 1 # memory: 50Gi networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (zoned) Please find below an example Shoot manifest for a zoned cluster:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (zoned with NAT Gateways per zone) Please find below an example Shoot manifest for a zoned cluster using NAT Gateways per zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 zones: - name: 1 cidr: 10.250.0.0/24 serviceEndpoints: - Microsoft.Storage - Microsoft.Sql natGateway: enabled: true idleConnectionTimeoutMinutes: 4 - name: 2 cidr: 10.250.1.0/24 serviceEndpoints: - Microsoft.Storage - Microsoft.Sql natGateway: enabled: true zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every Azure shoot cluster will be deployed with the Azure Disk CSI driver and the Azure File CSI driver.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-azure@v1.25.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-azure@v1.28.\nMiscellaneous Azure Accelerated Networking All worker machines of the cluster will be automatically configured to use Azure Accelerated Networking if the prerequisites are fulfilled. The prerequisites are that the cluster must be zoned, and the used machine type and operating system image version are compatible for Accelerated Networking. Availability Set based shoot clusters will not be enabled for accelerated networking even if the machine type and operating system support it, this is necessary because all machines from the availability set must be scheduled on special hardware, more daitls can be found here. Supported machine types are listed in the CloudProfile in .spec.providerConfig.machineTypes[].acceleratedNetworking and the supported operating system image versions are defined in .spec.providerConfig.machineImages[].versions[].acceleratedNetworking.\nPreview: Shoot clusters with VMSS Flexible Orchestration (VMSS Flex/VMO) The machines of an Azure cluster can be created while being attached to an Azure Virtual Machine ScaleSet with flexible orchestraion. The Virtual Machine ScaleSet with flexible orchestration feature is currently in preview and not yet general available on Azure. Subscriptions need to join the preview to make use of the feature.\nAzure VMSS Flex is intended to replace Azure AvailabilitySet for non-zoned Azure Shoot clusters in the mid-term (once the feature goes GA) as VMSS Flex come with less disadvantages like no blocking machine operations or compability with Standard SKU loadbalancer etc.\nTo configure an Azure Shoot cluster which make use of VMSS Flex you need to do the following:\n The InfrastructureConfig of the Shoot configuration need to contain .zoned=false Shoot resource need to have the following annotation assigned: alpha.azure.provider.extensions.gardener.cloud/vmo=true Some key facts about VMSS Flex based clusters:\n Unlike regular non-zonal Azure Shoot clusters, which have a primary AvailabilitySet which is shared between all machines in all worker pools of a Shoot cluster, a VMSS Flex based cluster has an own VMSS for each workerpool In case the configuration of the VMSS will change (e.g. amount of fault domains in a region change; configured in the CloudProfile) all machines of the worker pool need to be rolled It is not possible to migrate an existing primary AvailabilitySet based Shoot cluster to VMSS Flex based Shoot cluster and vice versa VMSS Flex based clusters are using Standard SKU LoadBalancers instead of Basic SKU LoadBalancers for AvailabilitySet based Shoot clusters ","categories":"","description":"","excerpt":"Using the Azure provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/usage/","tags":"","title":"Usage"},{"body":"Using the Equinix Metal provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for Equinix Metal and provide an example Shoot manifest with minimal configuration that you can use to create an Equinix Metal cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider secret data Every shoot cluster references a SecretBinding which itself references a Secret, and this Secret contains the provider credentials of your Equinix Metal project. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: my-secret namespace: garden-dev type: Opaque data: apiToken: base64(api-token) projectID: base64(project-id) Please look up https://metal.equinix.com/developers/api/ as well.\nWith Secret created, create a SecretBinding resource referencing it. It may look like this:\napiVersion: core.gardener.cloud/v1beta1 kind: SecretBinding metadata: name: my-secret namespace: garden-dev secretRef: name: my-secret quotas: [] InfrastructureConfig Currently, there is no infrastructure configuration possible for the Equinix Metal environment.\nAn example InfrastructureConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig The Equinix Metal extension will only create a key pair.\nControlPlaneConfig The control plane configuration mainly contains values for the Equinix Metal-specific control plane components. Today, the Equinix Metal extension deploys the cloud-controller-manager and the CSI controllers, however, it doesn’t offer any configuration options at the moment.\nAn example ControlPlaneConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig WorkerConfig The Equinix Metal extension supports specifying IDs for reserved devices that should be used for the machines of a specific worker pool.\nAn example WorkerConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig reservationIDs: - my-reserved-device-1 - my-reserved-device-2 reservedDevicesOnly: false The .reservationIDs[] list contains the list of IDs of the reserved devices. The .reservedDevicesOnly field indicates whether only reserved devices from the provided list of reservation IDs should be used when new machines are created. It always will attempt to create a device from one of the reservation IDs. If none is available, the behaviour depends on the setting:\n true: return an error false: request a regular on-demand device The default value is false.\nExample Shoot manifest Please find below an example Shoot manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: my-shoot namespace: garden-dev spec: cloudProfileName: equinix-metal region: ny # Corresponds to a metro secretBindingName: my-secret provider: type: equinixmetal infrastructureConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig controlPlaneConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-pool1 machine: type: t1.small minimum: 2 maximum: 2 volume: size: 50Gi type: storage_1 zones: # Optional list of facilities, all of which MUST be in the metro; if not provided, then random facilities within the metro will be chosen for each machine. - ewr1 - ny5 - name: reserved-pool machine: type: t1.small minimum: 1 maximum: 2 providerConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig reservationIDs: - reserved-device1 - reserved-device2 reservedDevicesOnly: true volume: size: 50Gi type: storage_1 networking: type: calico kubernetes: version: 1.27.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true ⚠️ Note that if you specify multiple facilities in the .spec.provider.workers[].zones[] list then new machines are randomly created in one of the provided facilities. Particularly, it is not ensured that all facilities are used or that all machines are equally or unequally distributed.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-equinix-metal@v2.2.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation feature gate since gardener-extension-provider-equinix-metal@v2.3 and ShootSARotation feature gate since gardener-extension-provider-equinix-metal@v2.4.\n","categories":"","description":"","excerpt":"Using the Equinix Metal provider extension with Gardener as end-user …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/usage/","tags":"","title":"Usage"},{"body":"Using the GCP provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for GCP and provides an example Shoot manifest with minimal configuration that can be used to create a GCP cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nGCP Provider Credentials In order for Gardener to create a Kubernetes cluster using GCP infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired GCP project. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the GCP project. The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nThe required credentials for the GCP project are a Service Account Key to authenticate as a GCP Service Account. A service account is a special account that can be used by services and applications to interact with Google Cloud Platform APIs. Applications can use service account credentials to authorize themselves to a set of APIs and perform actions within the permissions granted to the service account.\nMake sure to enable the Google Identity and Access Management (IAM) API. Create a Service Account that shall be used for the Shoot cluster. Grant at least the following IAM roles to the Service Account.\n Service Account Admin Service Account Token Creator Service Account User Compute Admin Create a JSON Service Account key for the Service Account. Provide it in the Secret (base64 encoded for field serviceaccount.json), that is being referenced by the SecretBinding in the Shoot cluster configuration.\nThis Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-gcp namespace: garden-dev type: Opaque data: serviceaccount.json: base64(serviceaccount-json) ⚠️ Depending on your API usage it can be problematic to reuse the same Service Account Key for different Shoot clusters due to rate limits. Please consider spreading your Shoots over multiple Service Accounts on different GCP projects if you are hitting those limits, see https://cloud.google.com/compute/docs/api-rate-limits.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: # vpc: # name: my-vpc # cloudRouter: # name: my-cloudrouter workers: 10.250.0.0/16 # internal: 10.251.0.0/16 # cloudNAT: # minPortsPerVM: 2048 # maxPortsPerVM: 65536 # endpointIndependentMapping: # enabled: false # enableDynamicPortAllocation: false # natIPNames: # - name: manualnat1 # - name: manualnat2 # udpIdleTimeoutSec: 30 # icmpIdleTimeoutSec: 30 # tcpEstablishedIdleTimeoutSec: 1200 # tcpTransitoryIdleTimeoutSec: 30 # tcpTimeWaitTimeoutSec: 120 # flowLogs: # aggregationInterval: INTERVAL_5_SEC # flowSampling: 0.2 # metadata: INCLUDE_ALL_METADATA The networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.name is given then you have to specify the VPC name of the existing VPC that was created by other means (manually, other tooling, …). If you want to get a fresh VPC for the shoot then just omit the networks.vpc field.\n If a VPC name is not given then we will create the cloud router + NAT gateway to ensure that worker nodes don’t get external IPs.\n If a VPC name is given then a cloud router name must also be given, failure to do so would result in validation errors and possibly clusters without egress connectivity.\n If a VPC name is given and calico shoot clusters are created without a network overlay within one VPC make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that VPC. Overlapping pod CIDRs will lead to disfunctional shoot clusters.\n The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications.\nThe networks.internal section is optional and can describe a CIDR for a subnet that is used for internal load balancers,\nThe networks.cloudNAT.minPortsPerVM is optional and is used to define the minimum number of ports allocated to a VM for the CloudNAT\nThe networks.cloudNAT.natIPNames is optional and is used to specify the names of the manual ip addresses which should be used by the nat gateway\nThe networks.cloudNAT.endpointIndependentMapping is optional and is used to define the endpoint mapping behavior. You can enable it or disable it at any point by toggling networks.cloudNAT.endpointIndependentMapping.enabled. By default, it is disabled.\nnetworks.cloudNAT.enableDynamicPortAllocation is optional (default: false) and allows one to enable dynamic port allocation (https://cloud.google.com/nat/docs/ports-and-addresses#dynamic-port). Note that enabling this puts additional restrictions on the permitted values for networks.cloudNAT.minPortsPerVM and networks.cloudNAT.minPortsPerVM, namely that they now both are required to be powers of two. Also, maxPortsPerVM may not be given if dynamic port allocation is disabled.\nnetworks.cloudNAT.udpIdleTimeoutSec, networks.cloudNAT.icmpIdleTimeoutSec, networks.cloudNAT.tcpEstablishedIdleTimeoutSec, networks.cloudNAT.tcpTransitoryIdleTimeoutSec, and networks.cloudNAT.tcpTimeWaitTimeoutSec give more fine-granular control over various timeout-values. For more details see https://cloud.google.com/nat/docs/public-nat#specs-timeouts.\nThe specified CIDR ranges must be contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nThe networks.flowLogs section describes the configuration for the VPC flow logs. In order to enable the VPC flow logs at least one of the following parameters needs to be specified in the flow log section:\n networks.flowLogs.aggregationInterval an optional parameter describing the aggregation interval for collecting flow logs. For more details, see aggregation_interval reference.\n networks.flowLogs.flowSampling an optional parameter describing the sampling rate of VPC flow logs within the subnetwork where 1.0 means all collected logs are reported and 0.0 means no logs are reported. For more details, see flow_sampling reference.\n networks.flowLogs.metadata an optional parameter describing whether metadata fields should be added to the reported VPC flow logs. For more details, see metadata reference.\n Apart from the VPC and the subnets the GCP extension will also create a dedicated service account for this shoot, and firewall rules.\nControlPlaneConfig The control plane configuration mainly contains values for the GCP-specific control plane components. Today, the only component deployed by the GCP extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b cloudControllerManager: # featureGates: # SomeKubernetesFeature: true storage: managedDefaultStorageClass: true managedDefaultVolumeSnapshotClass: true The zone field tells the cloud-controller-manager in which zone it should mainly operate. You can still create clusters in multiple availability zones, however, the cloud-controller-manager requires one “main” zone. ⚠️ You always have to specify this field!\nThe cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe members of the storage allows to configure the provided storage classes further. If storage.managedDefaultStorageClass is enabled (the default), the default StorageClass deployed will be marked as default (via storageclass.kubernetes.io/is-default-class annotation). Similarly, if storage.managedDefaultVolumeSnapshotClass is enabled (the default), the default VolumeSnapshotClass deployed will be marked as default. In case you want to set a different StorageClass or VolumeSnapshotClass as default you need to set the corresponding option to false as at most one class should be marked as default in each case and the ResourceManager will prevent any changes from the Gardener managed classes to take effect.\nWorkerConfig The worker configuration contains:\n Local SSD interface for the additional volumes attached to GCP worker machines.\nIf you attach the disk with SCRATCH type, either an NVMe interface or a SCSI interface must be specified. It is only meaningful to provide this volume interface if only SCRATCH data volumes are used.\n Volume Encryption config that specifies values for kmsKeyName and kmsKeyServiceAccountName.\n The kmsKeyName is the key name of the cloud kms disk encryption key and must be specified if CMEK disk encryption is needed. The kmsKeyServiceAccount is the service account granted the roles/cloudkms.cryptoKeyEncrypterDecrypter on the kmsKeyName If empty, then the role should be given to the Compute Engine Service Agent Account. This CESA account usually has the name: service-PROJECT_NUMBER@compute-system.iam.gserviceaccount.com. See: https://cloud.google.com/iam/docs/service-agents#compute-engine-service-agent Prior to use, the operator should add IAM policy binding using the gcloud CLI: gcloud projects add-iam-policy-binding projectId --member serviceAccount:name@projectIdgserviceaccount.com --role roles/cloudkms.cryptoKeyEncrypterDecrypter Setting a volume image with dataVolumes.sourceImage. However, this parameter should only be used with particular caution. For example Gardenlinux works with filesystem LABELs only and creating another disk form the very same image causes the LABELs to be duplicated. See: https://github.com/gardener/gardener-extension-provider-gcp/issues/323\n Some hyperdisks allow adjustment of their default values for provisionedIops and provisionedThroughput. Keep in mind though that Hyperdisk Extreme and Hyperdisk Throughput volumes can’t be used as boot disks.\n Service Account with their specified scopes, authorized for this worker.\nService accounts created in advance that generate access tokens that can be accessed through the metadata server and used to authenticate applications on the instance.\nNote: If you do not provide service accounts for your workers, the Compute Engine default service account will be used. For more details on the default account, see https://cloud.google.com/compute/docs/access/service-accounts#default_service_account. If the DisableGardenerServiceAccountCreation feature gate is disabled, Gardener will create a shared service accounts to use for all instances. This feature gate is currently in beta and it will no longer be possible to re-enable the service account creation via feature gate flag.\n GPU with its type and count per node. This will attach that GPU to all the machines in the worker grp\nNote:\n A rolling upgrade of the worker group would be triggered in case the acceleratorType or count is updated.\n Some machineTypes like a2 family come with already attached gpu of a100 type and pre-defined count. If your workerPool consists of such machineTypes, please specify exact GPU configuration for the machine type as specified in Google cloud documentation. acceleratorType to use for families with attached gpu are stated below:\n a2 family -\u003e nvidia-tesla-a100 g2 family -\u003e nvidia-l4 Sufficient quota of gpu is needed in the GCP project. This includes quota to support autoscaling if enabled.\n GPU-attached machines can’t be live migrated during host maintenance events. Find out how to handle that in your application here\n GPU count specified here is considered for forming node template during scale-from-zero in Cluster Autoscaler\n The .nodeTemplate is used to specify resource information of the machine during runtime. This then helps in Scale-from-Zero. Some points to note for this field:\n Currently only cpu, gpu and memory are configurable. a change in the value lead to a rolling update of the machine in the workerpool all the resources needs to be specified An example WorkerConfig for the GCP looks as follows:\n apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: kmsKeyName: \"projects/projectId/locations/\u003czoneName\u003e/keyRings/\u003ckeyRingName\u003e/cryptoKeys/alpha\" kmsKeyServiceAccount: \"user@projectId.iam.gserviceaccount.com\" dataVolumes: - name: test sourceImage: projects/sap-se-gcp-gardenlinux/global/images/gardenlinux-gcp-gardener-prod-amd64-1443-3-c261f887 provisionedIops: 3000 provisionedThroughput: 140 serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 1 memory: 50Gi Example Shoot manifest Please find below an example Shoot manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-gcp namespace: garden-dev spec: cloudProfileName: gcp region: europe-west1 secretBindingName: core-gcp provider: type: gcp infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.250.0.0/16 controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b workers: - name: worker-xoluy machine: type: n1-standard-4 minimum: 2 maximum: 2 volume: size: 50Gi type: pd-standard zones: - europe-west1-b networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every GCP shoot cluster will be deployed with the GCP PD CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new pd.csi.storage.gke.io provisioner.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-gcp@v1.21.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-gcp@v1.23.\n","categories":"","description":"","excerpt":"Using the GCP provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/usage/","tags":"","title":"Usage"},{"body":"Using the OpenStack provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for OpenStack and provide an example Shoot manifest with minimal configuration that you can use to create an OpenStack cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider Secret Data Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of your OpenStack tenant. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-openstack namespace: garden-dev type: Opaque data: domainName: base64(domain-name) tenantName: base64(tenant-name) # either use username/password username: base64(user-name) password: base64(password) # or application credentials #applicationCredentialID: base64(app-credential-id) #applicationCredentialName: base64(app-credential-name) # optional #applicationCredentialSecret: base64(app-credential-secret) Please look up https://docs.openstack.org/keystone/pike/admin/identity-concepts.html as well.\nFor authentication with username/password see Keystone username/password\nAlternatively, for authentication with application credentials see Keystone Application Credentials.\n⚠️ Depending on your API usage it can be problematic to reuse the same provider credentials for different Shoot clusters due to rate limits. Please consider spreading your Shoots over multiple credentials from different tenants if you are hitting those limits.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig floatingPoolName: MY-FLOATING-POOL # floatingPoolSubnetName: my-floating-pool-subnet-name networks: # id: 12345678-abcd-efef-08af-0123456789ab # router: # id: 1234 workers: 10.250.0.0/19 # shareNetwork: # enabled: true The floatingPoolName is the name of the floating pool you want to use for your shoot. If you don’t know which floating pools are available look it up in the respective CloudProfile.\nWith floatingPoolSubnetName you can explicitly define to which subnet in the floating pool network (defined via floatingPoolName) the router should be attached to.\nnetworks.id is an optional field. If it is given, you can specify the uuid of an existing private Neutron network (created manually, by other tooling, …) that should be reused. A new subnet for the Shoot will be created in it.\nIf a networks.id is given and calico shoot clusters are created without a network overlay within one network make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that network. Overlapping pod CIDRs will lead to disfunctional shoot clusters.\nThe networks.router section describes whether you want to create the shoot cluster in an already existing router or whether to create a new one:\n If networks.router.id is given then you have to specify the router id of the existing router that was created by other means (manually, other tooling, …). If you want to get a fresh router for the shoot then just omit the networks.router field.\n In any case, the shoot cluster will be created in a new subnet.\n The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications.\nYou can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nApart from the router and the worker subnet the OpenStack extension will also create a network, router interfaces, security groups, and a key pair.\nThe optional networks.shareNetwork.enabled field controls the creation of a share network. This is only needed if shared file system storage (like NFS) should be used. Note, that in this case, the ControlPlaneConfig needs additional configuration, too.\nControlPlaneConfig The control plane configuration mainly contains values for the OpenStack-specific control plane components. Today, the only component deployed by the OpenStack extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: haproxy loadBalancerClasses: - name: lbclass-1 purpose: default floatingNetworkID: fips-1-id floatingSubnetName: internet-* - name: lbclass-2 floatingNetworkID: fips-1-id floatingSubnetTags: internal,private - name: lbclass-3 purpose: private subnetID: internal-id # cloudControllerManager: # featureGates: # SomeKubernetesFeature: true # storage: # csiManila: # enabled: true The loadBalancerProvider is the provider name you want to use for load balancers in your shoot. If you don’t know which types are available look it up in the respective CloudProfile.\nThe loadBalancerClasses field contains an optional list of load balancer classes which will be available in the cluster. Each entry can have the following fields:\n name to select the load balancer class via the kubernetes service annotations loadbalancer.openstack.org/class=name purpose with values default or private The configuration of the default load balancer class will be used as default for all other kubernetes loadbalancer services without a class annotation The configuration of the private load balancer class will be also set to the global loadbalancer configuration of the cluster, but will be overridden by the default purpose floatingNetworkID can be specified to receive an ip from an floating/external network, additionally the subnet in this network can be selected via floatingSubnetName can be either a full subnet name or a regex/glob to match subnet name floatingSubnetTags a comma seperated list of subnet tags floatingSubnetID the id of a specific subnet subnetID can be specified by to receive an ip from an internal subnet (will not have an effect in combination with floating/external network configuration) The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommended to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe optional storage.csiManila.enabled field is used to enable the deployment of the CSI Manila driver to support NFS persistent volumes. In this case, please ensure to set networks.shareNetwork.enabled=true in the InfrastructureConfig, too. Additionally, if CSI Manila driver is enabled, for each availability zone a NFS StorageClass will be created on the shoot named like csi-manila-nfs-\u003czone\u003e.\nWorkerConfig Each worker group in a shoot may contain provider-specific configurations and options. These are contained in the providerConfig section of a worker group and can be configured using a WorkerConfig object. An example of a WorkerConfig looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig serverGroup: policy: soft-anti-affinity # nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) # capacity: # cpu: 2 # gpu: 0 # memory: 50Gi # machineLabels: # - name: my-label # value: foo # - name: my-rolling-label # value: bar # triggerRollingOnUpdate: true # means any change of the machine label value will trigger rolling of all machines of the worker pool ServerGroups When you specify the serverGroup section in your worker group configuration, a new server group will be created with the configured policy for each worker group that enabled this setting and all machines managed by this worker group will be assigned as members of the created server group.\nFor users to have access to the server group feature, it must be enabled on the CloudProfile by your operator. Existing clusters can take advantage of this feature by updating the server group configuration of their respective worker groups. Worker groups that are already configured with server groups can update their setting to change the policy used, or remove it altogether at any time.\nUsers must be aware that any change to the server group settings will result in a rolling deployment of new nodes for the affected worker group.\nPlease note the following restrictions when deploying workers with server groups:\n The serverGroup section is optional, but if it is included in the worker configuration, it must contain a valid policy value. The available policy values that can be used, are defined in the provider specific section of CloudProfile by your operator. Certain policy values may induce further constraints. Using the affinity policy is only allowed when the worker group utilizes a single zone. MachineLabels The machineLabels section in the worker group configuration allows to specify additional machine labels. These labels are added to the machine instances only, but not to the node object. Additionally, they have an optional triggerRollingOnUpdate field. If it is set to true, changing the label value will trigger a rolling of all machines of this worker pool.\nNode Templates Node templates allow users to override the capacity of the nodes as defined by the server flavor specified in the CloudProfile’s machineTypes. This is useful for certain dynamic scenarios as it allows users to customize cluster-autoscaler’s behavior for these workergroup with their provided values.\nExample Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-openstack namespace: garden-dev spec: cloudProfile: name: openstack region: europe-1 secretBindingName: core-openstack provider: type: openstack infrastructureConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig floatingPoolName: MY-FLOATING-POOL networks: workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: haproxy workers: - name: worker-xoluy machine: type: medium_4_8 minimum: 2 maximum: 2 zones: - europe-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every OpenStack shoot cluster will be deployed with the OpenStack Cinder CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new cinder.csi.openstack.org provisioner.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-openstack@v1.23.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-openstack@v1.26.\n","categories":"","description":"","excerpt":"Using the OpenStack provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/usage/","tags":"","title":"Usage"},{"body":"Using the Networking Calico extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a networking field that is meant to contain network-specific configuration.\nIn this document we are describing how this configuration looks like for Calico and provide an example Shoot manifest with minimal configuration that you can use to create a cluster.\nCalico Typha Calico Typha is an optional component of Project Calico designed to offload the Kubernetes API server. The Typha daemon sits between the datastore (such as the Kubernetes API server which is the one used by Gardener managed Kubernetes) and many instances of Felix. Typha’s main purpose is to increase scale by reducing each node’s impact on the datastore. You can opt-out Typha via .spec.networking.providerConfig.typha.enabled=false of your Shoot manifest. By default the Typha is enabled.\nEBPF Dataplane Calico can be run in ebpf dataplane mode. This has several benefits, calico scales to higher troughput, uses less cpu per GBit and has native support for kubernetes services (without needing kube-proxy). To switch to a pure ebpf dataplane it is recommended to run without an overlay network. The following configuration can be used to run without an overlay and without kube-proxy.\nAn example ebpf dataplane NetworkingConfig manifest:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ebpfDataplane: enabled: true overlay: enabled: false To disable kube-proxy set the enabled field to false in the shoot manifest.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: ebpf-shoot namespace: garden-dev spec: kubernetes: kubeProxy: enabled: false Know limitations of the EBPF Dataplane Please note that the default settings for calico’s ebpf dataplane may interfere with accelerated networking in azure rendering nodes with accelerated networking unusable in the network. The reason for this is that calico does not ignore the accelerated networking interface enP... as it should, but applies its ebpf programs to it. A simple mitigation for this is to adapt the FelixConfiguration default and ensure that the bpfDataIfacePattern does not include enP.... Per default bpfDataIfacePattern is not set. The default value for this option can be found here. For example, you could apply the following change:\n$ kubectl edit felixconfiguration default ... apiVersion: crd.projectcalico.org/v1 kind: FelixConfiguration metadata: ... name: default ... spec: bpfDataIfacePattern: ^((en|wl|ww|sl|ib)[opsx].*|(eth|wlan|wwan).*|tunl0$|vxlan.calico$|wireguard.cali$|wg-v6.cali$) ... AutoScaling Autoscaling defines how the calico components are automatically scaled. It allows to use either static resource assignment, vertical pod or cluster-proportional autoscaler (default: cluster-proportional).\nThe cluster-proportional autoscaling mode is preferable when conditions require minimal disturbances and vpa mode for improved cluster resource utilization. Static resource assignments causes no disruptions due to autoscaling, but has no dynamics to handle changing demands.\nPlease note VPA must be enabled on the shoot as a pre-requisite to enabling vpa mode.\nAn example NetworkingConfig manifest for vertical pod autoscaling:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig autoScaling: mode: \"vpa\" An example NetworkingConfig manifest for static resource assignment:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig autoScaling: mode: \"static\" resources: node: cpu: 100m memory: 100Mi typha: cpu: 100m memory: 100Mi ℹ️ Please note that in static mode, you have the option to configure the resource requests for calico-node and calico-typha. If not specified, default settings will be used. If the resource requests are chosen too low, it might impact the stability/performance of the cluster. Specifying the resource requests for any other autoscaling mode has no effect.\n Example NetworkingConfig manifest An example NetworkingConfig for the Calico extension looks as follows:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ipam: type: host-local cidr: usePodCIDR vethMTU: 1440 typha: enabled: true overlay: enabled: true autoScaling: mode: \"vpa\" Example Shoot manifest Please find below an example Shoot manifest with calico networking configratations:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfileName: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico nodes: 10.250.0.0/16 providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ipam: type: host-local vethMTU: 1440 overlay: enabled: true typha: enabled: false kubernetes: version: 1.28.3 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Known Limitations in conjunction with NodeLocalDNS If NodeLocalDNS is active in a shoot cluster, which uses calico as CNI without overlay network, it may be impossible to block DNS traffic to the cluster DNS server via network policy. This is due to FELIX_CHAININSERTMODE being set to APPEND instead of INSERT in case SNAT is being applied to requests to the infrastructure DNS server. In this scenario the iptables rules of NodeLocalDNS already accept the traffic before the network policies are checked.\nThis only applies to traffic directed to NodeLocalDNS. If blocking of all DNS traffic is desired via network policy the pod dnsPolicy should be changed to Default so that the cluster DNS is not used. Alternatives are usage of overlay network or disabling of NodeLocalDNS.\n","categories":"","description":"","excerpt":"Using the Networking Calico extension with Gardener as end-user The …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/usage/","tags":"","title":"Usage"},{"body":"Using the Networking Cilium extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a networking field that is meant to contain network-specific configuration.\nIn this document we are describing how this configuration looks like for Cilium and provide an example Shoot manifest with minimal configuration that you can use to create a cluster.\nCilium Hubble Hubble is a fully distributed networking and security observability platform build on top of Cilium and BPF. It is optional and is deployed to the cluster when enabled in the NetworkConfig. If the dashboard is not externally exposed\nkubectl port-forward -n kube-system deployment/hubble-ui 8081 can be used to acess it locally.\nExample NetworkingConfig manifest An example NetworkingConfig for the Cilium extension looks as follows:\napiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig hubble: enabled: true #debug: false #tunnel: vxlan #store: kubernetes NetworkingConfig options The hubble.enabled field describes whether hubble should be deployed into the cluster or not (default).\nThe debug field describes whether you want to run cilium in debug mode or not (default), change this value to true to use debug mode.\nThe tunnel field describes the encapsulation mode for communication between nodes. Possible values are vxlan (default), geneve or disabled.\nThe bpfSocketLBHostnsOnly.enabled field describes whether socket LB will be skipped for services when inside a pod namespace (default), in favor of service LB at the pod interface. Socket LB is still used when in the host namespace. This feature is required when using cilium with a service mesh like istio or linkerd.\nSetting the field cni.exclusive to false might be useful when additional plugins, such as Istio or Linkerd, wish to chain after Cilium. This action disables the default behavior of Cilium, which is to overwrite changes to the CNI configuration file.\nThe egressGateway.enabled field describes whether egress gateways are enabled or not (default). To use this feature kube-proxy must be disabled. This can be done with the following configuration in the Shoot:\nspec: kubernetes: kubeProxy: enabled: false The egress gateway feature is only supported in gardener with an overlay network (shoot.spec.networking.providerConfig.overlay.enabled: true) at the moment. This is due to the reason that bpf masquerading is required for the egress gateway feature. Once the overlay network is enabled bpf.masquerade is set to true in the cilium configmap.\nThe snatToUpstreamDNS.enabled field describes whether the traffic to the upstream dns server should be masqueraded or not (default). This is needed on some infrastructures where traffic to the dns server with the pod CIDR range is blocked.\nExample Shoot manifest Please find below an example Shoot manifest with cilium networking configuration:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: aws-cilium namespace: garden-dev spec: networking: type: cilium providerConfig: apiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig hubble: enabled: true pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 ... If you would like to see a provider specific shoot example, please check out the documentation of the well-known extensions. A list of them can be found here.\n","categories":"","description":"","excerpt":"Using the Networking Cilium extension with Gardener as end-user The …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-cilium/usage/","tags":"","title":"Usage"},{"body":"Using the CoreOS extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for CoreOS workers Gardener allows you to create CoreOS based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise CoreOS based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true (necessary for collecting node metrics) enableDnsSupport: true ","categories":"","description":"","excerpt":"Using the CoreOS extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-coreos/usage/","tags":"","title":"Usage"},{"body":"Using the SuSE CHost extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for SuSE CHost workers Gardener allows you to create SuSE CHost based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise SuSE CHost based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true enableDnsSupport: true Support for vSMP MemoryOne This extension controller is also capable of generating user-data for the vSMP MemoryOne operating system in conjunction with SuSE CHost. It reacts on the memoryone-chost extension type. Additionally, it allows certain customizations with the following configuration:\napiVersion: memoryone-chost.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"3\" systemMemory: \"7x\" The memoryTopology field controls the mem_topology setting. If it’s not provided then it will default to 2. The systemMemory field controls the system_memory setting. If it’s not provided then it defaults to 6x. Please note that it was only e2e-tested on AWS. Additionally, you need a snapshot ID of a SuSE CHost/CHost volume (see below how to create it).\nAn exemplary worker pool configuration inside a Shoot resource using for the vSMP MemoryOne operating system would look as follows:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: vsmp-memoryone namespace: garden-foo spec: ... workers: - name: cpu-worker3 minimum: 1 maximum: 1 maxSurge: 1 maxUnavailable: 0 machine: image: name: memoryone-chost version: 9.5.195 providerConfig: apiVersion: memoryone-chost.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"2\" systemMemory: \"6x\" type: c5d.metal volume: size: 20Gi type: gp2 dataVolumes: - name: chost size: 50Gi type: gp2 providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig dataVolumes: - name: chost snapshotID: snap-123456 zones: - eu-central-1b Please note that vSMP MemoryOne only works for EC2 bare-metal instance types such as M5d, R5, C5, C5d, etc. - please consult the EC2 instance types overview page and the documentation of vSMP MemoryOne to find out whether the instance type in question is eligible.\nGenerating an AWS snapshot ID for the CHost/CHost operating system The following script will help to generate the snapshot ID on AWS. It runs in the region that is selected in your $HOME/.aws/config file. Consequently, if you want to generate the snapshot in multiple regions, you have to run in multiple times after configuring the respective region using aws configure.\nami=\"ami-1234\" #Replace the ami with the intended one. name=`aws ec2 describe-images --image-ids $ami --query=\"Images[].Name\" --output=text` cur=`aws ec2 describe-snapshots --filter=\"Name=description,Values=snap-$name\" --query=\"Snapshots[].Description\" --output=text` if [ -n \"$cur\" ]; then echo \"AMI $nameexists as snapshot $cur\" continue fi echo \"AMI $name... creating private snapshot\" inst=`aws ec2 run-instances --instance-type t3.nano --image-id $ami --query 'Instances[0].InstanceId' --output=text --subnet-id subnet-1234 --tag-specifications 'ResourceType=instance,Tags=[{Key=scalemp-test,Value=scalemp-test}]'` #Replace the subnet-id with the intended one. aws ec2 wait instance-running --instance-ids $inst vol=`aws ec2 describe-instances --instance-ids $inst --query \"Reservations[].Instances[].BlockDeviceMappings[0].Ebs.VolumeId\" --output=text` snap=`aws ec2 create-snapshot --description \"snap-$name\" --volume-id $vol --query='SnapshotId' --tag-specifications \"ResourceType=snapshot,Tags=[{Key=Name,Value=\\\"$name\\\"}]\" --output=text` aws ec2 wait snapshot-completed --snapshot-ids $snap aws ec2 terminate-instances --instance-id $inst \u003e /dev/null echo $snap ","categories":"","description":"","excerpt":"Using the SuSE CHost extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-suse-chost/usage/","tags":"","title":"Usage"},{"body":"Using the Ubuntu extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for Ubuntu workers Gardener allows you to create Ubuntu based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise Ubuntu based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true enableDnsSupport: true ","categories":"","description":"","excerpt":"Using the Ubuntu extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-ubuntu/usage/","tags":"","title":"Usage"},{"body":"Disclaimer This post is meant to give a basic end-to-end description for deploying and using Prometheus and Grafana. Both applications offer a wide range of flexibility, which needs to be considered in case you have specific requirements. Such advanced details are not in the scope of this topic.\nIntroduction Prometheus is an open-source systems monitoring and alerting toolkit for recording numeric time series. It fits both machine-centric monitoring as well as monitoring of highly dynamic service-oriented architectures. In a world of microservices, its support for multi-dimensional data collection and querying is a particular strength.\nPrometheus is the second hosted project to graduate within CNCF.\nThe following characteristics make Prometheus a good match for monitoring Kubernetes clusters:\n Pull-based Monitoring Prometheus is a pull-based monitoring system, which means that the Prometheus server dynamically discovers and pulls metrics from your services running in Kubernetes.\n Labels Prometheus and Kubernetes share the same label (key-value) concept that can be used to select objects in the system.\nLabels are used to identify time series and sets of label matchers can be used in the query language (PromQL) to select the time series to be aggregated.\n Exporters\nThere are many exporters available, which enable integration of databases or even other monitoring systems not already providing a way to export metrics to Prometheus. One prominent exporter is the so called node-exporter, which allows to monitor hardware and OS related metrics of Unix systems.\n Powerful Query Language The Prometheus query language PromQL lets the user select and aggregate time series data in real time. Results can either be shown as a graph, viewed as tabular data in the Prometheus expression browser, or consumed by external systems via the HTTP API.\n Find query examples on Prometheus Query Examples.\nOne very popular open-source visualization tool not only for Prometheus is Grafana. Grafana is a metric analytics and visualization suite. It is popular for visualizing time series data for infrastructure and application analytics but many use it in other domains including industrial sensors, home automation, weather, and process control. For more information, see the Grafana Documentation.\nGrafana accesses data via Data Sources. The continuously growing list of supported backends includes Prometheus.\nDashboards are created by combining panels, e.g., Graph and Dashlist.\nIn this example, we describe an End-To-End scenario including the deployment of Prometheus and a basic monitoring configuration as the one provided for Kubernetes clusters created by Gardener.\nIf you miss elements on the Prometheus web page when accessing it via its service URL https://\u003cyour K8s FQN\u003e/api/v1/namespaces/\u003cyour-prometheus-namespace\u003e/services/prometheus-prometheus-server:80/proxy, this is probably caused by a Prometheus issue - #1583. To workaround this issue, set up a port forward kubectl port-forward -n \u003cyour-prometheus-namespace\u003e \u003cprometheus-pod\u003e 9090:9090 on your client and access the Prometheus UI from there with your locally installed web browser. This issue is not relevant in case you use the service type LoadBalancer.\nPreparation The deployment of Prometheus and Grafana is based on Helm charts.\nMake sure to implement the Helm settings before deploying the Helm charts.\nThe Kubernetes clusters provided by Gardener use role based access control (RBAC). To authorize the Prometheus node-exporter to access hardware and OS relevant metrics of your cluster’s worker nodes, specific artifacts need to be deployed.\nBind the Prometheus service account to the garden.sapcloud.io:monitoring:prometheus cluster role by running the command kubectl apply -f crbinding.yaml.\nContent of crbinding.yaml\napiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: \u003cyour-prometheus-name\u003e-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: garden.sapcloud.io:monitoring:prometheus subjects: - kind: ServiceAccount name: \u003cyour-prometheus-name\u003e-server namespace: \u003cyour-prometheus-namespace\u003e Deployment of Prometheus and Grafana Only minor changes are needed to deploy Prometheus and Grafana based on Helm charts.\nCopy the following configuration into a file called values.yaml and deploy Prometheus: helm install \u003cyour-prometheus-name\u003e --namespace \u003cyour-prometheus-namespace\u003e stable/prometheus -f values.yaml\nTypically, Prometheus and Grafana are deployed into the same namespace. There is no technical reason behind this, so feel free to choose different namespaces.\nContent of values.yaml for Prometheus:\nrbac: create: false # Already created in Preparation step nodeExporter: enabled: false # The node-exporter is already deployed by default server: global: scrape_interval: 30s scrape_timeout: 30s serverFiles: prometheus.yml: rule_files: - /etc/config/rules - /etc/config/alerts scrape_configs: - job_name: 'kube-kubelet' honor_labels: false scheme: https tls_config: # This is needed because the kubelets' certificates are not generated # for a specific pod IP insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - target_label: __metrics_path__ replacement: /metrics - source_labels: [__meta_kubernetes_node_address_InternalIP] target_label: instance - action: labelmap regex: __meta_kubernetes_node_label_(.+) - job_name: 'kube-kubelet-cadvisor' honor_labels: false scheme: https tls_config: # This is needed because the kubelets' certificates are not generated # for a specific pod IP insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - target_label: __metrics_path__ replacement: /metrics/cadvisor - source_labels: [__meta_kubernetes_node_address_InternalIP] target_label: instance - action: labelmap regex: __meta_kubernetes_node_label_(.+) # Example scrape config for probing services via the Blackbox Exporter. # # Relabelling allows to configure the actual service scrape endpoint using the following annotations: # # * `prometheus.io/probe`: Only probe services that have a value of `true` - job_name: 'kubernetes-services' metrics_path: /probe params: module: [http_2xx] kubernetes_sd_configs: - role: service relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe] action: keep regex: true - source_labels: [__address__] target_label: __param_target - target_label: __address__ replacement: blackbox - source_labels: [__param_target] target_label: instance - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] target_label: kubernetes_name # Example scrape config for pods # # Relabelling allows to configure the actual service scrape endpoint using the following annotations: # # * `prometheus.io/scrape`: Only scrape pods that have a value of `true` # * `prometheus.io/path`: If the metrics path is not `/metrics` override this. # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`. - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: (.+):(?:\\d+);(\\d+) replacement: ${1}:${2} target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name # Scrape config for service endpoints. # # The relabeling allows the actual service scrape endpoint to be configured # via the following annotations: # # * `prometheus.io/scrape`: Only scrape services that have a value of `true` # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need # to set this to `https` \u0026 most likely set the `tls_config` of the scrape config. # * `prometheus.io/path`: If the metrics path is not `/metrics` override this. # * `prometheus.io/port`: If the metrics are exposed on a different port to the # service then set this appropriately. - job_name: 'kubernetes-service-endpoints' kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme] action: replace target_label: __scheme__ regex: (https?) - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: (.+)(?::\\d+);(\\d+) replacement: $1:$2 - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name # Add your additional configuration here... Next, deploy Grafana. Since the deployment in this post is based on the Helm default values, the settings below are set explicitly in case the default changed.\nDeploy Grafana via helm install grafana --namespace \u003cyour-prometheus-namespace\u003e stable/grafana -f values.yaml. Here, the same namespace is chosen for Prometheus and for Grafana.\nContent of values.yaml for Grafana:\nserver: ingress: enabled: false service: type: ClusterIP Check the running state of the pods on the Kubernetes Dashboard or by running kubectl get pods -n \u003cyour-prometheus-namespace\u003e. In case of errors, check the log files of the pod(s) in question.\nThe text output of Helm after the deployment of Prometheus and Grafana contains very useful information, e.g., the user and password of the Grafana Admin user. The credentials are stored as secrets in the namespace \u003cyour-prometheus-namespace\u003e and could be decoded via kubectl get secret --namespace \u003cmy-grafana-namespace\u003e grafana -o jsonpath=\"{.data.admin-password}\" | base64 --decode ; echo.\nBasic Functional Tests To access the web UI of both applications, use port forwarding of port 9090.\nSetup port forwarding for port 9090:\nkubectl port-forward -n \u003cyour-prometheus-namespace\u003e \u003cyour-prometheus-server-pod\u003e 9090:9090 Open http://localhost:9090 in your web browser. Select Graph from the top tab and enter the following expressing to show the overall CPU usage for a server (see Prometheus Query Examples):\n100 * (1 - avg by(instance)(irate(node_cpu{mode='idle'}[5m]))) This should show some data in a graph.\nTo show the same data in Grafana setup port forwarding for port 3000 for the Grafana pod and open the Grafana Web UI by opening http://localhost:3000 in a browser. Enter the credentials of the admin user.\nNext, you need to enter the server name of your Prometheus deployment. This name is shown directly after the installation via helm.\nRun\nhelm status \u003cyour-prometheus-name\u003e to find this name. Below, this server name is referenced by \u003cyour-prometheus-server-name\u003e.\nFirst, you need to add your Prometheus server as data source:\n Navigate to Dashboards → Data Sources Choose Add data source Enter: Name: \u003cyour-prometheus-datasource-name\u003e\nType: Prometheus\nURL: http://\u003cyour-prometheus-server-name\u003e\nAccess: proxy Choose Save \u0026 Test In case of failure, check the Prometheus URL in the Kubernetes Dashboard.\nTo add a Graph follow these steps:\n In the left corner, select Dashboards → New to create a new dashboard Select Graph to create a new graph Next, select the Panel Title → Edit Select your Prometheus Data Source in the drop down list Enter the expression 100 * (1 - avg by(instance)(irate(node_cpu{mode='idle'}[5m]))) in the entry field A Select the floppy disk symbol (Save) on top Now you should have a very basic Prometheus and Grafana setup for your Kubernetes cluster.\nAs a next step you can implement monitoring for your applications by implementing the Prometheus client API.\nRelated Links Prometheus Prometheus Helm Chart Grafana Grafana Helm Chart ","categories":"","description":"How to deploy and configure Prometheus and Grafana to collect and monitor kubelet container metrics","excerpt":"How to deploy and configure Prometheus and Grafana to collect and …","ref":"/docs/guides/applications/prometheus/","tags":"","title":"Using Prometheus and Grafana to Monitor K8s"},{"body":"Using the Dashboard Terminal The dashboard features an integrated web-based terminal to your clusters. It allows you to use kubectl without the need to supply kubeconfig. There are several ways to access it and they’re described on this page.\nPrerequisites You are logged on to the Gardener Dashboard. You have created a cluster and its status is operational. The landscape administrator has enabled the terminal feature The cluster you want to connect to is reachable from the dashboard On this page:\n Open from cluster list Open from cluster details page Terminal Open from cluster list Choose your project from the menu on the left and choose CLUSTERS.\n Locate a cluster for which you want to open a Terminal and choose the key icon.\n In the dialog, choose the icon on the right of the Terminal label.\n Open from cluster details page Choose your project from the menu on the left and choose CLUSTERS.\n Locate a cluster for which you want to open a Terminal and choose to display its details.\n In the Access section, choose the icon on the right of the Terminal label.\n Terminal Opening up the terminal in either of the ways discussed here results in the following screen:\nIt provides a bash environment and range of useful tools and an installed and configured kubectl (with alias k) to use right away with your cluster.\nTry to list the namespaces in the cluster.\n$ k get ns You get a result like this: ","categories":"","description":"","excerpt":"Using the Dashboard Terminal The dashboard features an integrated …","ref":"/docs/dashboard/using-terminal/","tags":"","title":"Using Terminal"},{"body":"Version Skew Policy This document describes the maximum version skew supported between various Gardener components.\nSupported Gardener Versions Gardener versions are expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version, following Semantic Versioning terminology.\nThe Gardener project maintains release branches for the three most recent minor releases.\nApplicable fixes, including security fixes, may be backported to those three release branches, depending on severity and feasibility. Patch releases are cut from those branches at a regular cadence, plus additional urgent releases when required.\nFor more information, see the Releases document.\nSupported Version Skew Technically, we follow the same policy as the Kubernetes project. However, given that our release cadence is much more frequent compared to Kubernetes (every 14d vs. every 120d), in many cases it might be possible to skip versions, though we do not test these upgrade paths. Consequently, in general it might not work, and to be on the safe side, it is highly recommended to follow the described policy.\n🚨 Note that downgrading Gardener versions is generally not tested during development and should be considered unsupported.\ngardener-apiserver In multi-instance setups of Gardener, the newest and oldest gardener-apiserver instances must be within one minor version.\nExample:\n newest gardener-apiserver is at 1.37 other gardener-apiserver instances are supported at 1.37 and 1.36 gardener-controller-manager, gardener-scheduler, gardener-admission-controller gardener-controller-manager, gardener-scheduler, and gardener-admission-controller must not be newer than the gardener-apiserver instances they communicate with. They are expected to match the gardener-apiserver minor version, but may be up to one minor version older (to allow live upgrades).\nExample:\n gardener-apiserver is at 1.37 gardener-controller-manager, gardener-scheduler, and gardener-admission-controller are supported at 1.37 and 1.36 gardenlet gardenlet must not be newer than gardener-apiserver gardenlet may be up to two minor versions older than gardener-apiserver Example:\n gardener-apiserver is at 1.37 gardenlet is supported at 1.37, 1.36, and 1.35 gardener-operator Since gardener-operator manages the Gardener control plane components (gardener-apiserver, gardener-controller-manager, gardener-scheduler, gardener-admission-controller), it follows the same policy as for gardener-apiserver.\nIt implements additional start-up checks to ensure adherence to this policy. Concretely, gardener-operator will crash when\n its gets downgraded. its version gets upgraded and skips at least one minor version. Supported Component Upgrade Order The supported version skew between components has implications on the order in which components must be upgraded. This section describes the order in which components must be upgraded to transition an existing Gardener installation from version 1.37 to version 1.38.\ngardener-apiserver Prerequisites:\n In a single-instance setup, the existing gardener-apiserver instance is 1.37. In a multi-instance setup, all gardener-apiserver instances are at 1.37 or 1.38 (this ensures maximum skew of 1 minor version between the oldest and newest gardener-apiserver instance). The gardener-controller-manager, gardener-scheduler, and gardener-admission-controller instances that communicate with this gardener-apiserver are at version 1.37 (this ensures they are not newer than the existing API server version and are within 1 minor version of the new API server version). gardenlet instances on all seeds are at version 1.37 or 1.36 (this ensures they are not newer than the existing API server version and are within 2 minor versions of the new API server version). Actions:\n Upgrade gardener-apiserver to 1.38. gardener-controller-manager, gardener-scheduler, gardener-admission-controller Prerequisites:\n The gardener-apiserver instances these components communicate with are at 1.38 (in multi-instance setups in which these components can communicate with any gardener-apiserver instance in the cluster, all gardener-apiserver instances must be upgraded before upgrading these components). Actions:\n Upgrade gardener-controller-manager, gardener-scheduler, and gardener-admission-controller to 1.38 gardenlet Prerequisites:\n The gardener-apiserver instances the gardenlet communicates with are at 1.38. Actions:\n Optionally upgrade gardenlet instances to 1.38 (or they can be left at 1.37 or 1.36). [!WARNING] Running a landscape with gardenlet instances that are persistently two minor versions behind gardener-apiserver means they must be upgraded before the Gardener control plane can be upgraded.\n gardener-operator Prerequisites:\n All gardener-operator instances are at 1.37. Actions:\n Upgrade gardener-operator to 1.38. Supported Gardener Extension Versions Extensions are maintained and released separately and independently of the gardener/gardener repository. Consequently, providing version constraints is not possible in this document. Sometimes, the documentation of extensions contains compatibility information (e.g., “this extension version is only compatible with Gardener versions higher than 1.80”, see this example).\nHowever, since all extensions typically make use of the extensions library (example), a general constraint is that no extension must depend on a version of the extensions library higher than the version of gardenlet.\nExample 1:\n gardener-apiserver and other Gardener control plane components are at 1.37. All gardenlets are at 1.37. Only extensions are supported which depend on 1.37 or lower of the extensions library. Example 2:\n gardener-apiserver and other Gardener control plane components are at 1.37. Some gardenlets are at 1.37, others are at 1.36. Only extensions are supported which depend on 1.36 or lower of the extensions library. Supported Kubernetes Versions Please refer to Supported Kubernetes Versions.\n","categories":"","description":"","excerpt":"Version Skew Policy This document describes the maximum version skew …","ref":"/docs/gardener/deployment/version_skew_policy/","tags":"","title":"Version Skew Policy"},{"body":"Webhooks The etcd-druid controller-manager registers certain admission webhooks that allow for validation or mutation of requests on resources in the cluster, in order to prevent misconfiguration and restrict access to the etcd cluster resources.\nAll webhooks that are a part of etcd-druid reside in package internal/webhook, as sub-packages.\nPackage Structure The typical package structure for the webhooks that are part of etcd-druid is shown with the EtcdComponents Webhook:\ninternal/webhook/etcdcomponents ├── config.go ├── handler.go └── register.go config.go: contains all the logic for the configuration of the webhook, including feature gate activations, CLI flag parsing and validations. register.go: contains the logic for registering the webhook with the etcd-druid controller manager. handler.go: contains the webhook admission handler logic. Each webhook package may also contain auxiliary files which are relevant to that specific webhook.\nEtcd Components Webhook Druid controller-manager registers and runs the etcd controller, which creates and manages various components/resources such as Leases, ConfigMaps, and the Statefulset for the etcd cluster. It is essential for all these resources to contain correct configuration for the proper functioning of the etcd cluster.\nUnintended changes to any of these managed resources can lead to misconfiguration of the etcd cluster, leading to unwanted downtime for etcd traffic. To prevent such unintended changes, a validating webhook called EtcdComponents Webhook guards these managed resources, ensuring that only authorized entities can perform operations on these managed resources.\nEtcdComponents webhook prevents UPDATE and DELETE operations on all resources managed by etcd controller, unless such an operation is performed by druid itself, and during reconciliation of the Etcd resource. Operations are also allowed if performed by one of the authorized entities specified by CLI flag --etcd-components-webhook-exempt-service-accounts, but only if the Etcd resource is not being reconciled by etcd-druid at that time.\nThere may be specific cases where a human operator may need to make changes to the managed resources, possibly to test or fix an etcd cluster. An example of this is recovery from permanent quorum loss, where a human operator will need to suspend reconciliation of the Etcd resource, make changes to the underlying managed resources such as StatefulSet and ConfigMap, and then resume reconciliation for the Etcd resource. Such manual interventions will require out-of-band changes to the managed resources. Protection of managed resources for such Etcd resources can be turned off by adding an annotation druid.gardener.cloud/disable-etcd-component-protection on the Etcd resource. This will effectively disable EtcdComponents Webhook protection for all managed resources for the specific Etcd.\nNote: UPDATE operations for Leases by etcd members are always allowed, since these are regularly updated by the etcd-backup-restore sidecar.\nThe Etcd Components Webhook is disabled by default, and can be enabled via the CLI flag `–enable-etcd-components-webhook.\n","categories":"","description":"","excerpt":"Webhooks The etcd-druid controller-manager registers certain admission …","ref":"/docs/other-components/etcd-druid/concepts/webhooks/","tags":"","title":"Webhooks"},{"body":"Webterminals Architecture Overview Motivation We want to give garden operators and “regular” users of the Gardener dashboard an easy way to have a preconfigured shell directly in the browser.\nThis has several advantages:\n no need to set up any tools locally no need to download / store kubeconfigs locally Each terminal session will have its own “access” service account created. This makes it easier to see “who” did “what” when using the web terminals. The “access” service account is deleted when the terminal session expires Easy “privileged” access to a node (privileged container, hostPID, and hostNetwork enabled, mounted host root fs) in case of troubleshooting node. If allowed by PSP. How it’s done - TL;DR On the host cluster, we schedule a pod to which the dashboard frontend client attaches to (similar to kubectl attach). Usually the ops-toolbelt image is used, containing all relevant tools like kubectl. The Pod has a kubeconfig secret mounted with the necessary privileges for the target cluster - usually cluster-admin.\nTarget types There are currently three targets, where a user can open a terminal session to:\n The (virtual) garden cluster - Currently operator only The shoot cluster The control plane of the shoot cluster - operator only Host There are different factors on where the host cluster (and namespace) is chosen by the dashboard:\n Depending on, the selected target and the role of the user (operator or “regular” user) the host is chosen. For performance / low latency reasons, we want to place the “terminal” pods as near as possible to the target kube-apiserver. For example, the user wants to have a terminal for a shoot cluster. The kube-apiserver of the shoot is running in the seed-shoot-ns on the seed.\n If the user is an operator, we place the “terminal” pod directly in the seed-shoot-ns on the seed. However, if the user is a “regular” user, we don’t want to have “untrusted” workload scheduled on the seeds, that’s why the “terminal” pod is scheduled on the shoot itself, in a temporary namespace that is deleted afterwards. Lifecycle of a Web Terminal Session 1. Browser / Dashboard Frontend - Open Terminal User chooses the target and clicks in the browser on Open terminal button. A POST request is made to the dashboard backend to request a new terminal session.\n2. Dashboard Backend - Create Terminal Resource According to the privileges of the user (operator / enduser) and the selected target, the dashboard backend creates a terminal resource on behalf of the user in the (virtual) garden and responds with a handle to the terminal session.\n3. Browser / Dashboard Frontend The frontend makes another POST request to the dashboard backend to fetch the terminal session. The Backend waits until the terminal resource is in a “ready” state (timeout 10s) before sending a response to the frontend. More to that later.\n4. Terminal Resource The terminal resource, among other things, holds the information of the desired host and target cluster. The credentials to these clusters are declared as references (secretRef / serviceAccountRef). The terminal resource itself doesn’t contain sensitive information.\n5. Admission A validating webhook is in place to ensure that the user, that created the terminal resource, has the permission to read the referenced credentials. There is also a mutating webhook in place. Both admission configurations have failurePolicy: Fail.\n6. Terminal-Controller-Manager - Apply Resources on Host \u0026 Target Cluster Sidenote: The terminal-controller-manager has no knowledge about the gardener, its shoots, and seeds. In that sense it can be considered as independent from the gardener.\nThe terminal-controller-manager watches terminal resources and ensures the desired state on the host and target cluster. The terminal-controller-manager needs the permission to read all secrets / service accounts in the virtual garden. As additional safety net, the terminal-controller-manager ensures that the terminal resource was not created before the admission configurations were created.\nThe terminal-controller-manager then creates the necessary resources in the host and target cluster.\n Target Cluster: “Access” service account + (cluster)rolebinding usually to cluster-admin cluster role used from within the “terminal” pod Host Cluster: “Attach” service Account + rolebinding to “attach” cluster role (privilege to attach and get pod) will be used by the browser to attach to the pod Kubeconfig secret, containing the “access” token from the target cluster The “terminal” pod itself, having the kubeconfig secret mounted 7. Dashboard Backend - Responds to Frontend As mentioned in step 3, the dashboard backend waits until the terminal resource is “ready”. It then reads the “attach” token from the host cluster on behalf of the user. It responds with:\n attach token hostname of the host cluster’s api server name of the pod and namespace 8. Browser / Dashboard Frontend - Attach to Pod Dashboard frontend attaches to the pod located on the host cluster by opening a WebSocket connection using the provided parameter and credentials. As long as the terminal window is open, the dashboard regularly annotates the terminal resource (heartbeat) to keep it alive.\n9. Terminal-Controller-Manager - Cleanup When there is no heartbeat on the terminal resource for a certain amount of time (default is 5m) the created resources in the host and target cluster are cleaned up again and the terminal resource will be deleted.\nBrowser Trusted Certificates for Kube-Apiservers When the dashboard frontend opens a secure WebSocket connection to the kube-apiserver, the certificate presented by the kube-apiserver must be browser trusted. Otherwise, the connection can’t be established due to browser policy. Most kube-apiservers have self-signed certificates from a custom Root CA.\nThe Gardener project now handles the responsibility of exposing the kube-apiservers with browser trusted certificates for Seeds (gardener/gardener#7764) and Shoots (gardener/gardener#7712). For this to work, a Secret must exist in the garden namespace of the Seed cluster. This Secret should have a label gardener.cloud/role=controlplane-cert. The Secret is expected to contain the wildcard certificate for Seeds ingress domain.\nAllowlist for Hosts Motivation When a user starts a terminal session, the dashboard frontend establishes a secure WebSocket connection to the corresponding kube-apiserver. This connection is controlled by the connectSrc directive of the content security policy, which governs the hosts that the browser can connect to.\nBy default, the connectSrc directive only permits connections to the same host. However, to enable the webterminal feature to function properly, connections to additional trusted hosts are required. This is where the allowedHostSourceList configuration becomes relevant. It directly impacts the connectSrc directive by specifying the hostnames that the browser is allowed to connect to during a terminal session. By defining this list, you can extend the range of terminal connections to include the necessary trusted hosts, while still preventing any unauthorized or potentially harmful connections.\nConfiguration The allowedHostSourceList can be configured within the global.terminal section of the gardener-dashboard Helm values.yaml file. The list should consist of permitted hostnames (without the scheme) for terminal connections.\nIt is important to consider that the usage of wildcards follows the rules defined by the content security policy.\nHere is an example of how to configure the allowedHostSourceList:\nglobal: terminal: allowedHostSourceList: - \"*.seed.example.com\" In this example, any host under the seed.example.com domain is allowed for terminal connections.\n","categories":"","description":"","excerpt":"Webterminals Architecture Overview Motivation We want to give garden …","ref":"/docs/dashboard/webterminals/","tags":"","title":"Webterminals"},{"body":"Weeder Overview Weeder watches for an update to service endpoints and on receiving such an event it will create a time-bound watch for all configured dependent pods that need to be actively recovered in case they have not yet recovered from CrashLoopBackoff state. In a nutshell it accelerates recovery of pods when an upstream service recovers.\nAn interference in automatic recovery for dependent pods is required because kubernetes pod restarts a container with an exponential backoff when the pod is in CrashLoopBackOff state. This backoff could become quite large if the service stays down for long. Presence of weeder would not let that happen as it’ll restart the pod.\nPrerequisites Before we understand how Weeder works, we need to be familiar with kubernetes services \u0026 endpoints.\n NOTE: If a kubernetes service is created with selectors then kubernetes will create corresponding endpoint resource which will have the same name as that of the service. In weeder implementation service and endpoint name is used interchangeably.\n Config Weeder can be configured via command line arguments and a weeder configuration. See configure weeder.\nInternals Weeder keeps a watch on the events for the specified endpoints in the config. For every endpoints a list of podSelectors can be specified. It cretes a weeder object per endpoints resource when it receives a satisfactory Create or Update event. Then for every podSelector it creates a goroutine. This goroutine keeps a watch on the pods with labels as per the podSelector and kills any pod which turn into CrashLoopBackOff. Each weeder lives for watchDuration interval which has a default value of 5 mins if not explicitly set.\nTo understand the actions taken by the weeder lets use the following diagram as a reference. Let us also assume the following configuration for the weeder:\nwatchDuration: 2m0s servicesAndDependantSelectors: etcd-main-client: # name of the service/endpoint for etcd statefulset that weeder will receive events for. podSelectors: # all pods matching the label selector are direct dependencies for etcd service - matchExpressions: - key: gardener.cloud/role operator: In values: - controlplane - key: role operator: In values: - apiserver kube-apiserver: # name of the service/endpoint for kube-api-server pods that weeder will receive events for. podSelectors: # all pods matching the label selector are direct dependencies for kube-api-server service - matchExpressions: - key: gardener.cloud/role operator: In values: - controlplane - key: role operator: NotIn values: - main - apiserver Only for the sake of demonstration lets pick the first service -\u003e dependent pods tuple (etcd-main-client as the service endpoint).\n Assume that there are 3 replicas for etcd statefulset. Time here is just for showing the series of events t=0 -\u003e all etcd pods go down t=10 -\u003e kube-api-server pods transition to CrashLoopBackOff t=100 -\u003e all etcd pods recover together t=101 -\u003e Weeder sees Update event for etcd-main-client endpoints resource t=102 -\u003e go routine created to keep watch on kube-api-server pods t=103 -\u003e Since kube-api-server pods are still in CrashLoopBackOff, weeder deletes the pods to accelerate the recovery. t=104 -\u003e new kube-api-server pod created by replica-set controller in kube-controller-manager Points to Note Weeder only respond on Update events where a notReady endpoints resource turn to Ready. Thats why there was no weeder action at time t=10 in the example above. notReady -\u003e no backing pod is Ready Ready -\u003e atleast one backing pod is Ready Weeder doesn’t respond on Delete events Weeder will always wait for the entire watchDuration. If the dependent pods transition to CrashLoopBackOff after the watch duration or even after repeated deletion of these pods they do not recover then weeder will exit. Quality of service offered via a weeder is only Best-Effort. ","categories":"","description":"","excerpt":"Weeder Overview Weeder watches for an update to service endpoints and …","ref":"/docs/other-components/dependency-watchdog/concepts/weeder/","tags":"","title":"Weeder"},{"body":"Can you adapt a DNS configuration to be used by the workload on the cluster (CoreDNS configuration)? Yes, you can. Information on that can be found in Custom DNS Configuration.\nHow to use custom domain names using a DNS provider? Creating custom domain names for the Gardener infrastructure DNS records using DNSRecords resources With DNSRecords internal and external domain names of the kube-apiserver are set, as well as the deprecated ingress domain name and an “owner” DNS record for the owning seed.\nFor this purpose, you need either a provider extension supporting the needed resource kind DNSRecord/\u003cprovider-type\u003e or a special extension.\nAll main providers support their respective IaaS specific DNS servers:\n AWS =\u003e DNSRecord/aws-route53 GCP =\u003e DNSRecord/google-cloudns Azure =\u003e DNSRecord/azure-dns Openstack =\u003e DNSRecord/openstack-designate AliCloud =\u003e DNSRecord/alicloud-dns For Cloudflare there is a community extension existing.\nFor other providers like Netlify and infoblox there is currently no known supporting extension, however, they are supported for shoot-dns-service.\nCreating domain names for cluster resources like ingress or services with services of type Loadbalancers and for TLS certificates For this purpose, the shoot-dns-service extension is used (DNSProvider and DNSEntry resources).\nYou can read more on it in these documents:\n Deployment of the Shoot DNS Service Extension Request DNS Names in Shoot Clusters DNS Providers Gardener DNS Management for Shoots Request X.509 Certificates Gardener Certificate Management ","categories":"","description":"","excerpt":"Can you adapt a DNS configuration to be used by the workload on the …","ref":"/docs/faq/dns-config/","tags":"","title":"What are the meanings of different DNS configuration options?"},{"body":"Contract: Worker Resource While the control plane of a shoot cluster is living in the seed and deployed as native Kubernetes workload, the worker nodes of the shoot clusters are normal virtual machines (VMs) in the end-users infrastructure account. The Gardener project features a sub-project called machine-controller-manager. This controller is extending the Kubernetes API using custom resource definitions to represent actual VMs as Machine objects inside a Kubernetes system. This approach unlocks the possibility to manage virtual machines in the Kubernetes style and benefit from all its design principles.\nWhat is the machine-controller-manager doing exactly? Generally, there are provider-specific MachineClass objects (AWSMachineClass, AzureMachineClass, etc.; similar to StorageClass), and MachineDeployment, MachineSet, and Machine objects (similar to Deployment, ReplicaSet, and Pod). A machine class describes where and how to create virtual machines (in which networks, region, availability zone, SSH key, user-data for bootstrapping, etc.), while a Machine results in an actual virtual machine. You can read up more information in the machine-controller-manager’s repository.\nThe gardenlet deploys the machine-controller-manager, hence, provider extensions only have to inject their specific out-of-tree machine-controller-manager sidecar container into the Deployment.\nWhat needs to be implemented to support a new worker provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Worker metadata: name: bar namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar infrastructureProviderStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus ec2: keyName: shoot--foo--bar-ssh-publickey iam: instanceProfiles: - name: shoot--foo--bar-nodes purpose: nodes roles: - arn: arn:aws:iam::0123456789:role/shoot--foo--bar-nodes purpose: nodes vpc: id: vpc-0123456789 securityGroups: - id: sg-1234567890 purpose: nodes subnets: - id: subnet-01234 purpose: nodes zone: eu-west-1b - id: subnet-56789 purpose: public zone: eu-west-1b - id: subnet-0123a purpose: nodes zone: eu-west-1c - id: subnet-5678a purpose: public zone: eu-west-1c pools: - name: cpu-worker minimum: 3 maximum: 5 maxSurge: 1 maxUnavailable: 0 machineType: m4.large machineImage: name: coreos version: 1967.5.0 nodeAgentSecretName: gardener-node-agent-local-ee46034b8269353b nodeTemplate: capacity: cpu: 2 gpu: 0 memory: 8Gi labels: node.kubernetes.io/role: node worker.gardener.cloud/cri-name: containerd worker.gardener.cloud/pool: cpu-worker worker.gardener.cloud/system-components: \"true\" userDataSecretRef: name: user-data-secret key: cloud_config volume: size: 20Gi type: gp2 zones: - eu-west-1b - eu-west-1c machineControllerManager: drainTimeout: 10m healthTimeout: 10m creationTimeout: 10m maxEvictRetries: 30 nodeConditions: - ReadonlyFilesystem - DiskPressure - KernelDeadlock clusterAutoscaler: scaleDownUtilizationThreshold: 0.5 scaleDownGpuUtilizationThreshold: 0.5 scaleDownUnneededTime: 30m scaleDownUnreadyTime: 1h maxNodeProvisionTime: 15m The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed virtual machines. Also, as you can see, Gardener copies the output of the infrastructure creation (.spec.infrastructureProviderStatus, see Infrastructure resource), into the .spec.\nIn the .spec.pools[] field, the desired worker pools are listed. In the above example, one pool with machine type m4.large and min=3, max=5 machines shall be spread over two availability zones (eu-west-1b, eu-west-1c). This information together with the infrastructure status must be used to determine the proper configuration for the machine classes.\nThe spec.pools[].labels map contains all labels that should be added to all nodes of the corresponding worker pool. Gardener configures kubelet’s --node-labels flag to contain all labels that are mentioned here and allowed by the NodeRestriction admission plugin. This makes sure that kubelet adds all user-specified and gardener-managed labels to the new Node object when registering a new machine with the API server. Nevertheless, this is only effective when bootstrapping new nodes. The provider extension (respectively, machine-controller-manager) is still responsible for updating the labels of existing Nodes when the worker specification changes.\nThe spec.pools[].nodeTemplate.capacity field contains the resource information of the machine like cpu, gpu, and memory. This info is used by Cluster Autoscaler to generate nodeTemplate during scaling the nodeGroup from zero.\nThe spec.pools[].machineControllerManager field allows to configure the settings for machine-controller-manager component. Providers must populate these settings on worker-pool to the related fields in MachineDeployment.\nThe spec.pools[].clusterAutoscaler field contains cluster-autoscaler settings that are to be applied only to specific worker group. cluster-autoscaler expects to find these settings as annotations on the MachineDeployment, and so providers must pass these values to the corresponding MachineDeployment via annotations. The keys for these annotations can be found here and the values for the corresponding annotations should be the same as what is passed into the field. Providers can use the helper function extensionsv1alpha1helper.GetMachineDeploymentClusterAutoscalerAnnotations that returns the annotation map to be used.\nThe controller must only inject its provider-specific sidecar container into the machine-controller-manager Deployment managed by gardenlet.\nAfter that, it must compute the desired machine classes and the desired machine deployments. Typically, one class maps to one deployment, and one class/deployment is created per availability zone. Following this convention, the created resource would look like this:\napiVersion: v1 kind: Secret metadata: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar labels: gardener.cloud/purpose: machineclass type: Opaque data: providerAccessKeyId: eW91ci1hd3MtYWNjZXNzLWtleS1pZAo= providerSecretAccessKey: eW91ci1hd3Mtc2VjcmV0LWFjY2Vzcy1rZXkK userData: c29tZSBkYXRhIHRvIGJvb3RzdHJhcCB0aGUgVk0K --- apiVersion: machine.sapcloud.io/v1alpha1 kind: AWSMachineClass metadata: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar spec: ami: ami-0123456789 # Your controller must map the stated version to the provider specific machine image information, in the AWS case the AMI. blockDevices: - ebs: volumeSize: 20 volumeType: gp2 iam: name: shoot--foo--bar-nodes keyName: shoot--foo--bar-ssh-publickey machineType: m4.large networkInterfaces: - securityGroupIDs: - sg-1234567890 subnetID: subnet-01234 region: eu-west-1 secretRef: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar tags: kubernetes.io/cluster/shoot--foo--bar: \"1\" kubernetes.io/role/node: \"1\" --- apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: shoot--foo--bar-cpu-worker-z1 namespace: shoot--foo--bar spec: replicas: 2 selector: matchLabels: name: shoot--foo--bar-cpu-worker-z1 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: name: shoot--foo--bar-cpu-worker-z1 spec: class: kind: AWSMachineClass name: shoot--foo--bar-cpu-worker-z1-3db65 for the first availability zone eu-west-1b, and\napiVersion: v1 kind: Secret metadata: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar labels: gardener.cloud/purpose: machineclass type: Opaque data: providerAccessKeyId: eW91ci1hd3MtYWNjZXNzLWtleS1pZAo= providerSecretAccessKey: eW91ci1hd3Mtc2VjcmV0LWFjY2Vzcy1rZXkK userData: c29tZSBkYXRhIHRvIGJvb3RzdHJhcCB0aGUgVk0K --- apiVersion: machine.sapcloud.io/v1alpha1 kind: AWSMachineClass metadata: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar spec: ami: ami-0123456789 # Your controller must map the stated version to the provider specific machine image information, in the AWS case the AMI. blockDevices: - ebs: volumeSize: 20 volumeType: gp2 iam: name: shoot--foo--bar-nodes keyName: shoot--foo--bar-ssh-publickey machineType: m4.large networkInterfaces: - securityGroupIDs: - sg-1234567890 subnetID: subnet-0123a region: eu-west-1 secretRef: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar tags: kubernetes.io/cluster/shoot--foo--bar: \"1\" kubernetes.io/role/node: \"1\" --- apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: shoot--foo--bar-cpu-worker-z1 namespace: shoot--foo--bar spec: replicas: 1 selector: matchLabels: name: shoot--foo--bar-cpu-worker-z1 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: name: shoot--foo--bar-cpu-worker-z1 spec: class: kind: AWSMachineClass name: shoot--foo--bar-cpu-worker-z2-5z6as for the second availability zone eu-west-1c.\nAnother convention is the 5-letter hash at the end of the machine class names. Most controllers compute a checksum out of the specification of the machine class. Any change to the value of the nodeAgentSecretName field must result in a change of the machine class name. The checksum in the machine class name helps to trigger a rolling update of the worker nodes if, for example, the machine image version changes. In this case, a new checksum will be generated which results in the creation of a new machine class. The MachineDeployment’s machine class reference (.spec.template.spec.class.name) is updated, which triggers the rolling update process in the machine-controller-manager. However, all of this is only a convention that eases writing the controller, but you can do it completely differently if you desire - as long as you make sure that the described behaviours are implemented correctly.\nAfter the machine classes and machine deployments have been created, the machine-controller-manager will start talking to the provider’s IaaS API and create the virtual machines. Gardener makes sure that the content of the Secret referenced in the userDataSecretRef field that is used to bootstrap the machines contains the required configuration for installation of the kubelet and registering the VM as worker node in the shoot cluster. The Worker extension controller shall wait until all the created MachineDeployments indicate healthiness/readiness before it ends the control loop.\nDoes Gardener need some information that must be returned back? Another important benefit of the machine-controller-manager’s design principles (extending the Kubernetes API using CRDs) is that the cluster-autoscaler can be used without any provider-specific implementation. We have forked the upstream Kubernetes community’s cluster-autoscaler and extended it so that it understands the machine API. Definitely, we will merge it back into the community’s versions once it has been adapted properly.\nOur cluster-autoscaler only needs to know the minimum and maximum number of replicas per MachineDeployment and is ready to act. Without knowing that, it needs to talk to the provider APIs (it just modifies the .spec.replicas field in the MachineDeployment object). Gardener deploys this autoscaler if there is at least one worker pool that specifies max\u003emin. In order to know how it needs to configure it, the provider-specific Worker extension controller must expose which MachineDeployments it has created and how the min/max numbers should look like.\nConsequently, your controller should write this information into the Worker resource’s .status.machineDeployments field. It should also update the .status.machineDeploymentsLastUpdateTime field along with .status.machineDeployments, so that gardener is able to deploy Cluster-Autoscaler right after the status is updated with the latest MachineDeployments and does not wait for the reconciliation to be completed:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Worker metadata: name: worker namespace: shoot--foo--bar spec: ... status: lastOperation: ... machineDeployments: - name: shoot--foo--bar-cpu-worker-z1 minimum: 2 maximum: 3 - name: shoot--foo--bar-cpu-worker-z2 minimum: 1 maximum: 2 machineDeploymentsLastUpdateTime: \"2023-05-01T12:44:27Z\" In order to support a new worker provider, you need to write a controller that watches all Workers with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the AWS provider.\nThat sounds like a lot that needs to be done, can you help me? All of the described behaviour is mostly the same for every provider. The only difference is maybe the version/configuration of the provider-specific machine-controller-manager sidecar container, and the machine class specification itself. You can take a look at our extension library, especially the worker controller part where you will find a lot of utilities that you can use. Note that there are also utility functions for getting the default sidecar container specification or corresponding VPA container policy in the machinecontrollermanager package called ProviderSidecarContainer and ProviderSidecarVPAContainerPolicy. Also, using the library you only need to implement your provider specifics - all the things that can be handled generically can be taken for free and do not need to be re-implemented. Take a look at the AWS worker controller for finding an example.\nNon-provider specific information required for worker creation All the providers require further information that is not provider specific but already part of the shoot resource. One example for such information is whether the shoot is hibernated or not. In this case, all the virtual machines should be deleted/terminated, and after that the machine controller-manager should be scaled down. You can take a look at the AWS worker controller to see how it reads this information and how it is used. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Worker resource itself.\nReferences and Additional Resources Worker API (Golang Specification) Extension Controller Library Generic Worker Controller Exemplary Implementation for the AWS Provider ","categories":"","description":"","excerpt":"Contract: Worker Resource While the control plane of a shoot cluster …","ref":"/docs/gardener/extensions/worker/","tags":"","title":"Worker"},{"body":"Workerless Shoots Starting from v1.71, users can create a Shoot without any workers, known as a “workerless Shoot”. Previously, worker nodes had to always be included even if users only needed the Kubernetes control plane. With workerless Shoots, Gardener will not create any worker nodes or anything related to them.\nHere’s an example manifest for a local workerless Shoot:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: local namespace: garden-local spec: cloudProfile: name: local region: local provider: type: local kubernetes: version: 1.26.0 ⚠️ It’s important to note that a workerless Shoot cannot be converted to a Shoot with workers or vice versa.\n As part of the control plane, the following components are deployed in the seed cluster for workerless Shoot:\n etcds kube-apiserver kube-controller-manager gardener-resource-manager logging and monitoring components extension components (if they support workerless Shoots, see here) ","categories":"","description":"What is a Workerless Shoot and how to create one","excerpt":"What is a Workerless Shoot and how to create one","ref":"/docs/gardener/shoot_workerless/","tags":"","title":"Workerless `Shoot`s"},{"body":"Working with Projects Overview Projects are used to group clusters, onboard IaaS resources utilized by them, and organize access control. To work with clusters, first you need to create a project that they will belong to.\nCreating Your First Project Prerequisites You have access to the Gardener Dashboard and have permissions to create projects Steps Logon to the Gardener Dashboard and choose CREATE YOUR FIRST PROJECT.\n Provide a project Name, and optionally a Description and a Purpose, and choose CREATE.\n ⚠️ You will not be able to change the project name later. The rest of the details will be editable.\n Result After completing the steps above, you will arrive at a similar screen: Creating More Projects If you need to create more projects, expand the Projects list dropdown on the left. When expanded, it reveals a CREATE PROJECT button that brings up the same dialog as above.\nRotating Your Project’s Secrets After rotating your Gardener credentials and updating the corresponding secret in Gardener, you also need to reconcile all the shoots so that they can start using the updated secret. Updating the secret on its own won’t trigger shoot reconciliation and the shoot will use the old credentials until reconciliation, which is why you need to either trigger reconciliation or wait until it is performed in the next maintenance time window.\nFor more information, see Credentials Rotation for Shoot Clusters.\nDeleting Your Project When you need to delete your project, go to ADMINISTRATON, choose the trash bin icon and, confirm the operation.\n","categories":"","description":"","excerpt":"Working with Projects Overview Projects are used to group clusters, …","ref":"/docs/dashboard/working-with-projects/","tags":"","title":"Working With Projects"}] \ No newline at end of file +[{"body":"Gardener API Reference authentication.gardener.cloud API Group core.gardener.cloud API Group extensions.gardener.cloud API Group operations.gardener.cloud API Group resources.gardener.cloud API Group security.gardener.cloud API Group seedmanagement.gardener.cloud API Group settings.gardener.cloud API Group ","categories":"","description":"","excerpt":"Gardener API Reference authentication.gardener.cloud API Group …","ref":"/docs/gardener/api-reference/","tags":"","title":"API Reference"},{"body":"Packages:\n druid.gardener.cloud/v1alpha1 druid.gardener.cloud/v1alpha1 Package v1alpha1 is the v1alpha1 version of the etcd-druid API.\nResource Types: BackupSpec (Appears on: EtcdSpec) BackupSpec defines parameters associated with the full and delta snapshots of etcd.\n Field Description port int32 (Optional) Port define the port on which etcd-backup-restore server will be exposed.\n tls TLSConfig (Optional) image string (Optional) Image defines the etcd container image and tag\n store StoreSpec (Optional) Store defines the specification of object store provider for storing backups.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources defines compute Resources required by backup-restore container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n compactionResources Kubernetes core/v1.ResourceRequirements (Optional) CompactionResources defines compute Resources required by compaction job. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n fullSnapshotSchedule string (Optional) FullSnapshotSchedule defines the cron standard schedule for full snapshots.\n garbageCollectionPolicy GarbageCollectionPolicy (Optional) GarbageCollectionPolicy defines the policy for garbage collecting old backups\n garbageCollectionPeriod Kubernetes meta/v1.Duration (Optional) GarbageCollectionPeriod defines the period for garbage collecting old backups\n deltaSnapshotPeriod Kubernetes meta/v1.Duration (Optional) DeltaSnapshotPeriod defines the period after which delta snapshots will be taken\n deltaSnapshotMemoryLimit k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) DeltaSnapshotMemoryLimit defines the memory limit after which delta snapshots will be taken\n compression CompressionSpec (Optional) SnapshotCompression defines the specification for compression of Snapshots.\n enableProfiling bool (Optional) EnableProfiling defines if profiling should be enabled for the etcd-backup-restore-sidecar\n etcdSnapshotTimeout Kubernetes meta/v1.Duration (Optional) EtcdSnapshotTimeout defines the timeout duration for etcd FullSnapshot operation\n leaderElection LeaderElectionSpec (Optional) LeaderElection defines parameters related to the LeaderElection configuration.\n ClientService (Appears on: EtcdConfig) ClientService defines the parameters of the client service that a user can specify\n Field Description annotations map[string]string (Optional) Annotations specify the annotations that should be added to the client service\n labels map[string]string (Optional) Labels specify the labels that should be added to the client service\n CompactionMode (string alias)\n (Appears on: SharedConfig) CompactionMode defines the auto-compaction-mode: ‘periodic’ or ‘revision’. ‘periodic’ for duration based retention and ‘revision’ for revision number based retention.\nCompressionPolicy (string alias)\n (Appears on: CompressionSpec) CompressionPolicy defines the type of policy for compression of snapshots.\nCompressionSpec (Appears on: BackupSpec) CompressionSpec defines parameters related to compression of Snapshots(full as well as delta).\n Field Description enabled bool (Optional) policy CompressionPolicy (Optional) Condition (Appears on: EtcdCopyBackupsTaskStatus, EtcdStatus) Condition holds the information about the state of a resource.\n Field Description type ConditionType Type of the Etcd condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n lastUpdateTime Kubernetes meta/v1.Time Last time the condition was updated.\n reason string The reason for the condition’s last transition.\n message string A human-readable message indicating details about the transition.\n ConditionStatus (string alias)\n (Appears on: Condition) ConditionStatus is the status of a condition.\nConditionType (string alias)\n (Appears on: Condition) ConditionType is the type of condition.\nCrossVersionObjectReference (Appears on: EtcdStatus) CrossVersionObjectReference contains enough information to let you identify the referred resource.\n Field Description kind string Kind of the referent\n name string Name of the referent\n apiVersion string (Optional) API version of the referent\n Etcd Etcd is the Schema for the etcds API\n Field Description metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec EtcdSpec selector Kubernetes meta/v1.LabelSelector selector is a label query over pods that should match the replica count. It must match the pod template’s labels. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors\n labels map[string]string annotations map[string]string (Optional) etcd EtcdConfig backup BackupSpec sharedConfig SharedConfig (Optional) schedulingConstraints SchedulingConstraints (Optional) replicas int32 priorityClassName string (Optional) PriorityClassName is the name of a priority class that shall be used for the etcd pods.\n storageClass string (Optional) StorageClass defines the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1\n storageCapacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageCapacity defines the size of persistent volume.\n volumeClaimTemplate string (Optional) VolumeClaimTemplate defines the volume claim template to be created\n status EtcdStatus EtcdConfig (Appears on: EtcdSpec) EtcdConfig defines parameters associated etcd deployed\n Field Description quota k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Quota defines the etcd DB quota.\n defragmentationSchedule string (Optional) DefragmentationSchedule defines the cron standard schedule for defragmentation of etcd.\n serverPort int32 (Optional) clientPort int32 (Optional) image string (Optional) Image defines the etcd container image and tag\n authSecretRef Kubernetes core/v1.SecretReference (Optional) metrics MetricsLevel (Optional) Metrics defines the level of detail for exported metrics of etcd, specify ‘extensive’ to include histogram metrics.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources defines the compute Resources required by etcd container. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/\n clientUrlTls TLSConfig (Optional) ClientUrlTLS contains the ca, server TLS and client TLS secrets for client communication to ETCD cluster\n peerUrlTls TLSConfig (Optional) PeerUrlTLS contains the ca and server TLS secrets for peer communication within ETCD cluster Currently, PeerUrlTLS does not require client TLS secrets for gardener implementation of ETCD cluster.\n etcdDefragTimeout Kubernetes meta/v1.Duration (Optional) EtcdDefragTimeout defines the timeout duration for etcd defrag call\n heartbeatDuration Kubernetes meta/v1.Duration (Optional) HeartbeatDuration defines the duration for members to send heartbeats. The default value is 10s.\n clientService ClientService (Optional) ClientService defines the parameters of the client service that a user can specify\n EtcdCopyBackupsTask EtcdCopyBackupsTask is a task for copying etcd backups from a source to a target store.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec EtcdCopyBackupsTaskSpec sourceStore StoreSpec SourceStore defines the specification of the source object store provider for storing backups.\n targetStore StoreSpec TargetStore defines the specification of the target object store provider for storing backups.\n maxBackupAge uint32 (Optional) MaxBackupAge is the maximum age in days that a backup must have in order to be copied. By default all backups will be copied.\n maxBackups uint32 (Optional) MaxBackups is the maximum number of backups that will be copied starting with the most recent ones.\n waitForFinalSnapshot WaitForFinalSnapshotSpec (Optional) WaitForFinalSnapshot defines the parameters for waiting for a final full snapshot before copying backups.\n status EtcdCopyBackupsTaskStatus EtcdCopyBackupsTaskSpec (Appears on: EtcdCopyBackupsTask) EtcdCopyBackupsTaskSpec defines the parameters for the copy backups task.\n Field Description sourceStore StoreSpec SourceStore defines the specification of the source object store provider for storing backups.\n targetStore StoreSpec TargetStore defines the specification of the target object store provider for storing backups.\n maxBackupAge uint32 (Optional) MaxBackupAge is the maximum age in days that a backup must have in order to be copied. By default all backups will be copied.\n maxBackups uint32 (Optional) MaxBackups is the maximum number of backups that will be copied starting with the most recent ones.\n waitForFinalSnapshot WaitForFinalSnapshotSpec (Optional) WaitForFinalSnapshot defines the parameters for waiting for a final full snapshot before copying backups.\n EtcdCopyBackupsTaskStatus (Appears on: EtcdCopyBackupsTask) EtcdCopyBackupsTaskStatus defines the observed state of the copy backups task.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of an object’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n lastError string (Optional) LastError represents the last occurred error.\n EtcdMemberConditionStatus (string alias)\n (Appears on: EtcdMemberStatus) EtcdMemberConditionStatus is the status of an etcd cluster member.\nEtcdMemberStatus (Appears on: EtcdStatus) EtcdMemberStatus holds information about a etcd cluster membership.\n Field Description name string Name is the name of the etcd member. It is the name of the backing Pod.\n id string (Optional) ID is the ID of the etcd member.\n role EtcdRole (Optional) Role is the role in the etcd cluster, either Leader or Member.\n status EtcdMemberConditionStatus Status of the condition, one of True, False, Unknown.\n reason string The reason for the condition’s last transition.\n lastTransitionTime Kubernetes meta/v1.Time LastTransitionTime is the last time the condition’s status changed.\n EtcdRole (string alias)\n (Appears on: EtcdMemberStatus) EtcdRole is the role of an etcd cluster member.\nEtcdSpec (Appears on: Etcd) EtcdSpec defines the desired state of Etcd\n Field Description selector Kubernetes meta/v1.LabelSelector selector is a label query over pods that should match the replica count. It must match the pod template’s labels. More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors\n labels map[string]string annotations map[string]string (Optional) etcd EtcdConfig backup BackupSpec sharedConfig SharedConfig (Optional) schedulingConstraints SchedulingConstraints (Optional) replicas int32 priorityClassName string (Optional) PriorityClassName is the name of a priority class that shall be used for the etcd pods.\n storageClass string (Optional) StorageClass defines the name of the StorageClass required by the claim. More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1\n storageCapacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageCapacity defines the size of persistent volume.\n volumeClaimTemplate string (Optional) VolumeClaimTemplate defines the volume claim template to be created\n EtcdStatus (Appears on: Etcd) EtcdStatus defines the observed state of Etcd.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n etcd CrossVersionObjectReference (Optional) conditions []Condition (Optional) Conditions represents the latest available observations of an etcd’s current state.\n serviceName string (Optional) ServiceName is the name of the etcd service.\n lastError string (Optional) LastError represents the last occurred error.\n clusterSize int32 (Optional) Cluster size is the size of the etcd cluster.\n currentReplicas int32 (Optional) CurrentReplicas is the current replica count for the etcd cluster.\n replicas int32 (Optional) Replicas is the replica count of the etcd resource.\n readyReplicas int32 (Optional) ReadyReplicas is the count of replicas being ready in the etcd cluster.\n ready bool (Optional) Ready is true if all etcd replicas are ready.\n updatedReplicas int32 (Optional) UpdatedReplicas is the count of updated replicas in the etcd cluster.\n labelSelector Kubernetes meta/v1.LabelSelector (Optional) LabelSelector is a label query over pods that should match the replica count. It must match the pod template’s labels.\n members []EtcdMemberStatus (Optional) Members represents the members of the etcd cluster\n peerUrlTLSEnabled bool (Optional) PeerUrlTLSEnabled captures the state of peer url TLS being enabled for the etcd member(s)\n GarbageCollectionPolicy (string alias)\n (Appears on: BackupSpec) GarbageCollectionPolicy defines the type of policy for snapshot garbage collection.\nLeaderElectionSpec (Appears on: BackupSpec) LeaderElectionSpec defines parameters related to the LeaderElection configuration.\n Field Description reelectionPeriod Kubernetes meta/v1.Duration (Optional) ReelectionPeriod defines the Period after which leadership status of corresponding etcd is checked.\n etcdConnectionTimeout Kubernetes meta/v1.Duration (Optional) EtcdConnectionTimeout defines the timeout duration for etcd client connection during leader election.\n MetricsLevel (string alias)\n (Appears on: EtcdConfig) MetricsLevel defines the level ‘basic’ or ‘extensive’.\nSchedulingConstraints (Appears on: EtcdSpec) SchedulingConstraints defines the different scheduling constraints that must be applied to the pod spec in the etcd statefulset. Currently supported constraints are Affinity and TopologySpreadConstraints.\n Field Description affinity Kubernetes core/v1.Affinity (Optional) Affinity defines the various affinity and anti-affinity rules for a pod that are honoured by the kube-scheduler.\n topologySpreadConstraints []Kubernetes core/v1.TopologySpreadConstraint (Optional) TopologySpreadConstraints describes how a group of pods ought to spread across topology domains, that are honoured by the kube-scheduler.\n SecretReference (Appears on: TLSConfig) SecretReference defines a reference to a secret.\n Field Description SecretReference Kubernetes core/v1.SecretReference (Members of SecretReference are embedded into this type.) dataKey string (Optional) DataKey is the name of the key in the data map containing the credentials.\n SharedConfig (Appears on: EtcdSpec) SharedConfig defines parameters shared and used by Etcd as well as backup-restore sidecar.\n Field Description autoCompactionMode CompactionMode (Optional) AutoCompactionMode defines the auto-compaction-mode:‘periodic’ mode or ‘revision’ mode for etcd and embedded-Etcd of backup-restore sidecar.\n autoCompactionRetention string (Optional) AutoCompactionRetention defines the auto-compaction-retention length for etcd as well as for embedded-Etcd of backup-restore sidecar.\n StorageProvider (string alias)\n (Appears on: StoreSpec) StorageProvider defines the type of object store provider for storing backups.\nStoreSpec (Appears on: BackupSpec, EtcdCopyBackupsTaskSpec) StoreSpec defines parameters related to ObjectStore persisting backups\n Field Description container string (Optional) Container is the name of the container the backup is stored at.\n prefix string Prefix is the prefix used for the store.\n provider StorageProvider (Optional) Provider is the name of the backup provider.\n secretRef Kubernetes core/v1.SecretReference (Optional) SecretRef is the reference to the secret which used to connect to the backup store.\n TLSConfig (Appears on: BackupSpec, EtcdConfig) TLSConfig hold the TLS configuration details.\n Field Description tlsCASecretRef SecretReference serverTLSSecretRef Kubernetes core/v1.SecretReference clientTLSSecretRef Kubernetes core/v1.SecretReference (Optional) WaitForFinalSnapshotSpec (Appears on: EtcdCopyBackupsTaskSpec) WaitForFinalSnapshotSpec defines the parameters for waiting for a final full snapshot before copying backups.\n Field Description enabled bool Enabled specifies whether to wait for a final full snapshot before copying backups.\n timeout Kubernetes meta/v1.Duration (Optional) Timeout is the timeout for waiting for a final full snapshot. When this timeout expires, the copying of backups will be performed anyway. No timeout or 0 means wait forever.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n druid.gardener.cloud/v1alpha1 …","ref":"/docs/other-components/etcd-druid/api-reference/","tags":"","title":"API Reference"},{"body":"Dashboard Architecture Overview Overview The dashboard frontend is a Single Page Application (SPA) built with Vue.js. The dashboard backend is a web server built with Express and Node.js. The backend serves the bundled frontend as static content. The dashboard uses Socket.IO to enable real-time, bidirectional and event-based communication between the frontend and the backend. For the communication from the backend to different kube-apiservers the http/2 network protocol is used. Authentication at the apiserver of the garden cluster is done via JWT tokens. These can either be an ID Token issued by an OpenID Connect Provider or the token of a Kubernetes Service Account.\nFrontend The dashboard frontend consists of many Vue.js single file components that manage their state via a centralized store. The store defines mutations to modify the state synchronously. If several mutations have to be combined or the state in the backend has to be modified at the same time, the store provides asynchronous actions to do this job. The synchronization of the data with the backend is done by plugins that also use actions.\nBackend The backend is currently a monolithic Node.js application, but it performs several tasks that are actually independent.\n Static web server for the frontend single page application Forward real time events of the apiserver to the frontend Provide an HTTP API Initiate and manage the end user login flow in order to obtain an ID Token Bidirectional integration with the GitHub issue management It is planned to split the backend into several independent containers to increase stability and performance.\nAuthentication The following diagram shows the authorization code flow in the Gardener dashboard. When the user clicks the login button, he is redirected to the authorization endpoint of the openid connect provider. In the case of Dex IDP, authentication is delegated to the connected IDP. After a successful login, the OIDC provider redirects back to the dashboard backend with a one time authorization code. With this code, the dashboard backend can now request an ID token for the logged in user. The ID token is encrypted and stored as a secure httpOnly session cookie.\n","categories":"","description":"","excerpt":"Dashboard Architecture Overview Overview The dashboard frontend is a …","ref":"/docs/dashboard/architecture/","tags":"","title":"Architecture"},{"body":"Core Components The core Observability components which Gardener offers out-of-the-box are:\n Prometheus - for Metrics and Alerting Vali - a Loki fork for Logging Plutono - a Grafana fork for Dashboard visualization Both forks are done from the last version with an Apache license.\nControl Plane Components on the Seed Prometheus, Plutono, and Vali are all located in the seed cluster. They run next to the control plane of your cluster.\nThe next sections will explore those components in detail.\nNote Gardener only provides monitoring for Gardener-deployed components. If you need logging or monitoring for your workload, then you need to deploy your own monitoring stack into your shoot cluster. Note Gardener only provides a monitoring stack if the cluster is not of purpose: testing. For more information, see Shoot Cluster Purpose. Logging into Plutono Let us start by giving some visual hints on how to access Plutono. Plutono allows us to query logs and metrics and visualise those in form of dashboards. Plutono is shipped ready-to-use with a Gardener shoot cluster.\nIn order to access the Gardener provided dashboards, open the Plutono link provided in the Gardener dashboard and use the username and password provided next to it.\nThe password you can use to log in can be retrieved as shown below:\nAccessing the Dashboards After logging in, you will be greeted with a Plutono welcome screen. Navigate to General/Home, as depicted with the red arrow in the next picture:\nThen you will be able to select the dashboards. Some interesting ones to look at are:\n The Kubernetes Control Plane Status dashboard allows you to check control plane availability during a certain time frame. The API Server dashboard gives you an overview on which requests are done towards your apiserver and how long they take. With the Node Details dashboard you can analyze CPU/Network pressure or memory usage for nodes. The Network Problem Detector dashboard illustrates the results of periodic networking checks between nodes and to the APIServer. Here is a picture with the Kubernetes Control Plane Status dashboard.\nPrometheus Prometheus is a monitoring system and a time series database. It can be queried using PromQL, the so called Prometheus Querying Language.\nThis example query describes the current uptime status of the kube apiserver.\nPrometheus and Plutono Time series data from Prometheus can be made visible with Plutono. Here we see how the query above which describes the uptime of a Kubernetes cluster is visualized with a Plutono dashboard.\nVali Logs via Plutono Vali is our logging solution. In order to access the logs provided by Vali, you need to:\n Log into Plutono.\n Choose Explore, which is depicted as the little compass symbol:\n Select Vali at the top left, as shown here: There you can browse logs or events of the control plane components.\nHere are some examples of helpful queries:\n {container_name=\"cluster-autoscaler\" } to get cluster-autoscaler logs and see why certain node groups were scaled up. {container_name=\"kube-apiserver\"} |~ \"error\" to get the logs of the kube-apiserver container and filter for errors. {unit=\"kubelet.service\", nodename=\"ip-123\"} to get the kubelet logs of a specific node. {unit=\"containerd.service\", nodename=\"ip-123\"} to retrieve the containerd logs for a specific node. Choose Help \u003e in order to see what options exist to filter the results.\nFor more information on how to retrieve K8s events from the past, see How to Access Logs.\nDetailed View Data Flow Our monitoring and logging solutions Vali and Prometheus both run next to the control plane of the shoot cluster.\nData Flow - Logging The following diagram allows a more detailed look at Vali and the data flow.\nOn the very left, we see Plutono as it displays the logs. Vali is aggregating the logs from different sources.\nValitail and Fluentbit send the logs to Vali, which in turn stores them.\nValitail\nValitail is a systemd service that runs on each node. It scrapes kubelet, containerd, kernel logs, and the logs of the pods in the kube-system namespace.\nFluentbit\nFluentbit runs as a daemonset on each seed node. It scrapes logs of the kubernetes control plane components, like apiserver or etcd.\nIt also scrapes logs of the Gardener deployed components which run next to the control plane of the cluster, like the machine-controller-manager or the cluster autoscaler. Debugging those components, for example, would be helpful when finding out why certain worker groups got scaled up or why nodes were replaced.\nData Flow - Monitoring Next to each shoot’s control plane, we deploy an instance of Prometheus in the seed.\nGardener uses Prometheus for storing and accessing shoot-related metrics and alerting.\nThe diagram below shows the data flow of metrics. Plutono uses PromQL queries to query data from Prometheus. It then visualises those metrics in dashboards. Prometheus itself scrapes various targets for metrics, as seen in the diagram below by the arrows pointing to the Prometheus instance.\nLet us have a look what metrics we scrape for debugging purposes:\nContainer performance metrics\ncAdvisor is an open-source agent integrated into the kubelet binary that monitors resource usage and analyzes the performance of containers. It collects statistics about the CPU, memory, file, and network usage for all containers running on a given node. We use it to scrape data for all pods running in the kube-system namespace in the shoot cluster.\nHardware and kernel-related metrics\nThe Prometheus Node Exporter runs as a daemonset in the kube-system namespace of your shoot cluster. It exposes a wide variety of hardware and kernel-related metrics. Some of the metrics we scrape are, for example, the current usage of the filesystem (node_filesystem_free_bytes) or current CPU usage (node_cpu_seconds_total). Both can help you identify if nodes are running out of hardware resources, which could lead to your workload experiencing downtimes.\nControl plane component specific metrics\nThe different control plane pods (for example, etcd, API server, and kube-controller-manager) emit metrics over the /metrics endpoint. This includes metrics like how long webhooks take, the request count of the apiserver and storage information, like how many and what kind of objects are stored in etcd.\nMetrics about the state of Kubernetes objects\nkube-state-metrics is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. It is not concerned with metrics about the Kubernetes components, but rather it exposes metrics calculated from the status of Kubernetes objects (for example, resource requests or health of pods).\nIn the following image a few example metrics, which are exposed by the various components, are listed: We only store metrics for Gardener deployed components. Those include the Kubernetes control plane, Gardener managed system components (e.g., pods) in the kube-system namespace of the shoot cluster or systemd units on the nodes. We do not gather metrics for workload deployed in the shoot cluster. This is also shown in the picture below.\nThis means that for any workload you deploy into your shoot cluster, you need to deploy monitoring and logging yourself.\nLogs or metrics are kept up to 14 days or when a configured space limit is reached.\n","categories":"","description":"","excerpt":"Core Components The core Observability components which Gardener …","ref":"/docs/getting-started/observability/components/","tags":"","title":"Components"},{"body":"Dependency Watchdog \nOverview A watchdog which actively looks out for disruption and recovery of critical services. If there is a disruption then it will prevent cascading failure by conservatively scaling down dependent configured resources and if a critical service has just recovered then it will expedite the recovery of dependent services/pods.\nAvoiding cascading failure is handled by Prober component and expediting recovery of dependent services/pods is handled by Weeder component. These are separately deployed as individual pods.\nCurrent Limitation \u0026 Future Scope Although in the current offering the Prober is tailored to handle one such use case of kube-apiserver connectivity, but the usage of prober can be extended to solve similar needs for other scenarios where the components involved might be different.\nStart using or developing the Dependency Watchdog See our documentation in the /docs repository, please find the index here.\nFeedback and Support We always look forward to active community engagement.\nPlease report bugs or suggestions on how we can enhance dependency-watchdog to address additional recovery scenarios on GitHub issues\n","categories":"","description":"A watchdog which actively looks out for disruption and recovery of critical services","excerpt":"A watchdog which actively looks out for disruption and recovery of …","ref":"/docs/other-components/dependency-watchdog/","tags":"","title":"Dependency Watchdog"},{"body":"Welcome to the Gardener Getting Started section! Here you will be able to get accustomed to the way Gardener functions and learn how its components work together in order to seamlessly run Kubernetes clusters on various hyperscalers.\nThe following topics aim to be useful to both complete beginners and those already somewhat familiar with Gardener. While the content is structured, with Introduction serving as the starting point, if you’re feeling confident in your knowledge, feel free to skip to a topic you’re more interested in.\n","categories":"","description":"Gardener onboarding materials","excerpt":"Gardener onboarding materials","ref":"/docs/getting-started/","tags":"","title":"Getting Started"},{"body":"Hibernation Some clusters need to be up all the time - typically, they would be hosting some kind of production workload. Others might be used for development purposes or testing during business hours only. Keeping them up and running all the time is a waste of money. Gardener can help you here with its “hibernation” feature. Essentially, hibernation means to shut down all components of a cluster.\nHow Hibernation Works The hibernation flow for a shoot attempts to reduce the resources consumed as much as possible. Hence everything not state-related is being decommissioned.\nData Plane All nodes will be drained and the VMs will be deleted. As a result, all pods will be “stuck” in a Pending state since no new nodes are added. Of course, PVC / PV holding data is not deleted.\nServices of type LoadBalancer will keep their external IP addresses.\nControl Plane All components will be scaled down and no pods will remain running. ETCD data is kept safe on the disk.\nThe DNS records routing traffic for the API server are also destroyed. Trying to connect to a hibernated cluster via kubectl will result in a DNS lookup failure / no-such-host message.\nWhen waking up a cluster, all control plane components will be scaled up again and the DNS records will be re-created. Nodes will be created again and pods scheduled to run on them.\nHow to Configure / Trigger Hibernation The easiest way to configure hibernation schedules is via the dashboard. Of course, this is reflected in the shoot’s spec and can also be maintained there. Before a cluster is hibernated, constraints in the shoot’s status will be evaluated. There might be conditions (mostly revolving around mutating / validating webhooks) that would block a successful wake-up. In such a case, the constraint will block hibernation in the first place.\nTo wake-up or hibernate a shoot immediately, the dashboard can be used or a patch to the shoot’s spec can be applied directly.\n","categories":"","description":"","excerpt":"Hibernation Some clusters need to be up all the time - typically, they …","ref":"/docs/getting-started/features/hibernation/","tags":"","title":"Hibernation"},{"body":"","categories":"","description":"Gardener extension controllers for the different infrastructures","excerpt":"Gardener extension controllers for the different infrastructures","ref":"/docs/extensions/infrastructure-extensions/","tags":"","title":"Infrastructure Extensions"},{"body":"Problem Space Let’s discuss the problem space first. Why does anyone need something like Gardener?\nRunning Software The starting point is this rather simple question: Why would you want to run some software?\nTypically, software is run with a purpose and not just for the sake of running it. Whether it is a digital ledger, a company’s inventory or a blog - software provides a service to its user.\nWhich brings us to the way this software is being consumed. Traditionally, software has been shipped on physical / digital media to the customer or end user. There, someone had to install, configure, and operate it. In recent times, the pattern has shifted. More and more solutions are operated by the vendor or a hosting partner and sold as a service ready to be used.\nBut still, someone needs to install, configure, and maintain it - regardless of where it is installed. And of course, it will run forever once started and is generally resilient to any kind of failures.\nFor smaller installations things like maintenance, scaling, debugging or configuration can be done in a semi-automatic way. It’s probably no fun and most importantly, only a limited amount of instances can be taken care of - similar to how one would take care of a pet.\nBut when hosting services at scale, there is no way someone can do all this manually at acceptable costs. So we need some vehicle to easily spin up new instances, do lifecycle operations, get some basic failure resilience, and more. How can we achieve that?\nSolution Space 1 - Kubernetes Let’s start solving some of the problems described earlier with Container technology and Kubernetes.\nContainers Container technology is at the core of the solution space. A container forms a vehicle that is shippable, can easily run in any supported environment and generally adds a powerful abstraction layer to the infrastructure.\nHowever, plain containers do not help with resilience or scaling. Therefore, we need another system for orchestration.\nOrchestration “Classical” orchestration that just follows the “notes” and moves from state A to state B doesn’t solve all of our problems. We need something else.\nKubernetes operates on the principle of “desired state”. With it, you write a construction plan, then have controllers cycle through “observe -\u003e analyze -\u003e act” and transition the actual to the desired state. Those reconciliations ensure that whatever breaks there is a path back to a healthy state.\nSummary Containers (famously brought to the mainstream as “Docker”) and Kubernetes are the ingredients of a fundamental shift in IT. Similar to how the Operating System layer enabled the decoupling of software and hardware, container-related technologies provide an abstract interface to any kind of infrastructure platform for the next-generation of applications.\nSolution Space 2 - Gardener So, Kubernetes solves a lot of problems. But how do you get a Kubernetes cluster?\nEither:\n Buy a cluster as a service from an external vendor Run a Gardener instance and host yourself a cluster with its help Essentially, it was a “make or buy” decision that led to the founding of Gardener.\nThe Reason Why We Choose to “Make It” Gardener allows to run Kubernetes clusters on various hyperscalers. It offers the same set of basic configuration options independent of the chosen infrastructure. This kind of harmonization supports any multi-vendor strategy while reducing adoption costs for the individual teams. Just imagine having to deal with multiple vendors all offering vastly different Kubernetes clusters.\nOf course, there are plenty more reasons - from acquiring operational knowledge to having influence on the developed features - that made the pendulum swing towards “make it”.\nWhat exactly is Gardener? Gardener is a system to manage Kubernetes clusters. It is driven by the same “desired state” pattern as Kubernetes itself. In fact, it is using Kubernetes to run Kubernetes.\nA user may “desire” clusters with specific configuration on infrastructures such as GCP, AWS, Azure, Alicloud, Openstack, vsphere, … and Gardener will make sure to create such a cluster and keep it running.\nIf you take this rather simplistic principle of reconciliation and add the feature-richness of Gardener to it, you end up with universal Kubernetes at scale.\nWhether you need fleet management at minimal TCO or to look for a highly customizable control plane - we have it all.\nOn top of that, Gardener-managed Kubernetes clusters fulfill the conformance standard set out by the CNCF and we submit our test results for certification.\nHave a look at the CNCF map for more information or dive into the testgrid directly.\nGardener itself is open-source. Under the umbrella of github.com/gardener we develop the core functionalities as well as the extensions and you are welcome to contribute (by opening issues, feature requests or submitting code).\nLast time we counted, there were already 131 projects. That’s actually more projects than members of the organization.\nAs of today, Gardener is mainly developed by SAP employees and SAP is an “adopter” as well, among STACKIT, Telekom, Finanz Informatik Technologie Services GmbH and others. For a full list of adopters, see the Adopters page.\n","categories":"","description":"","excerpt":"Problem Space Let’s discuss the problem space first. Why does anyone …","ref":"/docs/getting-started/introduction/","tags":"","title":"Introduction to Gardener"},{"body":"machine-controller-manager \nNote One can add support for a new cloud provider by following Adding support for new provider.\nOverview Machine Controller Manager aka MCM is a group of cooperative controllers that manage the lifecycle of the worker machines. It is inspired by the design of Kube Controller Manager in which various sub controllers manage their respective Kubernetes Clients. MCM gives you the following benefits:\n seamlessly manage machines/nodes with a declarative API (of course, across different cloud providers) integrate generically with the cluster autoscaler plugin with tools such as the node-problem-detector transport the immutability design principle to machine/nodes implement e.g. rolling upgrades of machines/nodes MCM supports following providers. These provider code is maintained externally (out-of-tree), and the links for the same are linked below:\n Alicloud AWS Azure Equinix Metal GCP KubeVirt Metal Stack Openstack V Sphere Yandex It can easily be extended to support other cloud providers as well.\nExample of managing machine:\nkubectl create/get/delete machine vm1 Key terminologies Nodes/Machines/VMs are different terminologies used to represent similar things. We use these terms in the following way\n VM: A virtual machine running on any cloud provider. It could also refer to a physical machine (PM) in case of a bare metal setup. Node: Native kubernetes node objects. The objects you get to see when you do a “kubectl get nodes”. Although nodes can be either physical/virtual machines, for the purposes of our discussions it refers to a VM. Machine: A VM that is provisioned/managed by the Machine Controller Manager. Design of Machine Controller Manager The design of the Machine Controller Manager is influenced by the Kube Controller Manager, where-in multiple sub-controllers are used to manage the Kubernetes clients.\nDesign Principles It’s designed to run in the master plane of a Kubernetes cluster. It follows the best principles and practices of writing controllers, including, but not limited to:\n Reusing code from kube-controller-manager leader election to allow HA deployments of the controller workqueues and multiple thread-workers SharedInformers that limit to minimum network calls, de-serialization and provide helpful create/update/delete events for resources rate-limiting to allow back-off in case of network outages and general instability of other cluster components sending events to respected resources for easy debugging and overview Prometheus metrics, health and (optional) profiling endpoints Objects of Machine Controller Manager Machine Controller Manager reconciles a set of Custom Resources namely MachineDeployment, MachineSet and Machines which are managed \u0026 monitored by their controllers MachineDeployment Controller, MachineSet Controller, Machine Controller respectively along with another cooperative controller called the Safety Controller.\nMachine Controller Manager makes use of 4 CRD objects and 1 Kubernetes secret object to manage machines. They are as follows:\n Custom ResourceObject Description MachineClass A MachineClass represents a template that contains cloud provider specific details used to create machines. Machine A Machine represents a VM which is backed by the cloud provider. MachineSet A MachineSet ensures that the specified number of Machine replicas are running at a given point of time. MachineDeployment A MachineDeployment provides a declarative update for MachineSet and Machines. Secret A Secret here is a Kubernetes secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. See here for CRD API Documentation\nComponents of Machine Controller Manager Controller Description MachineDeployment controller Machine Deployment controller reconciles the MachineDeployment objects and manages the lifecycle of MachineSet objects. MachineDeployment consumes provider specific MachineClass in its spec.template.spec which is the template of the VM spec that would be spawned on the cloud by MCM. MachineSet controller MachineSet controller reconciles the MachineSet objects and manages the lifecycle of Machine objects. Safety controller There is a Safety Controller responsible for handling the unidentified or unknown behaviours from the cloud providers. Safety Controller: freezes the MachineDeployment controller and MachineSet controller if the number of Machine objects goes beyond a certain threshold on top of Spec.replicas. It can be configured by the flag --safety-up or --safety-down and also --machine-safety-overshooting-period`. freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. Along with the above Custom Controllers and Resources, MCM requires the MachineClass to use K8s Secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. All these controllers work in an co-operative manner. They form a parent-child relationship with MachineDeployment Controller being the grandparent, MachineSet Controller being the parent, and Machine Controller being the child.\nDevelopment To start using or developing the Machine Controller Manager, see the documentation in the /docs repository.\nFAQ An FAQ is available here.\ncluster-api Implementation cluster-api branch of machine-controller-manager implements the machine-api aspect of the cluster-api project. Link: https://github.com/gardener/machine-controller-manager/tree/cluster-api Once cluster-api project gets stable, we may make master branch of MCM as well cluster-api compliant, with well-defined migration notes. ","categories":"","description":"Declarative way of managing machines for Kubernetes cluster","excerpt":"Declarative way of managing machines for Kubernetes cluster","ref":"/docs/other-components/machine-controller-manager/","tags":"","title":"Machine Controller Manager"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/tutorials/","tags":"","title":"Tutorials"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on AWS.\nPrerequisites You have created an AWS account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a Project.\n Choose Secrets, then the plus icon and select AWS.\n To copy the policy for AWS from the Gardener dashboard, click on the help icon for AWS secrets, and choose copy .\n Create a new policy in AWS:\n Choose Create policy.\n Paste the policy that you copied from the Gardener dashboard to this custom policy.\n Choose Next until you reach the Review section.\n Fill in the name and description, then choose Create policy.\n Create a new technical user in AWS:\n Type in a username and select the access key credential type.\n Choose Attach an existing policy.\n Select GardenerAccess from the policy list.\n Choose Next until you reach the Review section.\n Note Note: After the user is created, Access key ID and Secret access key are generated and displayed. Remember to save them. The Access key ID is used later to create secrets for Gardener. On the Gardener dashboard, choose Secrets and then the plus sign . Select AWS from the drop down menu to add a new AWS secret.\n Create your secret.\n Type the name of your secret. Copy and paste the Access Key ID and Secret Access Key you saved when you created the technical user on AWS. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select AWS in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/tutorials/kubernetes-cluster-on-aws-with-gardener/kubernetes-cluster-on-aws-with-gardener/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/tutorials/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/tutorials/","tags":"","title":"Tutorials"},{"body":"","categories":"","description":"Walkthroughs of common activities","excerpt":"Walkthroughs of common activities","ref":"/docs/guides/","tags":"","title":"Guides"},{"body":"Overview In this overview, we want to present two ways to receive alerts for control plane and Gardener managed system-components:\n Predefined Gardener alerts Custom alerts Predefined Control Plane Alerts In the shoot spec it is possible to configure emailReceivers. On this email address you will automatically receive email notifications for predefined alerts of your control plane. Such alerts are deployed in the shoot Prometheus and have visibility owner or all. For more alert details, shoot owners can use this visibility to find these alerts in their shoot Prometheus UI.\nspec: monitoring: alerting: emailReceivers: - john.doe@example.com For more information, see Alerting.\nCustom Alerts - Federation If you need more customization for alerts for control plane metrics, you have the option to deploy your own Prometheus into your shoot control plane.\nThen you can use federation, which is a Prometheus feature, to forward the metrics from the Gardener managed Prometheus to your custom deployed Prometheus. Since as a shoot owner you do not have access to the control plane pods, this is the only way to get those metrics.\nThe credentials and endpoint for the Gardener managed Prometheus are exposed over the Gardener dashboard or programmatically in the garden project as a secret (\u003cshoot-name\u003e.monitoring).\n","categories":"","description":"","excerpt":"Overview In this overview, we want to present two ways to receive …","ref":"/docs/getting-started/observability/alerts/","tags":"","title":"Alerts"},{"body":"Kubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn the classic setup, there is a dedicated host / VM to host the master components / control plane of a Kubernetes cluster. However, these are just normal programs that can easily be put into containers. Once in containers, Kubernetes Deployments and StatefulSets (for the etcd) can be made to watch over them. And by putting all that into a separate, dedicated Kubernetes cluster you get Kubernetes on Kubernetes, aka Kubeception (named after the famous movie Inception with Leonardo DiCaprio).\nBut what are the advantages of running Kubernetes on Kubernetes? For one, it makes use of resources more reasonably. Instead of providing a dedicated computer or virtual machine for the control plane of a Kubernetes cluster - which will probably never be the right size but either too small or too big - you can dynamically scale the individual control plane components based on demand and maximize resource usage by combining the control planes of multiple Kubernetes clusters.\nIn addition to that, it helps introducing a first layer of high availability. What happens if the API server suddenly stops responding to requests? In a traditional setup, someone would have to find out and manually restart the API server. In the Kubeception model, the API server is a Kubernetes Deployment and of course, it has sophisticated liveness- and readiness-probes. Should the API server fail, its liveness-probe will fail too and the pod in question simply gets restarted automatically - sometimes even before anybody would have noticed about the API server being unresponsive.\nIn Gardener’s terminology, the cluster hosting the control plane components is called a seed cluster. The cluster that end users actually use (and whose control plane is hosted in the seed) is called a shoot cluster.\nThe worker nodes of a shoot cluster are plain, simple virtual machines in a hyperscaler (EC2 instances in AWS, GCE instances in GCP or ECS instances in Alibaba Cloud). They run an operating system, a container runtime (e.g., containerd), and the kubelet that gets configured during node bootstrap to connect to the shoot’s API server. The API server in turn runs in the seed cluster and is exposed through an ingress. This connection happens over public internet and is - of course - TLS encrypted.\nIn other terms: you use Kubernetes to run Kubernetes.\nCluster Hierarchy in Gardener Gardener uses many Kubernetes clusters to eventually provide you with your very own shoot cluster.\nAt the heart of Gardener’s cluster hierarchy is the garden cluster. Since Gardener is 100% Kubernetes native, a Kubernetes cluster is needed to store all Gardener related resources. The garden cluster is actually nodeless - it only consists of a control plane, an API server (actually two), an etcd, and a bunch of controllers. The garden cluster is the central brain of a Gardener landscape and the one you connect to in order to create, modify or delete shoot clusters - either with kubectl and a dedicated kubeconfig or through the Gardener dashboard.\nThe seed clusters are next in the hierarchy - they are the clusters which will host the “kubeceptioned” control planes of the shoot clusters. For every hyperscaler supported in a Gardener landscape, there would be at least one seed cluster. However, to reduce latencies as well as for scaling, Gardener landscapes have several different seeds in different regions across the globe to keep the distance between control planes and actual worker nodes small.\nFinally, there are the shoot clusters - what Gardener is all about. Shoot clusters are the clusters which you create through Gardener and which your workload gets deployed to.\nGardener Components Overview From a very high level point of view, the important components of Gardener are:\nThe Gardener API Endpoint You can connect to the Gardener API Endpoint (i.e., the API server in the garden cluster) either through the dashboard or with kubectl, given that you have a proper kubeconfig for it.\nThe Seeds Running the Shoot Cluster Control Planes Inside each seed is one of the most important controllers in Gardener - the gardenlet. It spawns many other controllers, which will eventually create all resources for a shoot cluster, including all resources on the cloud providers such as virtual networks, security groups, and virtual machines.\nGardener’s API Endpoint Kubernetes’ API can be extended - either by CRDs or by API aggregation.\nAPI aggregation involves setting up a so called extension-API-server and registering it with the main Kubernetes API server. The extension API server will then serve resources of custom-defined API groups on its own. While the main Kubernetes API server is still used to handle RBAC, authorization, namespacing, quotas, limits, etc., all custom resources will be delegated to the extension-API-server. This is done through an APIService resource in the main API server - it specifies that, e.g., the API group core.gardener.cloud is served by a dedicated extension-API-server and all requests concerning this API group should be forwarded the specified IP address or Kubernetes service name. Extension API servers can persist their resources in their very own etcd but they do not have to - instead, they can use the main API servers etcd as well.\nGardener uses its very own extension API server for its resources like Shoot, Seed, CloudProfile, SecretBinding, etc… However, Gardener does not set up a dedicated etcd for its own extension API server - instead, it reuses the existing etcd of the main Kubernetes API server. This is absolutely possible since the resources of Gardener’s API are part of the API group gardener.cloud and thus will not interfere with any resources of the main Kubernetes API in etcd.\nIn case you are interested, you can read more on:\n API Extension API Aggregation APIService Resource Gardener API Resources Since Gardener’s API endpoint is a regular Kubernetes cluster, it would theoretically serve all resources from the Kubernetes core API, including Pods, Deployments, etc. However, Gardener implements RBAC rules and disables certain controllers that make these resources inaccessible. Objects like Secrets, Namespaces, and ResourceQuotas are still available, though, as they play a vital role in Gardener.\nIn addition, through Gardener’s extension API server, the API endpoint also serves Gardener’s custom resources like Projects, Shoots, CloudProfiles, Seeds, SecretBindings (those are relevant for users), ControllerRegistrations, ControllerDeployments, BackupBuckets, BackupEntries (those are relevant to an operator), etc.\n","categories":"","description":"","excerpt":"Kubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn …","ref":"/docs/getting-started/architecture/","tags":"","title":"Architecture"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/concepts/","tags":"","title":"Concepts"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/dependency-watchdog/concepts/","tags":"","title":"Concepts"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/documents/","tags":"","title":"Documents"},{"body":"","categories":"","description":"Gardener extension controllers for the supported operating systems","excerpt":"Gardener extension controllers for the supported operating systems","ref":"/docs/extensions/os-extensions/","tags":"","title":"Operating System Extensions"},{"body":"Controlplane as a Service Sometimes, there may be use cases for Kubernetes clusters that don’t require pods but only features of the control plane. Gardener can create the so-called “workerless” shoots, which are exactly that. A Kubernetes cluster without nodes (and without any controller related to them).\nIn a scenario where you already have multiple clusters, you can use it for orchestration (leases) or factor out components that require many CRDs.\nAs part of the control plane, the following components are deployed in the seed cluster for workerless shoot:\n etcds kube-apiserver kube-controller-manager gardener-resource-manager Logging and monitoring components Extension components (to find out if they support workerless shoots, see the Extensions documentation) ","categories":"","description":"","excerpt":"Controlplane as a Service Sometimes, there may be use cases for …","ref":"/docs/getting-started/features/workerless-shoots/","tags":"","title":"Workerless Shoots"},{"body":"","categories":"","description":"Make sure that your clusters are compliant and secure","excerpt":"Make sure that your clusters are compliant and secure","ref":"/docs/security-and-compliance/","tags":"","title":"Security and Compliance"},{"body":"Keys There are plenty of keys in Gardener. The ETCD needs one to store resources like secrets encrypted at rest. Gardener generates certificate authorities (CAs) to ensure secured communication between the various components and actors and service account tokens are signed with a dedicated key. There is also an SSH key pair to allow debugging of nodes and the observability stack has its own passwords too.\nAll of these keys share a common property: they are managed by Gardener. Rotating them, however, is potentially very disruptive. Hence, Gardener does not do it automatically, but offers you means to perform these tasks easily. For a single cluster, you may conveniently use the dashboard. Of course, it is also possible to do the same by annotating the shoot resource accordingly:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-start kubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-complete​ Where possible, the rotation happens in two phases - Preparing and Completing. The Preparing phase introduces new keys while the old ones are still valid. Users can safely exchange keys / CA bundles wherever they are used. Afterwards, the Completing phase will invalidate the old keys / CA bundles.\nRotation Phases At the beginning, only the old set of credentials exists. By triggering the rotation, new credentials are created in the Preparing phase and both sets are valid. Now, all clients have to update and start using the new credentials. Only afterwards it is safe to trigger the Completing phase, which invalidates the old credentials.\nThe shoot’s status will always show the current status / phase of the rotation.\nFor more information, see Credentials Rotation for Shoot Clusters.\nUser-Provided Credentials You grant Gardener permissions to create resources by handing over cloud provider keys. These keys are stored in a secret and referenced to a shoot via a SecretBinding. Gardener uses the keys to create the network for the cluster resources, routes, VMs, disks, and IP addresses.\nWhen you rotate credentials, the new keys have to be stored in the same secret and the shoot needs to reconcile successfully to ensure the replication to every controller. Afterwards, the old keys can be deleted safely from Gardener’s perspective.\nWhile the reconciliation can be triggered manually, there is no need for it (if you’re not in a hurry). Each shoot reconciles once within 24h and the new keys will be picked up during the next maintenance window.\nNote It is not possible to move a shoot to a different infrastructure account (at all!). ","categories":"","description":"","excerpt":"Keys There are plenty of keys in Gardener. The ETCD needs one to store …","ref":"/docs/getting-started/features/credential-rotation/","tags":"","title":"Credential Rotation"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/dependency-watchdog/deployment/","tags":"","title":"Deployment"},{"body":"Extensibility Overview Initially, everything was developed in-tree in the Gardener project. All cloud providers and the configuration for all the supported operating systems were released together with the Gardener core itself. But as the project grew, it got more and more difficult to add new providers and maintain the existing code base. As a consequence and in order to become agile and flexible again, we proposed GEP-1 (Gardener Enhancement Proposal). The document describes an out-of-tree extension architecture that keeps the Gardener core logic independent of provider-specific knowledge (similar to what Kubernetes has achieved with out-of-tree cloud providers or with CSI volume plugins).\nBasic Concepts Gardener keeps running in the “garden cluster” and implements the core logic of shoot cluster reconciliation / deletion. Extensions are Kubernetes controllers themselves (like Gardener) and run in the seed clusters. As usual, we try to use Kubernetes wherever applicable. We rely on Kubernetes extension concepts in order to enable extensibility for Gardener. The main ideas of GEP-1 are the following:\n During the shoot reconciliation process, Gardener will write CRDs into the seed cluster that are watched and managed by the extension controllers. They will reconcile (based on the .spec) and report whether everything went well or errors occurred in the CRD’s .status field.\n Gardener keeps deploying the provider-independent control plane components (etcd, kube-apiserver, etc.). However, some of these components might still need little customization by providers, e.g., additional configuration, flags, etc. In this case, the extension controllers register webhooks in order to manipulate the manifests.\n Example 1:\nGardener creates a new AWS shoot cluster and requires the preparation of infrastructure in order to proceed (networks, security groups, etc.). It writes the following CRD into the seed cluster:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--core--aws-01 spec: type: aws providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a dns: apiserver: api.aws-01.core.example.com region: eu-west-1 secretRef: name: my-aws-credentials sshPublicKey: | base64(key) Please note that the .spec.providerConfig is a raw blob and not evaluated or known in any way by Gardener. Instead, it was specified by the user (in the Shoot resource) and just “forwarded” to the extension controller. Only the AWS controller understands this configuration and will now start provisioning/reconciling the infrastructure. It reports in the .status field the result:\nstatus: observedGeneration: ... state: ... lastError: .. lastOperation: ... providerStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus vpc: id: vpc-1234 subnets: - id: subnet-acbd1234 name: workers zone: eu-west-1 securityGroups: - id: sg-xyz12345 name: workers iam: nodesRoleARN: \u003csome-arn\u003e instanceProfileName: foo ec2: keyName: bar Gardener waits until the .status.lastOperation / .status.lastError indicates that the operation reached a final state and either continuous with the next step, or stops and reports the potential error. The extension-specific output in .status.providerStatus is - similar to .spec.providerConfig - not evaluated, and simply forwarded to CRDs in subsequent steps.\nExample 2:\nGardener deploys the control plane components into the seed cluster, e.g. the kube-controller-manager deployment with the following flags:\napiVersion: apps/v1 kind: Deployment ... spec: template: spec: containers: - command: - /usr/local/bin/kube-controller-manager - --allocate-node-cidrs=true - --attach-detach-reconcile-sync-period=1m0s - --controllers=*,bootstrapsigner,tokencleaner - --cluster-cidr=100.96.0.0/11 - --cluster-name=shoot--core--aws-01 - --cluster-signing-cert-file=/srv/kubernetes/ca/ca.crt - --cluster-signing-key-file=/srv/kubernetes/ca/ca.key - --concurrent-deployment-syncs=10 - --concurrent-replicaset-syncs=10 ... The AWS controller requires some additional flags in order to make the cluster functional. It needs to provide a Kubernetes cloud-config and also some cloud-specific flags. Consequently, it registers a MutatingWebhookConfiguration on Deployments and adds these flags to the container:\n - --cloud-provider=external - --external-cloud-volume-plugin=aws - --cloud-config=/etc/kubernetes/cloudprovider/cloudprovider.conf Of course, it would have needed to create a ConfigMap containing the cloud config and to add the proper volume and volumeMounts to the manifest as well.\n(Please note for this special example: The Kubernetes community is also working on making the kube-controller-manager provider-independent. However, there will most probably be still components other than the kube-controller-manager which need to be adapted by extensions.)\nIf you are interested in writing an extension, or generally in digging deeper to find out the nitty-gritty details of the extension concepts, please read GEP-1. We are truly looking forward to your feedback!\nCurrent Status Meanwhile, the out-of-tree extension architecture of Gardener is in place and has been productively validated. We are tracking all internal and external extensions of Gardener in the Gardener Extensions Library repo.\n","categories":"","description":"","excerpt":"Extensibility Overview Initially, everything was developed in-tree in …","ref":"/docs/gardener/extensions/","tags":"","title":"Extensions"},{"body":"Documentation Index Overview General Architecture Gardener landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Concepts Components Gardener API server In-Tree admission plugins Gardener Controller Manager Gardener Scheduler Gardener Admission Controller Gardener Resource Manager Gardener Operator Gardener Node Agent Gardenlet Backup Restore etcd Relation between Gardener API and Cluster API Usage Audit a Kubernetes cluster Cleanup of Shoot clusters in deletion containerd Registry Configuration Custom containerd configuration Custom CoreDNS configuration (Custom) CSI components Default Seccomp Profile DNS Autoscaling DNS Search Path Optimization Endpoints and Ports of a Shoot Control-Plane ETCD Encryption Config ExposureClasses Hibernate a Cluster IPv6 in Gardener Clusters Logging NodeLocalDNS feature OpenIDConnect presets Projects Service Account Manager Readiness of Shoot Worker Nodes Reversed Cluster VPN Shoot Cluster Purposes Shoot Scheduling Profiles Shoot Credentials Rotation Shoot Kubernetes and Operating System Versioning Shoot KUBERNETES_SERVICE_HOST Environment Variable Injection Shoot Networking Shoot Maintenance Shoot ServiceAccount Configurations Shoot Status Shoot Info ConfigMap Shoot Updates and Upgrades Shoot Auto-Scaling Configuration Shoot Pod Auto-Scaling Best Practices Shoot High-Availability Control Plane Shoot High-Availability Best Practices Shoot Workers Settings Accessing Shoot Clusters Supported Kubernetes versions Tolerations Trigger shoot operations Trusted TLS certificate for shoot control planes Trusted TLS certificate for garden runtime cluster Controlling the Kubernetes versions for specific worker pools Admission Configuration for the PodSecurity Admission Plugin Supported CPU Architectures for Shoot Worker Nodes Workerless Shoots API Reference authentication.gardener.cloud API Group core.gardener.cloud API Group extensions.gardener.cloud API Group operations.gardener.cloud API Group resources.gardener.cloud API Group security.gardener.cloud API Group seedmanagement.gardener.cloud API Group settings.gardener.cloud API Group Proposals GEP: Gardener Enhancement Proposal Description GEP: Template GEP-1: Gardener extensibility and extraction of cloud-specific/OS-specific knowledge GEP-2: BackupInfrastructure CRD and Controller Redesign GEP-3: Network extensibility GEP-4: New core.gardener.cloud/v1beta1 APIs required to extract cloud-specific/OS-specific knowledge out of Gardener core GEP-5: Gardener Versioning Policy GEP-6: Integrating etcd-druid with Gardener GEP-7: Shoot Control Plane Migration GEP-8: SNI Passthrough proxy for kube-apiservers GEP-9: Gardener integration test framework GEP-10: Support additional container runtimes GEP-11: Utilize API Server Network Proxy to Invert Seed-to-Shoot Connectivity GEP-12: OIDC Webhook Authenticator GEP-13: Automated Seed Management GEP-14: Reversed Cluster VPN GEP-15: Manage Bastions and SSH Key Pair Rotation GEP-16: Dynamic kubeconfig generation for Shoot clusters GEP-17: Shoot Control Plane Migration “Bad Case” Scenario GEP-18: Automated Shoot CA Rotation GEP-19: Observability Stack - Migrating to the prometheus-operator and fluent-bit operator GEP-20: Highly Available Shoot Control Planes GEP-21: IPv6 Single-Stack Support in Local Gardener GEP-22: Improved Usage of the ShootState API GEP-23: Autoscaling Shoot kube-apiserver via Independently Driven HPA and VPA GEP-24: Shoot OIDC Issuer GEP-25: Namespaced Cloud Profiles GEP-26: Workload Identity - Trust Based Authentication GEP-27: Add Optional Bastion Section To CloudProfile Development Getting started locally (using the local provider) Setting up a development environment (using a cloud provider) Testing (Unit, Integration, E2E Tests) Test Machinery Tests Dependency Management Kubernetes Clients in Gardener Logging in Gardener Components Changing the API Secrets Management for Seed and Shoot Clusters Releases, Features, Hotfixes Adding New Cloud Providers Adding Support For A New Kubernetes Version Extending the Monitoring Stack How to create log parser for container into fluent-bit PriorityClasses in Gardener Clusters High Availability Of Deployed Components Checklist For Adding New Components Defaulting Strategy and Developer Guideline Extensions Extensibility overview Extension controller registration Cluster resource Extension points General conventions Trigger for reconcile operations Deploy resources into the shoot cluster Shoot resource customization webhooks Logging and monitoring for extensions Contributing to shoot health status conditions Health Check Library CA Rotation in Extensions Blob storage providers BackupBucket resource BackupEntry resource DNS providers DNSRecord resources IaaS/Cloud providers Control plane customization webhooks Bastion resource ControlPlane resource ControlPlane exposure resource Infrastructure resource Worker resource Network plugin providers Network resource Operating systems OperatingSystemConfig resource Container runtimes ContainerRuntime resource Generic (non-essential) extensions Extension resource Extension Admission Heartbeat controller Provider Local machine-controller-manager-provider-local Access to the Garden Cluster Control plane migration Force Deletion Extending project roles Referenced resources Deployment Getting started locally Getting started locally with extensions Setup Gardener on a Kubernetes cluster Version Skew Policy Deploying Gardenlets Automatic Deployment of Gardenlets Deploy a Gardenlet Manually Scoped API Access for Gardenlets Overwrite image vector Migration from Gardener v0 to v1 Feature Gates in Gardener Configuring the Logging stack SecretBinding Provider Controller Operations Gardener configuration and usage Control Plane Migration Istio ManagedSeeds: Register Shoot as Seed NetworkPolicys In Garden, Seed, Shoot Clusters Seed Bootstrapping Seed Settings Topology-Aware Traffic Routing Monitoring Alerting Connectivity Profiling Gardener Components ","categories":"","description":"The core component providing the extension API server of your Kubernetes cluster","excerpt":"The core component providing the extension API server of your …","ref":"/docs/gardener/","tags":"","title":"Gardener"},{"body":"Overview Gardener is all about Kubernetes clusters, which we call shoots. However, Gardener also does user management, delicate permission management and offers technical accounts to integrate its services into other infrastructures. It allows you to create several quotas and it needs credentials to connect to cloud providers. All of these are arranged in multiple fully contained projects, each of which belongs to a dedicated user and / or group.\nProjects on YAML Level Projects are a Kubernetes resource which can be expressed by YAML. The resource specification can be found in the API reference documentation.\nA project’s specification defines a name, a description (which is a free-text field), a purpose (again, a free-text field), an owner, and members. In Gardener, user management is done on a project level. Therefore, projects can have different members with certain roles.\nIn Gardener, a user can have one of five different roles: owner, admin, viewer, UAM, and service account manager. A member with the viewer role can see and list all clusters but cannot create, delete or modify them. For that, a member would need the admin role. Another important role would be the uam role - members with that role are allowed to manage members and technical users for a project. The owner of a project is allowed to do all of that, regardless of what other roles might be assigned to him.\nProjects are getting reconciled by Gardener’s project-controller, a component of Gardener’s controller manager. The status of the last reconcilation, along with any potential failures, will be recorded in the project’s status field.\nFor more information, see Projects.\nIn case you are interested, you can also view the source code for:\n The structure of a project API object Reconciling a project Gardener Projects and Kubernetes Namespaces Note Each Gardener project corresponds to a Kubernetes namespace and all project specific resources are placed into it. Even though projects are a dedicated Kubernetes resource, every project also corresponds to a dedicated namespace in the garden cluster. All project resources - including shoots - are placed into this namespace.\nYou can ask Gardener to use a specific namespace name in the project manifest but usually, this field should be left empty. The namespace then gets created automatically by Gardener’s project-controller, with its name getting generated from the project’s name, prefixed by “garden-”.\nResourceQuotas - if any - will be enforced on the project namespace.\nQuotas Since all Gardener resources are custom Kubernetes resources, the usual and well established concept of resourceQuotas in Kubernetes can also be applied to Gardener resources. With a resourceQuota that sets a hard limit on, e.g., count/shoots.core.gardener.cloud, you can restrict the number of shoot clusters that can be created in a project. Infrastructure Secrets For Gardener to create all relevant infrastructure that a shoot cluster needs inside a cloud provider, it needs to know how to authenticate to the cloud provider’s API. This is done through regular secrets.\nThrough the Gardener dashboard, secrets can be created for each supported cloud provider (using the dashboard is the preferred way, as it provides interactive help on what information needs to be placed into the secret and how the corresponding user account on the cloud provider should be configured). All of that is stored in a standard, opaque Kubernetes secret.\nInside of a shoot manifest, a reference to that secret is given so that Gardener knows which secret to use for a given shoot. Consequently, different shoots, even though they are in the same project, can be created on multiple different cloud provider accounts. However, instead of referring to the secret directly, Gardener introduces another layer of indirection called a SecretBinding.\nIn the shoot manifest, we refer to a SecretBinding and the SecretBinding in turn refers to the actual secret.\nSecretBindings With SecretBindings, it is possible to reference the same infrastructure secret in different projects across namespaces. This has the following advantages:​\n Infrastructure secrets can be kept in one project (and thus namespace) with limited access. Through SecretsBindings, the secrets can be used in other projects (and thus namespaces) without being able to read their contents.​ Infrastructure secrets can be kept at one central place (a dedicated project) and be used by many other projects. This way, if a credential rotation is required, they only need to be changed in the secrets at that central place and not in all projects that reference them. Service Accounts Since Gardener is 100% Kubernetes, it can be easily used in a programmatic way - by just sending the resource manifest of a Gardener resource to its API server. To do so, a kubeconfig file and a (technical) user that the kubeconfig maps to are required.\nNext to project members, a project can have several service accounts - simple Kubernetes service accounts that are created in a project’s namespace. Consequently, every service account will also have its own, dedicated kubeconfig and they can be granted different roles through RoleBindings.\nTo integrate Gardener with other infrastructure or CI/CD platforms, one can create a service account, obtain its kubeconfig and then automatically send shoot manifests to the Gardener API server. With that, Kubernetes clusters can be created, modified or deleted on the fly whenever they are needed.\n","categories":"","description":"","excerpt":"Overview Gardener is all about Kubernetes clusters, which we call …","ref":"/docs/getting-started/project/","tags":"","title":"Gardener Projects"},{"body":"","categories":"","description":"Gardener extension controllers for the supported container network interfaces","excerpt":"Gardener extension controllers for the supported container network …","ref":"/docs/extensions/network-extensions/","tags":"","title":"Network Extensions"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/proposals/","tags":"","title":"Proposals"},{"body":"Overview In this topic you can see various shoot statuses and how you can use them to monitor your shoot cluster.\nShoot Status - Conditions You can retrieve the shoot status by using kubectl get shoot -oyaml\nIt contains conditions, which give you information about the healthiness of your cluster. Those conditions are also forwarded to the Gardener dashboard and show your cluster as healthy or unhealthy.\nShoot Status - Constraints The shoot status also contains constraints. If these constraints are met, your cluster operations are impaired and the cluster is likely to fail at some point. Please watch them and act accordingly.\nShoot Status - Last Operation The lastOperation, lastErrors, and lastMaintenance give you information on what was last happening in your clusters. This is especially useful when you are facing an error.\nIn this example, nodes are being recreated and not all machines have reached the desired state yet.\nShoot Status - Credentials Rotation You can also see the status of the last credentials rotation. Here you can also programmatically derive when the last rotation was down in order to trigger the next rotation.\n","categories":"","description":"","excerpt":"Overview In this topic you can see various shoot statuses and how you …","ref":"/docs/getting-started/observability/shoot-status/","tags":"","title":"Shoot Status"},{"body":"","categories":"","description":"The infrastructure, networking, OS and other extension components for Gardener","excerpt":"The infrastructure, networking, OS and other extension components for …","ref":"/docs/extensions/","tags":"","title":"List of Extensions"},{"body":"","categories":"","description":"Gardener extensions for the supported container runtime interfaces","excerpt":"Gardener extensions for the supported container runtime interfaces","ref":"/docs/extensions/container-runtime-extensions/","tags":"","title":"Container Runtime Extensions"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/deployment/","tags":"","title":"Deployment"},{"body":"External DNS Management When you deploy to Kubernetes, there is no native management of external DNS. Instead, the cloud-controller-manager requests (mostly IPv4) addresses for every service of type LoadBalancer. Of course, the Ingress resource helps here, but how is the external DNS entry for the ingress controller managed?\nEssentially, some sort of automation for DNS management is missing.\nAutomating DNS Management From a user’s perspective, it is desirable to work with already known resources and concepts. Hence, the DNS management offered by Gardener plugs seamlessly into Kubernetes resources and you do not need to “leave” the context of the shoot cluster.\nTo request a DNS record creation / update, a Service or Ingress resource is annotated accordingly. The shoot-dns-service extension will (if configured) will pick up the request and create a DNSEntry resource + reconcile it to have an actual DNS record created at a configured DNS provider. Gardener supports the following providers:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns For more information, see DNS Names.\nDNS Provider For the above to work, we need some ingredients. Primarily, this is implemented via a so-called DNSProvider. Every shoot has a default provider that is used to set up the API server’s public DNS record. It can be used to request sub-domains as well.\nIn addition, a shoot can reference credentials to a DNS provider. Those can be used to manage custom domains.\nPlease have a look at the documentation for further details.\n","categories":"","description":"","excerpt":"External DNS Management When you deploy to Kubernetes, there is no …","ref":"/docs/getting-started/features/dns-management/","tags":"","title":"External DNS Management"},{"body":"Overview A Kubernetes cluster consists of a control plane and a data plane. The data plane runs the actual containers on worker nodes (which translate to physical or virtual machines). For the control and data plane to work together properly, lots of components need matching configuration.\nSome configurations are standardized but some are also very specific to the needs of a cluster’s user / workload. Ideally, you want a properly configured cluster with the possibility to fine-tune some settings.\nConcept of a “Shoot” In Gardener, Kubernetes clusters (with their control plane and their data plane) are called shoot clusters or simply shoots. For Gardener, a shoot is just another Kubernetes resource. Gardener components watch it and act upon changes (e.g., creation). It comes with reasonable default settings but also allows fine-tuned configuration. And on top of it, you get a status providing health information, information about ongoing operations, and so on.\nLuckily there is a dashboard to get started.\nBasic Configuration Options Every cluster needs a name - after all, it is a Kubernetes resource and therefore unique within a namespace.\nThe Kubernetes version will be used as a starting point. Once a newer version is available, you can always update your existing clusters (but not downgrade, as this is not supported by Kubernetes in general).\nThe “purpose” affects some configuration (like automatic deployment of a monitoring stack or setting up certain alerting rules) and generally indicates the importance of a cluster.\nStart by selecting the infrastructure you want to use. The choice will be mapped to a cloud profile that contains provider specific information like the available (actual) OS images, zones and regions or machine types.\nEach data plane runs in an infrastructure account owned by the end user. By selecting the infrastructure secret containing the accounts credentials, you are granting Gardener access to the respective account to create / manage resources.\nNote Changing the account after the creation of a cluster is not possible. The credentials can be updated with a new key or even user but have to stay within the same account.\nCurrently, there is no way to move a single cluster to a different account. You would rather have to re-create a cluster and migrate workloads by different means.\n As part of the infrastructure you chose, the region for data plane has to be chosen as well. The Gardener scheduler will try to place the control plane on a seed cluster based on a minimal distance strategy. See Gardener Scheduler for more details.\nUp next, the networking provider (CNI) for the cluster has to be selected. At the point of writing, it is possible to choose between Calico and Cilium. If not specified in the shoot’s manifest, default CIDR ranges for nodes, services, and pods will be used.\nIn order to run any workloads in your cluster, you need nodes. The worker section lets you specify the most important configuration options. For beginners, the machine type is probably the most relevant field, together with the machine image (operating system).\nThe machine type is provider-specific and configured in the cloud profile. Check your respective cloud profile if you’re missing a machine type. Maybe it is available in general but unavailable in your selected region.\nThe operating system your machines will run is the next thing to choose. Debian-based GardenLinux is the best choice for most use cases.\nOther specifications for the workers include the volume type and size. These settings affect the root disk of each node. Therefore we would always recommend to use an SSD-based type to avoid i/o issues.\nCaveat Some machine types (e.g., bare-metal machine types on OpenStack) require you to omit the volume type and volume size settings. The autoscaler parameter defines the initial elasticity / scalability of your cluster. The cluster-autoscaler will add more nodes up to the maximum defined here when your workload grows and remove nodes in case your workload shrinks. The minimum number of nodes should be equal to or higher than the number of zones. You can distribute the nodes of a worker pool among all zones available to your cluster. This is the first step in running HA workloads.\nOnce per day, all clusters reconcile. This means all controllers will check if there are any updates they have to apply (e.g., new image version for ETCD). The maintenance window defines when this daily operation will be triggered. It is important to understand that there is no opt-out for reconciliation.\nIt is also possible to confine updates to the shoot spec to be applied only during this time. This can come in handy when you want to bundle changes or prevent changes to be applied outside a well-known time window.\nYou can allow Gardener to automatically update your cluster’s Kubernetes patch version and/or OS version (of the nodes). Take this decision consciously! Whenever a new Kubernetes patch version or OS version is set to supported in the respective cloud profile, auto update will upgrade your cluster during the next maintenance window. If you fail to (manually) upgrade the Kubernetes or OS version before they expire, force-upgrades will take place during the maintenance window.\nResult The result of your provided inputs and a set of conscious default values is a shoot resource that, once applied, will be acted upon by various Gardener components. The status section represents the intermediate steps / results of these operations. A typical shoot creation flow would look like this:\n Assign control plane to a seed. Create infrastructure resources in the data plane account (e.g., VPC, gateways, …) Deploy control plane incl. DNS records. Create nodes (VMs) and bootstrap kubelets. Deploy kube-system components to nodes. How to Access a Shoot Static credentials for shoots were discontinued in Gardener with Kubernetes v1.27. Short lived credentials need to be used instead. You can create/request tokens directly via Gardener or delegate authentication to an identity provider.\nA short-lived admin kubeconfig can be requested by using kubectl. If this is something you do frequently, consider switching to gardenlogin, which helps you with it.\nAn alternative is to use an identity provider and issue OIDC tokens.\nWhat can you configure? With the basic configuration options having been introduced, it is time to discuss more possibilities. Gardener offers a variety of options to tweak the control plane’s behavior - like defining an event TTL (default 1h), adding an OIDC configuration or activating some feature gates. You could alter the scheduling profile and define an audit logging policy. In addition, the control plane can be configured to run in HA mode (applied on a node or zone level), but keep in mind that once you enable HA, you cannot go back.\nIn case you have specific requirements for the cluster internal DNS, Gardener offers a plugin mechanism for custom core DNS rules or optimization with node-local DNS. For more information, see Custom DNS Configuration and NodeLocalDNS Configuration.\nAnother category of configuration options is dedicated to the nodes and the infrastructure they are running on. Every provider has their own perks and some of them are exposed. Check the detailed documentation of the relevant extension for your infrastructure provider.\nYou can fine-tune the cluster-autoscaler or help the kubelet to cope better with your workload.\nWorker Pools There are a couple of ways to configure a worker pool. One of them is to set everything in the Gardener dashboard. However, only a subset of options is presented there.\nA slightly more complex way is to set the configuration through the yaml file itself.\nThis allows you to configure much more properties of a worker pool, like the timeout after which an unhealthy machine is getting replaced. For more options, see the Worker API reference.\nHow to Change Things Since a shoot is just another Kubernetes resource, changes can be applied via kubectl. For convenience, the basic settings are configurable via the dashboard’s UI. It also has a “yaml” tab where you can alter all of the shoot’s specification in your browser. Once applied, the cluster will reconcile eventually and your changes become active (or cause an error).\nImmutability in a Shoot While Gardener allows you to modify existing shoot clusters, it is important to remember that not all properties of a shoot can be changed after it is created.\nFor example, it is not possible to move a shoot to a different infrastructure account. This is mainly rooted in the fact that discs and network resources are bound to your account.\nAnother set of options that become immutable are most of the network aspects of a cluster. On an infrastructure level the VPC cannot be changed and on a cluster level things like the pod / service cidr ranges, together with the nodeCIDRmask, are set for the lifetime of the cluster.\nSome other things can be changed, but not reverted. While it is possible to add more zones to a cluster on an infrastructure level (assuming that an appropriate CIDR range is available), removing zones is not supported. Similarly, upgrading Kubernetes versions is comparable to a one-way ticket. As of now, Kubernetes does not support downgrading. Lastly, the HA setting of the control plane is immutable once specified.\nCrazy Botany Since remembering all these options can be quite challenging, here is very helpful resource - an example shoot with all the latest options 🎉\n","categories":"","description":"","excerpt":"Overview A Kubernetes cluster consists of a control plane and a data …","ref":"/docs/getting-started/shoots/","tags":"","title":"Gardener Shoots"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/other-components/machine-controller-manager/todo/","tags":"","title":"ToDo"},{"body":"","categories":"","description":"Other Gardener extensions","excerpt":"Other Gardener extensions","ref":"/docs/extensions/others/","tags":"","title":"Others"},{"body":"Certificate Management For proper consumption, any service should present a TLS certificate to its consumers. However, self-signed certificates are not fit for this purpose - the certificate should be signed by a CA trusted by an application’s userbase. Luckily, Issuers like Let’s Encrypt and others help here by offering a signing service that issues certificates based on the ACME challenge (Automatic Certificate Management Environment).\nThere are plenty of tools you can use to perform the challenge. For Kubernetes, cert-manager certainly is the most common, however its configuration is rather cumbersome and error prone. So let’s see how a Gardener extension can help here.\nManage Certificates with Gardener You may annotate a Service or Ingress resource to trigger the cert-manager to request a certificate from the any configured issuer (e.g. Let’s Encrypt) and perform the challenge. A Gardener operator can add a default issuer for convenience. With the DNS extension discussed previously, setting up the DNS TXT record for the ACME challenge is fairly easy. The requested certificate can be customized by the means of several other annotations known to the controller. Most notably, it is possible to specify SANs via cert.gardener.cloud/dnsnames to accommodate domain names that have more than 64 characters (the limit for the CN field).\nThe user’s request for a certificate manifests as a certificate resource. The status, issuer, and other properties can be checked there.\nOnce successful, the resulting certificate will be stored in a secret and is ready for usage.\nWith additional configuration, it is also possible to define custom issuers of certificates.\nFor more information, see the Manage certificates with Gardener for public domain topic and the cert-management repository.\n","categories":"","description":"","excerpt":"Certificate Management For proper consumption, any service should …","ref":"/docs/getting-started/features/certificate-management/","tags":"","title":"Certificate Management"},{"body":"Overview A cluster has a data plane and a control plane. The data plane is like a space station. It has certain components which keep everyone / everything alive and can operate autonomously to a certain extent. However, without mission control (and the occasional delivery of supplies) it cannot share information or receive new instructions.\nSo let’s see what the mission control (control plane) of a Kubernetes cluster looks like.\nKubeception Kubeception - Kubernetes in Kubernetes in Kubernetes\nIn the classic setup, there is a dedicated host / VM to host the master components / control plane of a Kubernetes cluster. However, these are just normal programs that can easily be put into containers. Once in containers, we can make Kubernetes Deployments and StatefulSets (for the etcd) watch over them. And now we put all that into a separate, dedicated Kubernetes cluster - et voilà, we have Kubernetes in Kubernetes, aka Kubeception (named after the famous movie Inception with Leonardo DiCaprio).\nIn Gardener’s terminology, the cluster hosting the control plane components is called a seed cluster. The cluster that end users actually use (and whose control plane is hosted in the seed) is called a shoot cluster.\nControl Plane Components on the Seed All control-plane components of a shoot cluster run in a dedicated namespace on the seed.\nA control plane has lots of components:\n Everything needed to run vanilla Kubernetes etcd main \u0026 events (split for performance reasons) Kube-.*-manager CSI driver Additionally, we deploy components needed to manage the cluster:\n Gardener Resource Manager (GRM) Machine Controller Manager (MCM) DNS Management VPN There is also a set of components making our life easier (logging, monitoring) or adding additional features (cert manager).\nCore Components Let’s take a close look at the API server as well as etcd.\nSecrets are encrypted at rest. When asking etcd for the data, the reply is still encrypted. Decryption is done by the API server which knows the necessary key.\nFor non-HA clusters etcd has only 1 replica, while for HA clusters there are 3 replicas.\nOne special remark is needed for Gardener’s deployment of etcd. The pods coming from the etcd-main StatefulSet contain two containers - one runs etcd, the other runs a program that periodically backs up etcd’s contents to an object store that is set up per seed cluster to make sure no data is lost. After all, etcd is the Achilles heel of all Kubernetes clusters. The backup container is also capable of performing a restore from the object store as well as defragment and compact the etcd datastore. For performance reasons, Gardener stores Kubernetes events in a separate etcd instance. By default, events are retained for 1h but can be kept longer if defined in the shoot.spec.\nThe kube API server (often called “kapi”) scales both horizontally and vertically.\nThe kube API server is not directly exposed / reachable via its public hostname. Instead, Gardener runs a single LoadBalancer service backed by an istio gateway / envoy, which uses SNI to forward traffic.\nThe kube-controller-manager (aka KCM) is the component that contains all the controllers for the core Kubernetes objects such as Deployments, Services, PVCs, etc.\nThe Kubernetes scheduler will assign pods to nodes.\nThe Cloud Controller Manager (aka CCM) is the component that contains all functionality to talk to Cloud environments (e.g., create LoadBalancer services).\nThe CSI driver is the storage subsystem of Kubernetes. It provisions and manages anything related to persistence.\nWithout the cluster autoscaler, nodes could not be added or removed based on current pressure on the cluster resources. Without the VPA, pods would have fixed resource limits that could not change on demand.\nGardener-Specific Components Shoot DNS service: External DNS management for resources within the cluster.\nMachine Controller Manager: Responsible for managing VMs which will become nodes in the cluster.\nVirtual Private Network deployments (aka VPN): Almost every communication between Kubernetes controllers and the API server is unidirectional - the controllers are given a kubeconfig and will establish a connection to the API server, which is exposed to all nodes of the cluster through a LoadBalancer. However, there are a few operations that require the API server to connect to the kubelet instead (e.g., for every webhook, when using kubectl exec or kubectl logs). Since every good Kubernetes cluster will have its worker nodes shielded behind firewalls to reduce the attack surface, Gardener establishes a VPN connection from the shoot’s internal network to the API server in the seed. For that, every shoot, as well as every control plane namespace in the seed, have openVPN pods in them that connect to each other (with the connection being established from the shoot to the seed).\nGardener Resource Manager: Tooling to deploy and manage Kubernetes resources required for cluster functionality.\nMachines Machine Controller Manager (aka MCM):\nThe machine controller manager, which lives on the seed in a shoot’s control plane namespace, is the key component responsible for provisioning and removing worker nodes for a Kubernetes cluster. It acts on MachineClass, MachineDeployment, and MachineSet resources in the seed (think of them as the equivalent of Deployments and ReplicaSets) and controls the lifecycle of machine objects. Through a system of plugins, the MCM is the component that phones to the cloud provider’s API and bootstraps virtual machines.\nFor more information, see MCM and Cluster-autoscaler.\nManagedResources Gardener Resource Manager (aka GRM):\nGardener not only deploys components into the control plane namespace of the seed but also to the shoot (e.g., the counterpart of the VPN). Together with the components in the seed, Gardener needs to have a way to reconcile them.\nEnter the GRM - it reconciles on ManagedResources objects, which are descriptions of Kubernetes resources which are deployed into the seed or shoot by GRM. If any of these resources are modified or deleted by accident, the usual observe-analyze-act cycle will revert these potentially malicious changes back to the values that Gardener envisioned. In fact, all the components found in a shoot’s kube-system namespace are ManagedResources governed by the GRM. The actual resource definition is contained in secrets (as they may contain “secret” data), while the ManagedResources contain a reference to the secret containing the actual resource to be deployed and reconciled.\nDNS Records - “Internal” and “External” The internal domain name is used by all Gardener components to talk to the API server. Even though it is called “internal”, it is still publicly routable.\nBut most importantly, it is pre-defined and not configurable by the end user.\nTherefore, the “external” domain name exists. It is either a user owned domain or can be pre-defined for a Gardener landscape. It is used by any end user accessing the cluster’s API server.\nFor more information, see Contract: DNSRecord Resources.\nFeatures and Observability Gardener runs various health checks to ensure that the cluster works properly. The Network Problem Detector gives information about connectivity within the cluster and to the API server.\nCertificate Management: allows to request certificates via the ACME protocol (e.g., issued by Let’s Encrypt) from within the cluster. For detailed information, have a look at the cert-manager project.\nObservability stack: Gardener deploys observability components and gathers logs and metrics for the control-plane \u0026 kube-system namespace. Also provided out-of-the-box is a UI based on Plutono (fork of Grafana) with pre-defined dashboards to access and query the monitoring data. For more information, see Observability.\nHA Control Plane As the title indicates, the HA control plane feature is only about the control plane. Setting up the data plane to span multiple zones is part of the worker spec of a shoot.\nHA control planes can be configured as part of the shoot’s spec. The available types are:\n Node Zone Both work similarly and just differ in the failure domain the concepts are applied to.\nFor detailed guidance and more information, see the High Availability Guides.\nZonal HA Control Planes Zonal HA is the most likely setup for shoots with purpose: production.\nThe starting point is a regular (non-HA) control plane. etcd and most controllers are singletons and the kube-apiserver might have been scaled up to several replicas.\nTo get to an HA setup we need:\n A minimum of 3 replicas of the API server 3 replicas for etcd (both main and events) A second instance for each controller (e.g., controller manager, csi-driver, scheduler, etc.) that can take over in case of failure (active / passive). To distribute those pods across zones, well-known concepts like PodTopologySpreadConstraints or Affinities are applied.\nkube-system Namespace For a fully functional cluster, a few components need to run on the data plane side of the diagram. They all exist in the kube-system namespace. Let’s have a closer look at them.\nNetworking On each node we need a CNI (container network interface) plugin. Gardener offers Calico or Cilium as network provider for a shoot. When using Calico, a kube-proxy is deployed. Cilium does not need a kube-proxy, as it takes care of its tasks as well.\nThe CNI plugin ensures pod-to-pod communication within the cluster. As part of it, it assigns cluster-internal IP addresses to the pods and manages the network devices associated with them. When an overlay network is enabled, calico will also manage the routing of pod traffic between different nodes.\nOn the other hand, kube-proxy implements the actual service routing (cilium can do this as well and no kube-proxy is needed). Whenever packets go to a service’s IP address, they are re-routed based on IPtables rules maintained by kube-proxy to reach the actual pods backing the service. kube-proxy operates on endpoint-slices and manages IPtables on EVERY node. In addition, kube-proxy provides a health check endpoint for services with externalTrafficPolicy=local, where traffic only gets to nodes that run a pod matching the selector of the service.\nThe egress filter implements basic filtering of outgoing traffic to be compliant with SAP’s policies.\nAnd what happens if the pods crashloop, are missing or otherwise broken?\nWell, in case kube-proxy is broken, service traffic will degrade over time (depending on the pod churn rate and how many kube-proxy pods are broken).\nWhen calico is failing on a node, no new pods can start there as they don’t get any IP address assigned. It might also fail to add routes to newly added nodes. Depending on the error, deleting the pod might help.\nDNS System For a normal service in Kubernetes, a cluster-internal DNS record that resolves to the service’s ClusterIP address is being created. In Gardener (similar to most other Kubernetes offerings) CoreDNS takes care of this aspect. To reduce the load when it comes to upstream DNS queries, Gardener deploys a DNS cache to each node by default. It will also forward queries outside the cluster’s search domain directly to the upstream DNS server. For more information, see NodeLocalDNS Configuration and DNS autoscaling.\nIn addition to this optimization, Gardener allows custom DNS configuration to be added to CoreDNS via a dedicated ConfigMap.\nIn case this customization is related to non-Kubernetes entities, you may configure the shoot’s NodeLocalDNS to forward to CoreDNS instead of upstream (disableForwardToUpstreamDNS: true).\nA broken DNS system on any level will cause disruption / service degradation for applications within the cluster.\nHealth Checks and Metrics Gardener deploys probes checking the health of individual nodes. In a similar fashion, a network health check probes connectivity within the cluster (node to node, pod to pod, pod to api-server, …).\nThey provide the data foundation for Gardener’s monitoring stack together with the metrics collecting / exporting components.\nConnectivity Components From the perspective of the data plane, the shoot’s API server is reachable via the cluster-internal service kubernetes.default.svc.cluster.local. The apiserver-proxy intercepts connections to this destination and changes it so that the traffic is forwarded to the kube-apiserver service in the seed cluster. For more information, see kube-apiserver via apiserver-proxy.\nThe second component here is the VPN shoot. It initiates a VPN connection to its counterpart in the seed. This way, there is no open port / Loadbalancer needed on the data plane. The VPN connection is used for any traffic flowing from the control plane to the data plane. If the VPN connection is broken, port-forwarding or log querying with kubectl will not work. In addition, webhooks will stop functioning properly.\ncsi-driver The last component to mention here is the csi-driver that is deployed as a Daemonset to all nodes. It registers with the kubelet and takes care of the mounting of volume types it is responsible for.\n","categories":"","description":"","excerpt":"Overview A cluster has a data plane and a control plane. The data …","ref":"/docs/getting-started/ca-components/","tags":"","title":"Control Plane Components"},{"body":"Frequently Asked Questions The answers in this FAQ apply to the newest (HEAD) version of Machine Controller Manager. If you’re using an older version of MCM please refer to corresponding version of this document. Few of the answers assume that the MCM being used is in conjuction with cluster-autoscaler:\nTable of Contents: Basics\n What is Machine Controller Manager? Why is my machine deleted? What are the different sub-controllers in MCM? What is Safety Controller in MCM? How to?\n How to install MCM in a Kubernetes cluster? How to better control the rollout process of the worker nodes? How to scale down MachineDeployment by selective deletion of machines? How to force delete a machine? How to pause the ongoing rolling-update of the machinedeployment? How to delete machine object immedietly if I don’t have access to it? How to avoid garbage collection of your node? How to trigger rolling update of a machinedeployment? Internals\n What is the high level design of MCM? What are the different configuration options in MCM? What are the different timeouts/configurations in a machine’s lifecycle? How is the drain of a machine implemented? How are the stateful applications drained during machine deletion? How does maxEvictRetries configuration work with drainTimeout configuration? What are the different phases of a machine? What health checks are performed on a machine? How does rate limiting replacement of machine work in MCM ? How is it related to meltdown protection? How MCM responds when scale-out/scale-in is done during rolling update of a machinedeployment? How some unhealthy machines are drained quickly? How does MCM prioritize the machines for deletion on scale-down of machinedeployment? Troubleshooting\n My machine is stuck in deletion for 1 hr, why? My machine is not joining the cluster, why? Developer\n How should I test my code before submitting a PR? I need to change the APIs, what are the recommended steps? How can I update the dependencies of MCM? In the context of Gardener\n How can I configure MCM using Shoot resource? How is my worker-pool spread across zones? Basics What is Machine Controller Manager? Machine Controller Manager aka MCM is a bunch of controllers used for the lifecycle management of the worker machines. It reconciles a set of CRDs such as Machine, MachineSet, MachineDeployment which depicts the functionality of Pod, Replicaset, Deployment of the core Kubernetes respectively. Read more about it at README.\n Gardener uses MCM to manage its Kubernetes nodes of the shoot cluster. However, by design, MCM can be used independent of Gardener. Why is my machine deleted? A machine is deleted by MCM generally for 2 reasons-\n Machine is unhealthy for at least MachineHealthTimeout period. The default MachineHealthTimeout is 10 minutes.\n By default, a machine is considered unhealthy if any of the following node conditions - DiskPressure, KernelDeadlock, FileSystem, Readonly is set to true, or KubeletReady is set to false. However, this is something that is configurable using the following flag. Machine is scaled down by the MachineDeployment resource.\n This is very usual when an external controller cluster-autoscaler (aka CA) is used with MCM. CA deletes the under-utilized machines by scaling down the MachineDeployment. Read more about cluster-autoscaler’s scale down behavior here. What are the different sub-controllers in MCM? MCM mainly contains the following sub-controllers:\n MachineDeployment Controller: Responsible for reconciling the MachineDeployment objects. It manages the lifecycle of the MachineSet objects. MachineSet Controller: Responsible for reconciling the MachineSet objects. It manages the lifecycle of the Machine objects. Machine Controller: responsible for reconciling the Machine objects. It manages the lifecycle of the actual VMs/machines created in cloud/on-prem. This controller has been moved out of tree. Please refer an AWS machine controller for more info - link. Safety-controller: Responsible for handling the unidentified/unknown behaviors from the cloud providers. Please read more about its functionality below. What is Safety Controller in MCM? Safety Controller contains following functions:\n Orphan VM handler: It lists all the VMs in the cloud matching the tag of given cluster name and maps the VMs with the machine objects using the ProviderID field. VMs without any backing machine objects are logged and deleted after confirmation. This handler runs every 30 minutes and is configurable via machine-safety-orphan-vms-period flag. Freeze mechanism: Safety Controller freezes the MachineDeployment and MachineSet controller if the number of machine objects goes beyond a certain threshold on top of Spec.Replicas. It can be configured by the flag –safety-up or –safety-down and also machine-safety-overshooting-period. Safety Controller freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. Safety Controller unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. How to? How to install MCM in a Kubernetes cluster? MCM can be installed in a cluster with following steps:\n Apply all the CRDs from here\n Apply all the deployment, role-related objects from here.\n Control cluster is the one where the machine-* objects are stored. Target cluster is where all the node objects are registered. How to better control the rollout process of the worker nodes? MCM allows configuring the rollout of the worker machines using maxSurge and maxUnavailable fields. These fields are applicable only during the rollout process and means nothing in general scale up/down scenarios. The overall process is very similar to how the Deployment Controller manages pods during RollingUpdate.\n maxSurge refers to the number of additional machines that can be added on top of the Spec.Replicas of MachineDeployment during rollout process. maxUnavailable refers to the number of machines that can be deleted from Spec.Replicas field of the MachineDeployment during rollout process. How to scale down MachineDeployment by selective deletion of machines? During scale down, triggered via MachineDeployment/MachineSet, MCM prefers to delete the machine/s which have the least priority set. Each machine object has an annotation machinepriority.machine.sapcloud.io set to 3 by default. Admin can reduce the priority of the given machines by changing the annotation value to 1. The next scale down by MachineDeployment shall delete the machines with the least priority first.\nHow to force delete a machine? A machine can be force deleted by adding the label force-deletion: \"True\" on the machine object before executing the actual delete command. During force deletion, MCM skips the drain function and simply triggers the deletion of the machine. This label should be used with caution as it can violate the PDBs for pods running on the machine.\nHow to pause the ongoing rolling-update of the machinedeployment? An ongoing rolling-update of the machine-deployment can be paused by using spec.paused field. See the example below:\napiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: test-machine-deployment spec: paused: true It can be unpaused again by removing the Paused field from the machine-deployment.\nHow to delete machine object immedietly if I don’t have access to it? If the user doesn’t have access to the machine objects (like in case of Gardener clusters) and they would like to replace a node immedietly then they can place the annotation node.machine.sapcloud.io/trigger-deletion-by-mcm: \"true\" on their node. This will start the replacement of the machine with a new node.\nOn the other hand if the user deletes the node object immedietly then replacement will start only after MachineHealthTimeout.\nThis annotation can also be used if the user wants to expedite the replacement of unhealthy nodes\nNOTE:\n node.machine.sapcloud.io/trigger-deletion-by-mcm: \"false\" annotation is NOT acted upon by MCM , neither does it mean that MCM will not replace this machine. this annotation would delete the desired machine but another machine would be created to maintain desired replicas specified for the machineDeployment/machineSet. Currently if the user doesn’t have access to machineDeployment/machineSet then they cannot remove a machine without replacement. How to avoid garbage collection of your node? MCM provides an in-built safety mechanism to garbage collect VMs which have no corresponding machine object. This is done to save costs and is one of the key features of MCM. However, sometimes users might like to add nodes directly to the cluster without the help of MCM and would prefer MCM to not garbage collect such VMs. To do so they should remove/not-use tags on their VMs containing the following strings:\n kubernetes.io/cluster/ kubernetes.io/role/ kubernetes-io-cluster- kubernetes-io-role- How to trigger rolling update of a machinedeployment? Rolling update can be triggered for a machineDeployment by updating one of the following:\n .spec.template.annotations .spec.template.spec.class.name Internals What is the high level design of MCM? Please refer the following document.\nWhat are the different configuration options in MCM? MCM allows configuring many knobs to fine-tune its behavior according to the user’s need. Please refer to the link to check the exact configuration options.\nWhat are the different timeouts/configurations in a machine’s lifecycle? A machine’s lifecycle is governed by mainly following timeouts, which can be configured here.\n MachineDrainTimeout: Amount of time after which drain times out and the machine is force deleted. Default ~2 hours. MachineHealthTimeout: Amount of time after which an unhealthy machine is declared Failed and the machine is replaced by MachineSet controller. MachineCreationTimeout: Amount of time after which a machine creation is declared Failed and the machine is replaced by the MachineSet controller. NodeConditions: List of node conditions which if set to true for MachineHealthTimeout period, the machine is declared Failed and replaced by MachineSet controller. MaxEvictRetries: An integer number depicting the number of times a failed eviction should be retried on a pod during drain process. A pod is deleted after max-retries. How is the drain of a machine implemented? MCM imports the functionality from the upstream Kubernetes-drain library. Although, few parts have been modified to make it work best in the context of MCM. Drain is executed before machine deletion for graceful migration of the applications. Drain internally uses the EvictionAPI to evict the pods and triggers the Deletion of pods after MachineDrainTimeout. Please note:\n Stateless pods are evicted in parallel. Stateful applications (with PVCs) are serially evicted. Please find more info in this answer below. How are the stateful applications drained during machine deletion? Drain function serially evicts the stateful-pods. It is observed that serial eviction of stateful pods yields better overall availability of pods as the underlying cloud in most cases detaches and reattaches disks serially anyways. It is implemented in the following manner:\n Drain lists all the pods with attached volumes. It evicts very first stateful-pod and waits for its related entry in Node object’s .status.volumesAttached to be removed by KCM. It does the same for all the stateful-pods. It waits for PvDetachTimeout (default 2 minutes) for a given pod’s PVC to be removed, else moves forward. How does maxEvictRetries configuration work with drainTimeout configuration? It is recommended to only set MachineDrainTimeout. It satisfies the related requirements. MaxEvictRetries is auto-calculated based on MachineDrainTimeout, if maxEvictRetries is not provided. Following will be the overall behavior of both configurations together:\n If maxEvictRetries isn’t set and only maxDrainTimeout is set: MCM auto calculates the maxEvictRetries based on the drainTimeout. If drainTimeout isn’t set and only maxEvictRetries is set: Default drainTimeout and user provided maxEvictRetries for each pod is considered. If both maxEvictRetries and drainTimoeut are set: Then both will be respected. If none are set: Defaults are respected. What are the different phases of a machine? A phase of a machine can be identified with Machine.Status.CurrentStatus.Phase. Following are the possible phases of a machine object:\n Pending: Machine creation call has succeeded. MCM is waiting for machine to join the cluster.\n CrashLoopBackOff: Machine creation call has failed. MCM will retry the operation after a minor delay.\n Running: Machine creation call has succeeded. Machine has joined the cluster successfully and corresponding node doesn’t have node.gardener.cloud/critical-components-not-ready taint.\n Unknown: Machine health checks are failing, eg kubelet has stopped posting the status.\n Failed: Machine health checks have failed for a prolonged time. Hence it is declared failed by Machine controller in a rate limited fashion. Failed machines get replaced immediately.\n Terminating: Machine is being terminated. Terminating state is set immediately when the deletion is triggered for the machine object. It also includes time when it’s being drained.\n NOTE: No phase means the machine is being created on the cloud-provider.\nBelow is a simple phase transition diagram: What health checks are performed on a machine? Health check performed on a machine are:\n Existense of corresponding node obj Status of certain user-configurable node conditions. These conditions can be specified using the flag --node-conditions for OOT MCM provider or can be specified per machine object. The default user configurable node conditions can be found here True status of NodeReady condition . This condition shows kubelet’s status If any of the above checks fails , the machine turns to Unknown phase.\nHow does rate limiting replacement of machine work in MCM? How is it related to meltdown protection? Currently MCM replaces only 1 Unkown machine at a time per machinedeployment. This means until the particular Unknown machine get terminated and its replacement joins, no other Unknown machine would be removed.\nThe above is achieved by enabling Machine controller to turn machine from Unknown -\u003e Failed only if the above condition is met. MachineSet controller on the other hand marks Failed machine as Terminating immediately.\nOne reason for this rate limited replacement was to ensure that in case of network failures , where node’s kubelet can’t reach out to kube-apiserver , all nodes are not removed together i.e. meltdown protection. In gardener context however, DWD is deployed to deal with this scenario, but to stay protected from corner cases , this mechanism has been introduced in MCM.\nNOTE: Rate limiting replacement is not yet configurable\nHow MCM responds when scale-out/scale-in is done during rolling update of a machinedeployment? Machinedeployment controller executes the logic of scaling BEFORE logic of rollout. It identifies scaling by comparing the deployment.kubernetes.io/desired-replicas of each machineset under the machinedeployment with machinedeployment’s .spec.replicas. If the difference is found for any machineSet, a scaling event is detected.\nCase scale-out -\u003e ONLY New machineSet is scaled out Case scale-in -\u003e ALL machineSets(new or old) are scaled in , in proportion to their replica count , any leftover is adjusted in the largest machineSet.\nDuring update for scaling event, a machineSet is updated if any of the below is true for it:\n .spec.Replicas needs update deployment.kubernetes.io/desired-replicas needs update Once scaling is achieved, rollout continues.\nHow does MCM prioritize the machines for deletion on scale-down of machinedeployment? There could be many machines under a machinedeployment with different phases, creationTimestamp. When a scale down is triggered, MCM decides to remove the machine using the following logic:\n Machine with least value of machinepriority.machine.sapcloud.io annotation is picked up. If all machines have equal priorities, then following precedence is followed: Terminating \u003e Failed \u003e CrashloopBackoff \u003e Unknown \u003e Pending \u003e Available \u003e Running If still there is no match, the machine with oldest creation time (.i.e. creationTimestamp) is picked up. How some unhealthy machines are drained quickly ? If a node is unhealthy for more than the machine-health-timeout specified for the machine-controller, the controller health-check moves the machine phase to Failed. By default, the machine-health-timeout is 10` minutes.\nFailed machines have their deletion timestamp set and the machine then moves to the Terminating phase. The node drain process is initiated. The drain process is invoked either gracefully or forcefully.\nThe usual drain process is graceful. Pods are evicted from the node and the drain process waits until any existing attached volumes are mounted on new node. However, if the node Ready is False or the ReadonlyFilesystem is True for greater than 5 minutes (non-configurable), then a forceful drain is initiated. In a forceful drain, pods are deleted and VolumeAttachment objects associated with the old node are also marked for deletion. This is followed by the deletion of the cloud provider VM associated with the Machine and then finally ending with the Node object deletion.\nDuring the deletion of the VM we only delete the local data disks and boot disks associated with the VM. The disks associated with persistent volumes are left un-touched as their attach/de-detach, mount/unmount processes are handled by k8s attach-detach controller in conjunction with the CSI driver.\nTroubleshooting My machine is stuck in deletion for 1 hr, why? In most cases, the Machine.Status.LastOperation provides information around why a machine can’t be deleted. Though following could be the reasons but not limited to:\n Pod/s with misconfigured PDBs block the drain operation. PDBs with maxUnavailable set to 0, doesn’t allow the eviction of the pods. Hence, drain/eviction is retried till MachineDrainTimeout. Default MachineDrainTimeout could be as large as ~2hours. Hence, blocking the machine deletion. Short term: User can manually delete the pod in the question, with caution. Long term: Please set more appropriate PDBs which allow disruption of at least one pod. Expired cloud credentials can block the deletion of the machine from infrastructure. Cloud provider can’t delete the machine due to internal errors. Such situations are best debugged by using cloud provider specific CLI or cloud console. My machine is not joining the cluster, why? In most cases, the Machine.Status.LastOperation provides information around why a machine can’t be created. It could possibly be debugged with following steps:\n Firstly make sure all the relevant controllers like kube-controller-manager , cloud-controller-manager are running. Verify if the machine is actually created in the cloud. User can use the Machine.Spec.ProviderId to query the machine in cloud. A Kubernetes node is generally bootstrapped with the cloud-config. Please verify, if MachineDeployment is pointing the correct MachineClass, and MachineClass is pointing to the correct Secret. The secret object contains the actual cloud-config in base64 format which will be used to boot the machine. User must also check the logs of the MCM pod to understand any broken logical flow of reconciliation. My rolling update is stuck , why? The following can be the reason:\n Insufficient capacity for the new instance type the machineClass mentions. Old machines are stuck in deletion If you are using Gardener for setting up kubernetes cluster, then machine object won’t turn to Running state until node-critical-components are ready. Refer this for more details. Developer How should I test my code before submitting a PR? Developer can locally setup the MCM using following guide Developer must also enhance the unit tests related to the incoming changes. Developer can locally run the unit test by executing: make test-unit Developer can locally run integration tests to ensure basic functionality of MCM is not altered. I need to change the APIs, what are the recommended steps? Developer should add/update the API fields at both of the following places:\n https://github.com/gardener/machine-controller-manager/blob/master/pkg/apis/machine/types.go https://github.com/gardener/machine-controller-manager/tree/master/pkg/apis/machine/v1alpha1 Once API changes are done, auto-generate the code using following command:\nmake generate Please ignore the API-violation errors for now.\nHow can I update the dependencies of MCM? MCM uses gomod for depedency management. Developer should add/udpate depedency in the go.mod file. Please run following command to automatically tidy the dependencies.\nmake tidy In the context of Gardener How can I configure MCM using Shoot resource? All of the knobs of MCM can be configured by the workers section of the shoot resource.\n Gardener creates a MachineDeployment per zone for each worker-pool under workers section. workers.dataVolumes allows to attach multiple disks to a machine during creation. Refer the link. workers.machineControllerManager allows configuration of multiple knobs of the MachineDeployment from the shoot resource. How is my worker-pool spread across zones? Shoot resource allows the worker-pool to spread across multiple zones using the field workers.zones. Refer link.\n Gardener creates one MachineDeployment per zone. Each MachineDeployment is initiated with the following replica: MachineDeployment.Spec.Replicas = (Workers.Minimum)/(Number of availibility zones) ","categories":"","description":"Frequently Asked Questions","excerpt":"Frequently Asked Questions","ref":"/docs/other-components/machine-controller-manager/faq/","tags":"","title":"FAQ"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/gardener/monitoring/","tags":"","title":"Monitoring"},{"body":"","categories":"","description":"Other components included in the Gardener project","excerpt":"Other components included in the Gardener project","ref":"/docs/other-components/","tags":"","title":"Other Components"},{"body":"Gardener Dashboard \n \nDemo Documentation Gardener Dashboard Documentation\nLicense Apache License 2.0\nCopyright 2020 The Gardener Authors\n","categories":"","description":"The web UI for managing your projects and clusters","excerpt":"The web UI for managing your projects and clusters","ref":"/docs/dashboard/","tags":"","title":"Dashboard"},{"body":"Reconciliation in Kubernetes and Gardener The starting point of all reconciliation cycles is the constant observation of both the desired and actual state. A component would analyze any differences between the two states and try to converge the actual towards the desired state using appropriate actions. Typically, a component is responsible for a single resource type but it also watches others that have an implication on it.\nAs an example, the Kubernetes controller for ReplicaSets will watch pods belonging to it in order to ensure that the specified replica count is fulfilled. If one pod gets deleted, the controller will create a new pod to enforce the desired over the actual state.\nThis is all standard behaviour, as Gardener is following the native Kubernetes approach. All elements of a shoot cluster have a representation in Kubernetes resources and controllers are watching / acting upon them.\nIf we pick up the example of the ReplicaSet - a user typically creates a deployment resource and the ReplicaSet is implicitly generated on the way to create the pods. Similarly, Gardener takes the user’s intent (shoot) and creates lots of domain specific resources on the way. They all reconcile and make sure their actual and desired states match.\nUpdating the Desired State of a Shoot Based on the shoot’s specifications, Gardener will create network resources on a hyperscaler, backup resources for the ETCD, credentials, and other resources, but also representations of the worker pools. Eventually, this process will result in a fully functional Kubernetes cluster.\nIf you change the desired state, Gardener will reconcile the shoot and run through the same cycle to ensure the actual state matches the desired state.\nFor example, the (infrastructure-specific) machine type can be changed within the shoot resource. The following reconciliation will pick up the change and initiate the creation of new nodes with a different machine type and the removal of the old nodes.\nMaintenance Window and Daily Reconciliation EVERY shoot cluster reconciles once per day during the so-called “maintenance window”. You can confine the rollout of spec changes to this window.\nAdditionally, the daily reconciliation will help pick up all kind of version changes. When a new Gardener version was rolled out to the landscape, shoot clusters will pick up any changes during their next reconciliation. For example, if a new Calico version is introduced to fix some bug, it will automatically reach all shoots.\nImpact of a Change It is important to be aware of the impacts that a change can have on a cluster and the workloads within it.\nAn operator pushing a new Gardener version with a new calico image to a landscape will cause all calico pods to be re-created. Another example would be the rollout of a new etcd backup-restore image. This would cause etcd pods to be re-created, rendering a non-HA control plane unavailable until etcd is up and running again.\nWhen you change the shoot spec, it can also have significant impact on the cluster. Imagine that you have changes the machine type of a worker pool. This will cause new machines to be created and old machines to be deleted. Or in other words: all nodes will be drained, the pods will be evicted and then re-created on newly created nodes.\nKubernetes Version Update (Minor + Patch) Some operations are rather common and have to be performed on a regular basis. Updating the Kubernetes version is one them. Patch updates cause relatively little disruption, as only the control-plane pods will be re-created with new images and the kubelets on all nodes will restart.\nA minor version update is more impactful - it will cause all nodes to be recreated and rolls components of the control plane.\nOS Version Update The OS version is defined for each worker pool and can be changed per worker pool. You can freely switch back and forth. However, as there is no in-place update, each change will cause the entire worker pool to roll and nodes will be replaced. For OS versions different update strategies can be configured. Please check the documentation for details.\nAvailable Versions​ Gardener has a dedicated resource to maintain a list of available versions – the so-called cloudProfile.\nA cloudProfile provides information about supported​:\n Kubernetes versions​ OS versions (and where to find those images)​ Regions (and their zones)​ Machine types​ Each shoot references a cloudProfile in order to obtain information about available / possible versions and configurations.\nVersion Classifications Gardener has the following classifications for Kubernetes and OS image versions:\n preview: still in testing phase (several versions can be in preview at the same time)\n supported: recommended version\n deprecated: a new version has been set to “supported”, updating is recommended (might have an expiration date)\n expired: cannot be used anymore, clusters using this version will be force-upgraded\n Version information is maintained in the relevant cloud profile resource. There might be circumstances where a version will never become supported but instead move to deprecated directly. Similarly, a version might be directly introduced as supported.\nAutoUpdate / Forced Updates AutoUpdate for a machine image version will update all node pools to the latest supported version based on the defined update strategy. Whenever a new version is set to supported, the cluster will pick it up during its next maintenance window.\nFor Kubernetes versions the mechanism is the same, but only applied to patch version. This means that the cluster will be kept on the latest supported patch version of a specific minor version.\nIn case a version used in a cluster expires, there is a force update during the next maintenance window. In a worst case scenario, 2 minor versions expire simultaneously. Then there will be two consecutive minor updates enforced.\nFor more information, see Shoot Kubernetes and Operating System Versioning in Gardener.\nApplying Changes to a Seed It is important to keep in mind that a seed is just another Kubernetes cluster. As such, it has its own lifecycle (daily reconciliation, maintenance, etc.) and is also a subject to change.\nFrom time to time changes need to be applied to the seed as well. Some (like updating the OS version) cause the node pool to roll. In turn, this will cause the eviction of ALL pods running on the affected node. If your etcd is evicted and you don’t have a highly available control plane, it will cause downtime for your cluster. Your workloads will continue to run ,of course, but your cluster’s API server will not function until the etcd is up and running again.\n","categories":"","description":"","excerpt":"Reconciliation in Kubernetes and Gardener The starting point of all …","ref":"/docs/getting-started/lifecycle/","tags":"","title":"Shoot Lifecycle"},{"body":"Vertical Pod Autoscaler When a pod’s resource CPU or memory grows, it will hit a limit eventually. Either the pod has resource limits specified or the node will run short of resources. In both cases, the workload might be throttled or even terminated. When this happens, it is often desirable to increase the request or limits. To do this autonomously within certain boundaries is the goal of the Vertical Pod Autoscaler project.\nSince it is not part of the standard Kubernetes API, you have to install the CRDs and controller manually. With Gardener, you can simply flip the switch in the shoot’s spec and start creating your VPA objects.\nPlease be aware that VPA and HPA operate in similar domains and might interfere.\nA controller \u0026 CRDs for vertical pod auto-scaling can be activated via the shoot’s spec.\n","categories":"","description":"","excerpt":"Vertical Pod Autoscaler When a pod’s resource CPU or memory grows, it …","ref":"/docs/getting-started/features/vpa/","tags":"","title":"Vertical Pod Autoscaler"},{"body":"Obtaining Aditional Nodes The scheduler will assign pods to nodes, as long as they have capacity (CPU, memory, Pod limit, # attachable disks, …). But what happens when all nodes are fully utilized and the scheduler does not find any suitable target?\nOption 1: Evict other pods based on priority. However, this has the downside that other workloads with lower priority might become unschedulable.\nOption 2: Add more nodes. There is an upstream Cluster Autoscaler project that does exactly this. It simulates the scheduling and reacts to pods not being schedulable events. Gardener has forked it to make it work with machine-controller-manager abstraction of how node (groups) are defined in Gardener. The cluster autoscaler respects the limits (min / max) of any worker pool in a shoot’s spec. It can also scale down nodes based on utilization thresholds. For more details, see the autoscaler documentation.\nScaling by Priority For clusters with more than one node pool, the cluster autoscaler has to decide which group to scale up. By default, it randomly picks from the available / applicable. However, this behavior is customizable by the use of so-called expanders.\nThis section will focus on the priority based expander.\nEach worker pool gets a priority and the cluster autoscaler will scale up the one with the highest priority until it reaches its limit.\nTo get more information on the current status of the autoscaler, you can check a “status” configmap in the kube-system namespace with the following command:\nkubectl get cm -n kube-system cluster-autoscaler-status -oyaml\nTo obtain information about the decision making, you can check the logs of the cluster-autoscaler pod by using the shoot’s monitoring stack.\nFor more information, see the cluster-autoscaler FAQ and the Priority based expander for cluster-autoscaler topic.\n","categories":"","description":"","excerpt":"Obtaining Aditional Nodes The scheduler will assign pods to nodes, as …","ref":"/docs/getting-started/features/cluster-autoscaler/","tags":"","title":"Cluster Autoscaler"},{"body":"gardenctl-v2 \nWhat is gardenctl? gardenctl is a command-line client for the Gardener. It facilitates the administration of one or many garden, seed and shoot clusters. Use this tool to configure access to clusters and configure cloud provider CLI tools. It also provides support for accessing cluster nodes via ssh.\nInstallation Install the latest release from Homebrew, Chocolatey or GitHub Releases.\nInstall using Package Managers # Homebrew (macOS and Linux) brew install gardener/tap/gardenctl-v2 # Chocolatey (Windows) # default location C:\\ProgramData\\chocolatey\\bin\\gardenctl-v2.exe choco install gardenctl-v2 Attention brew users: gardenctl-v2 uses the same binary name as the legacy gardenctl (gardener/gardenctl) CLI. If you have an existing installation you should remove it with brew uninstall gardenctl before attempting to install gardenctl-v2. Alternatively, you can choose to link the binary using a different name. If you try to install without removing or relinking the old installation, brew will run into an error and provide instructions how to resolve it.\nInstall from Github Release If you install via GitHub releases, you need to\n put the gardenctl binary on your path and install gardenlogin. The other install methods do this for you.\n# Example for macOS # set operating system and architecture os=darwin # choose between darwin, linux, windows arch=amd64 # choose between amd64, arm64 # Get latest version. Alternatively set your desired version version=$(curl -s https://raw.githubusercontent.com/gardener/gardenctl-v2/master/LATEST) # Download gardenctl curl -LO \"https://github.com/gardener/gardenctl-v2/releases/download/${version}/gardenctl_v2_${os}_${arch}\" # Make the gardenctl binary executable chmod +x \"./gardenctl_v2_${os}_${arch}\" # Move the binary in to your PATH sudo mv \"./gardenctl_v2_${os}_${arch}\" /usr/local/bin/gardenctl Configuration gardenctl requires a configuration file. The default location is in ~/.garden/gardenctl-v2.yaml.\nYou can modify this file directly using the gardenctl config command. It allows adding, modifying and deleting gardens.\nExample config command:\n# Adapt the path to your kubeconfig file for the garden cluster (not to be mistaken with your shoot cluster) export KUBECONFIG=~/relative/path/to/kubeconfig.yaml # Fetch cluster-identity of garden cluster from the configmap cluster_identity=$(kubectl -n kube-system get configmap cluster-identity -ojsonpath={.data.cluster-identity}) # Configure garden cluster gardenctl config set-garden $cluster_identity --kubeconfig $KUBECONFIG This command will create or update a garden with the provided identity and kubeconfig path of your garden cluster.\nExample Config gardens: - identity: landscape-dev # Unique identity of the garden cluster. See cluster-identity ConfigMap in kube-system namespace of the garden cluster kubeconfig: ~/relative/path/to/kubeconfig.yaml # name: my-name # An alternative, unique garden name for targeting # context: different-context # Overrides the current-context of the garden cluster kubeconfig # patterns: ~ # List of regex patterns for pattern targeting Note: You need to have gardenlogin installed as kubectl plugin in order to use the kubeconfigs for Shoot clusters provided by gardenctl.\nConfig Path Overwrite The gardenctl config path can be overwritten with the environment variable GCTL_HOME. The gardenctl config name can be overwritten with the environment variable GCTL_CONFIG_NAME. export GCTL_HOME=/alternate/garden/config/dir export GCTL_CONFIG_NAME=myconfig # without extension! # config is expected to be under /alternate/garden/config/dir/myconfig.yaml Shell Session The state of gardenctl is bound to a shell session and is not shared across windows, tabs or panes. A shell session is defined by the environment variable GCTL_SESSION_ID. If this is not defined, the value of the TERM_SESSION_ID environment variable is used instead. If both are not defined, this leads to an error and gardenctl cannot be executed. The target.yaml and temporary kubeconfig.*.yaml files are store in the following directory ${TMPDIR}/garden/${GCTL_SESSION_ID}.\nYou can make sure that GCTL_SESSION_ID or TERM_SESSION_ID is always present by adding the following code to your terminal profile ~/.profile, ~/.bashrc or comparable file.\nbash and zsh: [ -n \"$GCTL_SESSION_ID\" ] || [ -n \"$TERM_SESSION_ID\" ] || export GCTL_SESSION_ID=$(uuidgen) fish: [ -n \"$GCTL_SESSION_ID\" ] || [ -n \"$TERM_SESSION_ID\" ] || set -gx GCTL_SESSION_ID (uuidgen) powershell: if ( !(Test-Path Env:GCTL_SESSION_ID) -and !(Test-Path Env:TERM_SESSION_ID) ) { $Env:GCTL_SESSION_ID = [guid]::NewGuid().ToString() } Completion Gardenctl supports completion that will help you working with the CLI and save you typing effort. It will also help you find clusters by providing suggestions for gardener resources such as shoots or projects. Completion is supported for bash, zsh, fish and powershell. You will find more information on how to configure your shell completion for gardenctl by executing the help for your shell completion command. Example:\ngardenctl completion bash --help Usage Targeting You can set a target to use it in subsequent commands. You can also overwrite the target for each command individually.\nNote that this will not affect your KUBECONFIG env variable. To update the KUBECONFIG env for your current target see Configure KUBECONFIG section\nExample:\n# target control plane gardenctl target --garden landscape-dev --project my-project --shoot my-shoot --control-plane Find more information in the documentation.\nConfigure KUBECONFIG for Shoot Clusters Generate a script that points KUBECONFIG to the targeted cluster for the specified shell. Use together with eval to configure your shell. Example for bash:\neval $(gardenctl kubectl-env bash) Configure Cloud Provider CLIs Generate the cloud provider CLI configuration script for the specified shell. Use together with eval to configure your shell. Example for bash:\neval $(gardenctl provider-env bash) SSH Establish an SSH connection to a Shoot cluster’s node.\ngardenctl ssh my-node ","categories":"","description":"The command line interface to control your clusters","excerpt":"The command line interface to control your clusters","ref":"/docs/gardenctl-v2/","tags":"","title":"Gardenctl V2"},{"body":"Overview Gardener offers out-of-the-box observability for the control plane, Gardener managed system-components, and the nodes of a shoot cluster.\nHaving your workload survive on day 2 can be a challenge. The goal of this topic is to give you the tools with which to observe, analyze, and alert when the control plane or system components of your cluster become unhealthy. This will let you guide your containers through the storm of operating in a production environment.\n","categories":"","description":"","excerpt":"Overview Gardener offers out-of-the-box observability for the control …","ref":"/docs/getting-started/observability/","tags":"","title":"Observability"},{"body":"","categories":"","description":"Commonly asked questions about Gardener","excerpt":"Commonly asked questions about Gardener","ref":"/docs/faq/","tags":"","title":"FAQ"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/getting-started/features/","tags":"","title":"Features"},{"body":"Architecture Containers will NOT fix a broken architecture! Running a highly distributed system has advantages, but of course, those come at a cost. In order to succeed, one would need:\n Logging Tracing No singleton Tolerance to failure of individual instances Automated config / change management Kubernetes knowledge Scalability Most scalability dimensions are interconnected with others. If a cluster grows beyond reasonable defaults, it can still function very well. But tuning it comes at the cost of time and can influence stability negatively.\nTake the number of nodes and pods, for example. Both are connected and you cannot grow both towards their individual limits, as you would face issues way before reaching any theoretical limits.\nReading the Scalability of Gardener Managed Kubernetes Clusters guide is strongly recommended in order to understand the topic of scalability within Kubernetes and Gardener.\nA Small Sample of Things That Can Grow Beyond Reasonable Limits When scaling a cluster, there are plenty of resources that can be exhausted or reach a limit:\n The API server will be scaled horizontally and vertically by Gardener. However, it can still consume too much resources to fit onto a single node on the seed. In this case, you can only reduce the load on the API server. This should not happen with regular usage patterns though. ETCD disk space: 8GB is the limit. If you have too many resources or a high churn rate, a cluster can run out of ETCD capacity. In such a scenario it will stop working until defragmented, compacted, and cleaned up. The number of nodes is limited by the network configuration (pod cidr range \u0026 node cidr mask). Also, there is a reasonable number of nodes (300) that most workloads should not exceed. It is possible to go beyond but doing so requires careful tuning and consideration of connected scaling dimensions (like the number of pods per node). The availability of your cluster is directly impacted by the way you use it.\nInfrastructure Capacity and Quotas Sometimes requests cannot be fulfilled due to shortages on the infrastructure side. For example, a certain instance type might not be available and new Kubernetes nodes of this type cannot be added. It is a good practice to use the cluster-autoscaler’s priority expander and have a secondary node pool.\nSometimes, it is not the physical capacity but exhausted quotas within an infrastructure account that result in limits. Obviously, there should be sufficient quota to create as many VMs as needed. But there are also other resources that are created in the infrastructure that need proper quotas:\n Loadbalancers VPC Disks Routes (often forgotten, but very important for clusters without overlay network; typically defaults to around 50 routes, meaning that 50 nodes is the maximum a cluster can have) … NodeCIDRMaskSize Upon cluster creation, there are several settings that are network related. For example, the address space for Pods has to be defined. In this case, it is a /16 subnet that includes a total of 65.536 hosts. However, that does not imply that you can easily use all addresses at the same point in time.\nAs part of the Kubernetes network setup, the /16 network is divided into smaller subnets and each node gets a distinct subnet. The size of this subnet defaults to /24. It can also be specified (but not changed later).\nNow, as you create more nodes, you have a total of 256 subnets that can be assigned to nodes, thus limiting the total number of nodes of this cluster to 256.\nFor more information, see Shoot Networking.\nOverlapping VPCs Avoid Overlapping CIDR Ranges in VPCs Gardener can create shoot cluster resources in an existing / user-created VPC. However, you have to make sure that the CIDR ranges used by the shoots nodes or subnets for zones do not overlap with other shoots deployed to the same VPC.\nIn case of an overlap, there might be strange routing effects, and packets ending up at a wrong location.\nExpired Credentials Credentials expire or get revoked. When this happens to the actively used infrastructure credentials of a shoot, the cluster will stop working after a while. New nodes cannot be added, LoadBalancers cannot be created, and so on.\nYou can update the credentials stored in the project namespace and reconcile the cluster to replicate the new keys to all relevant controllers. Similarly, when doing a planned rotation one should wait until the shoot reconciled successfully before invalidating the old credentials.\nAutoUpdate Breaking Clusters Gardener can automatically update a shoot’s Kubernetes patch version, when a new patch version is labeled as “supported”. Automatically updating of the OS images works in a similar way. Both are triggered by the “supported” classification in the respective cloud profile and can be enabled / disabled as part a shoot’s spec.\nAdditionally, when a minor Kubernetes / OS version expires, Gardener will force-update the shoot to the next supported version.\nTurning on AutoUpdate for a shoot may be convenient but comes at the risk of potentially unwanted changes. While it is possible to switch to another OS version, updates to the Kubernetes version are a one way operation and cannot be reverted.\nRecommendation Control the version lifecycle separately for any cluster that hosts important workload. Node Draining Node Draining and Pod Disruption Budget Typically, nodes are drained when:\n There is a update of the OS / Kubernetes minor version An Operator cordons \u0026 drains a node The cluster-autoscaler wants to scale down Without a PodDistruptionBudget, pods will be terminated as fast as possible. If an application has 2 out of 2 replicas running on the drained node, this will probably cause availability issues.\nNode Draining with PDB PodDisruptionBudgets can help to manage a graceful node drain. However, if no disruptions are allowed there, the node drain will be blocked until it reaches a timeout. Only then will the nodes be terminated but without respecting PDB thresholds.\nRecommendation Configure PDBs and allow disruptions. Pod Resource Requests and Limits Resource Consumption Pods consume resources and, of course, there are only so many resources available on a single node. Setting requests will make the scheduling much better, as the scheduler has more information available.\nSpecifying limits can help, but can also limit an application in unintended ways. A recommendation to start with:\n Do not set CPU limits (CPU is compressible and throttling is really hard to detect) Set memory limits and monitor OOM kills / restarts of workload (typically detectable by container status exit code 137 and corresponding events). This will decrease the likelihood of OOM situations on the node itself. However, for critical workloads it might be better to have uncapped growth and rather risk a node going OOM. Next, consider if assigning the workload to quality of service class guaranteed is needed. Again - this can help or be counterproductive. It is important to be aware of its implications. For more information, see Pod Quality of Service Classes.\nTune shoot.spec.Kubernetes.kubeReserved to protect the node (kubelet) in case of a workload pod consuming too much resources. It is very helpful to ensure a high level of stability.\nIf the usage profile changes over time, the VPA can help a lot to adapt the resource requests / limits automatically.\nWebhooks User-Deployed Webhooks in Kubernetes By default, any request to the API server will go through a chain of checks. Let’s take the example of creating a pod.\nWhen the resource is submitted to the API server, it will be checked against the following validations:\n Is the user authorized to perform this action? Is the pod definitionactually valid? Are the specified values allowed? Additionally, there is the defaulting - like the injection of the default service account’s name, if nothing else is specified.\nThis chain of admission control and mutation can be enhanced by the user. Read about dynamic admission control for more details.\nValidatingWebhookConfiguration: allow or deny requests based on custom rules\nMutatingWebhookConfiguration: change а resource before it is actually stored in etcd (that is, before any other controller acts upon)\nBoth ValidatingWebhookConfiguration as well as MutatingWebhookConfiguration resources:\n specify for which resources and operations these checks should be executed. specify how to reach the webhook server (typically a service running on the data plane of a cluster) rely on a webhook server performing a review and reply to the admissionReview request What could possibly go wrong? Due to the separation of control plane and data plane in Gardener’s architecture, webhooks have the potential to break a cluster. If the webhook server is not responding in time with a valid answer, the request should timeout and the failure policy is invoked. Depending on the scope of the webhook, frequent failures may cause downtime for applications. Common causes for failure are:\n The call to the webhook is made through the VPN tunnel. VPN / connection issues can happen both on the side of the seed as well as the shoot and would render the webhook unavailable from the perspective of the control plane. The traffic cannot reach the pod (network issue, pod not available) The pod is processing too slow (e.g., because there are too many requests) Timeout Webhooks are a very helpful feature of Kubernetes. However, they can easily be configured to break a shoot cluster. Take the timeout, for example. High timeouts (\u003e15s) can lead to blocking requests of control plane components. That’s because most control-plane API calls are made with a client-side timeout of 30s, so if a webhook has timeoutSeconds=30, the overall request might still fail as there is overhead in communication with the API server and other potential webhooks.\nRecommendation Webhooks (esp. mutating) may be called sequentially and thus adding up their individual timeouts. Even with a faliurePolicy=ignore the timeout will stop the request. Recommendations Problematic webhooks are reported as part of a shoot’s status. In addition to timeouts, it is crucial to exclude the kube-system namespace and (potentially non-namespaced) resources that are necessary for the cluster to function properly. Those should not be subject to a user-defined webhook.\nIn particular, a webhook should not operate on:\n the kube-system namespace Endpoints or EndpointSlices Nodes PodSecurityPolicies ClusterRoles ClusterRoleBindings CustomResourceDefinitions ApiServices CertificateSigningRequests PriorityClasses Example:\nA webhook checks node objects upon creation and has a failurePolicy: fail. If the webhook does not answer in time (either due to latency or because there is no pod serving it), new nodes cannot join the cluster.\nFor more information, see Shoot Status.\nConversion Webhooks Who installs a conversion webhook? If you have written your own CustomResourceDefinition (CRD) and made a version upgrade, you will also have consciously written \u0026 deployed the conversion webhook.\nHowever, sometimes, you simply use helm or kustomize to install a (third-party) dependency that contains CRDs. Of course, those can contain conversion webhooks as well. As a user of a cluster, please make sure to be aware what you deploy.\nCRD with a Conversion Webhook Conversion webhooks are tricky. Similarly to regular webhooks, they should have a low timeout. However, they cannot be remediated automatically and can cause errors in the control plane. For example, if a webhook is invoked but not available, it can block the garbage collection run by the kube-controller-manager.\nIn turn, when deleting something like a deployment, dependent resources like pods will not be deleted automatically.\nRecommendation Try to avoid conversion webhooks. They are valid and can be used, but should not stay in place forever. Complete the upgrade to a new version of the CRD as soon as possible. For more information, see the Webhook Conversion, Upgrade Existing Objects to a New Stored Version, and Version Priority topics in the Kubernetes documentation.\n","categories":"","description":"","excerpt":"Architecture Containers will NOT fix a broken architecture! Running a …","ref":"/docs/getting-started/common-pitfalls/","tags":"","title":"Common Pitfalls"},{"body":"Purpose Synonyms and inconsistent writing style makes it hard for beginners to get into a new topic. This glossary aims to help users to get a better understanding of Gardener and authors to use the right terminology.\nContributions are most welcome!\nIf you would like to contribute please check first if your new term is already part of the Standardized Kubernetes Glossary, and if so refrain from adding it here. Whenever you see the need to explain Kubernetes terminology or to refer to Kubernetes concepts it is recommended that you link to the official Kubernetes documentation in your section.\nGardener Glossary If you add anything to the list please keep it in alphabetical order.\n Term Definition Related Term cloud provider secret А resource storing confidential data used to authenticate Gardener and Kubernetes components for infrastructure operations. When a new cluster is created in a Gardener project, the project admin who creates the cluster specification must select the infrastructure secret that will be used to manage IaaS resources required for the new cluster. secret Gardener API server An API server designed to run inside a Kubernetes cluster whose API it wants to extend. After registration, it is used to expose resources native to Gardener such as cloud profiles, shoots, seeds and secret bindings. kube-apiserver garden cluster control plane A control plane that manages the overall creation, modification, and deletion of clusters. control plane Gardener controller manager A component that runs next to the Gardener API server which runs several control loops that do not require talking to any seed or shoot cluster. kube-controller-manager Gardener project A consolidation of project members, clusters, and secrets of the underlying IaaS provider used to organize teams and clusters in a meaningful way. none Gardener scheduler A controller that watches newly created shoots and assigns a seed cluster to them. kube-scheduler gardenlet An agent that manages seed clusters decentrally; reads the desired state from the Gardener API Server and updates the current state. The gardenlet has a similar role as the kubelet in Kubernetes, which manages the workload of a node decentrally; gardenlet manages the shoot clusters (workload) of a seed cluster instead. More information: gardenlet. kubelet garden cluster A dedicated Kubernetes cluster that the Gardener control plane runs in. cluster project “Gardener” An open source project that focuses on operating, monitoring, and managing Kubernetes clusters. none physical garden cluster A physical cluster of the IaaS provider that is used to install Gardener in. none secretBinding A resource that makes it possible for shoot clusters to connect to the cloud provider secret. none seed cluster A cluster that hosts shoot cluster control planes as pods in order to manage shoot clusters. node shoot cluster A Kubernetes runtime for the actual applications or services consisting of a shoot control plane running on the seed cluster and worker nodes hosting the actual workload. pod shoot cluster control plane A Kubernetes control plane used to run the actual end-user workload. It is hosted in the form of pods on a seed cluster. control plane soil cluster A cluster that is created manually and is used as host for other seeds. Sometimes it is technically impossible that Gardener can install shoot clusters on an infrastructure, for example, because the infrastructure is not supported or protected by a firewall. In such cases you can create a soil cluster on that infrastructure manually as a host for seed clusters. From inside the firewall, seed clusters can reach the garden cluster outside the firewall. This is possible since Gardener delegated cluster management to the Gardenlet. none virtual garden cluster A cluster without any nodes that runs the Kubernetes API server, etcd, and stores Gardener metadata like projects, shoot resources, seed resources, secrets, and others. The virtual garden cluster is installed on the physical garden cluster (base cluster of IaaS provider) during the installation of Gardener. Thanks to the virtual garden cluster, Gardener has full control over all Gardener metadata. This full control simplifies the support for the backup, restore, recovery, migration, relocation, or recreation of this data, because it can be implemented independently from the underlying physical garden cluster. none ","categories":"","description":"Commonly used terms in Gardener","excerpt":"Commonly used terms in Gardener","ref":"/docs/glossary/","tags":"","title":"Glossary"},{"body":"Overview The Gardener team takes security seriously, which is why we mandate the Security Technical Implementation Guide (STIG) for Kubernetes as published by the Defense Information Systems Agency (DISA) here. We offer Gardener adopters the opportunity to show compliance with DISA Kubernetes STIG via the compliance checker tool diki. The latest release in machine readable format can be found in the STIGs Document Library by searching for Kubernetes.\nKubernetes Clusters Security Requirements DISA Kubernetes STIG version 1 release 11 contains 91 rules overall. Only the following rules, however, apply to you. Some of them are secure-by-default, so your responsibility is to make sure that they are not changed. For your convenience, the requirements are grouped logically and per role:\nRules Relevant for Cluster Admins Control Plane Configuration ID Description Secure By Default Comments 242390 Kubernetes API server must have anonymous authentication disabled ✅ Disabled unless you enable it via enableAnnonymousAuthentication 245543 Kubernetes API Server must disable token authentication to protect information in transit ✅ Disabled unless you enable it via enableStaticTokenKubeconfig 242400 Kubernetes API server must have Alpha APIs disabled ✅ Disabled unless you enable it via featureGates 242436 Kubernetes API server must have the ValidatingAdmissionWebhook enabled ✅ Enabled unless you disable it explicitly via admissionPlugins 242393 Kubernetes Worker Nodes must not have sshd service running ❌ Active to allow debugging of network issues, but it is possible to deactivate via the sshAccess setting 242394 Kubernetes Worker Nodes must not have the sshd service enabled ❌ Enabled to allow debugging of network issues, but it is possible to deactivate via the sshAccess setting 242434 Kubernetes Kubelet must enable kernel protection ✅ Enabled for Kubernetes v1.26 or later unless disabled explicitly via protectKernalDefaults 245541 Kubernetes Kubelet must not disable timeouts ✅ Enabled for Kubernetes v1.26 or later unless disabled explicitly via streamingConnectionIdleTimeout Audit Configuration ID Description Secure By Default Comments 242402 The Kubernetes API Server must have an audit log path set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242403 Kubernetes API Server must generate audit records that identify what type of event has occurred, identify the source of the event, contain the event results, identify any users, and identify any containers associated with the event ❌ Users should set an audit policy that meets the requirements of their organization. Please consult the Shoot Audit Policy documentation. 242461 Kubernetes API Server audit logs must be enabled ❌ Users should set an audit policy that meets the requirements of their organization. Please consult the Shoot Audit Policy documentation. 242462 The Kubernetes API Server must be set to audit log max size ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242463 The Kubernetes API Server must be set to audit log maximum backup ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242464 The Kubernetes API Server audit log retention must be set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. 242465 The Kubernetes API Server audit log path must be set ❌ It is the user’s responsibility to configure an audit extension that meets the requirements of their organization. Depending on the audit extension implementation the audit logs do not always need to be written on the filesystem, i.e. when --audit-webhook-config-file is set and logs are sent to an audit backend. End User Workload ID Description Secure By Default Comments 242395 Kubernetes dashboard must not be enabled ✅ Not installed unless you install it via kubernetesDashboard. 242414 Kubernetes cluster must use non-privileged host ports for user pods ❌ Do not use any ports below 1024 for your own workload. 242415 Secrets in Kubernetes must not be stored as environment variables ❌ Always mount secrets as volumes and never as environment variables. 242383 User-managed resources must be created in dedicated namespaces ❌ Create and use your own/dedicated namespaces and never place anything into the default, kube-system, kube-public, or kube-node-lease namespace. The default namespace is never to be used while the other above listed namespaces are only to be used by the Kubernetes provider (here Gardener). 242417 Kubernetes must separate user functionality ❌ While 242383 is about all resources, this rule is specifically about pods. Create and use your own/dedicated namespaces and never place pods into the default, kube-system, kube-public, or kube-node-lease namespace. The default namespace is never to be used while the other above listed namespaces are only to be used by the Kubernetes provider (here Gardener). 242437 Kubernetes must have a pod security policy set ✅ Set, but Gardener can only set default pod security policies (PSP) and does so only until v1.24 as with v1.25 PSPs were removed (deprecated since v1.21) and replaced with Pod Security Standards (see this blog for more information). Whatever the technology, you are responsible to configure custom-tailured appropriate PSPs respectively use them or PSSs, depending on your own workload and security needs (only you know what a pod should be allowed to do). 242442 Kubernetes must remove old components after updated versions have been installed ❌ While Gardener manages all its components in its system namespaces (automated), you are naturally responsible for your own workload. 254800 Kubernetes must have a Pod Security Admission control file configured ❌ Gardener ensures that the pod security configuration allows system components to be deployed in the kube-system namespace but does not set configurations that can affect user namespaces. It is recommended that users enforce a minimum of baseline pod security level for their workload via PodSecurity admission plugin. Rules Relevant for Service Providers ID Description 242376 The Kubernetes Controller Manager must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242377 The Kubernetes Scheduler must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242378 The Kubernetes API Server must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination. 242379 The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination. 242380 The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination. 242381 The Kubernetes Controller Manager must create unique service accounts for each work payload. 242382 The Kubernetes API Server must enable Node,RBAC as the authorization mode. 242384 The Kubernetes Scheduler must have secure binding. 242385 The Kubernetes Controller Manager must have secure binding. 242386 The Kubernetes API server must have the insecure port flag disabled. 242387 The Kubernetes Kubelet must have the “readOnlyPort” flag disabled. 242388 The Kubernetes API server must have the insecure bind address not set. 242389 The Kubernetes API server must have the secure port set. 242391 The Kubernetes Kubelet must have anonymous authentication disabled. 242392 The Kubernetes kubelet must enable explicit authorization. 242396 Kubernetes Kubectl cp command must give expected access and results. 242397 The Kubernetes kubelet staticPodPath must not enable static pods. 242398 Kubernetes DynamicAuditing must not be enabled. 242399 Kubernetes DynamicKubeletConfig must not be enabled. 242404 Kubernetes Kubelet must deny hostname override. 242405 The Kubernetes manifests must be owned by root. 242406 The Kubernetes KubeletConfiguration file must be owned by root. 242407 The Kubernetes KubeletConfiguration files must have file permissions set to 644 or more restrictive. 242408 The Kubernetes manifest files must have least privileges. 242409 Kubernetes Controller Manager must disable profiling. 242410 The Kubernetes API Server must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242411 The Kubernetes Scheduler must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242412 The Kubernetes Controllers must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242413 The Kubernetes etcd must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL). 242418 The Kubernetes API server must use approved cipher suites. 242419 Kubernetes API Server must have the SSL Certificate Authority set. 242420 Kubernetes Kubelet must have the SSL Certificate Authority set. 242421 Kubernetes Controller Manager must have the SSL Certificate Authority set. 242422 Kubernetes API Server must have a certificate for communication. 242423 Kubernetes etcd must enable client authentication to secure service. 242424 Kubernetes Kubelet must enable tlsPrivateKeyFile for client authentication to secure service. 242425 Kubernetes Kubelet must enable tlsCertFile for client authentication to secure service. 242426 Kubernetes etcd must enable client authentication to secure service. 242427 Kubernetes etcd must have a key file for secure communication. 242428 Kubernetes etcd must have a certificate for communication. 242429 Kubernetes etcd must have the SSL Certificate Authority set. 242430 Kubernetes etcd must have a certificate for communication. 242431 Kubernetes etcd must have a key file for secure communication. 242432 Kubernetes etcd must have peer-cert-file set for secure communication. 242433 Kubernetes etcd must have a peer-key-file set for secure communication. 242438 Kubernetes API Server must configure timeouts to limit attack surface. 242443 Kubernetes must contain the latest updates as authorized by IAVMs, CTOs, DTMs, and STIGs. 242444 The Kubernetes component manifests must be owned by root. 242445 The Kubernetes component etcd must be owned by etcd. 242446 The Kubernetes conf files must be owned by root. 242447 The Kubernetes Kube Proxy must have file permissions set to 644 or more restrictive. 242448 The Kubernetes Kube Proxy must be owned by root. 242449 The Kubernetes Kubelet certificate authority file must have file permissions set to 644 or more restrictive. 242450 The Kubernetes Kubelet certificate authority must be owned by root. 242451 The Kubernetes component PKI must be owned by root. 242452 The Kubernetes kubelet KubeConfig must have file permissions set to 644 or more restrictive. 242453 The Kubernetes kubelet KubeConfig file must be owned by root. 242454 The Kubernetes kubeadm.conf must be owned by root. 242455 The Kubernetes kubeadm.conf must have file permissions set to 644 or more restrictive. 242456 The Kubernetes kubelet config must have file permissions set to 644 or more restrictive. 242457 The Kubernetes kubelet config must be owned by root. 242459 The Kubernetes etcd must have file permissions set to 644 or more restrictive. 242460 The Kubernetes admin.conf must have file permissions set to 644 or more restrictive. 242466 The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive. 242467 The Kubernetes PKI keys must have file permissions set to 600 or more restrictive. 245542 Kubernetes API Server must disable basic authentication to protect information in transit. 245544 Kubernetes endpoints must use approved organizational certificate and key pair to protect information in transit. 254801 Kubernetes must enable PodSecurity admission controller on static pods and Kubelets. ","categories":"","description":"Compliant user management of your Gardener projects","excerpt":"Compliant user management of your Gardener projects","ref":"/docs/security-and-compliance/kubernetes-hardening/","tags":["task"],"title":"Kubernetes Cluster Hardening Procedure"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/client-tools/","tags":"","title":"Set Up Client Tools"},{"body":"A curated list of awesome Kubernetes sources. Inspired by @sindresorhus’ awesome\nSetup Install Docker for Mac Install Docker for Windows Run a Kubernetes Cluster on your local machine A Place That Marks the Beginning of a Journey Read the kubernetes.io documentation Take an online Udemy course Kubernetes Community Overview and Contributions Guide by Ihor Dvoretskyi Kubernetes: The Future of Cloud Hosting by Meteorhacks Kubernetes by Google by Gaston Pantana Application Containers: Kubernetes and Docker from Scratch by Keith Tenzer Learn the Kubernetes Key Concepts in 10 Minutes by Omer Dawelbeit The Children’s Illustrated Guide to Kubernetes by Deis :-) Docker Kubernetes Lab Handbook by Peng Xiao Interactive Learning Environments Learn Kubernetes using an interactive environment without requiring downloads or configuration\n Interactive Kubernetes Tutorials Kubernetes: From Basics to Guru Kubernetes Bootcamp Massive Open Online Courses / Tutorials List of available free online courses(MOOC) and tutorials\n DevOps with Kubernetes Introduction to Kubernetes Courses Scalable Microservices with Kubernetes at Udacity Introduction to Kubernetes at edX Tutorials Kubernetes Tutorials by Kubernetes Team Kubernetes By Example by OpenShift Team Kubernetes Tutorial by Tutorialspoint Package Managers Helm KPM RPC gRPC RBAC Kubernetes RBAC: Role-Based Access Control Secret Generation and Management Vault auth plugin backend: Kubernetes Vault controller kube-lego k8sec kubernetes-vault kubesec - Secure Secret management Machine Learning TensorFlow k8s mxnet-operator - Tools for ML/MXNet on Kubernetes. kubeflow - Machine Learning Toolkit for Kubernetes. seldon-core - Open source framework for deploying machine learning models on Kubernetes Raspberry Pi Some of the awesome findings and experiments on using Kubernetes with Raspberry Pi.\n Kubecloud Setting up a Kubernetes on ARM cluster Setup Kubernetes on a Raspberry Pi Cluster easily the official way! by Mathias Renner and Lucas Käldström How to Build a Kubernetes Cluster with ARM Raspberry Pi then run .NET Core on OpenFaas by Scott Hanselman Contributing Contributions are most welcome!\nThis list is just getting started, please contribute to make it super awesome.\n","categories":"","description":"","excerpt":"A curated list of awesome Kubernetes sources. Inspired by …","ref":"/curated-links/","tags":"","title":"Curated Links"},{"body":" ","categories":"","description":"Gardener - Kubernetes automation including day 2 operations","excerpt":"Gardener - Kubernetes automation including day 2 operations","ref":"/docs/resources/videos/gardener-teaser/","tags":"","title":"Gardener Teaser"},{"body":"","categories":"","description":"Interesting and useful content on Kubernetes","excerpt":"Interesting and useful content on Kubernetes","ref":"/docs/resources/","tags":"","title":"Resources"},{"body":"Contributing to Gardener Welcome Welcome to the Contributor section of Gardener. Here you can learn how it is possible for you to contribute your ideas and expertise to the project and have it grow even more.\nPrerequisites Before you begin contributing to Gardener, there are a couple of things you should become familiar with and complete first.\nCode of Conduct All members of the Gardener community must abide by the Contributor Covenant. Only by respecting each other can we develop a productive, collaborative community. Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting gardener.opensource@sap.com and/or a Gardener project maintainer.\nDeveloper Certificate of Origin Due to legal reasons, contributors will be asked to accept a Developer Certificate of Origin (DCO) before they submit the first pull request to this projects, this happens in an automated fashion during the submission process. We use the standard DCO text of the Linux Foundation.\nLicense Your contributions to Gardener must be licensed properly:\n Code contributions must be licensed under the Apache 2.0 License Documentation contributions must be licensed under the Creative Commons Attribution 4.0 International License Contributing Gardener uses GitHub to manage reviews of pull requests.\n If you are a new contributor see: Steps to Contribute\n If you have a trivial fix or improvement, go ahead and create a pull request.\n If you plan to do something more involved, first discuss your ideas on our mailing list. This will avoid unnecessary work and surely give you and us a good deal of inspiration.\n Relevant coding style guidelines are the Go Code Review Comments and the Formatting and style section of Peter Bourgon’s Go: Best Practices for Production Environments.\n Steps to Contribute Should you wish to work on an issue, please claim it first by commenting on the GitHub issue that you want to work on it. This is to prevent duplicated efforts from contributors on the same issue.\nIf you have questions about one of the issues, with or without the tag, please comment on them and one of the maintainers will clarify it.\nWe kindly ask you to follow the Pull Request Checklist to ensure reviews can happen accordingly.\nPull Request Checklist Branch from the master branch and, if needed, rebase to the current master branch before submitting your pull request. If it doesn’t merge cleanly with master you may be asked to rebase your changes.\n Commits should be as small as possible, while ensuring that each commit is correct independently (i.e., each commit should compile and pass tests).\n Test your changes as thoroughly as possible before your commit them. Preferably, automate your testing with unit / integration tests. If tested manually, provide information about the test scope in the PR description (e.g., “Test passed: Upgrade K8s version from 1.14.5 to 1.15.2 on AWS, Azure, GCP, Alicloud, Openstack.”).\n When creating the PR, make your Pull Request description as detailed as possible to help out the reviewers.\n Create Work In Progress [WIP] pull requests only if you need a clarification or an explicit review before you can continue your work item.\n If your patch is not getting reviewed or you need a specific person to review it, you can @-reply a reviewer asking for a review in the pull request or a comment, or you can ask for a review on our mailing list.\n If you add new features, make sure that they are documented in the Gardener documentation.\n If your changes are relevant for operators, consider to update the ops toolbelt image.\n Post review:\n If a review requires you to change your commit(s), please test the changes again. Amend the affected commit(s) and force push onto your branch. Set respective comments in your GitHub review to resolved. Create a general PR comment to notify the reviewers that your amendments are ready for another round of review. Contributing Bigger Changes If you want to contribute bigger changes to Gardener, such as when introducing new API resources and their corresponding controllers, or implementing an approved Gardener Enhancement Proposal, follow the guidelines outlined in Contributing Bigger Changes.\nAdding Already Existing Documentation If you want to add documentation that already exists on GitHub to the website, you should update the central manifest instead of duplicating the content. To find out how to do that, see Adding Already Existing Documentation.\nIssues and Planning We use GitHub issues to track bugs and enhancement requests. Please provide as much context as possible when you open an issue. The information you provide must be comprehensive enough to reproduce that issue for the assignee. Therefore, contributors may use but aren’t restricted to the issue template provided by the Gardener maintainers.\nZenHub is used for planning:\n Install the ZenHub Chrome plugin Login to ZenHub Open the Gardener ZenHub workspace Security Release Process See Security Release Process.\nCommunity Slack Channel #gardener, sign up here.\nMailing List gardener@googlegroups.com\nThe mailing list is hosted through Google Groups. To receive the lists’ emails, join the group as you would any other Google Group.\nOther For additional channels where you can reach us, as well as links to our bi-weekly meetings, visit the Community page.\n","categories":"","description":"Contributors guides for code and documentation","excerpt":"Contributors guides for code and documentation","ref":"/docs/contribute/","tags":"","title":"Contribute"},{"body":"Using images on the website has to contribute to the aesthetics and comprehensibility of the materials, with uncompromised experience when loading and browsing pages. That concerns crisp clear images, their consistent layout and color scheme, dimensions and aspect ratios, flicker-free and fast loading or the feeling of it, even on unreliable mobile networks and devices.\nImage Production Guidelines A good, detailed reference for optimal use of images for the web can be found at web.dev’s Fast Load Times topic. The following summarizes some key points plus suggestions for tools support.\nYou are strongly encouraged to use vector images (SVG) as much as possible. They scale seamlessly without compromising the quality and are easier to maintain.\nIf you are just now starting with SVG authoring, here are some tools suggestions: Figma (online/Win/Mac), Sketch (Mac only).\nFor raster images (JPG, PNG, GIF), consider the following requirements and choose a tool that enables you to conform to them:\n Be mindful about image size, the total page size and loading times. Larger images (\u003e10K) need to support progressive rendering. Consult with your favorite authoring tool’s documentation to find out if and how it supports that. The site delivers the optimal media content format and size depending on the device screen size. You need to provide several variants (large screen, laptop, tablet, phone). Your authoring tool should be able to resize and resample images. Always save the largest size first and then downscale from it to avoid image quality loss. If you are looking for a tool that conforms to those guidelines, IrfanView is a very good option.\nScreenshots can be taken with whatever tool you have available. A simple Alt+PrtSc (Win) and paste into an image processing tool to save it does the job. If you need to add emphasized steps (1,2,3) when you describe a process on a screeshot, you can use Snaggit. Use red color and numbers. Mind the requirements for raster images laid out above.\nDiagrams can be exported as PNG/JPG from a diagraming tool such as Visio or even PowerPoint. Pick whichever you are comfortable with to design the diagram and make sure you comply with the requirements for the raster images production above. Diagrams produced as SVG are welcome too if your authoring tool supports exporting in that format. In any case, ensure that your diagrams “blend” with the content on the site - use the same color scheme and geometry style. Do not complicate diagrams too much. The site also supports Mermaid diagrams produced with markdown and rendered as SVG. You don’t need special tools for them, but for more complex ones you might want to prototype your diagram wth Mermaid’s online live editor, before encoding it in your markdown. More tips on using Mermaid can be found in the Shortcodes documentation.\nUsing Images in Markdown The standard for adding images to a topic is to use markdown’s ![caption](image-path). If the image is not showing properly, or if you wish to serve images close to their natural size and avoid scaling, then you can use HTML5’s \u003cpicture\u003e tag.\nExample:\n\u003cpicture\u003e \u003c!-- default, laptop-width-L max 1200px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-XL.png\" media=\"(min-width: 1000px)\"\u003e \u003c!-- default, laptop-width max 1000px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-L.png\" media=\"(min-width: 1400px)\"\u003e \u003c!-- default, tablets-width max 750px --\u003e \u003csource srcset=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview-M.png\" media=\"(min-width: 750px)\"\u003e \u003c!-- default, phones-width max 450px --\u003e \u003cimg src=\"https://github.tools.sap/kubernetes/documentation/tree/master/website/documentation/015-tutorials/my-guide/images/overview.png\" /\u003e \u003c/picture\u003e When deciding on image sizes, consider the breakpoints in the example above as maximum widths for each image variant you provide. Note that the site is designed for maximum width 1200px. There is no point to create images larger than that, since they will be scaled down.\nFor a nice overview on making the best use of responsive images with HTML5, please refer to the Responsive Images guide.\n","categories":"","description":"","excerpt":"Using images on the website has to contribute to the aesthetics and …","ref":"/docs/contribute/documentation/images/","tags":"","title":"Working with Images"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/high-availability/","tags":"","title":"High Availability"},{"body":"Run Partial DISA K8s STIGs Ruleset Against a Gardener Shoot Cluster Introduction This part shows how to run the DISA K8s STIGs ruleset against a Gardener shoot cluster. The guide features the managedk8s provider which does not implement all of the DISA K8s STIG rules since it assumes that the user running the ruleset does not have access to the environment (the seed in this particular case) in which the control plane components reside.\nPrerequisites Make sure you have diki installed and have a running Gardener shoot cluster.\nConfiguration We will be using the sample Partial DISA K8s STIG for Shoots configuration file for this run. You will need to set the provider.args.kubeconfigPath field pointing to a shoot admin kubeconfig.\nIn case you need instructions on how to generate such a kubeconfig, please read Accessing Shoot Clusters.\nAdditional metadata such as the shoot’s name can also be included in the provider.metadata section. The metadata section can be used to add additional context to different diki runs.\nThe provided configuration contains the recommended rule options for running the managedk8s provider ruleset against a shoot cluster, but you can modify rule options parameters according to requirements. All available options can be found in the managedk8s example configuration.\nRunning the DISA K8s STIGs Ruleset To run diki against a Gardener shoot cluster, run the following command:\ndiki run \\ --config=./example/guides/partial-disa-k8s-stig-shoot.yaml \\ --provider=managedk8s \\ --ruleset-id=disa-kubernetes-stig \\ --ruleset-version=v2r1 \\ --output=disa-k8s-stigs-report.json Generating a Report We can use the file generated in the previous step to create an html report by using the following command:\ndiki report generate \\ --output=disa-k8s-stigs-report.html \\ disa-k8s-stigs-report.json ","categories":"","description":"How can I check whether my shoot cluster fulfills the DISA STIGs security requirements?","excerpt":"How can I check whether my shoot cluster fulfills the DISA STIGs …","ref":"/docs/security-and-compliance/partial-disa-k8s-stig-shoot/","tags":"","title":"Run DISA K8s STIGs Ruleset"},{"body":" ","categories":"","description":"The Illustrated Children's Guide to Kubernetes. Written and performed by Matt Butcher Illustrated by Bailey Beougher","excerpt":"The Illustrated Children's Guide to Kubernetes. Written and performed …","ref":"/docs/resources/videos/fairy-tail/","tags":"","title":"The Illustrated Guide to Kubernetes"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/resources/videos/","tags":"","title":"Videos"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/administer-shoots/","tags":"","title":"Administer Client (Shoot) Clusters"},{"body":"Overview Gardener aims to comply with public security standards and guidelines, such as the Security Technical Implementation Guide (STIG) for Kubernetes from Defense Information Systems Agency (DISA). The DISA Kubernetes STIG is a set of rules that provide recommendations for secure deployment and operation of Kubernetes. It covers various aspects of Kubernetes security, including the configurations of the Kubernetes API server and other components, cluster management, certificate management, handling of updates and patches.\nWhile Gardener aims to follow this guideline, we also recognize that not all of the rules may be directly applicable or optimal for Gardener specific environment. Therefore, some of the requirements are adjusted. Rules that are not applicable to Gardener are skipped given an appropriate justification.\nFor every release, we check that Gardener is able of creating security hardened shoot clusters, reconfirming that the configurations which are not secure by default (as per Gardener Kubernetes Cluster Hardening Procedure) are still possible and work as expected.\nIn order to automate and ease this process, Gardener uses a tool called diki.\nSecurity Hardened Shoot Configurations The following security hardened shoot configurations were used in order to generate the compliance report.\n AWS kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: aws spec: cloudProfileName: aws kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: aws controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.180.0.0/16 zones: - internal: 10.180.48.0/20 name: eu-west-1c public: 10.180.32.0/20 workers: 10.180.0.0/19 workers: - cri: name: containerd name: worker-kkfk1 machine: type: m5.large image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: gp3 size: 50Gi zones: - eu-west-1c workersSettings: sshAccess: enabled: false purpose: evaluation region: eu-west-1 secretBindingName: secretBindingName Azure kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: azure spec: cloudProfileName: az kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: azure controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.180.0.0/16 workers: 10.180.0.0/16 zoned: true workers: - cri: name: containerd name: worker-g7p4p machine: type: Standard_A4_v2 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: StandardSSD_LRS size: 50Gi zones: - '3' workersSettings: sshAccess: enabled: false purpose: evaluation region: westeurope secretBindingName: secretBindingName GCP kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: gcp spec: cloudProfileName: gcp kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: gcp controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.180.0.0/16 workers: - cri: name: containerd name: worker-bex82 machine: type: n1-standard-2 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 volume: type: pd-balanced size: 50Gi zones: - europe-west1-b workersSettings: sshAccess: enabled: false purpose: evaluation region: europe-west1 secretBindingName: secretBindingName OpenStack kind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: openstack spec: cloudProfileName: converged-cloud-cp kubernetes: kubeAPIServer: admissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1beta1 kind: PodSecurityConfiguration defaults: enforce: baseline audit: baseline warn: baseline disabled: false auditConfig: auditPolicy: configMapRef: name: audit-policy version: \"1.28\" enableStaticTokenKubeconfig: false networking: type: calico pods: 100.64.0.0/12 nodes: 10.180.0.0/16 services: 100.104.0.0/13 ipFamilies: - IPv4 provider: type: openstack controlPlaneConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: f5 infrastructureConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.180.0.0/16 floatingPoolName: FloatingIP-external-cp workers: - cri: name: containerd name: worker-dqty2 machine: type: g_c2_m4 image: name: gardenlinux architecture: amd64 maximum: 2 minimum: 2 maxSurge: 1 maxUnavailable: 0 zones: - eu-de-1b workersSettings: sshAccess: enabled: false purpose: evaluation region: eu-de-1 secretBindingName: secretBindingName Diki Configuration The following diki configuration was used in order to test each of the shoot clusters described above. Mind that the rules regarding audit logging are skipped because organizations have different requirements and Gardener can integrate with different audit logging solutions.\n Configuration metadata: ... providers: - id: gardener name: Gardener metadata: ... args: ... rulesets: - id: disa-kubernetes-stig name: DISA Kubernetes Security Technical Implementation Guide version: v1r11 args: maxRetries: 5 ruleOptions: - ruleID: \"242402\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242403\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242414\" args: acceptedPods: - podMatchLabels: k8s-app: node-local-dns namespaceMatchLabels: kubernetes.io/metadata.name: kube-system justification: \"node local dns requires port 53 in order to operate properly\" ports: - 53 - ruleID: \"242445\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242446\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242451\" args: expectedFileOwner: users: [\"0\", \"65532\"] groups: [\"0\", \"65532\"] - ruleID: \"242462\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242463\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"242464\" skip: enabled: true justification: \"Gardener can integrate with different audit logging solutions\" - ruleID: \"245543\" args: acceptedTokens: - user: \"health-check\" uid: \"health-check\" - ruleID: \"254800\" args: minPodSecurityLevel: \"baseline\" output: minStatus: Passed Security Compliance Report for Hardened Shoot Clusters The report can be reviewed directly or downloaded by clicking here.\n *,:after,:before{border:0 solid #e5e7eb;box-sizing:border-box}:after,:before{--tw-content:\"\"}html{-webkit-text-size-adjust:100%;font-feature-settings:normal;font-family:ui-sans-serif,system-ui,-apple-system,BlinkMacSystemFont,Segoe UI,Roboto,Helvetica Neue,Arial,Noto Sans,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol,Noto Color Emoji;font-variation-settings:normal;line-height:1.5;-moz-tab-size:4;-o-tab-size:4;tab-size:4}body{line-height:inherit;margin:0}hr{border-top-width:1px;color:inherit;height:0}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}b,strong{font-weight:bolder}code,kbd,pre,samp{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:initial}sub{bottom:-.25em}sup{top:-.5em}table{border-collapse:collapse;border-color:inherit;text-indent:0}button,input,optgroup,select,textarea{font-feature-settings:inherit;color:inherit;font-family:inherit;font-size:100%;font-variation-settings:inherit;font-weight:inherit;line-height:inherit;margin:0;padding:0}button,select{text-transform:none}[type=button],[type=reset],[type=submit],button{-webkit-appearance:button;background-color:initial;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:initial}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dd,dl,figure,h1,h2,h3,h4,h5,h6,hr,p,pre{margin:0}fieldset{margin:0}fieldset,legend{padding:0}menu,ol,ul{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{color:#9ca3af;opacity:1}input::placeholder,textarea::placeholder{color:#9ca3af;opacity:1}[role=button],button{cursor:pointer}:disabled{cursor:default}img,video{height:auto;max-width:100%}[hidden]{display:none}*,::backdrop,:after,:before{--tw-border-spacing-x:0;--tw-border-spacing-y:0;--tw-translate-x:0;--tw-translate-y:0;--tw-rotate:0;--tw-skew-x:0;--tw-skew-y:0;--tw-scale-x:1;--tw-scale-y:1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness:proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width:0px;--tw-ring-offset-color:#fff;--tw-ring-color:#3b82f680;--tw-ring-offset-shadow:0 0 #0000;--tw-ring-shadow:0 0 #0000;--tw-shadow:0 0 #0000;--tw-shadow-colored:0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }.tw-absolute{position:absolute}.tw-relative{position:relative}.tw-right-3{right:.75rem}.tw-top-3{top:.75rem}.tw-flex{display:flex}.tw-hidden{display:none}.tw-list-inside{list-style-position:inside}.tw-list-disc{list-style-type:disc}.tw-list-none{list-style-type:none}.tw-flex-col{flex-direction:column}.tw-justify-center{justify-content:center}.tw-overflow-x-auto{overflow-x:auto}.tw-rounded{border-radius:.25rem}.tw-rounded-lg{border-radius:.5rem}.tw-bg-gray-200{--tw-bg-opacity:1;background-color:rgb(229 231 235/var(--tw-bg-opacity))}.tw-p-1{padding:.25rem}.tw-p-4{padding:1rem}.tw-px-6{padding-left:1.5rem;padding-right:1.5rem}.tw-pb-5{padding-bottom:1.25rem}.tw-pl-2{padding-left:.5rem}.tw-pl-5{padding-left:1.25rem}.tw-pr-2{padding-right:.5rem}.tw-pt-2{padding-top:.5rem}.tw-text-2xl{font-size:1.5rem;line-height:2rem}.tw-text-3xl{font-size:1.875rem;line-height:2.25rem}.tw-text-lg{font-size:1.125rem;line-height:1.75rem}.tw-text-xl{font-size:1.25rem;line-height:1.75rem}.tw-font-bold{font-weight:700}.tw-font-medium{font-weight:500}.tw-font-semibold{font-weight:600}.hover\\:tw-bg-gray-100:hover{--tw-bg-opacity:1;background-color:rgb(243 244 246/var(--tw-bg-opacity))} .arrow { border: solid black; border-width: 0px 3px 3px 0px; display: inline-block; padding: 4px; } .right { transform: rotate(-45deg); -webkit-transform: rotate(-45deg); } .left { transform: rotate(135deg); -webkit-transform: rotate(135deg); } .up { transform: rotate(-135deg); -webkit-transform: rotate(-135deg); } .down { transform: rotate(45deg); -webkit-transform: rotate(45deg); } function collapse(event) { const parent = event.currentTarget.parentElement const list = parent.getElementsByTagName('ul')[0] const arrow = event.currentTarget.getElementsByTagName('i')[0] if (list.classList.contains('tw-hidden') === true) { list.classList.remove('tw-hidden') arrow.classList.replace('right', 'down') return } list.classList.add('tw-hidden') arrow.classList.replace('down', 'right') } function cpCode(event) { const parent = event.currentTarget.parentElement const code = parent.getElementsByTagName('pre')[0].innerText navigator.clipboard.writeText(code); } Compliance Run (07-25-2024) Diki Version: v0.10.0\nGlossary 🟢 Passed: Rule check has been fulfilled. 🔵 Skipped: Rule check has been considered irrelevant for the specific scenario and will not be run. 🔵 Accepted: Rule check may or may not have been run, but it was decided by the user that the check is not a finding. 🟠 Warning: Rule check has encountered an ambiguous condition or configuration preventing the ability to determine if the check is fulfilled or not. 🔴 Failed: Rule check has been unfulfilled, can be considered a finding. 🔴 Errored: Rule check has errored during runtime. It cannot be determined whether the check is fulfilled or not. 🟠 Not Implemented: Rule check has not been implemented yet. Provider Gardener\n Evaluated targets aws (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: aws, seedKubernetesVersion: v1.29.4, shootCloudProvider: aws, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:20:33) azure (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: azure, seedKubernetesVersion: v1.29.4, shootCloudProvider: azure, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:21:30) gcp (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: gcp, seedKubernetesVersion: v1.29.4, shootCloudProvider: gcp, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:22:14) openstack (gardenVirtualCloudProvider: gcp, gardenerVersion: v1.99.2, projectName: diki-comp, seedCloudProvider: openstack, seedKubernetesVersion: v1.29.4, shootCloudProvider: openstack, shootKubernetesVersion: v1.28.10, time: 07-25-2024 13:24:21) v1r11 DISA Kubernetes Security Technical Implementation Guide (61x Passed 🟢, 24x Skipped 🔵, 7x Accepted 🔵, 7x Warning 🟠, 3x Failed 🔴) 🟢 Passed The Kubernetes Controller Manager must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242376) Option tls-min-version has not been set. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack Kubernetes Scheduler must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242377) Option tls-min-version has not been set. aws cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--aws azure cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--azure gcp cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--gcp openstack cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--openstack The Kubernetes API Server must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242378) Option tls-min-version has not been set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242379) Option client-transport-security.auto-tls set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack The Kubernetes Controller Manager must create unique service accounts for each work payload(HIGH 242381) Option use-service-account-credentials set to allowed value. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack The Kubernetes API Server must enable Node,RBAC as the authorization mode (MEDIUM 242382) Option authorization-mode set to expected value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must separate user functionality (MEDIUM 242383) System resource in system namespaces. aws kind: Service name: kubernetes namespace: default azure kind: Service name: kubernetes namespace: default gcp kind: Service name: kubernetes namespace: default openstack kind: Service name: kubernetes namespace: default The Kubernetes API server must have the insecure port flag disabled (HIGH 242386) Option insecure-port not set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes Kubelet must have the \"readOnlyPort\" flag disabled (HIGH 242387) Option readOnlyPort not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have the insecure bind address not set (HIGH 242388) Option insecure-bind-address not set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes API server must have the secure port set (MEDIUM 242389) Option secure-port set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes API server must have anonymous authentication disabled (HIGH 242390) Option anonymous-auth set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes Kubelet must have anonymous authentication disabled (HIGH 242391) Option authentication.anonymous.enabled set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes kubelet must enable explicit authorization (HIGH 242392) Option authorization.mode set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Worker Nodes must not have sshd service running (MEDIUM 242393) SSH daemon service not installed aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Worker Nodes must not have the sshd service enabled (MEDIUM 242394) SSH daemon disabled (or could not be probed) aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes dashboard must not be enabled (MEDIUM 242395) Kubernetes dashboard not installed aws azure gcp openstack The Kubernetes kubelet staticPodPath must not enable static pods (HIGH 242397) Option staticPodPath not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have Alpha APIs disabled (MEDIUM 242400) Option featureGates.AllAlpha not set. aws cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--aws cluster: shoot kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--azure cluster: shoot kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--gcp cluster: shoot kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack cluster: seed kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack cluster: seed kind: deployment name: kube-scheduler namespace: shoot--diki-comp--openstack cluster: shoot kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system Kubernetes Kubelet must deny hostname override (MEDIUM 242404) Flag hostname-override not set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet configuration file must be owned by root (MEDIUM 242406) File has expected owners aws details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /etc/systemd/system/kubelet.service, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet configuration files must have file permissions set to 644 or more restrictive (MEDIUM 242407) File has expected permissions aws details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /etc/systemd/system/kubelet.service, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Controller Manager must disable profiling (MEDIUM 242409) Option profiling set to allowed value. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack The Kubernetes cluster must use non-privileged host ports for user pods (MEDIUM 242414) Container does not use hostPort \u003c 1024. aws cluster: seed kind: pod name: aws-custom-route-controller-7856476fd4-hsq29 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-2v7cs namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-d7bpd namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cert-controller-manager-755dbd646b-hgxzx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cloud-controller-manager-769c9b45dd-c5vxq namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-controller-7669f6bfc4-nscqb namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xfjxn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xs2pt namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: event-logger-7cdddb58d8-65h7q namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-28tts namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-5q5st namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-56mqn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-b2lbj namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-7gwf4 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-state-metrics-68dfcd5d48-5mdnv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: network-problem-detector-controller-5f458c7579-82tns namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: shoot-dns-service-645f556cf4-7xc4r namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-hxrh7 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-vf58j namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-recommender-6f499cfd88-lnbrx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-updater-746fb98848-8zzf8 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-nd86n namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vjfwc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-s5286 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system cluster: shoot kind: pod name: node-exporter-fkdwq namespace: kube-system cluster: shoot kind: pod name: node-exporter-xhh5n namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-7nhkg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-vngln namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-lpf4t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-wk9l5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cert-controller-manager-7bd977469b-gj7zt namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cloud-controller-manager-678c6d74d6-9n8dm namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-controller-54b4bcd846-mlxgq namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-685cb namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-t64t4 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: event-logger-5d8496f566-jbqv7 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-mkrs9 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-tddc6 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-k6cl8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-ml2z8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-fqmls namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-thd52 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-controller-manager-86f5fc4fc7-fx4b5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-scheduler-9df464f49-fswpk namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-state-metrics-85b5bf77b4-mxf42 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: network-problem-detector-controller-66989c7547-j6rgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: remedy-controller-azure-57f7db994-gv467 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: shoot-dns-service-55f4885d86-85jgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-fxmch namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-s822t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-recommender-56bbfc87c8-lbv2s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-updater-6f4b5fb546-xb778 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system cluster: shoot kind: pod name: calico-node-4wmbt namespace: kube-system cluster: shoot kind: pod name: calico-node-8wlvp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-rzc7h namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-svm6w namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: diki-242449-m2wpk64dps namespace: kube-system cluster: shoot kind: pod name: diki-242451-0r3a1mudxn namespace: kube-system cluster: shoot kind: pod name: diki-242466-syzgrb0nhu namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-bbbbr namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system cluster: shoot kind: pod name: node-exporter-nbkkr namespace: kube-system cluster: shoot kind: pod name: node-exporter-ph9sx namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-8mw8p namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-p9jp4 namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-db9kq namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-t667q namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cert-controller-manager-6946674f78-9dsg6 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cloud-controller-manager-6f67b6df64-9svgn namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-controller-fd9587fdf-2mvdf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-6kzb7 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-qggvf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: event-logger-69576b5c95-hjbwj namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-qlhnp namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-z7rjv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-4r2tv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-szjgd namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-5mfhz namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-mjzj9 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-controller-manager-856b7c9889-dzsbv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-scheduler-5d4c7456bd-mvv6x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-state-metrics-64d5994f8-rfzmh namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: network-problem-detector-controller-66cc54677c-kvq75 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: shoot-dns-service-575bcd459-79s4m namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-jl676 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-s8flk namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-recommender-56645d8bdb-2lcmb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-updater-f79b6fc6b-4rlg5 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: diki-242404-z1nu9wom0m namespace: kube-system cluster: shoot kind: pod name: diki-242449-8z89s24f3f namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-2blsk namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-zl466 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system cluster: shoot kind: pod name: node-exporter-8frqb namespace: kube-system cluster: shoot kind: pod name: node-exporter-xq6cg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-mhj4m namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-rn6hv namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-46wrb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-v88mn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cert-controller-manager-5df68f6f5d-sgc7d namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cloud-controller-manager-b4857486b-2h6jb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-controller-5d4fc5c479-dmrwv namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-66245 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-c924q namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: event-logger-6469658865-tbjft namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-j9wdx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-wrpcb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-pg654 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-rfqn2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-m7mmg namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-state-metrics-7f54fbdbdb-jpq78 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: network-problem-detector-controller-78bbfd4757-tf8f2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: shoot-dns-service-867b566fc5-ct8wj namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-7j9lc namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-rhbmx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-recommender-5df469cbf4-kngl8 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-updater-5dfd58d478-ph8mz namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system cluster: shoot kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system cluster: shoot kind: pod name: calico-node-7xv9t namespace: kube-system cluster: shoot kind: pod name: calico-node-k2pc6 namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-t965v namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vsrrl namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-55ptw namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system cluster: shoot kind: pod name: node-exporter-rnbv9 namespace: kube-system cluster: shoot kind: pod name: node-exporter-trqtg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-k79bs namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-pdtdj namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system Secrets in Kubernetes must not be stored as environment variables (HIGH 242415) Pod does not use environment to inject secret. aws cluster: seed kind: pod name: aws-custom-route-controller-7856476fd4-hsq29 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-2v7cs namespace: shoot--diki-comp--aws cluster: seed kind: pod name: blackbox-exporter-5d75c47dcd-d7bpd namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cert-controller-manager-755dbd646b-hgxzx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: cloud-controller-manager-769c9b45dd-c5vxq namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-driver-controller-7ffbd87db8-dkp27 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-controller-7669f6bfc4-nscqb namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xfjxn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: csi-snapshot-validation-654f9b49d7-xs2pt namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: event-logger-7cdddb58d8-65h7q namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-28tts namespace: shoot--diki-comp--aws cluster: seed kind: pod name: extension-shoot-lakom-service-6df659477c-5q5st namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-56mqn namespace: shoot--diki-comp--aws cluster: seed kind: pod name: gardener-resource-manager-6d957ff4b4-b2lbj namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-7gwf4 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: kube-state-metrics-68dfcd5d48-5mdnv namespace: shoot--diki-comp--aws cluster: seed kind: pod name: machine-controller-manager-7454c6df68-z77xw namespace: shoot--diki-comp--aws cluster: seed kind: pod name: network-problem-detector-controller-5f458c7579-82tns namespace: shoot--diki-comp--aws cluster: seed kind: pod name: plutono-567d7c946b-7xgjl namespace: shoot--diki-comp--aws cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: shoot-dns-service-645f556cf4-7xc4r namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-hxrh7 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-admission-controller-59bc4d9d8f-vf58j namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-recommender-6f499cfd88-lnbrx namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpa-updater-746fb98848-8zzf8 namespace: shoot--diki-comp--aws cluster: seed kind: pod name: vpn-seed-server-547576865c-x6fr2 namespace: shoot--diki-comp--aws cluster: shoot kind: pod name: apiserver-proxy-kx2mw namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system cluster: shoot kind: pod name: calico-node-9nlzv namespace: kube-system cluster: shoot kind: pod name: calico-node-l94hn namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system cluster: shoot kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-mrv64 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-s74n2 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-nd86n namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vjfwc namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system cluster: shoot kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-s5286 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system cluster: shoot kind: pod name: node-exporter-fkdwq namespace: kube-system cluster: shoot kind: pod name: node-exporter-xhh5n namespace: kube-system cluster: shoot kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot kind: pod name: node-local-dns-ws9mx namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-7nhkg namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-vngln namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-lpf4t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: blackbox-exporter-86c7645696-wk9l5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cert-controller-manager-7bd977469b-gj7zt namespace: shoot--diki-comp--azure cluster: seed kind: pod name: cloud-controller-manager-678c6d74d6-9n8dm namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-disk-6b967795c9-w8nmj namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-driver-controller-file-7cfdfbd8fc-xgp5z namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-controller-54b4bcd846-mlxgq namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-685cb namespace: shoot--diki-comp--azure cluster: seed kind: pod name: csi-snapshot-validation-797f668744-t64t4 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: event-logger-5d8496f566-jbqv7 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-mkrs9 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: extension-shoot-lakom-service-c79868bf8-tddc6 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-k6cl8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: gardener-resource-manager-78754877d5-ml2z8 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-fqmls namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-apiserver-86b5d6dbc4-thd52 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-controller-manager-86f5fc4fc7-fx4b5 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-scheduler-9df464f49-fswpk namespace: shoot--diki-comp--azure cluster: seed kind: pod name: kube-state-metrics-85b5bf77b4-mxf42 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: machine-controller-manager-68b74c776d-msnzv namespace: shoot--diki-comp--azure cluster: seed kind: pod name: network-problem-detector-controller-66989c7547-j6rgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: plutono-6fc5d56577-9h64s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: remedy-controller-azure-57f7db994-gv467 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: shoot-dns-service-55f4885d86-85jgc namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-fxmch namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-admission-controller-6ccd6fc589-s822t namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-recommender-56bbfc87c8-lbv2s namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpa-updater-6f4b5fb546-xb778 namespace: shoot--diki-comp--azure cluster: seed kind: pod name: vpn-seed-server-576f5cc-rttdc namespace: shoot--diki-comp--azure cluster: shoot kind: pod name: apiserver-proxy-kbgdp namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system cluster: shoot kind: pod name: calico-node-4wmbt namespace: kube-system cluster: shoot kind: pod name: calico-node-8wlvp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-rzc7h namespace: kube-system cluster: shoot kind: pod name: cloud-node-manager-svm6w namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system cluster: shoot kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-bbbbr namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system cluster: shoot kind: pod name: node-exporter-nbkkr namespace: kube-system cluster: shoot kind: pod name: node-exporter-ph9sx namespace: kube-system cluster: shoot kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot kind: pod name: node-local-dns-zs2sb namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-8mw8p namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-p9jp4 namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-db9kq namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: blackbox-exporter-c7cc77fbf-t667q namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cert-controller-manager-6946674f78-9dsg6 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: cloud-controller-manager-6f67b6df64-9svgn namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-driver-controller-7dd7c47666-zjpqb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-controller-fd9587fdf-2mvdf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-6kzb7 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: csi-snapshot-validation-79df8f8c66-qggvf namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: event-logger-69576b5c95-hjbwj namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-qlhnp namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: extension-shoot-lakom-service-86596f55f8-z7rjv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-4r2tv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: gardener-resource-manager-ff5bf7fb4-szjgd namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-5mfhz namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-apiserver-6f5746f87-mjzj9 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-controller-manager-856b7c9889-dzsbv namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-scheduler-5d4c7456bd-mvv6x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: kube-state-metrics-64d5994f8-rfzmh namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: machine-controller-manager-67b97665c9-m54jw namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: network-problem-detector-controller-66cc54677c-kvq75 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: plutono-69866c8cdb-n2c8x namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: shoot-dns-service-575bcd459-79s4m namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-jl676 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-admission-controller-9cffc8f78-s8flk namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-recommender-56645d8bdb-2lcmb namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpa-updater-f79b6fc6b-4rlg5 namespace: shoot--diki-comp--gcp cluster: seed kind: pod name: vpn-seed-server-67c8474dc7-blfcl namespace: shoot--diki-comp--gcp cluster: shoot kind: pod name: apiserver-proxy-rmcnj namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-v88dp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system cluster: shoot kind: pod name: calico-node-5bzc2 namespace: kube-system cluster: shoot kind: pod name: calico-node-cnwrp namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system cluster: shoot kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-z298z namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-zgp8f namespace: kube-system cluster: shoot kind: pod name: diki-242393-ot4eirqfni namespace: kube-system cluster: shoot kind: pod name: diki-242406-uphz6x02zf namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-2blsk namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system cluster: shoot kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-zl466 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system cluster: shoot kind: pod name: node-exporter-8frqb namespace: kube-system cluster: shoot kind: pod name: node-exporter-xq6cg namespace: kube-system cluster: shoot kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot kind: pod name: node-local-dns-kz9nr namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-mhj4m namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-rn6hv namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-46wrb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: blackbox-exporter-6b8d699d98-v88mn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cert-controller-manager-5df68f6f5d-sgc7d namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: cloud-controller-manager-b4857486b-2h6jb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-driver-controller-5968889847-slsgn namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-controller-5d4fc5c479-dmrwv namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-66245 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: csi-snapshot-validation-5fc8f5bb4b-c924q namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: event-logger-6469658865-tbjft namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-j9wdx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: extension-shoot-lakom-service-844c5dcfd6-wrpcb namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-pg654 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: gardener-resource-manager-7b4747c958-rfqn2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-apiserver-7fb7b9b4cd-m7mmg namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: kube-state-metrics-7f54fbdbdb-jpq78 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: machine-controller-manager-85cbdc979-mptqt namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: network-problem-detector-controller-78bbfd4757-tf8f2 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: plutono-694bff49d4-px76r namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: prometheus-shoot-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: shoot-dns-service-867b566fc5-ct8wj namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vali-0 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-7j9lc namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-admission-controller-b99c554c8-rhbmx namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-recommender-5df469cbf4-kngl8 namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpa-updater-5dfd58d478-ph8mz namespace: shoot--diki-comp--openstack cluster: seed kind: pod name: vpn-seed-server-69d5794bb7-s7vkf namespace: shoot--diki-comp--openstack cluster: shoot kind: pod name: apiserver-proxy-qw9pr namespace: kube-system cluster: shoot kind: pod name: apiserver-proxy-qzdcp namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system cluster: shoot kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system cluster: shoot kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system cluster: shoot kind: pod name: calico-node-7xv9t namespace: kube-system cluster: shoot kind: pod name: calico-node-k2pc6 namespace: kube-system cluster: shoot kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system cluster: shoot kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system cluster: shoot kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system cluster: shoot kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system cluster: shoot kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-gcsc7 namespace: kube-system cluster: shoot kind: pod name: csi-driver-node-pmml4 namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-t965v namespace: kube-system cluster: shoot kind: pod name: egress-filter-applier-vsrrl namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system cluster: shoot kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-55ptw namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system cluster: shoot kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system cluster: shoot kind: pod name: node-exporter-rnbv9 namespace: kube-system cluster: shoot kind: pod name: node-exporter-trqtg namespace: kube-system cluster: shoot kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot kind: pod name: node-local-dns-r8z88 namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-k79bs namespace: kube-system cluster: shoot kind: pod name: node-problem-detector-pdtdj namespace: kube-system cluster: shoot kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system Kubernetes must separate user functionality (MEDIUM 242417) Gardener managed pods are not user pods aws kind: pod name: apiserver-proxy-kx2mw namespace: kube-system kind: pod name: apiserver-proxy-wtlv2 namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-82dwq namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-wh7rj namespace: kube-system kind: pod name: calico-node-9nlzv namespace: kube-system kind: pod name: calico-node-l94hn namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-x9rl9 namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-6rlcn namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-g7k2t namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-vtvrw namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-7gf59 namespace: kube-system kind: pod name: coredns-5cc8785ccd-x8bs2 namespace: kube-system kind: pod name: coredns-5cc8785ccd-xwwgh namespace: kube-system kind: pod name: csi-driver-node-mrv64 namespace: kube-system kind: pod name: csi-driver-node-s74n2 namespace: kube-system kind: pod name: egress-filter-applier-nd86n namespace: kube-system kind: pod name: egress-filter-applier-vjfwc namespace: kube-system kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-4lhcz namespace: kube-system kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system kind: pod name: metrics-server-5776b47bc7-g7qjf namespace: kube-system kind: pod name: metrics-server-5776b47bc7-rfmd5 namespace: kube-system kind: pod name: network-problem-detector-host-s5286 namespace: kube-system kind: pod name: network-problem-detector-host-x5rm5 namespace: kube-system kind: pod name: network-problem-detector-pod-5kv4k namespace: kube-system kind: pod name: network-problem-detector-pod-s4wlg namespace: kube-system kind: pod name: node-exporter-fkdwq namespace: kube-system kind: pod name: node-exporter-xhh5n namespace: kube-system kind: pod name: node-local-dns-6kjdw namespace: kube-system kind: pod name: node-local-dns-ws9mx namespace: kube-system kind: pod name: node-problem-detector-7nhkg namespace: kube-system kind: pod name: node-problem-detector-vngln namespace: kube-system kind: pod name: vpn-shoot-664f9946cc-cgkvj namespace: kube-system azure kind: pod name: apiserver-proxy-kbgdp namespace: kube-system kind: pod name: apiserver-proxy-ptvb8 namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-gx79p namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-qhbs2 namespace: kube-system kind: pod name: calico-node-4wmbt namespace: kube-system kind: pod name: calico-node-8wlvp namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hf2jw namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-98jwl namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-j82pt namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-gq6ml namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-jg9nf namespace: kube-system kind: pod name: cloud-node-manager-rzc7h namespace: kube-system kind: pod name: cloud-node-manager-svm6w namespace: kube-system kind: pod name: coredns-58fd58b4f6-kbbdp namespace: kube-system kind: pod name: coredns-58fd58b4f6-pvvrz namespace: kube-system kind: pod name: csi-driver-node-disk-hjxlx namespace: kube-system kind: pod name: csi-driver-node-disk-nsmlq namespace: kube-system kind: pod name: csi-driver-node-file-5ln94 namespace: kube-system kind: pod name: csi-driver-node-file-qv8rp namespace: kube-system kind: pod name: egress-filter-applier-bbbbr namespace: kube-system kind: pod name: egress-filter-applier-qb8t6 namespace: kube-system kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-kpksf namespace: kube-system kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system kind: pod name: metrics-server-7655f847b-4kzt2 namespace: kube-system kind: pod name: metrics-server-7655f847b-8v894 namespace: kube-system kind: pod name: network-problem-detector-host-6b9mc namespace: kube-system kind: pod name: network-problem-detector-host-kbzqs namespace: kube-system kind: pod name: network-problem-detector-pod-k22pr namespace: kube-system kind: pod name: network-problem-detector-pod-kx6jn namespace: kube-system kind: pod name: node-exporter-nbkkr namespace: kube-system kind: pod name: node-exporter-ph9sx namespace: kube-system kind: pod name: node-local-dns-s2lvs namespace: kube-system kind: pod name: node-local-dns-zs2sb namespace: kube-system kind: pod name: node-problem-detector-8mw8p namespace: kube-system kind: pod name: node-problem-detector-p9jp4 namespace: kube-system kind: pod name: vpn-shoot-56dcf9cf9d-99tfc namespace: kube-system gcp kind: pod name: apiserver-proxy-rmcnj namespace: kube-system kind: pod name: apiserver-proxy-v88dp namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-gmfnj namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-jjtfq namespace: kube-system kind: pod name: calico-node-5bzc2 namespace: kube-system kind: pod name: calico-node-cnwrp namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-hjg6k namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-frk7j namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-rlc2z namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-5cbl7 namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-scbqx namespace: kube-system kind: pod name: coredns-679b67f9f7-m46pm namespace: kube-system kind: pod name: coredns-679b67f9f7-t8f7n namespace: kube-system kind: pod name: csi-driver-node-z298z namespace: kube-system kind: pod name: csi-driver-node-zgp8f namespace: kube-system kind: pod name: egress-filter-applier-2blsk namespace: kube-system kind: pod name: egress-filter-applier-mwnd5 namespace: kube-system kind: pod name: kube-proxy-worker-bex82-v1.28.10-bb9x9 namespace: kube-system kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system kind: pod name: metrics-server-7db8b88958-dz2h9 namespace: kube-system kind: pod name: metrics-server-7db8b88958-rwnwc namespace: kube-system kind: pod name: network-problem-detector-host-x6g88 namespace: kube-system kind: pod name: network-problem-detector-host-zl466 namespace: kube-system kind: pod name: network-problem-detector-pod-n8k2n namespace: kube-system kind: pod name: network-problem-detector-pod-nnqtf namespace: kube-system kind: pod name: node-exporter-8frqb namespace: kube-system kind: pod name: node-exporter-xq6cg namespace: kube-system kind: pod name: node-local-dns-cl4xr namespace: kube-system kind: pod name: node-local-dns-kz9nr namespace: kube-system kind: pod name: node-problem-detector-mhj4m namespace: kube-system kind: pod name: node-problem-detector-rn6hv namespace: kube-system kind: pod name: vpn-shoot-59f4dbd8cd-bwf8w namespace: kube-system openstack kind: pod name: apiserver-proxy-qw9pr namespace: kube-system kind: pod name: apiserver-proxy-qzdcp namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-2nt8f namespace: kube-system kind: pod name: blackbox-exporter-858fbbb8d6-6tqbq namespace: kube-system kind: pod name: calico-kube-controllers-7fbfb84c54-2lsh5 namespace: kube-system kind: pod name: calico-node-7xv9t namespace: kube-system kind: pod name: calico-node-k2pc6 namespace: kube-system kind: pod name: calico-node-vertical-autoscaler-5477bf8d8b-przgw namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-bwkdh namespace: kube-system kind: pod name: calico-typha-deploy-7968dd78d5-hkdc5 namespace: kube-system kind: pod name: calico-typha-horizontal-autoscaler-586ff75c6b-htlcp namespace: kube-system kind: pod name: calico-typha-vertical-autoscaler-b95cbbd-9zp9f namespace: kube-system kind: pod name: coredns-56d45984c9-f6xtf namespace: kube-system kind: pod name: coredns-56d45984c9-zgq2w namespace: kube-system kind: pod name: csi-driver-node-gcsc7 namespace: kube-system kind: pod name: csi-driver-node-pmml4 namespace: kube-system kind: pod name: egress-filter-applier-t965v namespace: kube-system kind: pod name: egress-filter-applier-vsrrl namespace: kube-system kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system kind: pod name: kube-proxy-worker-dqty2-v1.28.10-xx9v6 namespace: kube-system kind: pod name: metrics-server-586dcd8bff-7n7nm namespace: kube-system kind: pod name: metrics-server-586dcd8bff-sjjfv namespace: kube-system kind: pod name: network-problem-detector-host-55ptw namespace: kube-system kind: pod name: network-problem-detector-host-lp4n6 namespace: kube-system kind: pod name: network-problem-detector-pod-ftcw5 namespace: kube-system kind: pod name: network-problem-detector-pod-zt596 namespace: kube-system kind: pod name: node-exporter-rnbv9 namespace: kube-system kind: pod name: node-exporter-trqtg namespace: kube-system kind: pod name: node-local-dns-jdng7 namespace: kube-system kind: pod name: node-local-dns-r8z88 namespace: kube-system kind: pod name: node-problem-detector-k79bs namespace: kube-system kind: pod name: node-problem-detector-pdtdj namespace: kube-system kind: pod name: vpn-shoot-697b676499-jkgvw namespace: kube-system The Kubernetes API server must use approved cipher suites (MEDIUM 242418) Option tls-cipher-suites set to allowed values. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes API Server must have the SSL Certificate Authority set (MEDIUM 242419) Option client-ca-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes Kubelet must have the SSL Certificate Authority set (MEDIUM 242420) Option authentication.x509.clientCAFile set. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Controller Manager must have the SSL Certificate Authority set (MEDIUM 242421) Option root-ca-file set. aws kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--aws azure kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-controller-manager namespace: shoot--diki-comp--openstack Kubernetes API Server must have a certificate for communication (MEDIUM 242422) Option tls-cert-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Option tls-private-key-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must enable client authentication to secure service (MEDIUM 242423) Option client-transport-security.client-cert-auth set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes Kubelet must enable tlsPrivateKeyFile for client authentication to secure service (MEDIUM 242424) Kubelet rotates server certificates automatically itself. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes Kubelet must enable tlsCertFile for client authentication to secure service (MEDIUM 242425) Kubelet rotates server certificates automatically itself. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes etcd must have a key file for secure communication (MEDIUM 242427) Option client-transport-security.key-file set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have a certificate for communication (MEDIUM 242428) Option client-transport-security.cert-file set to allowed value. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have the SSL Certificate Authority set (MEDIUM 242429) Option etcd-cafile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must have a certificate for communication (MEDIUM 242430) Option etcd-certfile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes etcd must have a key file for secure communication (MEDIUM 242431) Option etcd-keyfile set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes Kubelet must enable kernel protection (HIGH 242434) Option protectKernelDefaults set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb The Kubernetes API server must have the ValidatingAdmissionWebhook enabled (HIGH 242436) Option enable-admission-plugins set to allowed value. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes API Server must configure timeouts to limit attack surface (MEDIUM 242438) Option request-timeout has not been set. aws details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack details: defaults to 1m0s kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must remove old components after updated versions have been installed (MEDIUM 242442) All found images use current versions. aws azure gcp openstack The Kubernetes component etcd must be owned by etcd (MEDIUM 242445) File has expected owners aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_31.3632059657/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/region, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/secretAccessKey, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/accessKeyID, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_34.2074945830/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageAccount, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageKey, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_30.2940324903/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/serviceaccount.json, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_39.3264256653/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialSecret, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/authURL, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/bucketName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/domainName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/region, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/tenantName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialID, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialName, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_27.791977657/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, ownerUser: 65532, ownerGroup: 65532 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_26.760285163/etcd.conf.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes conf files must be owned by root (MEDIUM 242446) File has expected owners aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_03_32.3178977814/config.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_03_07.736850249/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_02_10.919451044/audit-policy.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/podsecurity.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/admission-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_02_10.226502613/encryption-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_02_10.2933211119/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_02_10.2023717197/egress-selector-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_02_10.1624455993/static_tokens.csv, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_01_59.3581293990/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_02_16.2132886517/config.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/token, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/kubeconfig, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_00_42.2870882805/audit-policy.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/podsecurity.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/admission-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_00_42.531503639/encryption-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_00_42.322496126/id_rsa, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_00_42.3637718223/egress-selector-configuration.yaml, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_00_42.2571933157/static_tokens.csv, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack The Kubernetes Kube Proxy kubeconfig must have file permissions set to 644 or more restrictive (MEDIUM 242447) File has expected permissions aws details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes Kube Proxy kubeconfig must be owned by root (MEDIUM 242448) File has expected owners aws details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~configmap/kube-proxy-config/config.yaml, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~secret/kubeconfig/kubeconfig, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes Kubelet certificate authority file must have file permissions set to 644 or more restrictive (MEDIUM 242449) File has expected permissions aws details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/ca.crt, permissions: 644 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes Kubelet certificate authority must be owned by root (MEDIUM 242450) File has expected owners aws details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/ca.crt, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes component PKI must be owned by root (MEDIUM 242451) File has expected owners aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803, ownerUser: 0, ownerGroup: 0 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519, ownerUser: 0, ownerGroup: 65532 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115, ownerUser: 0, ownerGroup: 0 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes kubelet KubeConfig must have file permissions set to 644 or more restrictive (MEDIUM 242452) File has expected permissions aws details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/kubeconfig-real, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs details: fileName: /var/lib/kubelet/config/kubelet, permissions: 644 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes kubelet KubeConfig file must be owned by root (MEDIUM 242453) File has expected owners aws details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack details: fileName: /var/lib/kubelet/kubeconfig-real, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs details: fileName: /var/lib/kubelet/config/kubelet, ownerUser: 0, ownerGroup: 0 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) File has expected permissions aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-fd95950b-9370-4572-949e-1b89bffc322c/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~csi/pv-shoot--garden--aws-ha-eu1-35612ac2-a2b9-4090-a96e-9769ae4951b1/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~csi/pv-shoot--garden--az-ha-eu1-3c6cb2de-811b-4aba-a0cf-f1adf2e54dc7/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~csi/pv--e9f0c993-3a2f-4339-9fa0-3be12b6ba0ff/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-52fa48e7-f13b-4e8e-9c28-93e60a287d73/mount/safe_guard, permissions: 600 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/snap/db, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0.tmp, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/new.etcd/member/wal/0000000000000000-0000000000000000.wal, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~csi/pv-shoot--garden--cc-ha-eu1-41094dbc-7a38-4451-9f23-2f3a958aec41/mount/safe_guard, permissions: 600 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes admin.conf must have file permissions set to 644 or more restrictive (MEDIUM 242460) File has expected permissions aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_03_32.3178977814/config.yaml, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/kubeconfig, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_32.4108013154/token, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_02_10.919451044/audit-policy.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/podsecurity.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_02_10.557863803/admission-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_02_10.226502613/encryption-configuration.yaml, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_02_10.2933211119/id_rsa, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_02_10.2023717197/egress-selector-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_02_10.1624455993/static_tokens.csv, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_03_07.736850249/id_rsa, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/kubeconfig, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_03_23.915608683/token, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_01_59.3581293990/id_rsa, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/kubeconfig, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_03.3923270535/token, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~configmap/kube-scheduler-config/..2024_07_25_13_02_16.2132886517/config.yaml, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/token, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/kubeconfig/..2024_07_25_13_02_19.2500005201/kubeconfig, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/audit-policy-config/..2024_07_25_13_00_42.2870882805/audit-policy.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/podsecurity.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/admission-config/..2024_07_25_13_00_42.3675300062/admission-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-encryption-secret/..2024_07_25_13_00_42.531503639/encryption-configuration.yaml, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key/..2024_07_25_13_00_42.322496126/id_rsa, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~configmap/egress-selection-config/..2024_07_25_13_00_42.3637718223/egress-selector-configuration.yaml, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/static-token/..2024_07_25_13_00_42.2571933157/static_tokens.csv, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack The Kubernetes API Server audit logs must be enabled (MEDIUM 242461) Option audit-policy-file set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive (MEDIUM 242466) File has expected permissions aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca/..2024_07_25_13_03_07.1368478840/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_03_32.2849634808/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.crt, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_02_10.2226241370/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_02_10.933493267/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_02_10.3965564115/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca/..2024_07_25_13_02_10.662489473/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_02_10.2581373418/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/ca.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.crt, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_02_10.2158392424/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/86529276-a42f-4936-b124-a9c8086e0817/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_05_26.2518867880/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-kkfk1-v1.28.10-jlnp7 namespace: kube-system azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/c67ede99-8319-4733-8147-b982a812c98b/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_15_47.153294224/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-g7p4p-v1.28.10-rd228 namespace: kube-system gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/9c47266d-9ffc-404b-8ebd-3b875deb4702/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_03_47.2022085892/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-bex82-v1.28.10-vdtfc namespace: kube-system openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca/..2024_07_25_13_01_59.991544212/bundle.crt, permissions: 644 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.crt, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~projected/client-ca/..2024_07_25_13_02_16.1569774485/bundle.crt, permissions: 644 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.crt, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_00_42.2442118241/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_00_42.3330985798/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-front-proxy/..2024_07_25_13_00_42.3182125229/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca/..2024_07_25_13_00_42.3474913291/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-vpn/..2024_07_25_13_00_42.1762643519/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/ca.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.crt, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/ca-etcd/..2024_07_25_13_00_42.232080975/bundle.crt, permissions: 644 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot containerName: kube-proxy details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system cluster: shoot containerName: conntrack-fix details: fileName: /var/lib/kubelet/pods/3a896a5b-121e-4002-b774-32b920cf61b3/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_13_09_26.539881115/ca.crt, permissions: 644 kind: pod name: kube-proxy-worker-dqty2-v1.28.10-p2ssj namespace: kube-system The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) File has expected permissions aws cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/b4ab7c5a-7f34-4a9f-9a1a-c458680774ae/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_03_32.95238055/tls.key, permissions: 640 kind: pod name: kube-scheduler-7578c654bc-hkrb6 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_03_07.2977859912/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/server/..2024_07_25_13_03_07.2872104760/tls.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/17b59f93-1234-4095-b237-047f69079654/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_03_07.424642478/ca.key, permissions: 640 kind: pod name: kube-controller-manager-744589d556-krzm2 namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/server/..2024_07_25_13_02_10.141438377/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_02_10.978118440/bundle.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_02_10.874163962/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_02_10.3397907710/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_02_10.3506294053/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_02_10.3094998726/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/d1f9c1d3-278c-44c0-b023-2b465e7f7f07/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_02_10.3837573115/tls.key, permissions: 640 kind: pod name: kube-apiserver-76d9c64f5b-8s7gv namespace: shoot--diki-comp--aws cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address5-24.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address5-26.pem, permissions: 600 kind: node name: ip-IP-Address.eu-west-1.compute.internal azure cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-02.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-00.pem, permissions: 600 kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw gcp cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address3-43.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address3-45.pem, permissions: 600 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-client/..2024_07_25_13_01_59.3068992271/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/server/..2024_07_25_13_01_59.311037195/tls.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-controller-manager details: fileName: /var/lib/kubelet/pods/fe9a8ddb-08d1-4b46-8936-78de420b80f8/volumes/kubernetes.io~secret/ca-kubelet/..2024_07_25_13_01_59.1987301483/ca.key, permissions: 640 kind: pod name: kube-controller-manager-699b9d5ddc-9dmsx namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-scheduler details: fileName: /var/lib/kubelet/pods/dd1157b0-0692-44ba-9df2-607e31628d92/volumes/kubernetes.io~secret/kube-scheduler-server/..2024_07_25_13_02_16.3362231041/tls.key, permissions: 640 kind: pod name: kube-scheduler-754b48d9b7-wm2xh namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/server/..2024_07_25_13_00_42.1009608694/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/service-account-key-bundle/..2024_07_25_13_00_42.1536609594/bundle.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kube-aggregator/..2024_07_25_13_00_42.3154059943/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/kubelet-client/..2024_07_25_13_00_42.1321475187/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/http-proxy/..2024_07_25_13_00_42.2684688169/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/tls-sni-0/..2024_07_25_13_00_42.2482923120/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: seed containerName: kube-apiserver details: fileName: /var/lib/kubelet/pods/48589daa-0b30-4755-b4e2-f0f91db6f456/volumes/kubernetes.io~secret/etcd-client/..2024_07_25_13_00_42.2512843323/tls.key, permissions: 640 kind: pod name: kube-apiserver-7fb7b9b4cd-7tkz9 namespace: shoot--diki-comp--openstack cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-server-2024-0IP-Address4-55.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs cluster: shoot details: fileName: /var/lib/kubelet/pki/kubelet-client-2024-0IP-Address4-53.pem, permissions: 600 kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs Kubernetes Kubelet must not disable timeouts (MEDIUM 245541) Option streamingConnectionIdleTimeout set to allowed value. aws kind: node name: ip-IP-Address.eu-west-1.compute.internal kind: node name: ip-IP-Address.eu-west-1.compute.internal azure kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xr5mw kind: node name: shoot--diki-comp--azure-worker-g7p4p-z3-78697-xxs7v gcp kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-9ks54 kind: node name: shoot--diki-comp--gcp-worker-bex82-z1-7cf97-p9r2r openstack kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-7v2zs kind: node name: shoot--diki-comp--openstack-worker-dqty2-z1-65475-vw5jb Kubernetes API Server must disable basic authentication to protect information in transit (HIGH 245542) Option basic-auth-file has not been set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes endpoints must use approved organizational certificate and key pair to protect information in transit (HIGH 245544) Option kubelet-client-certificate set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Option kubelet-client-key set. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack Kubernetes must have a Pod Security Admission control file configured (HIGH 254800) PodSecurity is properly configured aws kind: PodSecurityConfiguration azure kind: PodSecurityConfiguration gcp kind: PodSecurityConfiguration openstack kind: PodSecurityConfiguration 🔵 Skipped The Kubernetes etcd must use TLS to protect the confidentiality of sensitive data during electronic dissemination (MEDIUM 242380) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack The Kubernetes Scheduler must have secure binding (MEDIUM 242384) The Kubernetes Scheduler runs in a container which already has limited access to network interfaces. In addition ingress traffic to the Kubernetes Scheduler is restricted via network policies, making an unintended exposure less likely. aws azure gcp openstack The Kubernetes Controller Manager must have secure binding (MEDIUM 242385) The Kubernetes Controller Manager runs in a container which already has limited access to network interfaces. In addition ingress traffic to the Kubernetes Controller Manager is restricted via network policies, making an unintended exposure less likely. aws azure gcp openstack Kubernetes Kubectl cp command must give expected access and results (MEDIUM 242396) \"kubectl\" is not installed into control plane pods or worker nodes and Gardener does not offer Kubernetes v1.12 or older. aws azure gcp openstack Kubernetes DynamicAuditing must not be enabled (MEDIUM 242398) Option feature-gates.DynamicAuditing removed in Kubernetes v1.19. aws azure gcp openstack Kubernetes DynamicKubeletConfig must not be enabled (MEDIUM 242399) Option featureGates.DynamicKubeletConfig removed in Kubernetes v1.26. aws details: Used Kubernetes version 1.28.10. azure details: Used Kubernetes version 1.28.10. gcp details: Used Kubernetes version 1.28.10. openstack details: Used Kubernetes version 1.28.10. Kubernetes manifests must be owned by root (MEDIUM 242405) Gardener does not deploy any control plane component as systemd processes or static pod. aws azure gcp openstack The Kubernetes manifest files must have least privileges (MEDIUM 242408) Gardener does not deploy any control plane component as systemd processes or static pod. aws azure gcp openstack The Kubernetes API Server must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242410) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes Scheduler must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242411) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes Controllers must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242412) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack The Kubernetes etcd must enforce ports, protocols, and services (PPS) that adhere to the Ports, Protocols, and Services Management Category Assurance List (PPSM CAL) (MEDIUM 242413) Cannot be tested and should be enforced organizationally. Gardener uses a minimum of known and automatically opened/used/created ports/protocols/services (PPSM stands for Ports, Protocols, Service Management). aws azure gcp openstack Kubernetes etcd must enable client authentication to secure service (MEDIUM 242426) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have peer-cert-file set for secure communication (MEDIUM 242432) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes etcd must have a peer-key-file set for secure communication (MEDIUM 242433) ETCD runs as a single instance, peer communication options are not used. aws kind: statefulSet name: etcd-main namespace: shoot--diki-comp--aws kind: statefulSet name: etcd-events namespace: shoot--diki-comp--aws azure kind: statefulSet name: etcd-main namespace: shoot--diki-comp--azure kind: statefulSet name: etcd-events namespace: shoot--diki-comp--azure gcp kind: statefulSet name: etcd-main namespace: shoot--diki-comp--gcp kind: statefulSet name: etcd-events namespace: shoot--diki-comp--gcp openstack kind: statefulSet name: etcd-main namespace: shoot--diki-comp--openstack kind: statefulSet name: etcd-events namespace: shoot--diki-comp--openstack Kubernetes must have a pod security policy set (HIGH 242437) PSPs are removed in K8s version 1.25. aws azure gcp openstack Kubernetes must contain the latest updates as authorized by IAVMs, CTOs, DTMs, and STIGs (MEDIUM 242443) Scanning/patching security vulnerabilities should be enforced organizationally. Security vulnerability scanning should be automated and maintainers should be informed automatically. aws azure gcp openstack Kubernetes component manifests must be owned by root (MEDIUM 242444) Rule is duplicate of \"242405\" aws azure gcp openstack Kubernetes kubeadm.conf must be owned by root(MEDIUM 242454) Gardener does not use \"kubeadm\" and also does not store any \"main config\" anywhere in seed or shoot (flow/component logic built-in/in-code). aws azure gcp openstack Kubernetes kubeadm.conf must have file permissions set to 644 or more restrictive (MEDIUM 242455) Gardener does not use \"kubeadm\" and also does not store any \"main config\" anywhere in seed or shoot (flow/component logic built-in/in-code). aws azure gcp openstack Kubernetes kubelet config must have file permissions set to 644 or more restrictive (MEDIUM 242456) Rule is duplicate of \"242452\". aws azure gcp openstack Kubernetes kubelet config must be owned by root (MEDIUM 242457) Rule is duplicate of \"242453\". aws azure gcp openstack Kubernetes API Server audit log path must be set (MEDIUM 242465) Rule is duplicate of \"242402\" aws azure gcp openstack Kubernetes must enable PodSecurity admission controller on static pods and Kubelets (HIGH 254801) Option featureGates.PodSecurity was made GA in v1.25 and removed in v1.28. aws azure gcp openstack 🔵 Accepted The Kubernetes API Server must have an audit log path set (MEDIUM 242402) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server must generate audit records that identify what type of event has occurred, identify the source of the event, contain the event results, identify any users, and identify any containers associated with the event (MEDIUM 242403) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes cluster must use non-privileged host ports for user pods (MEDIUM 242414) node local dns requires port 53 in order to operate properly aws cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-6kjdw namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-ws9mx namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-ws9mx namespace: kube-system azure cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-s2lvs namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-zs2sb namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-zs2sb namespace: kube-system gcp cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-cl4xr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-kz9nr namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-kz9nr namespace: kube-system openstack cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-jdng7 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-r8z88 namespace: kube-system cluster: shoot details: containerName: node-cache, port: 53 kind: pod name: node-local-dns-r8z88 namespace: kube-system The Kubernetes API Server must be set to audit log max size (MEDIUM 242462) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server must be set to audit log maximum backup (MEDIUM 242463) Gardener can integrate with different audit logging solutions aws azure gcp openstack The Kubernetes API Server audit log retention must be set (MEDIUM 242464) Gardener can integrate with different audit logging solutions aws azure gcp openstack Kubernetes API Server must disable token authentication to protect information in transit (HIGH 245543) All defined tokens are accepted. aws kind: deployment name: kube-apiserver namespace: shoot--diki-comp--aws azure kind: deployment name: kube-apiserver namespace: shoot--diki-comp--azure gcp kind: deployment name: kube-apiserver namespace: shoot--diki-comp--gcp openstack kind: deployment name: kube-apiserver namespace: shoot--diki-comp--openstack 🟠 Warning The Kubernetes component etcd must be owned by etcd (MEDIUM 242445) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f gcp kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f The Kubernetes conf files must be owned by root (MEDIUM 242446) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes component PKI must be owned by root (MEDIUM 242451) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f gcp kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f The Kubernetes admin.conf must have file permissions set to 644 or more restrictive (MEDIUM 242460) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes PKI CRT must have file permissions set to 644 or more restrictive (MEDIUM 242466) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) Reference group cannot be tested since all pods of the group are scheduled on a fully allocated node. azure cluster: seed kind: referenceGroup name: etcd-events uid: f0537c21-2987-42a5-a15b-7cf16beff82f cluster: seed kind: referenceGroup name: kube-controller-manager-86f5fc4fc7 uid: 7fc87649-c1aa-4488-b276-446d96bc0e35 cluster: seed kind: referenceGroup name: kube-scheduler-9df464f49 uid: ef24775f-39b0-451e-bcd8-e577b834455b cluster: seed kind: referenceGroup name: kube-apiserver-86b5d6dbc4 uid: 1edbd5e8-2dc0-4081-b956-ac2faa06d320 gcp cluster: seed kind: referenceGroup name: etcd-events uid: 223e03f1-a5ad-49da-b569-9e365eda153f cluster: seed kind: referenceGroup name: kube-controller-manager-856b7c9889 uid: 60d0e948-ed0c-455a-8ce6-79099a09059d cluster: seed kind: referenceGroup name: kube-scheduler-5d4c7456bd uid: 1f098851-17d6-4bdd-b223-7ac36ff06508 cluster: seed kind: referenceGroup name: kube-apiserver-6f5746f87 uid: 886baf48-5fcd-4a34-9d81-3c3445552745 🔴 Failed Secrets in Kubernetes must not be stored as environment variables (HIGH 242415) Pod uses environment to inject secret. gcp cluster: seed details: containerName: backup-restore, variableName: GOOGLE_STORAGE_API_ENDPOINT, keyRef: storageAPIEndpoint kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp The Kubernetes etcd must have file permissions set to 644 or more restrictive (MEDIUM 242459) File has too wide permissions aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/region, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/secretAccessKey, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/accessKeyID, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_34.1239384448/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_34.2074945830/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_34.1172303068/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_34.2099202019/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_31.3632059657/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_31.34789977/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_31.2250314724/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageAccount, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_30.69405982/storageKey, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_30.2940324903/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_30.20484171/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_30.1702802701/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/serviceaccount.json, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_39.2305215472/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_39.3264256653/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_39.4173641049/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_39.72798489/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialSecret, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/authURL, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/bucketName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/domainName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/region, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/tenantName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialID, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/etcd-backup/..2024_07_25_12_59_27.2208747644/applicationCredentialName, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_27.791977657/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_27.2143070997/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_27.473498504/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~configmap/etcd-config-file/..2024_07_25_12_59_26.760285163/etcd.conf.yaml, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-ca-etcd/..2024_07_25_12_59_26.899830952/bundle.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/namespace, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/token, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~projected/kube-api-access-gardener/..2024_07_25_12_59_26.617148803/ca.crt, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack The Kubernetes PKI keys must have file permissions set to 600 or more restrictive (MEDIUM 242467) File has too wide permissions aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_34.3978844949/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/a9d384ca-71b3-4ec6-af13-99948f8a9dc0/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_34.455155549/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_31.3506181544/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/58d543b1-6f99-461c-8865-c8e7f8304f2f/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_31.1102049637/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--aws azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_30.317963596/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/888beae6-bc5d-4b09-9a47-9743329c77fa/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_30.162676357/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--azure gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_39.2209850753/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/c96eb421-2e7e-4751-bca0-11fc953bbd03/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_39.1006602421/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--gcp openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_27.3616440099/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/5850d039-ffb6-4474-8fdc-52125f861755/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_27.933106860/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-main-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: etcd details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-server-tls/..2024_07_25_12_59_26.2941679320/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack cluster: seed containerName: backup-restore details: fileName: /var/lib/kubelet/pods/02a926bc-4903-4806-9e8a-43fc0553d701/volumes/kubernetes.io~secret/client-url-etcd-client-tls/..2024_07_25_12_59_26.199464106/tls.key, permissions: 644, expectedPermissionsMax: 640 kind: pod name: etcd-events-0 namespace: shoot--diki-comp--openstack ","categories":"","description":"The latest compliance report generated against security hardened shoot clusters","excerpt":"The latest compliance report generated against security hardened shoot …","ref":"/docs/security-and-compliance/report/","tags":"","title":"Gardener Compliance Report"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/networking/","tags":"","title":"Networking"},{"body":" ","categories":"","description":"Red Hatter Jamie Duncan gives a technical overview of Kubernetes, an open source container orchestration system, in just five minutes.","excerpt":"Red Hatter Jamie Duncan gives a technical overview of Kubernetes, an …","ref":"/docs/resources/videos/why-kubernetes/","tags":"","title":"Why Kubernetes"},{"body":"Shared Responsibility Model Gardener, like most cloud providers’ Kubernetes offerings, is dedicated for a global setup. And just like how most cloud providers offer means to fulfil regional restrictions, Gardener also has some means built in for this purpose. Similarly, Gardener also follows a shared responsibility model where users are obliged to use the provided Gardener means in a way which results in compliance with regional restrictions.\nRegions Gardener users need to understand that Gardener is a generic tool and has no built-in knowledge about regions as geographical or political conglomerates. For Gardener, regions are only strings. To create regional restrictions is an obligation of all Gardener users who orchestrate existing Gardener functionality to reach evidence which can be audited later on.\nSupport for Regional Restrictions Gardener offers functionality to support the most important kind of regional restrictions in its global setup:\n No Restriction: All seeds in all regions can be allowed to host the control plane of all shoots. Restriction by Dedication: Shoots running in a region can be configured so that only dedicated seeds in dedicated regions are allowed to host the shoot’s control plane. This can be achieved by adding labels to a seed and subsequently restricting shoot control plane placement to appropriately labeled seeds by using the field spec.seedSelector (example). Restriction by Tainting: Some seeds running in some dedicated regions are not allowed to host the control plane of any shoots unless explicitly allowed. This can be achieved by tainting seeds appropriately (example) which in turn requires explicit tolerations if a shoot’s control plane should be placed on such tainted seeds (example). ","categories":"","description":"How Gardener supports regional restrictions","excerpt":"How Gardener supports regional restrictions","ref":"/docs/security-and-compliance/regional-restrictions/","tags":["task"],"title":"Regional Restrictions"},{"body":" ","categories":"","description":"In this talk Andrew Jessup walks through the essential elements of building a performant, secure and well factored micro-service in Go and how to deploy it to Google Container Engine.You'll also learn how to use Google Stackdriver to monitor, instrument, trace and even debug a production service in real time.","excerpt":"In this talk Andrew Jessup walks through the essential elements of …","ref":"/docs/resources/videos/microservices-in_kubernetes/","tags":"","title":"High Performance Microservices with Kubernetes, Go, and gRPC"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/monitoring-and-troubleshooting/","tags":"","title":"Monitor and Troubleshoot"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/guides/applications/","tags":"","title":"Applications"},{"body":" ","categories":"","description":"Sandeep Dinesh shows how you can build small containers to make your Kubernetes deployments faster and more secure.","excerpt":"Sandeep Dinesh shows how you can build small containers to make your …","ref":"/docs/resources/videos/small-container/","tags":"","title":"Building Small Containers"},{"body":"Contributing Bigger Changes Here are the guidelines you should follow when contributing larger changes to Gardener:\n Avoid proposing a big change in one single PR. Instead, split your work into multiple stages which are independently mergeable and create one PR for each stage. For example, if introducing a new API resource and its controller, these stages could be:\n API resource types, including defaults and generated code. API resource validation. API server storage. Admission plugin(s), if any. Controller(s), including changes to existing controllers. Split this phase further into different functional subsets if appropriate. If you realize later that changes to artifacts introduced in a previous stage are required, by all means make them and explain in the PR why they were needed.\n Consider splitting a big PR further into multiple commits to allow for more focused reviews. For example, you could add unit tests / documentation in separate commits from the rest of the code. If you have to adapt your PR to review feedback, prefer doing that also in a separate commit to make it easier for reviewers to check how their feedback has been addressed.\n To make the review process more efficient and avoid too many long discussions in the PR itself, ask for a “main reviewer” to be assigned to your change, then work with this person to make sure he or she understands it in detail, and agree together on any improvements that may be needed. If you can’t reach an agreement on certain topics, comment on the PR and invite other people to join the discussion.\n Even if you have a “main reviewer” assigned, you may still get feedback from other reviewers. In general, these “non-main reviewers” are advised to focus more on the design and overall approach rather than the implementation details. Make sure that you address any concerns on this level appropriately.\n ","categories":"","description":"","excerpt":"Contributing Bigger Changes Here are the guidelines you should follow …","ref":"/docs/contribute/code/contributing-bigger-changes/","tags":"","title":"Contributing Bigger Changes"},{"body":" ","categories":"","description":"In this episode of Kubernetes Best Practices, Sandeep Dinesh shows how to work with Namespaces and how they can help you manage your Kubernetes resources.","excerpt":"In this episode of Kubernetes Best Practices, Sandeep Dinesh shows how …","ref":"/docs/resources/videos/namespace/","tags":"","title":"Organizing with Namespaces"},{"body":" ","categories":"","description":"How to make your Kubernetes deployments more robust by using Liveness and Readiness probes.","excerpt":"How to make your Kubernetes deployments more robust by using Liveness …","ref":"/docs/resources/videos/livecheck-readiness/","tags":"","title":"Readiness != Liveness"},{"body":" ","categories":"","description":"Smoothly handling Google Container Engine and networking can take some practice. In this video, Tim Hockin and Michael Rubin discuss migrating applications to Container Engine, networking in Container Engine, use of overlays, segmenting traffic between pods and services, and the variety of options available to you.","excerpt":"Smoothly handling Google Container Engine and networking can take some …","ref":"/docs/resources/videos/in-out-networking/","tags":"","title":"The Ins and Outs of Networking"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2024/","tags":"","title":"2024"},{"body":"Overview Here you can find a variety of articles related to Gardener and keep up to date with the latest community calls, features, and highlights!\nHow to Contribute If you’d like to create a new blog post, simply follow the steps outlined in the Documentation Contribution Guide and add the topic to the corresponding folder.\n","categories":"","description":"","excerpt":"Overview Here you can find a variety of articles related to Gardener …","ref":"/blog/","tags":"","title":"Blogs"},{"body":"","categories":"","description":"","excerpt":"","ref":"/","tags":"","title":"Gardener"},{"body":"The Gardener community recently concluded its 5th Hackathon, a week-long event that brought together multiple companies to collaborate on common topics of interest. The Hackathon, held at Schlosshof Freizeitheim in Schelklingen, Germany, was a testament to the power of collective effort and open-source, producing a tremendous number of results in a short time and moving the Gardener project forward with innovative solutions.\nA Week of Collaboration and Innovation The Hackathon addressed a wide range of topics, from improving the maturity of the Gardener API to harmonizing development setups and automating additional preparation tasks for Gardener installations. The event also saw the introduction of new resources and configurations, the rewriting of VPN components from Bash to Golang, and the exploration of a Tailscale-based VPN to secure shoot clusters.\nKey Achievements 🗃️ OCI Helm Release Reference for ControllerDeployment: The Hackathon introduced the core.gardener.cloud/v1 API, which supports OCI repository-based Helm chart references. This innovation reduces operational complexity and enables reusability for other scenarios. 👨🏼‍💻 Local gardener-operator Development Setup with gardenlet: A new Skaffold configuration was created to harmonize the development setups for Gardener. This configuration deploys gardener-operator and its Garden CRD together with a deployment of gardenlet to register a seed cluster, allowing for a full-fledged Gardener setup. 👨🏻‍🌾 Extensions for Garden Cluster via gardener-operator: The Hackathon focused on automating additional preparation tasks for Gardener installations. The Garden controller was augmented to deploy extensions as part of its reconciliation flow, reducing operational complexity. 🪄 Gardenlet Self-Upgrades for Unmanaged Seeds: A new Gardenlet resource was introduced, allowing for the specification of deployment values and component configurations. A new controller within gardenlet watches these resources and updates the gardenlet’s Helm chart and configuration accordingly, effectively implementing self-upgrades. 🦺 Type-Safe Configurability in OperatingSystemConfig: The Hackathon improved the configurability of the OperatingSystemConfig for containerd, DNS, NTP, etc. The OperatingSystemConfig API was augmented to support containerd-config related use-cases. 👮 Expose Shoot API Server in Tailscale VPN: The Hackathon explored the use of a Tailscale-based VPN to secure shoot clusters. A document was compiled explaining how shoot owners can expose their API server within a Tailscale VPN. ⌨️ Rewrite gardener/vpn2 from Bash to Golang: The Hackathon improved the VPN components by rewriting them in Golang. All functionality was successfully rewritten, and the pull requests have been opened for gardener/vpn2 and the integration into gardener/gardener. 🕳️ Pure IPv6-Based VPN Tunnel: The Hackathon addressed the restriction of the VPN network CIDR by switching the VPN tunnel to a pure IPv6-based network (follow-up of gardener/gardener#9597). This allows for more flexibility in network design. 👐 Harmonize Local VPN Setup with Real-World Scenario: The Hackathon aimed to align the local VPN setup with real-world scenarios regarding the VPN connection. provider-local was augmented to dynamically create Calico’s IPPool resources to emulate the real-world’s networking situation. 🐝 Support Cilium v1.15+ for HA Shoots: The Hackathon addressed the issue of Cilium v1.15+ not considering StatefulSet labels in NetworkPolicys. A prototype was developed to make the Service resources for vpn-seed-server headless. 🍞 Compression for ManagedResource Secrets: The Hackathon focused on reducing the size of Secret related to ManagedResources by leveraging the Brotli compression algorithm. This reduces network I/O and related costs, improving scalability and reducing load on the ETCD cluster. 🚛 Making Shoot Flux Extension Production-Ready: The Hackathon aimed to promote the Flux extension to “production-ready” status. Features such as reconciliation sync mode, and the option to provide additional Secret resources were added. 🧹 Move machine-controller-manager-provider-local Repository into gardener/gardener: The Hackathon focused on moving the machine-controller-manager-provider-local repository content into the gardener/gardener repository. This simplifies maintenance and development tasks. 🗄️ Stop Vendoring Third-Party Code in OS Extensions: The Hackathon aimed to avoid vendoring third-party code in the OS extensions. Two out of the four OS extensions have been adapted. 📦 Consider Embedded Files for Local Image Builds: The Hackathon addressed the issue that changes to embedded files don’t lead to automatic rebuilds of the Gardener images by Skaffold for local development. The related hack script was augmented to detect embedded files and make them part of the list of dependencies. Note that a significant portion of the above topics have been built on top of the achievements of previous Hackathons.This continuity and progression of these Hackathons, with each one building on the achievements of the last, is a testament to the power of sustained collaborative effort.\nLooking Ahead As we look towards the future, the Gardener community is already gearing up for the next Hackathon slated for the end of 2024. The anticipation is palpable, as these events have consistently proven to be a hotbed of creativity, innovation, and collaboration. The 5th Gardener Community Hackathon has once again demonstrated the remarkable outcomes that can be achieved when diverse minds unite to work on shared interests. The event has not only yielded an impressive array of results in a short span but has also sparked innovations that promise to propel the Gardener project to new heights. The community eagerly awaits the next Hackathon, ready to tackle new challenges and continue the journey of innovation and growth.\n","categories":"","description":"","excerpt":"The Gardener community recently concluded its 5th Hackathon, a …","ref":"/blog/2024/05-21-innovation-unleashed-a-deep-dive-into-the-5th-gardener-community-hackathon/","tags":"","title":"Innovation Unleashed: A Deep Dive into the 5th Gardener Community Hackathon"},{"body":"Use Cases In Kubernetes, on every Node the container runtime daemon pulls the container images that are configured in the Pods’ specifications running on the corresponding Node. Although these container images are cached on the Node’s file system after the initial pull operation, there are imperfections with this setup.\nNew Nodes are often created due to events such as auto-scaling (scale up), rolling updates, or replacements of unhealthy Nodes. A new Node would need to pull the images running on it from the container registry because the Node’s cache is initially empty. Pulling an image from a registry incurs network traffic and registry costs.\nTo reduce network traffic and registry costs for your Shoot cluster, it is recommended to enable the Gardener’s Registry Cache extension to run a registry as pull-through cache in the Shoot cluster.\nThe use cases of using a pull-through cache are not only limited to cost savings. Using a pull-through cache makes the Kubernetes cluster resilient to failures with the upstream registry - outages, failures due to rate limiting.\nSolution Gardener’s Registry Cache extension deploys and manages a pull-through cache registry in the Shoot cluster.\nA pull-through cache registry is a registry that caches container images in its storage. The first time when an image is requested from the pull-through cache, it pulls the image from the upstream registry, returns it to the client, and stores it in its local storage. On subsequent requests for the same image, the pull-through cache serves the image from its storage, avoiding network traffic to the upstream registry.\nImagine that you have a DaemonSet in your Kubernetes cluster. In a cluster without a pull-through cache, every Node must pull the same container image from the upstream registry. In a cluster with a pull-through cache, the image is pulled once from the upstream registry and served later for all Nodes.\nA Shoot cluster setup with a registry cache for Docker Hub (docker.io).\nCost Considerations An image pull represents ingress traffic for a virtual machine (data is entering to the system from outside) and egress traffic for the upstream registry (data is leaving the system).\nIngress traffic from the internet to a virtual machine is free of charge on AWS, GCP, and Azure. However, the cloud providers charge NAT gateway costs for inbound and outbound data processed by the NAT gateway based on the processed data volume (per GB). The container registry offerings on the cloud providers charge for egress traffic - again, based on the data volume (per GB).\nHaving all of this in mind, the Registry Cache extension reduces NAT gateway costs for the Shoot cluster and container registry costs.\nTry It Out! We would also like to encourage you to try it! As a Gardener user, you can also reduce your infrastructure costs and increase resilience by enabling the Registry Cache for your Shoot clusters. The Registry Cache extension is a great fit for long running Shoot clusters that have high image pull rate.\nFor more information, refer to the Registry Cache extension documentation!\n","categories":"","description":"","excerpt":"Use Cases In Kubernetes, on every Node the container runtime daemon …","ref":"/blog/2024/04-22-gardeners-registry-cache-extension-another-cost-saving-win-and-more/","tags":"","title":"Gardener's Registry Cache Extension: Another Cost Saving Win and More"},{"body":"With the rising popularity of WebAssembly (WASM) and WebAssembly System Interface (WASI) comes a variety of integration possibilities. WASM is now not only suitable for the browser, but can be also utilized for running workloads on the server. In this post we will explore how you can get started writing serverless applications powered by SpinKube on a Gardener Shoot cluster. This post is inspired by a similar tutorial that goes through the steps of Deploying the Spin Operator on Azure Kubernetes Service. Keep in mind that this post does not aim to define a production environment. It is meant to show that Gardener Shoot clusters are able to run WebAssembly workloads, giving users the chance to experiment and explore this cutting-edge technology.\nPrerequisites kubectl - the Kubernetes command line tool helm - the package manager for Kubernetes A running Gardener Shoot cluster Gardener Shoot Cluster For this showcase I am using a Gardener Shoot cluster on AWS infrastructure with nodes powered by Garden Linux, although the steps should be applicable for other infrastructures as well, since Gardener aims to provide a homogenous Kubernetes experience.\nAs a prerequisite for next steps, verify that you have access to your Gardener Shoot cluster.\n# Verify the access to the Gardener Shoot cluster kubectl get ns NAME STATUS AGE default Active 4m1s kube-node-lease Active 4m1s kube-public Active 4m1s kube-system Active 4m1s If you are having troubles accessing the Gardener Shoot cluster, please consult the Accessing Shoot Clusters documentation page.\nDeploy the Spin Operator As a first step, we will install the Spin Operator Custom Resource Definitions and the Runtime Class needed by wasmtime-spin-v2.\n# Install Spin Operator CRDs kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.crds.yaml # Install the Runtime Class kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.runtime-class.yaml Next, we will install cert-manager, which is required for provisioning TLS certificates used by the admission webhook of the Spin Operator. If you face issues installing cert-manager, please consult the cert-manager installation documentation.\n# Add and update the Jetstack repository helm repo add jetstack https://charts.jetstack.io helm repo update # Install the cert-manager chart alongside with CRDs needed by cert-manager helm install \\ cert-manager jetstack/cert-manager \\ --namespace cert-manager \\ --create-namespace \\ --version v1.14.4 \\ --set installCRDs=true In order to install the containerd-wasm-shim on the Kubernetes nodes we will use the kwasm-operator. There is also a successor of kwasm-operator - runtime-class-manager which aims to address some of the limitations of kwasm-operator and provide a production grade implementation for deploying containerd shims on Kubernetes nodes. Since kwasm-operator is easier to install, for the purpose of this post we will use it instead of the runtime-class-manager.\n# Add the kwasm helm repository helm repo add kwasm http://kwasm.sh/kwasm-operator/ helm repo update # Install KWasm operator helm install \\ kwasm-operator kwasm/kwasm-operator \\ --namespace kwasm \\ --create-namespace \\ --set kwasmOperator.installerImage=ghcr.io/spinkube/containerd-shim-spin/node-installer:v0.13.1 # Annotate all nodes in the cluster so kwasm can select them and provision the required containerd shim kubectl annotate node --all kwasm.sh/kwasm-node=true We can see that a pod has started and completed in the kwasm namespace.\nkubectl -n kwasm get pod NAME READY STATUS RESTARTS AGE ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm-qhr8r 0/1 Completed 0 8s kwasm-operator-6c76c5f94b-8zt4s 1/1 Running 0 15s The logs of the kwasm-operator also indicate that the node was provisioned with the required shim.\nkubectl -n kwasm logs kwasm-operator-6c76c5f94b-8zt4s {\"level\":\"info\",\"node\":\"ip-10-180-7-60.eu-west-1.compute.internal\",\"time\":\"2024-04-18T05:44:25Z\",\"message\":\"Trying to Deploy on ip-10-180-7-60.eu-west-1.compute.internal\"} {\"level\":\"info\",\"time\":\"2024-04-18T05:44:31Z\",\"message\":\"Job ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm is still Ongoing\"} {\"level\":\"info\",\"time\":\"2024-04-18T05:44:31Z\",\"message\":\"Job ip-10-180-7-60.eu-west-1.compute.internal-provision-kwasm is Completed. Happy WASMing\"} Finally we can deploy the spin-operator alongside with a shim-executor.\nhelm install spin-operator \\ --namespace spin-operator \\ --create-namespace \\ --version 0.1.0 \\ --wait \\ oci://ghcr.io/spinkube/charts/spin-operator kubectl apply -f https://github.com/spinkube/spin-operator/releases/download/v0.1.0/spin-operator.shim-executor.yaml Deploy a Spin App Let’s deploy a sample Spin application using the following command:\nkubectl apply -f https://raw.githubusercontent.com/spinkube/spin-operator/main/config/samples/simple.yaml After the CRD has been picked up by the spin-operator, a pod will be created running the sample application. Let’s explore its logs.\nkubectl logs simple-spinapp-56687588d9-nbrtq Serving http://0.0.0.0:80 Available Routes: hello: http://0.0.0.0:80/hello go-hello: http://0.0.0.0:80/go-hello We can see the available routes served by the application. Let’s port forward to the application service and test them out.\nkubectl port-forward services/simple-spinapp 8000:80 Forwarding from 127.0.0.1:8000 -\u003e 80 Forwarding from [::1]:8000 -\u003e 80 In another terminal, we can verify that the application returns a response.\ncurl http://localhost:8000/hello Hello world from Spin!% This sets the ground for further experimentation and testing. What the SpinApp CRD provides as capabilities and API can be explored through the SpinApp CRD reference.\nCleanup Let’s clean all deployed resources so far.\n# Delete the spin app and its executor kubectl delete spinapp simple-spinapp kubectl delete spinappexecutors.core.spinoperator.dev containerd-shim-spin # Uninstall the spin-operator chart helm -n spin-operator uninstall spin-operator # Remove the kwasm.sh/kwasm-node annotation from nodes kubectl annotate node --all kwasm.sh/kwasm-node- # Uninstall the kwasm-operator chart helm -n kwasm uninstall kwasm-operator # Uninstall the cert-manager chart helm -n cert-manager uninstall cert-manager # Delete the runtime class and SpinApp CRDs kubectl delete runtimeclass wasmtime-spin-v2 kubectl delete crd spinappexecutors.core.spinoperator.dev kubectl delete crd spinapps.core.spinoperator.dev Conclusion In my opinion, WASM on the server is here to stay. Communities are expressing more and more interest in integrating Kubernetes with WASM workloads. As shown Gardener clusters are perfectly capable of supporting this use case. This setup is a great way to start exploring the capabilities that WASM can bring to the server. As stated in the introduction, bear in mind that this post does not define a production environment, but is rather meant to define a playground suitable for exploring and trying out ideas.\n","categories":"","description":"","excerpt":"With the rising popularity of WebAssembly (WASM) and WebAssembly …","ref":"/blog/2024/04-18-spinkube-gardener-shoot-cluster/","tags":"","title":"SpinKube on Gardener - Serverless WASM on Kubernetes"},{"body":"KubeCon + CloudNativeCon Europe 2024, recently held in Paris, was a testament to the robustness of the open-source community and its pivotal role in driving advancements in AI and cloud-native technologies. With a record attendance of over +12,000 participants, the conference underscored the ubiquity of cloud-native architectures and the business opportunities they provide.\nAI Everywhere LLMs and GenAI took center stage at the event, with discussions on challenges such as security, data management, and energy consumption. A popular quote stated, “If #inference is the new web application, #kubernetes is the new web server”. The conference emphasized the need for more open data models for AI to democratize the technology. Cloud-native platforms offer advantages for AI innovation, such as packaging models and dependencies as Docker packages and enhancing resource management for proper model execution. The community is exploring AI workload management, including using CPUs for inferencing and preprocessing data before handing it over to GPUs. CNCF took the initiative and put together an AI whitepaper outlining the apparent synergy between cloud-native technologies and AI.\nCluster Autopilot The conference showcased popular projects in the cloud-native ecosystem, including Kubernetes, Istio, and OpenTelemetry. Kubernetes was highlighted as a platform for running massive AI workloads. The UXL Foundation aims to enable multi-vendor AI workloads on Kubernetes, allowing developers to move AI workloads without being locked into a specific infrastructure. Every vendor we interacted with has assembled an AI-powered chatbot, which performs various functions – from assessing cluster health through analyzing cost efficiency and proposing workload optimizations to troubleshooting issues and alerting for potential challenges with upcoming Kubernetes version upgrades. Sysdig went even further with a chatbot, which answers the popular question, “Do any of my products have critical CVEs in production?” and analyzes workloads’ structure and configuration. Some chatbots leveraged the k8sgpt project, which joined the CNCF sandbox earlier this year.\nSophisticated Fleet Management The ecosystem showcased maturity in observability, platform engineering, security, and optimization, which will help operationalize AI workloads. Data demands and costs were also in focus, touching on data observability and cloud-cost management. Cloud-native technologies, also going beyond Kubernetes, are expected to play a crucial role in managing the increasing volume of data and scaling AI. Google showcased fleet management in their Google Hosted Cloud offering (ex-Anthos). It allows for defining teams and policies at the fleet level, later applied to all the Kubernetes clusters in the fleet, irrespective of the infrastructure they run on (GCP and beyond).\nWASM Everywhere The conference also highlighted the growing interest in WebAssembly (WASM) as a portable binary instruction format for executable programs and its integration with Kubernetes and other functions. The topic here started with a dedicated WASM pre-conference day, the sessions of which are available in the following playlist. WASM is positioned as the smoother approach to software distribution and modularity, providing more lightweight runtime execution options and an easier way for app developers to enter.\nRust on the Rise Several talks were promoting Rust as an ideal programming language for cloud-native workloads. It was even promoted as suitable for writing Kubernetes controllers.\nInternal Developer Platforms The event showcased the importance of Internal Developer Platforms (IDPs), both commercial and open-source, in facilitating the development process across all types of organizations – from Allianz to Mercedes. Backstage leads the pack by a large margin, with all relevant sessions being full or at capacity. Much effort goes into the modularization of Backstage, which was also a notable highlight at the conference.\nSustainability Sustainability was a key theme, with discussions on the role of cloud-native technologies in promoting green practices. The KubeCost application folks put a lot of effort into emphasizing the large amount of wasted money, which hyperscalers benefit from. In parallel – the kube-green project emphasized optimizing your cluster footprint to minimize CO2 emissions. The conference also highlighted the importance of open source in creating a level playing field for multiple players to compete, fostering diverse participation, and solving global challenges.\nCustomer Stories In contrast to the Chicago KubeCon in 2023, the one in Paris outlined multiple case studies, best practices, and reference scenarios. Many enterprises and their IT teams were well represented at KubeCon - regarding sessions, sponsorships, and participation. These companies strive to excel forward, reaping the efficiency and flexibility benefits cloud-native architectures provide. We came across multiple companies using Gardener as their Kubernetes management underlay – including FUGA Cloud, STACKIT, and metal-stack Cloud. We eagerly anticipate more companies embracing Gardener at future events. The consistent feedback from these companies has been overwhelmingly positive—they absolutely love using Gardener and our shared excitement grows as the community thrives!\nNotable Talks Notable talks from leaders in the cloud-native world, including Solomon Hykes, Bob Wise, and representatives from KCP for Platforms and the United Nations, provided valuable insights into the future of AI and cloud-native technologies. All the talks are now uploaded to YouTube in the following playlist. Those do not include the various pre-conference days, available as separate playlists by CNCF.\nIn Conclusion… In conclusion, KubeCon 2024 showcased the intersection of AI and cloud-native technologies, the open-source community’s growth, and the cloud-native ecosystem’s maturity. Many enterprises are actively engaged there, innovating, trying, and growing their internal expertise. They’re using KubeCon as a recruiting event, expanding their internal talent pool and taking more of their internal operations and processes into their own hands. The event served as a platform for global collaboration, cross-company alignments, innovation, and the exchange of ideas, setting the stage for the future of cloud-native computing.\n","categories":"","description":"","excerpt":"KubeCon + CloudNativeCon Europe 2024, recently held in Paris, was a …","ref":"/blog/2024/04-05-kubecon-cloudnativecon-europe-2024-highlights/","tags":"","title":"KubeCon / CloudNativeCon Europe 2024 Highlights"},{"body":"","categories":"","description":"","excerpt":"","ref":"/docs/","tags":"","title":"Docs"},{"body":"Gardener Extension for certificate services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nConfiguration Example configuration for this extension controller:\napiVersion: shoot-cert-service.extensions.config.gardener.cloud/v1alpha1 kind: Configuration issuerName: gardener restrictIssuer: true # restrict issuer to any sub-domain of shoot.spec.dns.domain (default) acme: email: john.doe@example.com server: https://acme-v02.api.letsencrypt.org/directory # privateKey: | # Optional key for Let's Encrypt account. # -----BEGIN BEGIN RSA PRIVATE KEY----- # ... # -----END RSA PRIVATE KEY----- Extension-Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: \"extension-certificate-service\" namespace: shoot--project--abc spec: type: shoot-cert-service When an extension resource is reconciled, the extension controller will create an instance of Cert-Management as well as an Issuer with the ACME information provided in the configuration above. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters, i.e. it never deploys them directly.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for certificate services for shoot clusters","excerpt":"Gardener extension controller for certificate services for shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/","tags":"","title":"Certificate services"},{"body":"Changing alerting settings Certificates are normally renewed automatically 30 days before they expire. As a second line of defense, there is an alerting in Prometheus activated if the certificate is a few days before expiration. By default, the alert is triggered 15 days before expiration.\nYou can configure the days in the providerConfig of the extension. Setting it to 0 disables the alerting.\nIn this example, the days are changed to 3 days before expiration.\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig alerting: certExpirationAlertDays: 3 ","categories":"","description":"How to change the alerting on expiring certificates","excerpt":"How to change the alerting on expiring certificates","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/alerting/","tags":["task"],"title":"Changing alerting settings"},{"body":"Have you ever wondered how much more your Kubernetes cluster can scale before it breaks down?\nOf course, the answer is heavily dependent on your workloads. But be assured, any cluster will break eventually. Therefore, the best mitigation is to plan for sharding early and run multiple clusters instead of trying to optimize everything hoping to survive with a single cluster. Still, it is helpful to know when the time has come to scale out. This document aims at giving you the basic knowledge to keep a Gardener-managed Kubernetes cluster up and running while it scales according to your needs.\nWelcome to Planet Scale, Please Mind the Gap! For a complex, distributed system like Kubernetes it is impossible to give absolute thresholds for its scalability. Instead, the limit of a cluster’s scalability is a combination of various, interconnected dimensions.\nLet’s take a rather simple example of two dimensions - the number of Pods per Node and number of Nodes in a cluster. According to the scalability thresholds documentation, Kubernetes can scale up to 5000 Nodes and with default settings accommodate a maximum of 110 Pods on a single Node. Pushing only a single dimension towards its limit will likely harm the cluster. But if both are pushed simultaneously, any cluster will break way before reaching one dimension’s limit.\nWhat sounds rather straightforward in theory can be a bit trickier in reality. While 110 Pods is the default limit, we successfully pushed beyond that and in certain cases run up to 200 Pods per Node without breaking the cluster. This is possible in an environment where one knows and controls all workloads and cluster configurations. It still requires careful testing, though, and comes at the cost of limiting the scalability of other dimensions, like the number of Nodes.\nOf course, a Kubernetes cluster has a plethora of dimensions. Thus, when looking at a simple questions like “How many resources can I store in ETCD?”, the only meaningful answer must be: “it depends”\nThe following sections will help you to identify relevant dimensions and how they affect a Gardener-managed Kubernetes cluster’s scalability.\n“Official” Kubernetes Thresholds and Scalability Considerations To get started with the topic, please check the basic guidance provided by the Kubernetes community (specifically SIG Scalability):\n How we define scalability? Kubernetes Scalability Thresholds Furthermore, the problem space has been discussed in a KubeCon talk, the slides for which can be found here. You should at least read the slides before continuing.\nEssentially, it comes down to this:\n If you promise to:\n correctly configure your cluster use extensibility features “reasonably” keep the load in the cluster within recommended limits Then we promise that your cluster will function properly.\n With that knowledge in mind, let’s look at Gardener and eventually pick up the question about the number of objects in ETCD raised above.\nGardener-Specific Considerations The following considerations are based on experience with various large clusters that scaled in different dimensions. Just as explained above, pushing beyond even one of the limits is likely to cause issues at some point in time (but not guaranteed). Depending on the setup of your workloads however, it might work unexpectedly well. Nevertheless, we urge you take conscious decisions and rather think about sharding your workloads. Please keep in mind - your workload affects the overall stability and scalability of a cluster significantly.\nETCD The following section is based on a setup where ETCD Pods run on a dedicated Node pool and each Node has 8 vCPU and 32GB memory at least.\nETCD has a practical space limit of 8 GB. It caps the number of objects one can technically have in a Kubernetes cluster.\nOf course, the number is heavily influenced by each object’s size, especially when considering that secrets and configmaps may store up to 1MB of data. Another dimension is a cluster’s churn rate. Since ETCD stores a history of the keyspace, a higher churn rate reduces the number of objects. Gardener runs compaction every 30min and defragmentation once per day during a cluster’s maintenance window to ensure proper ETCD operations. However, it is still possible to overload ETCD. If the space limit is reached, ETCD will only accept READ or DELETE requests and manual interaction by a Gardener operator is needed to disarm the alarm, once you got below the threshold.\nTo avoid such a situation, you can monitor the current ETCD usage via the “ETCD” dashboard of the monitoring stack. It gives you the current DB size, as well as historical data for the past 2 weeks. While there are improvements planned to trigger compaction and defragmentation based on DB size, an ETCD should not grow up to this threshold. A typical, healthy DB size is less than 3 GB.\nFurthermore, the dashboard has a panel called “Memory”, which indicates the memory usage of the etcd pod(s). Using more than 16GB memory is a clear red flag, and you should reduce the load on ETCD.\nAnother dimension you should be aware of is the object count in ETCD. You can check it via the “API Server” dashboard, which features a “ETCD Object Counts By Resource” panel. The overall number of objects (excluding events, as they are stored in a different etcd instance) should not exceed 100k for most use cases.\nKube API Server The following section is based on a setup where kube-apiserver run as Pods and are scheduled to Nodes with at least 8 vCPU and 32GB memory.\nGardener can scale the Deployment of a kube-apiserver horizontally and vertically. Horizontal scaling is limited to a certain number of replicas and should not concern a stakeholder much. However, the CPU / memory consumption of an individual kube-apiserver pod poses a potential threat to the overall availability of your cluster. The vertical scaling of any kube-apiserver is limited by the amount of resources available on a single Node. Outgrowing the resources of a Node will cause a downtime and render the cluster unavailable.\nIn general, continuous CPU usage of up to 3 cores and 16 GB memory per kube-apiserver pod is considered to be safe. This gives some room to absorb spikes, for example when the caches are initialized. You can check the resource consumption by selecting kube-apiserver Pods in the “Kubernetes Pods” dashboard. If these boundaries are exceeded constantly, you need to investigate and derive measures to lower the load.\nFurther information is also recorded and made available through the monitoring stack. The dashboard “API Server Request Duration and Response Size” provides insights into the request processing time of kube-apiserver Pods. Related information like request rates, dropped requests or termination codes (e.g., 429 for too many requests) can be obtained from the dashboards “API Server” and “Kubernetes API Server Details”. They provide a good indicator for how well the system is dealing with its current load.\nReducing the load on the API servers can become a challenge. To get started, you may try to:\n Use immutable secrets and configmaps where possible to save watches. This pays off, especially when you have a high number of Nodes or just lots of secrets in general. Applications interacting with the K8s API: If you know an object by its name, use it. Using label selector queries is expensive, as the filtering happens only within the kube-apiserver and not etcd, hence all resources must first pass completely from etcd to kube-apiserver. Use (single object) caches within your controllers. Check the “Use cache for ShootStates in Gardenlet” issue for an example. Nodes When talking about the scalability of a Kubernetes cluster, Nodes are probably mentioned in the first place… well, obviously not in this guide. While vanilla Kubernetes lists 5000 Nodes as its upper limit, pushing that dimension is not feasible. Most clusters should run with fewer than 300 Nodes. But of course, the actual limit depends on the workloads deployed and can be lower or higher. As you scale your cluster, be extra careful and closely monitor ETCD and kube-apiserver.\nThe scalability of Nodes is subject to a range of limiting factors. Some of them can only be defined upon cluster creation and remain immutable during a cluster lifetime. So let’s discuss the most important dimensions.\nCIDR:\nUpon cluster creation, you have to specify or use the default values for several network segments. There are dedicated CIDRs for services, Pods, and Nodes. Each defines a range of IP addresses available for the individual resource type. Obviously, the maximum of possible Nodes is capped by the CIDR for Nodes. However, there is a second limiting factor, which is the pod CIDR combined with the nodeCIDRMaskSize. This mask is used to divide the pod CIDR into smaller subnets, where each blocks gets assigned to a node. With a /16 pod network and a /24 nodeCIDRMaskSize, a cluster can scale up to 256 Nodes. Please check Shoot Networking for details.\nEven though a /24 nodeCIDRMaskSize translates to a theoretical 256 pod IP addresses per Node, the maxPods setting should be less than 1/2 of this value. This gives the system some breathing room for churn and minimizes the risk for strange effects like mis-routed packages caused by immediate re-use of IPs.\nCloud provider capacity:\nMost of the time, Nodes in Kubernetes translate to virtual machines on a hyperscaler. An attempt to add more Nodes to a cluster might fail due to capacity issues resulting in an error message like this:\nCloud provider message - machine codes error: code = [Internal] message = [InsufficientInstanceCapacity: We currently do not have sufficient \u003cinstance type\u003e capacity in the Availability Zone you requested. Our system will be working on provisioning additional capacity. In heavily utilized regions, individual clusters are competing for scarce resources. So before choosing a region / zone, try to ensure that the hyperscaler supports your anticipated growth. This might be done through quota requests or by contacting the respective support teams. To mitigate such a situation, you may configure a worker pool with a different Node type and a corresponding priority expander as part of a shoot’s autoscaler section. Please consult the Autoscaler FAQ for more details.\nRolling of Node pools:\nThe overall number of Nodes is affecting the duration of a cluster’s maintenance. When upgrading a Node pool to a new OS image or Kubernetes version, all machines will be drained and deleted, and replaced with new ones. The more Nodes a cluster has, the longer this process will take, given that workloads are typically protected by PodDisruptionBudgets. Check Shoot Updates and Upgrades for details. Be sure to take this into consideration when planning maintenance.\nRoot disk:\nYou should be aware that the Node configuration impacts your workload’s performance too. Take the root disk of a Node, for example. While most hyperscalers offer the usage of HDD and SSD disks, it is strongly recommended to use SSD volumes as root disks. When there are lots of Pods on a Node or workloads making extensive use of emptyDir volumes, disk throttling becomes an issue. When a disk hits its IOPS limits, processes are stuck in IO-wait and slow down significantly. This can lead to a slow-down in the kubelet’s heartbeat mechanism and result in Nodes being replaced automatically, as they appear to be unhealthy. To analyze such a situation, you might have to run tools like iostat, sar or top directly on a Node.\nSwitching to an I/O optimized instance type (if offered for your infrastructure) can help to resolve issue. Please keep in mind that disks used via PersistentVolumeClaims have I/O limits as well. Sometimes these limits are related to the size and/or can be increased for individual disks.\nCloud Provider (Infrastructure) Limits In addition to the already mentioned capacity restrictions, a cloud provider may impose other limitations to a Kubernetes cluster’s scalability. One category is the account quota defining the number of resources allowed globally or per region. Make sure to request appropriate values that suit your needs and contain a buffer, for example for having more Nodes during a rolling update.\nAnother dimension is the network throughput per VM or network interface. While you may be able to choose a network-optimized Node type for your workload to mitigate issues, you cannot influence the available bandwidth for control plane components. Therefore, please ensure that the traffic on the ETCD does not exceed 100MB/s. The ETCD dashboard provides data for monitoring this metric.\nIn some environments the upstream DNS might become an issue too and make your workloads subject to rate limiting. Given the heterogeneity of cloud providers incl. private data centers, it is not possible to give any thresholds. Still, the “CoreDNS” and “NodeLocalDNS” dashboards can help to derive a workload’s usage pattern. Check the DNS autoscaling and NodeLocalDNS documentations for available configuration options.\nWebhooks While webhooks provide powerful means to manage a cluster, they are equally powerful in breaking a cluster upon a malfunction or unavailability. Imagine using a policy enforcing system like Kyverno or Open Policy Agent Gatekeeper. As part of the stack, both will deploy webhooks which are invoked for almost everything that happens in a cluster. Now, if this webhook gets either overloaded or is simply not available, the cluster will stop functioning properly.\nHence, you have to ensure proper sizing, quick processing time, and availability of the webhook serving Pods when deploying webhooks. Please consult Dynamic Admission Control (Availability and Timeouts sections) for details. You should also be aware of the time added to any request that has to go through a webhook, as the kube-apiserver sends the request for mutation / validation to another pod and waits for the response. The more resources being subject to an external webhook, the more likely this will become a bottleneck when having a high churn rate on resources. Within the Gardener monitoring stack, you can check the extra time per webhook via the “API Server (Admission Details)” dashboard, which has a panel for “Duration per Webhook”.\nIn Gardener, any webhook timeout should be less than 15 seconds. Due to the separation of Kubernetes data-plane (shoot) and control-plane (seed) in Gardener, the extra hop from kube-apiserver (control-plane) to webhook (data-plane) is more expensive. Please check Shoot Status for more details.\nCustom Resource Definitions Using Custom Resource Definitions (CRD) to extend a cluster’s API is a common Kubernetes pattern and so is writing an operator to act upon custom resources. Writing an efficient controller reduces the load on the kube-apiserver and allows for better scaling. As a starting point, you might want to read Gardener’s Kubernetes Clients Guide.\nAnother problematic dimension is the usage of conversion webhooks when having resources stored in different versions. Not only do they add latency (see Webhooks) but can also block the kube-controllermanager’s garbage collection. If a conversion webhook is unavailable, the garbage collector fails to list all resources and does not perform any cleanup. In order to avoid such a situation, it is highly recommended to use conversion webhooks only when necessary and complete the migration to a new version as soon as possible.\nConclusion As outlined by SIG Scalability, it is quite impossible to give limits or even recommendations fitting every individual use case. Instead, this guide outlines relevant dimensions and gives rather conservative recommendations based on usage patterns observed. By combining this information, it is possible to operate and scale a cluster in stable manner.\nWhile going beyond is certainly possible for some dimensions, it significantly increases the risk of instability. Typically, limits on the control-plane are introduced by the availability of resources like CPU or memory on a single machine and can hardly be influenced by any user. Therefore, utilizing the existing resources efficiently is key. Other parameters are controlled by a user. In these cases, careful testing may reveal actual limits for a specific use case.\nPlease keep in mind that all aspects of a workload greatly influence the stability and scalability of a Kubernetes cluster.\n","categories":"","description":"Know the boundary conditions when scaling your workloads","excerpt":"Know the boundary conditions when scaling your workloads","ref":"/docs/guides/administer-shoots/scalability/","tags":"","title":"Scalability of Gardener Managed Kubernetes Clusters"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2023/","tags":"","title":"2023"},{"body":"Developing highly available workload that can tolerate a zone outage is no trivial task. In this blog, we will explore various recommendations to get closer to that goal. While many recommendations are general enough, the examples are specific in how to achieve this in a Gardener-managed cluster and where/how to tweak the different control plane components. If you do not use Gardener, it may be still a worthwhile read as most settings can be influenced with most of the Kubernetes providers.\nFirst however, what is a zone outage? It sounds like a clear-cut “thing”, but it isn’t. There are many things that can go haywire. Here are some examples:\n Elevated cloud provider API error rates for individual or multiple services Network bandwidth reduced or latency increased, usually also effecting storage sub systems as they are network attached No networking at all, no DNS, machines shutting down or restarting, … Functional issues, of either the entire service (e.g., all block device operations) or only parts of it (e.g., LB listener registration) All services down, temporarily or permanently (the proverbial burning down data center 🔥) This and everything in between make it hard to prepare for such events, but you can still do a lot. The most important recommendation is to not target specific issues exclusively - tomorrow another service will fail in an unanticipated way. Also, focus more on meaningful availability than on internal signals (useful, but not as relevant as the former). Always prefer automation over manual intervention (e.g., leader election is a pretty robust mechanism, auto-scaling may be required as well, etc.).\nAlso remember that HA is costly - you need to balance it against the cost of an outage as silly as this may sound, e.g., running all this excess capacity “just in case” vs. “going down” vs. a risk-based approach in between where you have means that will kick in, but they are not guaranteed to work (e.g., if the cloud provider is out of resource capacity). Maybe some of your components must run at the highest possible availability level, but others not - that’s a decision only you can make.\nControl Plane The Kubernetes cluster control plane is managed by Gardener (as pods in separate infrastructure clusters to which you have no direct access) and can be set up with no failure tolerance (control plane pods will be recreated best-effort when resources are available) or one of the failure tolerance types node or zone.\nStrictly speaking, static workload does not depend on the (high) availability of the control plane, but static workload doesn’t rhyme with Cloud and Kubernetes and also means, that when you possibly need it the most, e.g., during a zone outage, critical self-healing or auto-scaling functionality won’t be available to you and your workload, if your control plane is down as well. That’s why it’s generally recommended to use the failure tolerance type zone for the control planes of productive clusters, at least in all regions that have 3+ zones. Regions that have only 1 or 2 zones don’t support the failure tolerance type zone and then your second best option is the failure tolerance type node, which means a zone outage can still take down your control plane, but individual node outages won’t.\nIn the shoot resource it’s merely only this what you need to add:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) This setting will scale out all control plane components for a Gardener cluster as necessary, so that no single zone outage can take down the control plane for longer than just a few seconds for the fail-over to take place (e.g., lease expiration and new leader election or readiness probe failure and endpoint removal). Components run highly available in either active-active (servers) or active-passive (controllers) mode at all times, the persistence (ETCD), which is consensus-based, will tolerate the loss of one zone and still maintain quorum and therefore remain operational. These are all patterns that we will revisit down below also for your own workload.\nWorker Pools Now that you have configured your Kubernetes cluster control plane in HA, i.e. spread it across multiple zones, you need to do the same for your own workload, but in order to do so, you need to spread your nodes across multiple zones first.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: provider: workers: - name: ... minimum: 6 maximum: 60 zones: - ... Prefer regions with at least 2, better 3+ zones and list the zones in the zones section for each of your worker pools. Whether you need 2 or 3 zones at a minimum depends on your fail-over concept:\n Consensus-based software components (like ETCD) depend on maintaining a quorum of (n/2)+1, so you need at least 3 zones to tolerate the outage of 1 zone. Primary/Secondary-based software components need just 2 zones to tolerate the outage of 1 zone. Then there are software components that can scale out horizontally. They are probably fine with 2 zones, but you also need to think about the load-shift and that the remaining zone must then pick up the work of the unhealthy zone. With 2 zones, the remaining zone must cope with an increase of 100% load. With 3 zones, the remaining zones must only cope with an increase of 50% load (per zone). In general, the question is also whether you have the fail-over capacity already up and running or not. If not, i.e. you depend on re-scheduling to a healthy zone or auto-scaling, be aware that during a zone outage, you will see a resource crunch in the healthy zones. If you have no automation, i.e. only human operators (a.k.a. “red button approach”), you probably will not get the machines you need and even with automation, it may be tricky. But holding the capacity available at all times is costly. In the end, that’s a decision only you can make. If you made that decision, please adapt the minimum and maximum settings for your worker pools accordingly.\nAlso, consider fall-back worker pools (with different/alternative machine types) and cluster autoscaler expanders using a priority-based strategy.\nGardener-managed clusters deploy the cluster autoscaler or CA for short and you can tweak the general CA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 If you want to be ready for a sudden spike or have some buffer in general, over-provision nodes by means of “placeholder” pods with low priority and appropriate resource requests. This way, they will demand nodes to be provisioned for them, but if any pod comes up with a regular/higher priority, the low priority pods will be evicted to make space for the more important ones. Strictly speaking, this is not related to HA, but it may be important to keep this in mind as you generally want critical components to be rescheduled as fast as possible and if there is no node available, it may take 3 minutes or longer to do so (depending on the cloud provider). Besides, not only zones can fail, but also individual nodes.\nReplicas (Horizontal Scaling) Now let’s talk about your workload. In most cases, this will mean to run multiple replicas. If you cannot do that (a.k.a. you have a singleton), that’s a bad situation to be in. Maybe you can run a spare (secondary) as backup? If you cannot, you depend on quick detection and rescheduling of your singleton (more on that below).\nObviously, things get messier with persistence. If you have persistence, you should ideally replicate your data, i.e. let your spare (secondary) “follow” your main (primary). If your software doesn’t support that, you have to deploy other means, e.g., volume snapshotting or side-backups (specific to the software you deploy; keep the backups regional, so that you can switch to another zone at all times). If you have to do those, your HA scenario becomes more a DR scenario and terms like RPO and RTO become relevant to you:\n Recovery Point Objective (RPO): Potential data loss, i.e. how much data will you lose at most (time between backups) Recovery Time Objective (RTO): Time until recovery, i.e. how long does it take you to be operational again (time to restore) Also, keep in mind that your persistent volumes are usually zonal, i.e. once you have a volume in one zone, it’s bound to that zone and you cannot get up your pod in another zone w/o first recreating the volume yourself (Kubernetes won’t help you here directly).\nAnyway, best avoid that, if you can (from technical and cost perspective). The best solution (and also the most costly one) is to run multiple replicas in multiple zones and keep your data replicated at all times, so that your RPO is always 0 (best). That’s what we do for Gardener-managed cluster HA control planes (ETCD) as any data loss may be disastrous and lead to orphaned resources (in addition, we deploy side cars that do side-backups for disaster recovery, with full and incremental snapshots with an RPO of 5m).\nSo, how to run with multiple replicas? That’s the easiest part in Kubernetes and the two most important resources, Deployments and StatefulSet, support that out of the box:\napiVersion: apps/v1 kind: Deployment | StatefulSet spec: replicas: ... The problem comes with the number of replicas. It’s easy only if the number is static, e.g., 2 for active-active/passive or 3 for consensus-based software components, but what with software components that can scale out horizontally? Here you usually do not set the number of replicas statically, but make use of the horizontal pod autoscaler or HPA for short (built-in; part of the kube-controller-manager). There are also other options like the cluster proportional autoscaler, but while the former works based on metrics, the latter is more a guestimate approach that derives the number of replicas from the number of nodes/cores in a cluster. Sometimes useful, but often blind to the actual demand.\nSo, HPA it is then for most of the cases. However, what is the resource (e.g., CPU or memory) that drives the number of desired replicas? Again, this is up to you, but not always are CPU or memory the best choices. In some cases, custom metrics may be more appropriate, e.g., requests per second (it was also for us).\nYou will have to create specific HorizontalPodAutoscaler resources for your scale target and can tweak the general HPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeControllerManager: horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s Resources (Vertical Scaling) While it is important to set a sufficient number of replicas, it is also important to give the pods sufficient resources (CPU and memory). This is especially true when you think about HA. When a zone goes down, you might need to get up replacement pods, if you don’t have them running already to take over the load from the impacted zone. Likewise, e.g., with active-active software components, you can expect the remaining pods to receive more load. If you cannot scale them out horizontally to serve the load, you will probably need to scale them out (or rather up) vertically. This is done by the vertical pod autoscaler or VPA for short (not built-in; part of the kubernetes/autoscaler repository).\nA few caveats though:\n You cannot use HPA and VPA on the same metrics as they would influence each other, which would lead to pod trashing (more replicas require fewer resources; fewer resources require more replicas) Scaling horizontally doesn’t cause downtimes (at least not when out-scaling and only one replica is affected when in-scaling), but scaling vertically does (if the pod runs OOM anyway, but also when new recommendations are applied, resource requests for existing pods may be changed, which causes the pods to be rescheduled). Although the discussion is going on for a very long time now, that is still not supported in-place yet (see KEP 1287, implementation in Kubernetes, implementation in VPA). VPA is a useful tool and Gardener-managed clusters deploy a VPA by default for you (HPA is supported anyway as it’s built into the kube-controller-manager). You will have to create specific VerticalPodAutoscaler resources for your scale target and can tweak the general VPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s While horizontal pod autoscaling is relatively straight-forward, it takes a long time to master vertical pod autoscaling. We saw performance issues, hard-coded behavior (on OOM, memory is bumped by +20% and it may take a few iterations to reach a good level), unintended pod disruptions by applying new resource requests (after 12h all targeted pods will receive new requests even though individually they would be fine without, which also drives active-passive resource consumption up), difficulties to deal with spiky workload in general (due to the algorithmic approach it takes), recommended requests may exceed node capacity, limit scaling is proportional and therefore often questionable, and more. VPA is a double-edged sword: useful and necessary, but not easy to handle.\nFor the Gardener-managed components, we mostly removed limits. Why?\n CPU limits have almost always only downsides. They cause needless CPU throttling, which is not even easily visible. CPU requests turn into cpu shares, so if the node has capacity, the pod may consume the freely available CPU, but not if you have set limits, which curtail the pod by means of cpu quota. There are only certain scenarios in which they may make sense, e.g., if you set requests=limits and thereby define a pod with guaranteed QoS, which influences your cgroup placement. However, that is difficult to do for the components you implement yourself and practically impossible for the components you just consume, because what’s the correct value for requests/limits and will it hold true also if the load increases and what happens if a zone goes down or with the next update/version of this component? If anything, CPU limits caused outages, not helped prevent them. As for memory limits, they are slightly more useful, because CPU is compressible and memory is not, so if one pod runs berserk, it may take others down (with CPU, cpu shares make it as fair as possible), depending on which OOM killer strikes (a complicated topic by itself). You don’t want the operating system OOM killer to strike as the result is unpredictable. Better, it’s the cgroup OOM killer or even the kubelet’s eviction, if the consumption is slow enough as it takes priorities into consideration even. If your component is critical and a singleton (e.g., node daemon set pods), you are better off also without memory limits, because letting the pod go OOM because of artificial/wrong memory limits can mean that the node becomes unusable. Hence, such components also better run only with no or a very high memory limit, so that you can catch the occasional memory leak (bug) eventually, but under normal operation, if you cannot decide about a true upper limit, rather not have limits and cause endless outages through them or when you need the pods the most (during a zone outage) where all your assumptions went out of the window. The downside of having poor or no limits and poor and no requests is that nodes may “die” more often. Contrary to the expectation, even for managed services, the managed service is not responsible or cannot guarantee the health of a node under all circumstances, since the end user defines what is run on the nodes (shared responsibility). If the workload exhausts any resource, it will be the end of the node, e.g., by compressing the CPU too much (so that the kubelet fails to do its work), exhausting the main memory too fast, disk space, file handles, or any other resource.\nThe kubelet allows for explicit reservation of resources for operating system daemons (system-reserved) and Kubernetes daemons (kube-reserved) that are subtracted from the actual node resources and become the allocatable node resources for your workload/pods. All managed services configure these settings “by rule of thumb” (a balancing act), but cannot guarantee that the values won’t waste resources or always will be sufficient. You will have to fine-tune them eventually and adapt them to your needs. In addition, you can configure soft and hard eviction thresholds to give the kubelet some headroom to evict “greedy” pods in a controlled way. These settings can be configured for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubelet: systemReserved: # explicit resource reservation for operating system daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 kubeReserved: # explicit resource reservation for Kubernetes daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 evictionSoft: # soft, i.e. graceful eviction (used if the node is about to run out of resources, avoiding hard evictions) memoryAvailable: 200Mi imageFSAvailable: 10% imageFSInodesFree: 10% nodeFSAvailable: 10% nodeFSInodesFree: 10% evictionSoftGracePeriod: # caps pod's `terminationGracePeriodSeconds` value during soft evictions (specific grace periods) memoryAvailable: 1m30s imageFSAvailable: 1m30s imageFSInodesFree: 1m30s nodeFSAvailable: 1m30s nodeFSInodesFree: 1m30s evictionHard: # hard, i.e. immediate eviction (used if the node is out of resources, avoiding the OS generally run out of resources fail processes indiscriminately) memoryAvailable: 100Mi imageFSAvailable: 5% imageFSInodesFree: 5% nodeFSAvailable: 5% nodeFSInodesFree: 5% evictionMinimumReclaim: # additional resources to reclaim after hitting the hard eviction thresholds to not hit the same thresholds soon after again memoryAvailable: 0Mi imageFSAvailable: 0Mi imageFSInodesFree: 0Mi nodeFSAvailable: 0Mi nodeFSInodesFree: 0Mi evictionMaxPodGracePeriod: 90 # caps pod's `terminationGracePeriodSeconds` value during soft evictions (general grace periods) evictionPressureTransitionPeriod: 5m0s # stabilization time window to avoid flapping of node eviction state You can tweak these settings also individually per worker pool (spec.provider.workers.kubernetes.kubelet...), which makes sense especially with different machine types (and also workload that you may want to schedule there).\nPhysical memory is not compressible, but you can overcome this issue to some degree (alpha since Kubernetes v1.22 in combination with the feature gate NodeSwap on the kubelet) with swap memory. You can read more in this introductory blog and the docs. If you chose to use it (still only alpha at the time of this writing) you may want to consider also the risks associated with swap memory:\n Reduced performance predictability Reduced performance up to page trashing Reduced security as secrets, normally held only in memory, could be swapped out to disk That said, the various options mentioned above are only remotely related to HA and will not be further explored throughout this document, but just to remind you: if a zone goes down, load patterns will shift, existing pods will probably receive more load and will require more resources (especially because it is often practically impossible to set “proper” resource requests, which drive node allocation - limits are always ignored by the scheduler) or more pods will/must be placed on the existing and/or new nodes and then these settings, which are generally critical (especially if you switch on bin-packing for Gardener-managed clusters as a cost saving measure), will become even more critical during a zone outage.\nProbes Before we go down the rabbit hole even further and talk about how to spread your replicas, we need to talk about probes first, as they will become relevant later. Kubernetes supports three kinds of probes: startup, liveness, and readiness probes. If you are a visual thinker, also check out this slide deck by Tim Hockin (Kubernetes networking SIG chair).\nBasically, the startupProbe and the livenessProbe help you restart the container, if it’s unhealthy for whatever reason, by letting the kubelet that orchestrates your containers on a node know, that it’s unhealthy. The former is a special case of the latter and only applied at the startup of your container, if you need to handle the startup phase differently (e.g., with very slow starting containers) from the rest of the lifetime of the container.\nNow, the readinessProbe helps you manage the ready status of your container and thereby pod (any container that is not ready turns the pod not ready). This again has impact on endpoints and pod disruption budgets:\n If the pod is not ready, the endpoint will be removed and the pod will not receive traffic anymore If the pod is not ready, the pod counts into the pod disruption budget and if the budget is exceeded, no further voluntary pod disruptions will be permitted for the remaining ready pods (e.g., no eviction, no voluntary horizontal or vertical scaling, if the pod runs on a node that is about to be drained or in draining, draining will be paused until the max drain timeout passes) As you can see, all of these probes are (also) related to HA (mostly the readinessProbe, but depending on your workload, you can also leverage livenessProbe and startupProbe into your HA strategy). If Kubernetes doesn’t know about the individual status of your container/pod, it won’t do anything for you (right away). That said, later/indirectly something might/will happen via the node status that can also be ready or not ready, which influences the pods and load balancer listener registration (a not ready node will not receive cluster traffic anymore), but this process is worker pool global and reacts delayed and also doesn’t discriminate between the containers/pods on a node.\nIn addition, Kubernetes also offers pod readiness gates to amend your pod readiness with additional custom conditions (normally, only the sum of the container readiness matters, but pod readiness gates additionally count into the overall pod readiness). This may be useful if you want to block (by means of pod disruption budgets that we will talk about next) the roll-out of your workload/nodes in case some (possibly external) condition fails.\nPod Disruption Budgets One of the most important resources that help you on your way to HA are pod disruption budgets or PDB for short. They tell Kubernetes how to deal with voluntary pod disruptions, e.g., during the deployment of your workload, when the nodes are rolled, or just in general when a pod shall be evicted/terminated. Basically, if the budget is reached, they block all voluntary pod disruptions (at least for a while until possibly other timeouts act or things happen that leave Kubernetes no choice anymore, e.g., the node is forcefully terminated). You should always define them for your workload.\nVery important to note is that they are based on the readinessProbe, i.e. even if all of your replicas are lively, but not enough of them are ready, this blocks voluntary pod disruptions, so they are very critical and useful. Here an example (you can specify either minAvailable or maxUnavailable in absolute numbers or as percentage):\napiVersion: policy/v1 kind: PodDisruptionBudget spec: maxUnavailable: 1 selector: matchLabels: ... And please do not specify a PDB of maxUnavailable being 0 or similar. That’s pointless, even detrimental, as it blocks then even useful operations, forces always the hard timeouts that are less graceful and it doesn’t make sense in the context of HA. You cannot “force” HA by preventing voluntary pod disruptions, you must work with the pod disruptions in a resilient way. Besides, PDBs are really only about voluntary pod disruptions - something bad can happen to a node/pod at any time and PDBs won’t make this reality go away for you.\nPDBs will not always work as expected and can also get in your way, e.g., if the PDB is violated or would be violated, it may possibly block whatever you are trying to do to salvage the situation, e.g., drain a node or deploy a patch version (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which Kubernetes doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\nPod Topology Spread Constraints Pod topology spread constraints or PTSC for short (no official abbreviation exists, but we will use this in the following) are enormously helpful to distribute your replicas across multiple zones, nodes, or any other user-defined topology domain. They complement and improve on pod (anti-)affinities that still exist and can be used in combination.\nPTSCs are an improvement, because they allow for maxSkew and minDomains. You can steer the “level of tolerated imbalance” with maxSkew, e.g., you probably want that to be at least 1, so that you can perform a rolling update, but this all depends on your deployment (maxUnavailable and maxSurge), etc. Stateful sets are a bit different (maxUnavailable) as they are bound to volumes and depend on them, so there usually cannot be 2 pods requiring the same volume. minDomains is a hint to tell the scheduler how far to spread, e.g., if all nodes in one zone disappeared because of a zone outage, it may “appear” as if there are only 2 zones in a 3 zones cluster and the scheduling decisions may end up wrong, so a minDomains of 3 will tell the scheduler to spread to 3 zones before adding another replica in one zone. Be careful with this setting as it also means, if one zone is down the “spread” is already at least 1, if pods run in the other zones. This is useful where you have exactly as many replicas as you have zones and you do not want any imbalance. Imbalance is critical as if you end up with one, nobody is going to do the (active) re-balancing for you (unless you deploy and configure additional non-standard components such as the descheduler). So, for instance, if you have something like a DBMS that you want to spread across 2 zones (active-passive) or 3 zones (consensus-based), you better specify minDomains of 2 respectively 3 to force your replicas into at least that many zones before adding more replicas to another zone (if supported).\nAnyway, PTSCs are critical to have, but not perfect, so we saw (unsurprisingly, because that’s how the scheduler works), that the scheduler may block the deployment of new pods because it takes the decision pod-by-pod (see for instance #109364).\nPod Affinities and Anti-Affinities As said, you can combine PTSCs with pod affinities and/or anti-affinities. Especially inter-pod (anti-)affinities may be helpful to place pods apart, e.g., because they are fall-backs for each other or you do not want multiple potentially resource-hungry “best-effort” or “burstable” pods side-by-side (noisy neighbor problem), or together, e.g., because they form a unit and you want to reduce the failure domain, reduce the network latency, and reduce the costs.\nTopology Aware Hints While topology aware hints are not directly related to HA, they are very relevant in the HA context. Spreading your workload across multiple zones may increase network latency and cost significantly, if the traffic is not shaped. Topology aware hints (beta since Kubernetes v1.23, replacing the now deprecated topology aware traffic routing with topology keys) help to route the traffic within the originating zone, if possible. Basically, they tell kube-proxy how to setup your routing information, so that clients can talk to endpoints that are located within the same zone.\nBe aware however, that there are some limitations. Those are called safeguards and if they strike, the hints are off and traffic is routed again randomly. Especially controversial is the balancing limitation as there is the assumption, that the load that hits an endpoint is determined by the allocatable CPUs in that topology zone, but that’s not always, if even often, the case (see for instance #113731 and #110714). So, this limitation hits far too often and your hints are off, but then again, it’s about network latency and cost optimization first, so it’s better than nothing.\nNetworking We have talked about networking only to some small degree so far (readiness probes, pod disruption budgets, topology aware hints). The most important component is probably your ingress load balancer - everything else is managed by Kubernetes. AWS, Azure, GCP, and also OpenStack offer multi-zonal load balancers, so make use of them. In Azure and GCP, LBs are regional whereas in AWS and OpenStack, they need to be bound to a zone, which the cloud-controller-manager does by observing the zone labels at the nodes (please note that this behavior is not always working as expected, see #570 where the AWS cloud-controller-manager is not readjusting to newly observed zones).\nPlease be reminded that even if you use a service mesh like Istio, the off-the-shelf installation/configuration usually never comes with productive settings (to simplify first-time installation and improve first-time user experience) and you will have to fine-tune your installation/configuration, much like the rest of your workload.\nRelevant Cluster Settings Following now a summary/list of the more relevant settings you may like to tune for Gardener-managed clusters:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) kubernetes: kubeAPIServer: defaultNotReadyTolerationSeconds: 300 defaultUnreachableTolerationSeconds: 300 kubelet: ... kubeScheduler: featureGates: MinDomainsInPodTopologySpread: true kubeControllerManager: nodeMonitorPeriod: 10s nodeMonitorGracePeriod: 40s horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: - ... # list of zones you want your worker pool nodes to be spread across, see above kubernetes: kubelet: ... # similar to `kubelet` above (cluster-wide settings), but here per worker pool (pool-specific settings), see above machineControllerManager: # optional, it allows to configure the machine-controller settings. machineCreationTimeout: 20m machineHealthTimeout: 10m machineDrainTimeout: 60h systemComponents: coreDNS: autoscaling: mode: horizontal # valid values are `horizontal` (driven by CPU load) and `cluster-proportional` (driven by number of nodes/cores) On spec.controlPlane.highAvailability.failureTolerance.type If set, determines the degree of failure tolerance for your control plane. zone is preferred, but only available if your control plane resides in a region with 3+ zones. See above and the docs.\nOn spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds This is a very interesting API server setting that lets Kubernetes decide how fast to evict pods from nodes whose status condition of type Ready is either Unknown (node status unknown, a.k.a unreachable) or False (kubelet not ready) (see node status conditions; please note that kubectl shows both values as NotReady which is a somewhat “simplified” visualization).\nYou can also override the cluster-wide API server settings individually per pod:\nspec: tolerations: - key: \"node.kubernetes.io/unreachable\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 - key: \"node.kubernetes.io/not-ready\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 This will evict pods on unreachable or not-ready nodes immediately, but be cautious: 0 is very aggressive and may lead to unnecessary disruptions. Again, you must decide for your own workload and balance out the pros and cons (e.g., long startup time).\nPlease note, these settings replace spec.kubernetes.kubeControllerManager.podEvictionTimeout that was deprecated with Kubernetes v1.26 (and acted as an upper bound).\nOn spec.kubernetes.kubeScheduler.featureGates.MinDomainsInPodTopologySpread Required to be enabled for minDomains to work with PTSCs (beta since Kubernetes v1.25, but off by default). See above and the docs. This tells the scheduler, how many topology domains to expect (=zones in the context of this document).\nOn spec.kubernetes.kubeControllerManager.nodeMonitorPeriod and nodeMonitorGracePeriod This is another very interesting kube-controller-manager setting that can help you speed up or slow down how fast a node shall be considered Unknown (node status unknown, a.k.a unreachable) when the kubelet is not updating its status anymore (see node status conditions), which effects eviction (see spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds above). The shorter the time window, the faster Kubernetes will act, but the higher the chance of flapping behavior and pod trashing, so you may want to balance that out according to your needs, otherwise stick to the default which is a reasonable compromise.\nOn spec.kubernetes.kubeControllerManager.horizontalPodAutoscaler... This configures horizontal pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.verticalPodAutoscaler... This configures vertical pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.clusterAutoscaler... This configures node auto-scaling in Gardener-managed clusters. See above and the docs for the detailed fields, especially about expanders, which may become life-saving in case of a zone outage when a resource crunch is setting in and everybody rushes to get machines in the healthy zones.\nIn case of a zone outage, it may be interesting to understand how the cluster autoscaler will put a worker pool in one zone into “back-off”. Unfortunately, the official cluster autoscaler documentation does not explain these details, but you can find hints in the source code:\nIf a node fails to come up, the node group (worker pool in that zone) will go into “back-off”, at first 5m, then exponentially longer until the maximum of 30m is reached. The “back-off” is reset after 3 hours. This in turn means, that nodes must be first considered Unknown, which happens when spec.kubernetes.kubeControllerManager.nodeMonitorPeriod.nodeMonitorGracePeriod lapses. Then they must either remain in this state until spec.provider.workers.machineControllerManager.machineHealthTimeout lapses for them to be recreated, which will fail in the unhealthy zone, or spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds lapses for the pods to be evicted (usually faster than node replacements, depending on your configuration), which will trigger the cluster autoscaler to create more capacity, but very likely in the same zone as it tries to balance its node groups at first, which will also fail in the unhealthy zone. It will be considered failed only when maxNodeProvisionTime lapses (usually close to spec.provider.workers.machineControllerManager.machineCreationTimeout) and only then put the node group into “back-off” and not retry for 5m at first and then exponentially longer. It’s critical to keep that in mind and accommodate for it. If you have already capacity up and running, the reaction time is usually much faster with leases (whatever you set) or endpoints (spec.kubernetes.kubeControllerManager.nodeMonitorPeriod.nodeMonitorGracePeriod), but if you depend on new/fresh capacity, the above should inform you how long you will have to wait for it.\nOn spec.provider.workers.minimum, maximum, maxSurge, maxUnavailable, zones, and machineControllerManager Each worker pool in Gardener may be configured differently. Among many other settings like machine type, root disk, Kubernetes version, kubelet settings, and many more you can also specify the lower and upper bound for the number of machines (minimum and maximum), how many machines may be added additionally during a rolling update (maxSurge) and how many machines may be in termination/recreation during a rolling update (maxUnavailable), and of course across how many zones the nodes shall be spread (zones).\nInteresting is also the configuration for Gardener’s machine-controller-manager or MCM for short that provisions, monitors, terminates, replaces, or updates machines that back your nodes:\n The shorter machineCreationTimeout is, the faster MCM will retry to create a machine/node, if the process is stuck on cloud provider side. It is set to useful/practical timeouts for the different cloud providers and you probably don’t want to change those (in the context of HA at least). Please align with the cluster autoscaler’s maxNodeProvisionTime. The shorter machineHealthTimeout is, the faster MCM will replace machines/nodes in case the kubelet isn’t reporting back, which translates to Unknown, or reports back with NotReady, or the node-problem-detector that Gardener deploys for you reports a non-recoverable issue/condition (e.g., read-only file system). If it is too short however, you risk node and pod trashing, so be careful. The shorter machineDrainTimeout is, the faster you can get rid of machines/nodes that MCM decided to remove, but this puts a cap on the grace periods and PDBs. They are respected up until the drain timeout lapses - then the machine/node will be forcefully terminated, whether or not the pods are still in termination or not even terminated because of PDBs. Those PDBs will then be violated, so be careful here as well. Please align with the cluster autoscaler’s maxGracefulTerminationSeconds. Especially the last two settings may help you recover faster from cloud provider issues.\nOn spec.systemComponents.coreDNS.autoscaling DNS is critical, in general and also within a Kubernetes cluster. Gardener-managed clusters deploy CoreDNS, a graduated CNCF project. Gardener supports 2 auto-scaling modes for it, horizontal (using HPA based on CPU) and cluster-proportional (using cluster proportional autoscaler that scales the number of pods based on the number of nodes/cores, not to be confused with the cluster autoscaler that scales nodes based on their utilization). Check out the docs, especially the trade-offs why you would chose one over the other (cluster-proportional gives you more configuration options, if CPU-based horizontal scaling is insufficient to your needs). Consider also Gardener’s feature node-local DNS to decouple you further from the DNS pods and stabilize DNS. Again, that’s not strictly related to HA, but may become important during a zone outage, when load patterns shift and pods start to initialize/resolve DNS records more frequently in bulk.\nMore Caveats Unfortunately, there are a few more things of note when it comes to HA in a Kubernetes cluster that may be “surprising” and hard to mitigate:\n If the kubelet restarts, it will report all pods as NotReady on startup until it reruns its probes (#100277), which leads to temporary endpoint and load balancer target removal (#102367). This topic is somewhat controversial. Gardener uses rolling updates and a jitter to spread necessary kubelet restarts as good as possible. If a kube-proxy pod on a node turns NotReady, all load balancer traffic to all pods (on this node) under services with externalTrafficPolicy local will cease as the load balancer will then take this node out of serving. This topic is somewhat controversial as well. So, please remember that externalTrafficPolicy local not only has the disadvantage of imbalanced traffic spreading, but also a dependency to the kube-proxy pod that may and will be unavailable during updates. Gardener uses rolling updates to spread necessary kube-proxy updates as good as possible. These are just a few additional considerations. They may or may not affect you, but other intricacies may. It’s a reminder to be watchful as Kubernetes may have one or two relevant quirks that you need to consider (and will probably only find out over time and with extensive testing).\nMeaningful Availability Finally, let’s go back to where we started. We recommended to measure meaningful availability. For instance, in Gardener, we do not trust only internal signals, but track also whether Gardener or the control planes that it manages are externally available through the external DNS records and load balancers, SNI-routing Istio gateways, etc. (the same path all users must take). It’s a huge difference whether the API server’s internal readiness probe passes or the user can actually reach the API server and it does what it’s supposed to do. Most likely, you will be in a similar spot and can do the same.\nWhat you do with these signals is another matter. Maybe there are some actionable metrics and you can trigger some active fail-over, maybe you can only use it to improve your HA setup altogether. In our case, we also use it to deploy mitigations, e.g., via our dependency-watchdog that watches, for instance, Gardener-managed API servers and shuts down components like the controller managers to avert cascading knock-off effects (e.g., melt-down if the kubelets cannot reach the API server, but the controller managers can and start taking down nodes and pods).\nEither way, understanding how users perceive your service is key to the improvement process as a whole. Even if you are not struck by a zone outage, the measures above and tracking the meaningful availability will help you improve your service.\nThank you for your interest and we wish you no or a “successful” zone outage next time. 😊\nWant to know more about Gardener? The Gardener project is Open Source and hosted on GitHub.\nFeedback and contributions are always welcome!\nAll channels for getting in touch or learning about the project are listed on our landing page. We are cordially inviting interested parties to join our bi-weekly meetings.\n","categories":"","description":"","excerpt":"Developing highly available workload that can tolerate a zone outage …","ref":"/blog/2023/03-27-high-availability-and-zone-outage-toleration/","tags":"","title":"High Availability and Zone Outage Toleration"},{"body":"Implementing High Availability and Tolerating Zone Outages Developing highly available workload that can tolerate a zone outage is no trivial task. You will find here various recommendations to get closer to that goal. While many recommendations are general enough, the examples are specific in how to achieve this in a Gardener-managed cluster and where/how to tweak the different control plane components. If you do not use Gardener, it may be still a worthwhile read.\nFirst however, what is a zone outage? It sounds like a clear-cut “thing”, but it isn’t. There are many things that can go haywire. Here are some examples:\n Elevated cloud provider API error rates for individual or multiple services Network bandwidth reduced or latency increased, usually also effecting storage sub systems as they are network attached No networking at all, no DNS, machines shutting down or restarting, … Functional issues, of either the entire service (e.g. all block device operations) or only parts of it (e.g. LB listener registration) All services down, temporarily or permanently (the proverbial burning down data center 🔥) This and everything in between make it hard to prepare for such events, but you can still do a lot. The most important recommendation is to not target specific issues exclusively - tomorrow another service will fail in an unanticipated way. Also, focus more on meaningful availability than on internal signals (useful, but not as relevant as the former). Always prefer automation over manual intervention (e.g. leader election is a pretty robust mechanism, auto-scaling may be required as well, etc.).\nAlso remember that HA is costly - you need to balance it against the cost of an outage as silly as this may sound, e.g. running all this excess capacity “just in case” vs. “going down” vs. a risk-based approach in between where you have means that will kick in, but they are not guaranteed to work (e.g. if the cloud provider is out of resource capacity). Maybe some of your components must run at the highest possible availability level, but others not - that’s a decision only you can make.\nControl Plane The Kubernetes cluster control plane is managed by Gardener (as pods in separate infrastructure clusters to which you have no direct access) and can be set up with no failure tolerance (control plane pods will be recreated best-effort when resources are available) or one of the failure tolerance types node or zone.\nStrictly speaking, static workload does not depend on the (high) availability of the control plane, but static workload doesn’t rhyme with Cloud and Kubernetes and also means, that when you possibly need it the most, e.g. during a zone outage, critical self-healing or auto-scaling functionality won’t be available to you and your workload, if your control plane is down as well. That’s why, even though the resource consumption is significantly higher, we generally recommend to use the failure tolerance type zone for the control planes of productive clusters, at least in all regions that have 3+ zones. Regions that have only 1 or 2 zones don’t support the failure tolerance type zone and then your second best option is the failure tolerance type node, which means a zone outage can still take down your control plane, but individual node outages won’t.\nIn the shoot resource it’s merely only this what you need to add:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) This setting will scale out all control plane components for a Gardener cluster as necessary, so that no single zone outage can take down the control plane for longer than just a few seconds for the fail-over to take place (e.g. lease expiration and new leader election or readiness probe failure and endpoint removal). Components run highly available in either active-active (servers) or active-passive (controllers) mode at all times, the persistence (ETCD), which is consensus-based, will tolerate the loss of one zone and still maintain quorum and therefore remain operational. These are all patterns that we will revisit down below also for your own workload.\nWorker Pools Now that you have configured your Kubernetes cluster control plane in HA, i.e. spread it across multiple zones, you need to do the same for your own workload, but in order to do so, you need to spread your nodes across multiple zones first.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: provider: workers: - name: ... minimum: 6 maximum: 60 zones: - ... Prefer regions with at least 2, better 3+ zones and list the zones in the zones section for each of your worker pools. Whether you need 2 or 3 zones at a minimum depends on your fail-over concept:\n Consensus-based software components (like ETCD) depend on maintaining a quorum of (n/2)+1, so you need at least 3 zones to tolerate the outage of 1 zone. Primary/Secondary-based software components need just 2 zones to tolerate the outage of 1 zone. Then there are software components that can scale out horizontally. They are probably fine with 2 zones, but you also need to think about the load-shift and that the remaining zone must then pick up the work of the unhealthy zone. With 2 zones, the remaining zone must cope with an increase of 100% load. With 3 zones, the remaining zones must only cope with an increase of 50% load (per zone). In general, the question is also whether you have the fail-over capacity already up and running or not. If not, i.e. you depend on re-scheduling to a healthy zone or auto-scaling, be aware that during a zone outage, you will see a resource crunch in the healthy zones. If you have no automation, i.e. only human operators (a.k.a. “red button approach”), you probably will not get the machines you need and even with automation, it may be tricky. But holding the capacity available at all times is costly. In the end, that’s a decision only you can make. If you made that decision, please adapt the minimum, maximum, maxSurge and maxUnavailable settings for your worker pools accordingly (visit this section for more information).\nAlso, consider fall-back worker pools (with different/alternative machine types) and cluster autoscaler expanders using a priority-based strategy.\nGardener-managed clusters deploy the cluster autoscaler or CA for short and you can tweak the general CA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 If you want to be ready for a sudden spike or have some buffer in general, over-provision nodes by means of “placeholder” pods with low priority and appropriate resource requests. This way, they will demand nodes to be provisioned for them, but if any pod comes up with a regular/higher priority, the low priority pods will be evicted to make space for the more important ones. Strictly speaking, this is not related to HA, but it may be important to keep this in mind as you generally want critical components to be rescheduled as fast as possible and if there is no node available, it may take 3 minutes or longer to do so (depending on the cloud provider). Besides, not only zones can fail, but also individual nodes.\nReplicas (Horizontal Scaling) Now let’s talk about your workload. In most cases, this will mean to run multiple replicas. If you cannot do that (a.k.a. you have a singleton), that’s a bad situation to be in. Maybe you can run a spare (secondary) as backup? If you cannot, you depend on quick detection and rescheduling of your singleton (more on that below).\nObviously, things get messier with persistence. If you have persistence, you should ideally replicate your data, i.e. let your spare (secondary) “follow” your main (primary). If your software doesn’t support that, you have to deploy other means, e.g. volume snapshotting or side-backups (specific to the software you deploy; keep the backups regional, so that you can switch to another zone at all times). If you have to do those, your HA scenario becomes more a DR scenario and terms like RPO and RTO become relevant to you:\n Recovery Point Objective (RPO): Potential data loss, i.e. how much data will you lose at most (time between backups) Recovery Time Objective (RTO): Time until recovery, i.e. how long does it take you to be operational again (time to restore) Also, keep in mind that your persistent volumes are usually zonal, i.e. once you have a volume in one zone, it’s bound to that zone and you cannot get up your pod in another zone w/o first recreating the volume yourself (Kubernetes won’t help you here directly).\nAnyway, best avoid that, if you can (from technical and cost perspective). The best solution (and also the most costly one) is to run multiple replicas in multiple zones and keep your data replicated at all times, so that your RPO is always 0 (best). That’s what we do for Gardener-managed cluster HA control planes (ETCD) as any data loss may be disastrous and lead to orphaned resources (in addition, we deploy side cars that do side-backups for disaster recovery, with full and incremental snapshots with an RPO of 5m).\nSo, how to run with multiple replicas? That’s the easiest part in Kubernetes and the two most important resources, Deployments and StatefulSet, support that out of the box:\napiVersion: apps/v1 kind: Deployment | StatefulSet spec: replicas: ... The problem comes with the number of replicas. It’s easy only if the number is static, e.g. 2 for active-active/passive or 3 for consensus-based software components, but what with software components that can scale out horizontally? Here you usually do not set the number of replicas statically, but make use of the horizontal pod autoscaler or HPA for short (built-in; part of the kube-controller-manager). There are also other options like the cluster proportional autoscaler, but while the former works based on metrics, the latter is more a guestimate approach that derives the number of replicas from the number of nodes/cores in a cluster. Sometimes useful, but often blind to the actual demand.\nSo, HPA it is then for most of the cases. However, what is the resource (e.g. CPU or memory) that drives the number of desired replicas? Again, this is up to you, but not always are CPU or memory the best choices. In some cases, custom metrics may be more appropriate, e.g. requests per second (it was also for us).\nYou will have to create specific HorizontalPodAutoscaler resources for your scale target and can tweak the general HPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeControllerManager: horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s Resources (Vertical Scaling) While it is important to set a sufficient number of replicas, it is also important to give the pods sufficient resources (CPU and memory). This is especially true when you think about HA. When a zone goes down, you might need to get up replacement pods, if you don’t have them running already to take over the load from the impacted zone. Likewise, e.g. with active-active software components, you can expect the remaining pods to receive more load. If you cannot scale them out horizontally to serve the load, you will probably need to scale them out (or rather up) vertically. This is done by the vertical pod autoscaler or VPA for short (not built-in; part of the kubernetes/autoscaler repository).\nA few caveats though:\n You cannot use HPA and VPA on the same metrics as they would influence each other, which would lead to pod trashing (more replicas require fewer resources; fewer resources require more replicas) Scaling horizontally doesn’t cause downtimes (at least not when out-scaling and only one replica is affected when in-scaling), but scaling vertically does (if the pod runs OOM anyway, but also when new recommendations are applied, resource requests for existing pods may be changed, which causes the pods to be rescheduled). Although the discussion is going on for a very long time now, that is still not supported in-place yet (see KEP 1287, implementation in Kubernetes, implementation in VPA). VPA is a useful tool and Gardener-managed clusters deploy a VPA by default for you (HPA is supported anyway as it’s built into the kube-controller-manager). You will have to create specific VerticalPodAutoscaler resources for your scale target and can tweak the general VPA knobs for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s While horizontal pod autoscaling is relatively straight-forward, it takes a long time to master vertical pod autoscaling. We saw performance issues, hard-coded behavior (on OOM, memory is bumped by +20% and it may take a few iterations to reach a good level), unintended pod disruptions by applying new resource requests (after 12h all targeted pods will receive new requests even though individually they would be fine without, which also drives active-passive resource consumption up), difficulties to deal with spiky workload in general (due to the algorithmic approach it takes), recommended requests may exceed node capacity, limit scaling is proportional and therefore often questionable, and more. VPA is a double-edged sword: useful and necessary, but not easy to handle.\nFor the Gardener-managed components, we mostly removed limits. Why?\n CPU limits have almost always only downsides. They cause needless CPU throttling, which is not even easily visible. CPU requests turn into cpu shares, so if the node has capacity, the pod may consume the freely available CPU, but not if you have set limits, which curtail the pod by means of cpu quota. There are only certain scenarios in which they may make sense, e.g. if you set requests=limits and thereby define a pod with guaranteed QoS, which influences your cgroup placement. However, that is difficult to do for the components you implement yourself and practically impossible for the components you just consume, because what’s the correct value for requests/limits and will it hold true also if the load increases and what happens if a zone goes down or with the next update/version of this component? If anything, CPU limits caused outages, not helped prevent them. As for memory limits, they are slightly more useful, because CPU is compressible and memory is not, so if one pod runs berserk, it may take others down (with CPU, cpu shares make it as fair as possible), depending on which OOM killer strikes (a complicated topic by itself). You don’t want the operating system OOM killer to strike as the result is unpredictable. Better, it’s the cgroup OOM killer or even the kubelet’s eviction, if the consumption is slow enough as it takes priorities into consideration even. If your component is critical and a singleton (e.g. node daemon set pods), you are better off also without memory limits, because letting the pod go OOM because of artificial/wrong memory limits can mean that the node becomes unusable. Hence, such components also better run only with no or a very high memory limit, so that you can catch the occasional memory leak (bug) eventually, but under normal operation, if you cannot decide about a true upper limit, rather not have limits and cause endless outages through them or when you need the pods the most (during a zone outage) where all your assumptions went out of the window. The downside of having poor or no limits and poor and no requests is that nodes may “die” more often. Contrary to the expectation, even for managed services, the managed service is not responsible or cannot guarantee the health of a node under all circumstances, since the end user defines what is run on the nodes (shared responsibility). If the workload exhausts any resource, it will be the end of the node, e.g. by compressing the CPU too much (so that the kubelet fails to do its work), exhausting the main memory too fast, disk space, file handles, or any other resource.\nThe kubelet allows for explicit reservation of resources for operating system daemons (system-reserved) and Kubernetes daemons (kube-reserved) that are subtracted from the actual node resources and become the allocatable node resources for your workload/pods. All managed services configure these settings “by rule of thumb” (a balancing act), but cannot guarantee that the values won’t waste resources or always will be sufficient. You will have to fine-tune them eventually and adapt them to your needs. In addition, you can configure soft and hard eviction thresholds to give the kubelet some headroom to evict “greedy” pods in a controlled way. These settings can be configured for Gardener-managed clusters like this:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubelet: kubeReserved: # explicit resource reservation for Kubernetes daemons cpu: 100m memory: 1Gi ephemeralStorage: 1Gi pid: 1000 evictionSoft: # soft, i.e. graceful eviction (used if the node is about to run out of resources, avoiding hard evictions) memoryAvailable: 200Mi imageFSAvailable: 10% imageFSInodesFree: 10% nodeFSAvailable: 10% nodeFSInodesFree: 10% evictionSoftGracePeriod: # caps pod's `terminationGracePeriodSeconds` value during soft evictions (specific grace periods) memoryAvailable: 1m30s imageFSAvailable: 1m30s imageFSInodesFree: 1m30s nodeFSAvailable: 1m30s nodeFSInodesFree: 1m30s evictionHard: # hard, i.e. immediate eviction (used if the node is out of resources, avoiding the OS generally run out of resources fail processes indiscriminately) memoryAvailable: 100Mi imageFSAvailable: 5% imageFSInodesFree: 5% nodeFSAvailable: 5% nodeFSInodesFree: 5% evictionMinimumReclaim: # additional resources to reclaim after hitting the hard eviction thresholds to not hit the same thresholds soon after again memoryAvailable: 0Mi imageFSAvailable: 0Mi imageFSInodesFree: 0Mi nodeFSAvailable: 0Mi nodeFSInodesFree: 0Mi evictionMaxPodGracePeriod: 90 # caps pod's `terminationGracePeriodSeconds` value during soft evictions (general grace periods) evictionPressureTransitionPeriod: 5m0s # stabilization time window to avoid flapping of node eviction state You can tweak these settings also individually per worker pool (spec.provider.workers.kubernetes.kubelet...), which makes sense especially with different machine types (and also workload that you may want to schedule there).\nPhysical memory is not compressible, but you can overcome this issue to some degree (alpha since Kubernetes v1.22 in combination with the feature gate NodeSwap on the kubelet) with swap memory. You can read more in this introductory blog and the docs. If you chose to use it (still only alpha at the time of this writing) you may want to consider also the risks associated with swap memory:\n Reduced performance predictability Reduced performance up to page trashing Reduced security as secrets, normally held only in memory, could be swapped out to disk That said, the various options mentioned above are only remotely related to HA and will not be further explored throughout this document, but just to remind you: if a zone goes down, load patterns will shift, existing pods will probably receive more load and will require more resources (especially because it is often practically impossible to set “proper” resource requests, which drive node allocation - limits are always ignored by the scheduler) or more pods will/must be placed on the existing and/or new nodes and then these settings, which are generally critical (especially if you switch on bin-packing for Gardener-managed clusters as a cost saving measure), will become even more critical during a zone outage.\nProbes Before we go down the rabbit hole even further and talk about how to spread your replicas, we need to talk about probes first, as they will become relevant later. Kubernetes supports three kinds of probes: startup, liveness, and readiness probes. If you are a visual thinker, also check out this slide deck by Tim Hockin (Kubernetes networking SIG chair).\nBasically, the startupProbe and the livenessProbe help you restart the container, if it’s unhealthy for whatever reason, by letting the kubelet that orchestrates your containers on a node know, that it’s unhealthy. The former is a special case of the latter and only applied at the startup of your container, if you need to handle the startup phase differently (e.g. with very slow starting containers) from the rest of the lifetime of the container.\nNow, the readinessProbe helps you manage the ready status of your container and thereby pod (any container that is not ready turns the pod not ready). This again has impact on endpoints and pod disruption budgets:\n If the pod is not ready, the endpoint will be removed and the pod will not receive traffic anymore If the pod is not ready, the pod counts into the pod disruption budget and if the budget is exceeded, no further voluntary pod disruptions will be permitted for the remaining ready pods (e.g. no eviction, no voluntary horizontal or vertical scaling, if the pod runs on a node that is about to be drained or in draining, draining will be paused until the max drain timeout passes) As you can see, all of these probes are (also) related to HA (mostly the readinessProbe, but depending on your workload, you can also leverage livenessProbe and startupProbe into your HA strategy). If Kubernetes doesn’t know about the individual status of your container/pod, it won’t do anything for you (right away). That said, later/indirectly something might/will happen via the node status that can also be ready or not ready, which influences the pods and load balancer listener registration (a not ready node will not receive cluster traffic anymore), but this process is worker pool global and reacts delayed and also doesn’t discriminate between the containers/pods on a node.\nIn addition, Kubernetes also offers pod readiness gates to amend your pod readiness with additional custom conditions (normally, only the sum of the container readiness matters, but pod readiness gates additionally count into the overall pod readiness). This may be useful if you want to block (by means of pod disruption budgets that we will talk about next) the roll-out of your workload/nodes in case some (possibly external) condition fails.\nPod Disruption Budgets One of the most important resources that help you on your way to HA are pod disruption budgets or PDB for short. They tell Kubernetes how to deal with voluntary pod disruptions, e.g. during the deployment of your workload, when the nodes are rolled, or just in general when a pod shall be evicted/terminated. Basically, if the budget is reached, they block all voluntary pod disruptions (at least for a while until possibly other timeouts act or things happen that leave Kubernetes no choice anymore, e.g. the node is forcefully terminated). You should always define them for your workload.\nVery important to note is that they are based on the readinessProbe, i.e. even if all of your replicas are lively, but not enough of them are ready, this blocks voluntary pod disruptions, so they are very critical and useful. Here an example (you can specify either minAvailable or maxUnavailable in absolute numbers or as percentage):\napiVersion: policy/v1 kind: PodDisruptionBudget spec: maxUnavailable: 1 selector: matchLabels: ... And please do not specify a PDB of maxUnavailable being 0 or similar. That’s pointless, even detrimental, as it blocks then even useful operations, forces always the hard timeouts that are less graceful and it doesn’t make sense in the context of HA. You cannot “force” HA by preventing voluntary pod disruptions, you must work with the pod disruptions in a resilient way. Besides, PDBs are really only about voluntary pod disruptions - something bad can happen to a node/pod at any time and PDBs won’t make this reality go away for you.\nPDBs will not always work as expected and can also get in your way, e.g. if the PDB is violated or would be violated, it may possibly block whatever you are trying to do to salvage the situation, e.g. drain a node or deploy a patch version (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which Kubernetes doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server, beta and enabled by default since Kubernetes v1.27) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\nPod Topology Spread Constraints Pod topology spread constraints or PTSC for short (no official abbreviation exists, but we will use this in the following) are enormously helpful to distribute your replicas across multiple zones, nodes, or any other user-defined topology domain. They complement and improve on pod (anti-)affinities that still exist and can be used in combination.\nPTSCs are an improvement, because they allow for maxSkew and minDomains. You can steer the “level of tolerated imbalance” with maxSkew, e.g. you probably want that to be at least 1, so that you can perform a rolling update, but this all depends on your deployment (maxUnavailable and maxSurge), etc. Stateful sets are a bit different (maxUnavailable) as they are bound to volumes and depend on them, so there usually cannot be 2 pods requiring the same volume. minDomains is a hint to tell the scheduler how far to spread, e.g. if all nodes in one zone disappeared because of a zone outage, it may “appear” as if there are only 2 zones in a 3 zones cluster and the scheduling decisions may end up wrong, so a minDomains of 3 will tell the scheduler to spread to 3 zones before adding another replica in one zone. Be careful with this setting as it also means, if one zone is down the “spread” is already at least 1, if pods run in the other zones. This is useful where you have exactly as many replicas as you have zones and you do not want any imbalance. Imbalance is critical as if you end up with one, nobody is going to do the (active) re-balancing for you (unless you deploy and configure additional non-standard components such as the descheduler). So, for instance, if you have something like a DBMS that you want to spread across 2 zones (active-passive) or 3 zones (consensus-based), you better specify minDomains of 2 respectively 3 to force your replicas into at least that many zones before adding more replicas to another zone (if supported).\nAnyway, PTSCs are critical to have, but not perfect, so we saw (unsurprisingly, because that’s how the scheduler works), that the scheduler may block the deployment of new pods because it takes the decision pod-by-pod (see for instance #109364).\nPod Affinities and Anti-Affinities As said, you can combine PTSCs with pod affinities and/or anti-affinities. Especially inter-pod (anti-)affinities may be helpful to place pods apart, e.g. because they are fall-backs for each other or you do not want multiple potentially resource-hungry “best-effort” or “burstable” pods side-by-side (noisy neighbor problem), or together, e.g. because they form a unit and you want to reduce the failure domain, reduce the network latency, and reduce the costs.\nTopology Aware Hints While topology aware hints are not directly related to HA, they are very relevant in the HA context. Spreading your workload across multiple zones may increase network latency and cost significantly, if the traffic is not shaped. Topology aware hints (beta since Kubernetes v1.23, replacing the now deprecated topology aware traffic routing with topology keys) help to route the traffic within the originating zone, if possible. Basically, they tell kube-proxy how to setup your routing information, so that clients can talk to endpoints that are located within the same zone.\nBe aware however, that there are some limitations. Those are called safeguards and if they strike, the hints are off and traffic is routed again randomly. Especially controversial is the balancing limitation as there is the assumption, that the load that hits an endpoint is determined by the allocatable CPUs in that topology zone, but that’s not always, if even often, the case (see for instance #113731 and #110714). So, this limitation hits far too often and your hints are off, but then again, it’s about network latency and cost optimization first, so it’s better than nothing.\nNetworking We have talked about networking only to some small degree so far (readiness probes, pod disruption budgets, topology aware hints). The most important component is probably your ingress load balancer - everything else is managed by Kubernetes. AWS, Azure, GCP, and also OpenStack offer multi-zonal load balancers, so make use of them. In Azure and GCP, LBs are regional whereas in AWS and OpenStack, they need to be bound to a zone, which the cloud-controller-manager does by observing the zone labels at the nodes (please note that this behavior is not always working as expected, see #570 where the AWS cloud-controller-manager is not readjusting to newly observed zones).\nPlease be reminded that even if you use a service mesh like Istio, the off-the-shelf installation/configuration usually never comes with productive settings (to simplify first-time installation and improve first-time user experience) and you will have to fine-tune your installation/configuration, much like the rest of your workload.\nRelevant Cluster Settings Following now a summary/list of the more relevant settings you may like to tune for Gardener-managed clusters:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: zone # valid values are `node` and `zone` (only available if your control plane resides in a region with 3+ zones) kubernetes: kubeAPIServer: defaultNotReadyTolerationSeconds: 300 defaultUnreachableTolerationSeconds: 300 kubelet: ... kubeScheduler: featureGates: MinDomainsInPodTopologySpread: true kubeControllerManager: nodeMonitorGracePeriod: 40s horizontalPodAutoscaler: syncPeriod: 15s tolerance: 0.1 downscaleStabilization: 5m0s initialReadinessDelay: 30s cpuInitializationPeriod: 5m0s verticalPodAutoscaler: enabled: true evictAfterOOMThreshold: 10m0s evictionRateBurst: 1 evictionRateLimit: -1 evictionTolerance: 0.5 recommendationMarginFraction: 0.15 updaterInterval: 1m0s recommenderInterval: 1m0s clusterAutoscaler: expander: \"least-waste\" scanInterval: 10s scaleDownDelayAfterAdd: 60m scaleDownDelayAfterDelete: 0s scaleDownDelayAfterFailure: 3m scaleDownUnneededTime: 30m scaleDownUtilizationThreshold: 0.5 provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: - ... # list of zones you want your worker pool nodes to be spread across, see above kubernetes: kubelet: ... # similar to `kubelet` above (cluster-wide settings), but here per worker pool (pool-specific settings), see above machineControllerManager: # optional, it allows to configure the machine-controller settings. machineCreationTimeout: 20m machineHealthTimeout: 10m machineDrainTimeout: 60h systemComponents: coreDNS: autoscaling: mode: horizontal # valid values are `horizontal` (driven by CPU load) and `cluster-proportional` (driven by number of nodes/cores) On spec.controlPlane.highAvailability.failureTolerance.type If set, determines the degree of failure tolerance for your control plane. zone is preferred, but only available if your control plane resides in a region with 3+ zones. See above and the docs.\nOn spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds This is a very interesting API server setting that lets Kubernetes decide how fast to evict pods from nodes whose status condition of type Ready is either Unknown (node status unknown, a.k.a unreachable) or False (kubelet not ready) (see node status conditions; please note that kubectl shows both values as NotReady which is a somewhat “simplified” visualization).\nYou can also override the cluster-wide API server settings individually per pod:\nspec: tolerations: - key: \"node.kubernetes.io/unreachable\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 - key: \"node.kubernetes.io/not-ready\" operator: \"Exists\" effect: \"NoExecute\" tolerationSeconds: 0 This will evict pods on unreachable or not-ready nodes immediately, but be cautious: 0 is very aggressive and may lead to unnecessary disruptions. Again, you must decide for your own workload and balance out the pros and cons (e.g. long startup time).\nPlease note, these settings replace spec.kubernetes.kubeControllerManager.podEvictionTimeout that was deprecated with Kubernetes v1.26 (and acted as an upper bound).\nOn spec.kubernetes.kubeScheduler.featureGates.MinDomainsInPodTopologySpread Required to be enabled for minDomains to work with PTSCs (beta since Kubernetes v1.25, but off by default). See above and the docs. This tells the scheduler, how many topology domains to expect (=zones in the context of this document).\nOn spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod This is another very interesting kube-controller-manager setting that can help you speed up or slow down how fast a node shall be considered Unknown (node status unknown, a.k.a unreachable) when the kubelet is not updating its status anymore (see node status conditions), which effects eviction (see spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds and defaultNotReadyTolerationSeconds above). The shorter the time window, the faster Kubernetes will act, but the higher the chance of flapping behavior and pod trashing, so you may want to balance that out according to your needs, otherwise stick to the default which is a reasonable compromise.\nOn spec.kubernetes.kubeControllerManager.horizontalPodAutoscaler... This configures horizontal pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.verticalPodAutoscaler... This configures vertical pod autoscaling in Gardener-managed clusters. See above and the docs for the detailed fields.\nOn spec.kubernetes.clusterAutoscaler... This configures node auto-scaling in Gardener-managed clusters. See above and the docs for the detailed fields, especially about expanders, which may become life-saving in case of a zone outage when a resource crunch is setting in and everybody rushes to get machines in the healthy zones.\nIn case of a zone outage, it is critical to understand how the cluster autoscaler will put a worker pool in one zone into “back-off” and what the consequences for your workload will be. Unfortunately, the official cluster autoscaler documentation does not explain these details, but you can find hints in the source code:\nIf a node fails to come up, the node group (worker pool in that zone) will go into “back-off”, at first 5m, then exponentially longer until the maximum of 30m is reached. The “back-off” is reset after 3 hours. This in turn means, that nodes must be first considered Unknown, which happens when spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod lapses (e.g. at the beginning of a zone outage). Then they must either remain in this state until spec.provider.workers.machineControllerManager.machineHealthTimeout lapses for them to be recreated, which will fail in the unhealthy zone, or spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds lapses for the pods to be evicted (usually faster than node replacements, depending on your configuration), which will trigger the cluster autoscaler to create more capacity, but very likely in the same zone as it tries to balance its node groups at first, which will fail in the unhealthy zone. It will be considered failed only when maxNodeProvisionTime lapses (usually close to spec.provider.workers.machineControllerManager.machineCreationTimeout) and only then put the node group into “back-off” and not retry for 5m (at first and then exponentially longer). Only then you can expect new node capacity to be brought up somewhere else.\nDuring the time of ongoing node provisioning (before a node group goes into “back-off”), the cluster autoscaler may have “virtually scheduled” pending pods onto those new upcoming nodes and will not reevaluate these pods anymore unless the node provisioning fails (which will fail during a zone outage, but the cluster autoscaler cannot know that and will therefore reevaluate its decision only after it has given up on the new nodes).\nIt’s critical to keep that in mind and accommodate for it. If you have already capacity up and running, the reaction time is usually much faster with leases (whatever you set) or endpoints (spec.kubernetes.kubeControllerManager.nodeMonitorGracePeriod), but if you depend on new/fresh capacity, the above should inform you how long you will have to wait for it and for how long pods might be pending (because capacity is generally missing and pending pods may have been “virtually scheduled” to new nodes that won’t come up until the node group goes eventually into “back-off” and nodes in the healthy zones come up).\nOn spec.provider.workers.minimum, maximum, maxSurge, maxUnavailable, zones, and machineControllerManager Each worker pool in Gardener may be configured differently. Among many other settings like machine type, root disk, Kubernetes version, kubelet settings, and many more you can also specify the lower and upper bound for the number of machines (minimum and maximum), how many machines may be added additionally during a rolling update (maxSurge) and how many machines may be in termination/recreation during a rolling update (maxUnavailable), and of course across how many zones the nodes shall be spread (zones).\nGardener divides minimum, maximum, maxSurge, maxUnavailable values by the number of zones specified for this worker pool. This fact must be considered when you plan the sizing of your worker pools.\nExample:\n provider: workers: - name: ... minimum: 6 maximum: 60 maxSurge: 3 maxUnavailable: 0 zones: [\"a\", \"b\", \"c\"] The resulting MachineDeployments per zone will get minimum: 2, maximum: 20, maxSurge: 1, maxUnavailable: 0. If another zone is added all values will be divided by 4, resulting in: Less workers per zone. ⚠️ One MachineDeployment with maxSurge: 0, i.e. there will be a replacement of nodes without rolling updates. Interesting is also the configuration for Gardener’s machine-controller-manager or MCM for short that provisions, monitors, terminates, replaces, or updates machines that back your nodes:\n The shorter machineCreationTimeout is, the faster MCM will retry to create a machine/node, if the process is stuck on cloud provider side. It is set to useful/practical timeouts for the different cloud providers and you probably don’t want to change those (in the context of HA at least). Please align with the cluster autoscaler’s maxNodeProvisionTime. The shorter machineHealthTimeout is, the faster MCM will replace machines/nodes in case the kubelet isn’t reporting back, which translates to Unknown, or reports back with NotReady, or the node-problem-detector that Gardener deploys for you reports a non-recoverable issue/condition (e.g. read-only file system). If it is too short however, you risk node and pod trashing, so be careful. The shorter machineDrainTimeout is, the faster you can get rid of machines/nodes that MCM decided to remove, but this puts a cap on the grace periods and PDBs. They are respected up until the drain timeout lapses - then the machine/node will be forcefully terminated, whether or not the pods are still in termination or not even terminated because of PDBs. Those PDBs will then be violated, so be careful here as well. Please align with the cluster autoscaler’s maxGracefulTerminationSeconds. Especially the last two settings may help you recover faster from cloud provider issues.\nOn spec.systemComponents.coreDNS.autoscaling DNS is critical, in general and also within a Kubernetes cluster. Gardener-managed clusters deploy CoreDNS, a graduated CNCF project. Gardener supports 2 auto-scaling modes for it, horizontal (using HPA based on CPU) and cluster-proportional (using cluster proportional autoscaler that scales the number of pods based on the number of nodes/cores, not to be confused with the cluster autoscaler that scales nodes based on their utilization). Check out the docs, especially the trade-offs why you would chose one over the other (cluster-proportional gives you more configuration options, if CPU-based horizontal scaling is insufficient to your needs). Consider also Gardener’s feature node-local DNS to decouple you further from the DNS pods and stabilize DNS. Again, that’s not strictly related to HA, but may become important during a zone outage, when load patterns shift and pods start to initialize/resolve DNS records more frequently in bulk.\nMore Caveats Unfortunately, there are a few more things of note when it comes to HA in a Kubernetes cluster that may be “surprising” and hard to mitigate:\n If the kubelet restarts, it will report all pods as NotReady on startup until it reruns its probes (#100277), which leads to temporary endpoint and load balancer target removal (#102367). This topic is somewhat controversial. Gardener uses rolling updates and a jitter to spread necessary kubelet restarts as good as possible. If a kube-proxy pod on a node turns NotReady, all load balancer traffic to all pods (on this node) under services with externalTrafficPolicy local will cease as the load balancer will then take this node out of serving. This topic is somewhat controversial as well. So, please remember that externalTrafficPolicy local not only has the disadvantage of imbalanced traffic spreading, but also a dependency to the kube-proxy pod that may and will be unavailable during updates. Gardener uses rolling updates to spread necessary kube-proxy updates as good as possible. These are just a few additional considerations. They may or may not affect you, but other intricacies may. It’s a reminder to be watchful as Kubernetes may have one or two relevant quirks that you need to consider (and will probably only find out over time and with extensive testing).\nMeaningful Availability Finally, let’s go back to where we started. We recommended to measure meaningful availability. For instance, in Gardener, we do not trust only internal signals, but track also whether Gardener or the control planes that it manages are externally available through the external DNS records and load balancers, SNI-routing Istio gateways, etc. (the same path all users must take). It’s a huge difference whether the API server’s internal readiness probe passes or the user can actually reach the API server and it does what it’s supposed to do. Most likely, you will be in a similar spot and can do the same.\nWhat you do with these signals is another matter. Maybe there are some actionable metrics and you can trigger some active fail-over, maybe you can only use it to improve your HA setup altogether. In our case, we also use it to deploy mitigations, e.g. via our dependency-watchdog that watches, for instance, Gardener-managed API servers and shuts down components like the controller managers to avert cascading knock-off effects (e.g. melt-down if the kubelets cannot reach the API server, but the controller managers can and start taking down nodes and pods).\nEither way, understanding how users perceive your service is key to the improvement process as a whole. Even if you are not struck by a zone outage, the measures above and tracking the meaningful availability will help you improve your service.\nThank you for your interest.\n","categories":"","description":"","excerpt":"Implementing High Availability and Tolerating Zone Outages Developing …","ref":"/docs/guides/high-availability/best-practices/","tags":"","title":"Best Practices"},{"body":"Overview Gardener provides chaostoolkit modules to simulate compute and network outages for various cloud providers such as AWS, Azure, GCP, OpenStack/Converged Cloud, and VMware vSphere, as well as pod disruptions for any Kubernetes cluster.\nThe API, parameterization, and implementation is as homogeneous as possible across the different cloud providers, so that you have only minimal effort. As a Gardener user, you benefit from an additional garden module that leverages the generic modules, but exposes their functionality in the most simple, homogeneous, and secure way (no need to specify cloud provider credentials, cluster credentials, or filters explicitly; retrieves credentials and stores them in memory only).\nInstallation The name of the package is chaosgarden and it was developed and tested with Python 3.9+. It’s being published to PyPI, so that you can comfortably install it via Python’s package installer pip (you may want to create a virtual environment before installing it):\npip install chaosgarden ℹ️ If you want to use the VMware vSphere module, please note the remarks in requirements.txt for vSphere. Those are not contained in the published PyPI package.\nThe package can be used directly from Python scripts and supports this usage scenario with additional convenience that helps launch actions and probes in background (more on actions and probes later), so that you can compose also complex scenarios with ease.\nIf this technology is new to you, you will probably prefer the chaostoolkit CLI in combination with experiment files, so we need to install the CLI next:\npip install chaostoolkit Please verify that it was installed properly by running:\nchaos --help Usage ℹ️ We assume you are using Gardener and run Gardener-managed shoot clusters. You can also use the generic cloud provider and Kubernetes chaosgarden modules, but configuration and secrets will then differ. Please see the module docs for details.\nA Simple Experiment The most important command is the run command, but before we can use it, we need to compile an experiment file first. Let’s start with a simple one, invoking only a read-only 📖 action from chaosgarden that lists cloud provider machines and networks (depends on cloud provider) for the “first” zone of one of your shoot clusters.\nLet’s assume, your project is called my-project and your shoot is called my-shoot, then we need to create the following experiment:\n{ \"title\": \"assess-filters-impact\", \"description\": \"assess-filters-impact\", \"method\": [ { \"type\": \"action\", \"name\": \"assess-filters-impact\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"assess_cloud_provider_filters_impact\", \"arguments\": { \"zone\": 0 } } } ], \"configuration\": { \"garden_project\": \"my-project\", \"garden_shoot\": \"my-shoot\" } } We are not yet there and need one more thing to do before we can run it: We need to “target” the Gardener landscape resp. Gardener API server where you have created your shoot cluster (not to be confused with your shoot cluster API server). If you do not know what this is or how to download the Gardener API server kubeconfig, please follow these instructions. You can either download your personal credentials or project credentials (see creation of a serviceaccount) to interact with Gardener. For now (fastest and most convenient way, but generally not recommended), let’s use your personal credentials, but if you later plan to automate your experiments, please use proper project credentials (a serviceaccount is not bound to your person, but to the project, and can be restricted using RBAC roles and role bindings, which is why we recommend this for production).\nTo download your personal credentials, open the Gardener Dashboard and click on your avatar in the upper right corner of the page. Click “My Account”, then look for the “Access” pane, then “Kubeconfig”, then press the “Download” button and save the kubeconfig to disk. Run the following command next:\nexport KUBECONFIG=path/to/kubeconfig We are now set and you can run your first experiment:\nchaos run path/to/experiment You should see output like this (depends on cloud provider):\n[INFO] Validating the experiment's syntax [INFO] Installing signal handlers to terminate all active background threads on involuntary signals (note that SIGKILL cannot be handled). [INFO] Experiment looks valid [INFO] Running experiment: assess-filters-impact [INFO] Steady-state strategy: default [INFO] Rollbacks strategy: default [INFO] No steady state hypothesis defined. That's ok, just exploring. [INFO] Playing your experiment's method now... [INFO] Action: assess-filters-impact [INFO] Validating client credentials and listing probably impacted instances and/or networks with the given arguments zone='world-1a' and filters={'instances': [{'Name': 'tag-key', 'Values': ['kubernetes.io/cluster/shoot--my-project--my-shoot']}], 'vpcs': [{'Name': 'tag-key', 'Values': ['kubernetes.io/cluster/shoot--my-project--my-shoot']}]}: [INFO] 1 instance(s) would be impacted: [INFO] - i-aabbccddeeff0000 [INFO] 1 VPC(s) would be impacted: [INFO] - vpc-aabbccddeeff0000 [INFO] Let's rollback... [INFO] No declared rollbacks, let's move on. [INFO] Experiment ended with status: completed 🎉 Congratulations! You successfully ran your first chaosgarden experiment.\nA Destructive Experiment Now let’s break 🪓 your cluster. Be advised that this experiment will be destructive in the sense that we will temporarily network-partition all nodes in one availability zone (machine termination or restart is available with chaosgarden as well). That means, these nodes and their pods won’t be able to “talk” to other nodes, pods, and services. Also, the API server will become unreachable for them and the API server will report them as unreachable (confusingly shown as NotReady when you run kubectl get nodes and Unknown in the status Ready condition when you run kubectl get nodes --output yaml).\nBeing unreachable will trigger service endpoint and load balancer de-registration (when the node’s grace period lapses) as well as eventually pod eviction and machine replacement (which will continue to fail under test). We won’t run the experiment long enough for all of these effects to materialize, but the longer you run it, the more will happen, up to temporarily giving up/going into “back-off” for the affected worker pool in that zone. You will also see that the Kubernetes cluster autoscaler will try to create a new machine almost immediately, if pods are pending for the affected zone (which will initially fail under test, but may succeed later, which again depends on the runtime of the experiment and whether or not the cluster autoscaler goes into “back-off” or not).\nBut for now, all of this doesn’t matter as we want to start “small”. You can later read up more on the various settings and effects in our best practices guide on high availability.\nPlease create a new experiment file, this time with this content:\n{ \"title\": \"run-network-failure-simulation\", \"description\": \"run-network-failure-simulation\", \"method\": [ { \"type\": \"action\", \"name\": \"run-network-failure-simulation\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"run_cloud_provider_network_failure_simulation\", \"arguments\": { \"mode\": \"total\", \"zone\": 0, \"duration\": 60 } } } ], \"rollbacks\": [ { \"type\": \"action\", \"name\": \"rollback-network-failure-simulation\", \"provider\": { \"type\": \"python\", \"module\": \"chaosgarden.garden.actions\", \"func\": \"rollback_cloud_provider_network_failure_simulation\", \"arguments\": { \"mode\": \"total\", \"zone\": 0 } } } ], \"configuration\": { \"garden_project\": { \"type\": \"env\", \"key\": \"GARDEN_PROJECT\" }, \"garden_shoot\": { \"type\": \"env\", \"key\": \"GARDEN_SHOOT\" } } } ℹ️ There is an even more destructive action that terminates or alternatively restarts machines in a given zone 🔥 (immediately or delayed with some randomness/chaos for maximum inconvenience for the nodes and pods). You can find links to all these examples at the end of this tutorial.\nThis experiment is very similar, but this time we will break 🪓 your cluster - for 60s. If that’s too short to even see a node or pod transition from Ready to NotReady (actually Unknown), then increase the duration. Depending on the workload that your cluster runs, you may already see effects of the network partitioning, because it is effective immediately. It’s just that Kubernetes cannot know immediately and rather assumes that something is failing only after the node’s grace period lapses, but the actual workload is impacted immediately.\nMost notably, this experiment also has a rollbacks section, which is invoked even if you abort the experiment or it fails unexpectedly, but only if you run the CLI with the option --rollback-strategy always which we will do soon. Any chaosgarden action that can undo its activity, will do that implicitly when the duration lapses, but it is a best practice to always configure a rollbacks section in case something unexpected happens. Should you be in panic and just want to run the rollbacks section, you can remove all other actions and the CLI will execute the rollbacks section immediately.\nOne other thing is different in the second experiment as well. We now read the name of the project and the shoot from the environment, i.e. a configuration section can automatically expand environment variables. Also useful to know (not shown here), chaostoolkit supports variable substitution too, so that you have to define variables only once. Please note that you can also add a secrets section that can also automatically expand environment variables. For instance, instead of targeting the Gardener API server via $KUBECONFIG, which is supported by our chaosgarden package natively, you can also explicitly refer to it in a secrets section (for brevity reasons not shown here either).\nLet’s now run your second experiment (please watch your nodes and pods in parallel, e.g. by running watch kubectl get nodes,pods --output wide in another terminal):\nexport GARDEN_PROJECT=my-project export GARDEN_SHOOT=my-shoot chaos run --rollback-strategy always path/to/experiment The output of the run command will be similar to the one above, but longer. It will mention either machines or networks that were network-partitioned (depends on cloud provider), but should revert everything back to normal.\nNormally, you would not only run actions in the method section, but also probes as part of a steady state hypothesis. Such steady state hypothesis probes are run before and after the actions to validate that the “system” was in a healthy state before and gets back to a healthy state after the actions ran, hence show that the “system” is in a steady state when not under test. Eventually, you will write your own probes that don’t even have to be part of a steady state hypothesis. We at Gardener run multi-zone (multiple zones at once) and rolling-zone (strike each zone once) outages with continuous custom probes all within the method section to validate our KPIs continuously under test (e.g. how long do the individual fail-overs take/how long is the actual outage). The most complex scenarios are even run via Python scripts as all actions and probes can also be invoked directly (which is what the CLI does).\nHigh Availability Developing highly available workload that can tolerate a zone outage is no trivial task. You can find more information on how to achieve this goal in our best practices guide on high availability.\nThank you for your interest in Gardener chaos engineering and making your workload more resilient.\nFurther Reading Here some links for further reading:\n Examples: Experiments, Scripts Gardener Chaos Engineering: GitHub, PyPI, Module Docs for Gardener Users Chaos Toolkit Core: Home Page, Installation, Concepts, GitHub ","categories":"","description":"","excerpt":"Overview Gardener provides chaostoolkit modules to simulate compute …","ref":"/docs/guides/high-availability/chaos-engineering/","tags":"","title":"Chaos Engineering"},{"body":"Highly Available Shoot Control Plane Shoot resource offers a way to request for a highly available control plane.\nFailure Tolerance Types A highly available shoot control plane can be setup with either a failure tolerance of zone or node.\nNode Failure Tolerance The failure tolerance of a node will have the following characteristics:\n Control plane components will be spread across different nodes within a single availability zone. There will not be more than one replica per node for each control plane component which has more than one replica. Worker pool should have a minimum of 3 nodes. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different node within a single availability zone. Zone Failure Tolerance The failure tolerance of a zone will have the following characteristics:\n Control plane components will be spread across different availability zones. There will be at least one replica per zone for each control plane component which has more than one replica. Gardener scheduler will automatically select a seed which has a minimum of 3 zones to host the shoot control plane. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different zone. Shoot Spec To request for a highly available shoot control plane Gardener provides the following configuration in the shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: \u003cnode | zone\u003e Allowed Transitions\nIf you already have a shoot cluster with non-HA control plane, then the following upgrades are possible:\n Upgrade of non-HA shoot control plane to HA shoot control plane with node failure tolerance. Upgrade of non-HA shoot control plane to HA shoot control plane with zone failure tolerance. However, it is essential that the seed which is currently hosting the shoot control plane should be multi-zonal. If it is not, then the request to upgrade will be rejected. Note: There will be a small downtime during the upgrade, especially for etcd, which will transition from a single node etcd cluster to a multi-node etcd cluster.\n Disallowed Transitions\nIf you already have a shoot cluster with HA control plane, then the following transitions are not possible:\n Upgrade of HA shoot control plane from node failure tolerance to zone failure tolerance is currently not supported, mainly because already existing volumes are bound to the zone they were created in originally. Downgrade of HA shoot control plane with zone failure tolerance to node failure tolerance is currently not supported, mainly because of the same reason as above, that already existing volumes are bound to the respective zones they were created in originally. Downgrade of HA shoot control plane with either node or zone failure tolerance, to a non-HA shoot control plane is currently not supported, mainly because etcd-druid does not currently support scaling down of a multi-node etcd cluster to a single-node etcd cluster. Zone Outage Situation Implementing highly available software that can tolerate even a zone outage unscathed is no trivial task. You may find our HA Best Practices helpful to get closer to that goal. In this document, we collected many options and settings for you that also Gardener internally uses to provide a highly available service.\nDuring a zone outage, you may be forced to change your cluster setup on short notice in order to compensate for failures and shortages resulting from the outage. For instance, if the shoot cluster has worker nodes across three zones where one zone goes down, the computing power from these nodes is also gone during that time. Changing the worker pool (shoot.spec.provider.workers[]) and infrastructure (shoot.spec.provider.infrastructureConfig) configuration can eliminate this disbalance, having enough machines in healthy availability zones that can cope with the requests of your applications.\nGardener relies on a sophisticated reconciliation flow with several dependencies for which various flow steps wait for the readiness of prior ones. During a zone outage, this can block the entire flow, e.g., because all three etcd replicas can never be ready when a zone is down, and required changes mentioned above can never be accomplished. For this, a special one-off annotation shoot.gardener.cloud/skip-readiness helps to skip any readiness checks in the flow.\n The shoot.gardener.cloud/skip-readiness annotation serves as a last resort if reconciliation is stuck because of important changes during an AZ outage. Use it with caution, only in exceptional cases and after a case-by-case evaluation with your Gardener landscape administrator. If used together with other operations like Kubernetes version upgrades or credential rotation, the annotation may lead to a severe outage of your shoot control plane.\n ","categories":"","description":"Failure tolerance types `node` and `zone`. Possible mitigations for zone or node outages","excerpt":"Failure tolerance types `node` and `zone`. Possible mitigations for …","ref":"/docs/guides/high-availability/control-plane/","tags":"","title":"Control Plane"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2022/","tags":"","title":"2022"},{"body":"Presenters This community call was led by Pawel Palucki and Alexander D. Kanevskiy.\nTopics Alexander Kanevskiy begins the community call by giving an overview of CRI-resource-manager, describing it as a “hardware aware container runtime”, and also going over what it brings to the user in terms of features and policies.\nPawel Palucki continues by giving details on the policy that will later be used in the demo and the use case demonstrated in it. He then goes over the “must have” features of any extension - observability and the ability to deploy and configure objects with it.\nThe demo then begins, mixed with slides giving further information at certain points regarding the installation process, static and dynamic configuration flow, healthchecks and recovery mode, and access to logs, among others.\nThe presentation is concluded by Pawel showcasing the new features coming to CRI-resource-manager with its next releases and sharing some tips for other extension developers.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Pawel Palucki and Alexander …","ref":"/blog/2022/10.20-gardener-community-meeting-october-2/","tags":"","title":"Community Call - Get more computing power in Gardener by overcoming Kubelet limitations with CRI-resource-manager"},{"body":"Presenters This community call was led by Raymond de Jong.\nTopics This meeting explores the uses of Cilium, an open source software used to secure the network connectivity between application services deployed using Kubernetes, and Hubble, the networking and security observability platform built on top of it.\nRaymond de Jong begins the meeting by giving an introduction of Cillium and eBPF and how they are both used in Kubernetes networking and services. He then goes over the ways of running Cillium - either by using a supported cloud provider or by CNI chaining.\nThe next topic introduced is the Cluster Mesh and the different use cases for it, offering high availability, shared services, local and remote service affinity, and the ability to split services.\nIn regards to security, being an identity-based security solution utilizing API-aware authorization, Cillium implements Hubble in order to increase its observability. Hubble combines hubble UI, hubble API and hubble Metrics - Grafana and Prometheus, in order to provide service dependency maps, detailed flow visibility and built-in metrics for operations and applications stability.\nThe final topic covered is the Service Mesh, offering service maps and the ability to integrate Cluster Mesh features.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Raymond de Jong.\nTopics This …","ref":"/blog/2022/10.06-gardener-community-meeting-october/","tags":"","title":"Community Call - Cilium / Isovalent Presentation"},{"body":"Manage certificates with Gardener for default domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please keep reading this article. You want to use a certificate for a custom domain. If this is your case, please see Manage certificates with Gardener for public domain Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Since you are using the default DNS name, all DNS configuration should already be done and ready.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will automatically be created.\nUsing an ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\"spec: tls: - hosts: # Must not exceed 64 characters. - short.ingress.shoot.project.default-domain.gardener.cloud # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: short.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Using a service type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/purpose: managed # Certificate and private key reside in this secret. cert.gardener.cloud/secretname: tls-secret # You may add more domains separated by commas (e.g. \"service.shoot.project.default-domain.gardener.cloud, amazing.shoot.project.default-domain.gardener.cloud\") dns.gardener.cloud/dnsnames: \"service.shoot.project.default-domain.gardener.cloud\" dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: short.ingress.shoot.project.default-domain.gardener.cloud secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden If you’re interested in the current progress of your request, you’re advised to consult the description, more specifically the status attribute in case the issuance failed.\nRequest a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.ingress.shoot.project.default-domain.gardener.cloud\" spec: tls: - hosts: - amazing.ingress.shoot.project.default-domain.gardener.cloud secretName: tls-secret rules: - host: amazing.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nMore information For more information and more examples about using the certificate extension, please see Manage certificates with Gardener for public domain\n","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/request_default_domain_cert/","tags":["task"],"title":"Manage certificates with Gardener for default domain"},{"body":"Manage certificates with Gardener for public domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please see Manage certificates with Gardener for default domain You want to use a certificate for a custom domain. If this is your case, please keep reading this article. Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Your custom domain is under a public top level domain (e.g. .com) Your custom zone is resolvable with a public resolver via the internet (e.g. 8.8.8.8) You have a custom DNS provider configured and working (see “DNS Providers”) As part of the Let’s Encrypt ACME challenge validation process, Gardener sets a DNS TXT entry and Let’s Encrypt checks if it can both resolve and authenticate it. Therefore, it’s important that your DNS-entries are publicly resolvable. You can check this by querying e.g. Googles public DNS server and if it returns an entry your DNS is publicly visible:\n# returns the A record for cert-example.example.com using Googles DNS server (8.8.8.8) dig cert-example.example.com @8.8.8.8 A DNS provider In order to issue certificates for a custom domain you need to specify a DNS provider which is permitted to create DNS records for subdomains of your requested domain in the certificate. For example, if you request a certificate for host.example.com your DNS provider must be capable of managing subdomains of host.example.com.\nDNS providers are normally specified in the shoot manifest. To learn more on how to configure one, please see the DNS provider documentation.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) Gateways (both Istio gateways and from the Gateway API) Certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will be created automatically.\nUsing an Ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed # Optional but recommended, this is going to create the DNS entry at the same time dns.gardener.cloud/class: garden dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/commonname: \"*.example.com\" # optional, if not specified the first name from spec.tls[].hosts is used as common name #cert.gardener.cloud/dnsnames: \"\" # optional, if not specified the names from spec.tls[].hosts are used #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" spec: tls: - hosts: # Must not exceed 64 characters. - amazing.example.com # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Replace the hosts and rules[].host value again with your own domain and adjust the remaining Ingress attributes in accordance with your deployment (e.g. the above is for an istio Ingress controller and forwards traffic to a service1 on port 80).\nUsing a Service of type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/secretname: tls-secret dns.gardener.cloud/dnsnames: example.example.com dns.gardener.cloud/class: garden # Optional dns.gardener.cloud/ttl: \"600\" cert.gardener.cloud/commonname: \"*.example.example.com\" cert.gardener.cloud/dnsnames: \"\" #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using a Gateway resource Please see Istio Gateways or Gateway API for details.\nUsing the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: amazing.example.com secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden # If delegated domain for DNS01 challenge should be used. This has only an effect if a CNAME record is set for # '_acme-challenge.amazing.example.com'. # For example: If a CNAME record exists '_acme-challenge.amazing.example.com' =\u003e '_acme-challenge.writable.domain.com', # the DNS challenge will be written to '_acme-challenge.writable.domain.com'. #followCNAME: true # optionally set labels for the secret #secretLabels: # key1: value1 # key2: value2 # Optionally specify the preferred certificate chain: if the CA offers multiple certificate chains, prefer the chain with an issuer matching this Subject Common Name. If no match, the default offered chain will be used. #preferredChain: \"ISRG Root X1\" # Optionally specify algorithm and key size for private key. Allowed algorithms: \"RSA\" (allowed sizes: 2048, 3072, 4096) and \"ECDSA\" (allowed sizes: 256, 384) # If not specified, RSA with 2048 is used. #privateKey: # algorithm: ECDSA # size: 384 Supported attributes Here is a list of all supported annotations regarding the certificate extension:\n Path Annotation Value Required Description N/A cert.gardener.cloud/purpose: managed Yes when using annotations Flag for Gardener that this specific Ingress or Service requires a certificate spec.commonName cert.gardener.cloud/commonname: E.g. “*.demo.example.com” or “special.example.com” Certificate and Ingress : No Service: Yes, if DNS names unset Specifies for which domain the certificate request will be created. If not specified, the names from spec.tls[].hosts are used. This entry must comply with the 64 character limit. spec.dnsNames cert.gardener.cloud/dnsnames: E.g. “special.example.com” Certificate and Ingress : No Service: Yes, if common name unset Additional domains the certificate should be valid for (Subject Alternative Name). If not specified, the names from spec.tls[].hosts are used. Entries in this list can be longer than 64 characters. spec.secretRef.name cert.gardener.cloud/secretname: any-name Yes for certificate and Service Specifies the secret which contains the certificate/key pair. If the secret is not available yet, it’ll be created automatically as soon as the certificate has been issued. spec.issuerRef.name cert.gardener.cloud/issuer: E.g. gardener No Specifies the issuer you want to use. Only necessary if you request certificates for custom domains. N/A cert.gardener.cloud/revoked: true otherwise always false No Use only to revoke a certificate, see reference for more details spec.followCNAME cert.gardener.cloud/follow-cname E.g. true No Specifies that the usage of a delegated domain for DNS challenges is allowed. Details see Follow CNAME. spec.preferredChain cert.gardener.cloud/preferred-chain E.g. ISRG Root X1 No Specifies the Common Name of the issuer for selecting the certificate chain. Details see Preferred Chain. spec.secretLabels cert.gardener.cloud/secret-labels for annotation use e.g. key1=value1,key2=value2 No Specifies labels for the certificate secret. spec.privateKey.algorithm cert.gardener.cloud/private-key-algorithm RSA, ECDSA No Specifies algorithm for private key generation. The default value is depending on configuration of the extension (default of the default is RSA). You may request a new certificate without privateKey settings to find out the concrete defaults in your Gardener. spec.privateKey.size cert.gardener.cloud/private-key-size \"256\", \"384\", \"2048\", \"3072\", \"4096\" No Specifies size for private key generation. Allowed values for RSA are 2048, 3072, and 4096. For ECDSA allowed values are 256 and 384. The default values are depending on the configuration of the extension (defaults of the default values are 3072 for RSA and 384 for ECDSA respectively). Request a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.example.com\" spec: tls: - hosts: - amazing.example.com secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nUsing a custom Issuer Most Gardener deployment with the certification extension enabled have a preconfigured garden issuer. It is also usually configured to use Let’s Encrypt as the certificate provider.\nIf you need a custom issuer for a specific cluster, please see Using a custom Issuer\nQuotas For security reasons there may be a default quota on the certificate requests per day set globally in the controller registration of the shoot-cert-service.\nThe default quota only applies if there is no explicit quota defined for the issuer itself with the field requestsPerDayQuota, e.g.:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' requestsPerDayQuota: 10 DNS Propagation As stated before, cert-manager uses the ACME challenge protocol to authenticate that you are the DNS owner for the domain’s certificate you are requesting. This works by creating a DNS TXT record in your DNS provider under _acme-challenge.example.example.com containing a token to compare with. The TXT record is only applied during the domain validation. Typically, the record is propagated within a few minutes. But if the record is not visible to the ACME server for any reasons, the certificate request is retried again after several minutes. This means you may have to wait up to one hour after the propagation problem has been resolved before the certificate request is retried. Take a look in the events with kubectl describe ingress example for troubleshooting.\nCharacter Restrictions Due to restriction of the common name to 64 characters, you may to leave the common name unset in such cases.\nFor example, the following request is invalid:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-invalid namespace: default spec: commonName: morethan64characters.ingress.shoot.project.default-domain.gardener.cloud But it is valid to request a certificate for this domain if you have left the common name unset:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: dnsNames: - morethan64characters.ingress.shoot.project.default-domain.gardener.cloud References Gardener cert-management Managing DNS with Gardener ","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/request_cert/","tags":["task"],"title":"Manage certificates with Gardener for public domain"},{"body":"Using a custom Issuer Another possibility to request certificates for custom domains is a dedicated issuer.\n Note: This is only needed if the default issuer provided by Gardener is restricted to shoot related domains or you are using domain names not visible to public DNS servers. Which means that your senario most likely doesn’t require your to add an issuer.\n The custom issuers are specified normally in the shoot manifest. If the shootIssuers feature is enabled, it can alternatively be defined in the shoot cluster.\nCustom issuer in the shoot manifest kind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' privateKeySecretName: my-privatekey # referenced resource, the private key must be stored in the secret at `data.privateKey` (optionally, only needed as alternative to auto registration) #precheckNameservers: # to provide special set of nameservers to be used for prechecking DNSChallenges for an issuer #- dns1.private.company-net:53 #- dns2.private.company-net:53\" #shootIssuers: # if true, allows to specify issuers in the shoot cluster #enabled: true resources: - name: my-privatekey resourceRef: apiVersion: v1 kind: Secret name: custom-issuer-privatekey # name of secret in Gardener project If you are using an ACME provider for private domains, you may need to change the nameservers used for checking the availability of the DNS challenge’s TXT record before the certificate is requested from the ACME provider. By default, only public DNS servers may be used for this purpose. At least one of the precheckNameservers must be able to resolve the private domain names.\nUsing the custom issuer To use the custom issuer in a certificate, just specify its name in the spec.\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate spec: ... issuerRef: name: custom-issuer ... For source resources like Ingress or Service use the cert.gardener.cloud/issuer annotation.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/issuer: custom-issuer ... Custom issuer in the shoot cluster Prerequiste: The shootIssuers feature has to be enabled. It is either enabled globally in the ControllerDeployment or in the shoot manifest with:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig shootIssuers: enabled: true # if true, allows to specify issuers in the shoot cluster ... Example for specifying an Issuer resource and its Secret directly in any namespace of the shoot cluster:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Issuer metadata: name: my-own-issuer namespace: my-namespace spec: acme: domains: include: - my.own.domain.com email: some.user@my.own.domain.com privateKeySecretRef: name: my-own-issuer-secret namespace: my-namespace server: https://acme-v02.api.letsencrypt.org/directory --- apiVersion: v1 kind: Secret metadata: name: my-own-issuer-secret namespace: my-namespace type: Opaque data: privateKey: ... # replace '...' with valus encoded as base64 Using the custom shoot issuer To use the custom issuer in a certificate, just specify its name and namespace in the spec.\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate spec: ... issuerRef: name: my-own-issuer namespace: my-namespace ... For source resources like Ingress or Service use the cert.gardener.cloud/issuer annotation.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/issuer: my-namespace/my-own-issuer ... ","categories":"","description":"How to define a custom issuer forma shoot cluster","excerpt":"How to define a custom issuer forma shoot cluster","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/custom_shoot_issuer/","tags":["task"],"title":"Using a custom Issuer"},{"body":"Introduction of Disruptions We need to understand that some kind of voluntary disruptions can happen to pods. For example, they can be caused by cluster administrators who want to perform automated cluster actions, like upgrading and autoscaling clusters. Typical application owner actions include:\n deleting the deployment or other controller that manages the pod updating a deployment’s pod template causing a restart directly deleting a pod (e.g., by accident) Setup Pod Disruption Budgets Kubernetes offers a feature called PodDisruptionBudget (PDB) for each application. A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions.\nThe most common use case is when you want to protect an application specified by one of the built-in Kubernetes controllers:\n Deployment ReplicationController ReplicaSet StatefulSet A PodDisruptionBudget has three fields:\n A label selector .spec.selector to specify the set of pods to which it applies. .spec.minAvailable which is a description of the number of pods from that set that must still be available after the eviction, even in the absence of the evicted pod. minAvailable can be either an absolute number or a percentage. .spec.maxUnavailable which is a description of the number of pods from that set that can be unavailable after the eviction. It can be either an absolute number or a percentage. Cluster Upgrade or Node Deletion Failed due to PDB Violation Misconfiguration of the PDB could block the cluster upgrade or node deletion processes. There are two main cases that can cause a misconfiguration.\nCase 1: The replica of Kubernetes controllers is 1 Only 1 replica is running: there is no replicaCount setup or replicaCount for the Kubernetes controllers is set to 1\n PDB configuration\n spec: minAvailable: 1 To fix this PDB misconfiguration, you need to change the value of replicaCount for the Kubernetes controllers to a number greater than 1\n Case 2: HPA configuration violates PDB In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand. The HorizontalPodAutoscaler manages the replicas field of the Kubernetes controllers.\n There is no replicaCount setup or replicaCount for the Kubernetes controllers is set to 1\n PDB configuration\n spec: minAvailable: 1 HPA configuration\n spec: minReplicas: 1 To fix this PDB misconfiguration, you need to change the value of HPA minReplicas to be greater than 1\n Related Links Specifying a Disruption Budget for Your Application Horizontal Pod Autoscaling ","categories":"","description":"","excerpt":"Introduction of Disruptions We need to understand that some kind of …","ref":"/docs/guides/applications/pod-disruption-budget/","tags":"","title":"Specifying a Disruption Budget for Kubernetes Controllers"},{"body":"Presenters This community call was led by Jens Schneider and Lothar Gesslein.\nOverview Starting the development of a new Gardener extension can be challenging, when you are not an expert in the Gardener ecosystem yet. Therefore, the first half of this community call led by Jens Schneider aims to provide a “getting started tutorial” at a beginner level. 23Technologies have developed a minimal working example for Gardener extensions, gardener-extension-mwe, hosted in a Github repository. Jens is following the Getting started with Gardener extension development tutorial, which aims to provide exactly that.\nIn the second part of the community call, Lothar Gesslein introduces the gardener-extension-shoot-flux, which allows for the automated installation of arbitrary Kubernetes resources into shoot clusters. As this extension relies on Flux, an overview of Flux’s capabilities is also provided.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end.\nYou can find the tutorials in this community call at:\n Getting started with Gardener extension development A Gardener Extension for universal Shoot Configuration If you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end of the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Jens Schneider and Lothar …","ref":"/blog/2022/06.17-gardener-community-meeting-june/","tags":"","title":"Community Call - Gardener Extension Development"},{"body":"Presenters This community call was led by Tim Ebert and Rafael Franzke.\nOverview So far, deploying Gardener locally was not possible end-to-end. While you certainly could run the Gardener components in a minikube or kind cluster, creating shoot clusters always required to register seeds backed by cloud provider infrastructure like AWS, Azure, etc..\nConsequently, developing Gardener locally was similarly complicated, and the entry barrier for new contributors was way too high.\nIn a previous community call (Hackathon “Hack The Metal”), we already presented a new approach for overcoming these hurdles and complexities.\nNow we would like to present the Local Provider Extension for Gardener and show how it can be used to deploy Gardener locally, allowing you to quickly get your feet wet with the project.\nIn this session, Tim Ebert goes through the process of setting up a local Gardener cluster. After his demonstration, Rafael Franzke showcases a different approach to building your clusters locally, which, while more complicated, offers a much faster build time.\nYou can find the tutorials in this community call at:\n Deploying Gardener locally Running Gardener locally If you are left with any questions regarding the content, you might find the answers in the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Tim Ebert and Rafael …","ref":"/blog/2022/03.23-gardener-community-meeting-march/","tags":"","title":"Community Call - Deploying and Developing Gardener Locally"},{"body":"Presenters This community call was led by Holger Kosser, Lukas Gross and Peter Sutter.\nOverview Watch the recording of our February 2022 Community call to see how to get started with the gardenctl-v2 and watch a walkthrough for gardenctl-v2 features. You’ll learn about targeting, secure shoot cluster access, SSH, and how to use cloud provider CLIs natively.\nThe session is led by Lukas Gross, who begins by giving some information on the motivations behind creating a new version of gardenctl - providing secure access to shoot clustes, enabling direct usage of kubectl and cloud provider CLIs and managing cloud provider resources for SSH access.\nHolger Kosser then takes over in order to delve deeper into the concepts behind the implementation of gardenctl-2, going over Targeting, Gardenlogin and Cloud Provider CLIs. After that, Peter Sutter does the first demo, where he presents the main features in gardenctl-2.\nThe next part details how to get started with gardenctl, followed by another demo. The landscape requirements are also discussed, as well as future plans and enhancement requests.\nYou can find the slides for this community call at Google Slides.\nIf you are left with any questions regarding the content, you might find the answers at the Q\u0026A session and discussion held at the end, as well as the questions asked and answered throughout the meeting.\nRecording ","categories":"","description":"","excerpt":"Presenters This community call was led by Holger Kosser, Lukas Gross …","ref":"/blog/2022/02.17-gardener-community-meeting-february/","tags":"","title":"Community Call - Gardenctl-v2"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2021/","tags":"","title":"2021"},{"body":"Happy New Year Gardeners! As we greet 2021, we also celebrate Gardener’s third anniversary. Gardener was born with its first open source commit on 10.1.2018 (its inception within SAP was of course some 9 months earlier):\ncommit d9619d01845db8c7105d27596fdb7563158effe1 Author: Gardener Development Community \u003cgardener.opensource@sap.com\u003e Date: Wed Jan 10 13:07:09 2018 +0100 Initial version of gardener This is the initial contribution to the Open Source Gardener project. ... Looking back, three years down the line, the project initiators were working towards a special goal: Publishing Gardener as an open source project on Github.com. Join us as we look back at how it all began, the challenges Gardener aims to solve, and why open source and the community was and is the project’s key enabler.\nGardener Kick-Off: “We opted to BUILD ourselves” Early 2017, SAP put together a small, jelled team of experts with a clear mission: work out how SAP could serve Kubernetes based environments (as a service) for all teams within the company. Later that same year, SAP also joined the CNCF as a platinum member.\nWe first deliberated intensively on the BUY options (including acquisitions, due to the size and estimated volume needed at SAP). There were some early products from commercial vendors and startups available that did not bind exclusively to one of the hyperscalers, but these products did not cover many of our crucial and immediate requirements for a multi-cloud environment.\nUltimately, we opted to BUILD ourselves. This decision was not made lightly, because right from the start, we knew that we would have to cover thousands of clusters, across the globe, on all kinds of infrastructures. We would have to be able to create them at scale as well as manage them 24x7. And thus, we predicted the need to invest into automation of all aspects, to keep the service TCO at a minimum, and to offer an enterprise worthy SLA early on. This particular endeavor grew into launching the project Gardener, first internally, and ultimately fulfilling all checks, externally based on open source. Its mission statement, in a nutshell, is “Universal Kubernetes at scale”. Now, that’s quite bold. But we also had a nifty innovation that helped us tremendously along the way. And we can openly reveal the secret here: Gardener was built, not only for creating Kubernetes at scale, but it was built (recursively) in Kubernetes itself.\nWhat Do You Get with Gardener? Gardener offers managed and homogenous Kubernetes clusters on IaaS providers like AWS, Azure, GCP, AliCloud, Open Telekom Cloud, SCS, OVH and more, but also covers versatile infrastructures like OpenStack, VMware or bare metal. Day-1 and Day-2 operations are an integral part of a cluster’s feature set. This means that Gardener is not only capable of provisioning or de-provisioning thousands of clusters, but also of monitoring your cluster’s health state, upgrading components in a rolling fashion, or scaling the control plane as well as worker nodes up and down depending on the current resource demand.\nSome features mentioned above might sound familiar to you, simply because they’re squarely derived from Kubernetes. Concretely, if you explore a Gardener managed end-user cluster, you’ll never see the so-called “control plane components” (Kube-Apiserver, Kube-Controller-Manager, Kube-Scheduler, etc.) The reason is that they run as Pods inside another, hosting/seeding Kubernetes cluster. Speaking in Gardener terms, the latter is called a Seed cluster, and the end-user cluster is called a Shoot cluster; and thus the botanical naming scheme for Gardener was born. Further assets like infrastructure components or worker machines are modelled as managed Kubernetes objects too. This allows Gardener to leverage all the great and production proven features of Kubernetes for managing Kubernetes clusters. Our blog post on Kubernetes.io reveals more details about the architectural refinements.\nFigure 1: Gardener architecture overview End-users directly benefit from Gardener’s recursive architecture. Many of the requirements that we identified for the Gardener service turned out to be highly convenient for shoot owners. For instance, Seed clusters are usually equipped with DNS and x509 services. At the same time, these service offerings can be extended to requests coming from the Shoot clusters i.e., end-users get domain names and certificates for their applications out of the box.\nRecognizing the Power of Open Source The Gardener team immediately profited from open source: from Kubernetes obviously, and all its ecosystem projects. That all facilitated our project’s very fast and robust development. But it does not answer:\n“Why would SAP open source a tool that clearly solves a monetizable enterprise requirement?\"_\nShort spoiler alert: it initially involved a leap of faith. If we just look at our own decision path, it is undeniable that developers, and with them entire industries, gravitate towards open source. We chose Linux, Containers, and Kubernetes exactly because they are open, and we could bet on network effects, especially around skills. The same decision process is currently replicated in thousands of companies, with the same results. Why? Because all companies are digitally transforming. They are becoming software companies as well to a certain extent. Many of them are also our customers and in many discussions, we recognized that they have the same challenges that we are solving with Gardener. This, in essence, was a key eye opener. We were confident that if we developed Gardener as open source, we’d not only seize the opportunity to shape a Kubernetes management tool that finds broad interest and adoption outside of our use case at SAP, but we could solve common challenges faster with the help of a community, and that in consequence would sustain continuous feature development.\nCoincidently, that was also when the SAP Open Source Program Office (OSPO) was launched. It supported us making a case to develop Gardener completely as open source. Today, we can witness that this strategy has unfolded. It opened the gates not only for adoption, but for co-innovation, investment security, and user feedback directly in code. Below you can see an example of how the Gardener project benefits from this external community power as contributions are submitted right away.\nFigure 2: Example immediate community contribution Differentiating Gardener from Other Kubernetes Management Solutions Imagine that you have created a modern solid cloud native app or service, fully scalable, in containers. And the business case requires you to run the service on multiple clouds, like AWS, AliCloud, Azure, … maybe even on-premises like OpenStack or VMware. Your development team has done everything to ensure that the workload is highly portable. But they would need to qualify each providers’ managed Kubernetes offering and their custom Bill-of-Material (BoM), their versions, their deprecation plan, roadmap etc. Your TCD would explode and this is exactly what teams at SAP experienced. Now, with Gardener you can, instead, roll out homogeneous clusters and stay in control of your versions and a single roadmap. Across all supported providers!\nAlso, teams that have serious, or say, more demanding workloads running on Kubernetes will come to the same conclusion: They require the full management control of the Kubernetes underlay. Not only that, they need access, visibility, and all the tuning options for the control plane to safeguard their service. This is a conclusion not only from teams at SAP, but also from our community members, like PingCap, who use Gardener to serve TiDB Cloud service. Whenever you need to get serious and need more than one or two clusters, Gardener is your friend.\nWho Is Using Gardener? Well, there is SAP itself of course, but also the number of Gardener adopters and companies interested in Gardener is growing (~1700 GitHub stars), as more are challenged by multi-cluster and multi-cloud requirements.\nFlant, PingCap, STACKIT, T-Systems, Sky, or b’nerd are among these companies, to name a few. They use Gardener to either run products they sell on top or offer managed Kubernetes clusters directly to their clients, or even only components that are re-usable from Gardener.\nAn interesting journey in the open source space started with Finanz Informatik Technologie Service (FI-TS), an European Central Bank regulated and certified hoster for banks. They operate in very restricted environments, as you can imagine, and as such, they re-designed their datacenter for cloud native workloads from scratch, that is from cabling, racking and stacking to an API that serves bare metal servers. For Kubernetes-as-a-Service, they evaluated and chose Gardener because it was open and a perfect candidate. With Gardener’s extension capabilities, it was possible to bring managed Kubernetes clusters to their very own bare metal stack, metal-stack.io. Of course, this meant implementation effort. But by reusing the Gardener project, FI-TS was able to leverage our standard with minimal adjustments for their special use-case. Subsequently, with their contributions, SAP was able to make Gardener more open for the community.\nFull Speed Ahead with the Community in 2021 Some of the current and most active topics are about the installer (Landscaper), control plane migration, automated seed management and documentation. Even though once you are into Kubernetes and then Gardener, all complexity falls into place, you can make all the semantic connections yourself. But beginners that join the community without much prior knowledge should experience a ramp-up with slighter slope. And that is currently a pain point. Experts directly ask questions about documentation not being up-to-date or clear enough. We prioritized the functionality of what you get with Gardener at the outset and need to catch up. But here is the good part: Now that we are starting the installation subject, later we will have a much broader picture of what we need to install and maintain Gardener, and how we will build it.\n In a community call last summer, we gave an overview of what we are building: The Landscaper. With this tool, we will be able to not only install a full Gardener landscape, but we will also streamline patches, updates and upgrades with the Landscaper. Gardener adopters can then attach to a release train from the project and deploy Gardener into a dev, canary and multiple production environments sequentially. Like we do at SAP.\nKey Takeaways in Three Years of Gardener #1 Open Source is Strategic Open Source is not just about using freely available libraries, components, or tools to optimize your own software production anymore. It is strategic, unfolds for projects like Gardener, and that in the meantime has also reached the Board Room.\n#2 Solving Concrete Challenges by Co-Innovation Users of a particular product or service increasingly vote/decide for open source variants, such as project Gardener, because that allows them to freely innovate and solve concrete challenges by developing exactly what they require (see FI-TS example). This user-centric process has tremendous advantages. It clears out the middleman and other vested interests. You have access to the full code. And lastly, if others start using and contributing to your innovation, it allows enterprises to secure their investments for the long term. And that re-enforces point #1 for enterprises that have yet to create a strategic Open Source Program Office.\n#3 Cloud Native Skills Gardener solves problems by applying Kubernetes and Kubernetes principles itself. Developers and operators who obtain familiarity with Kubernetes will immediately notice and appreciate our concept and can contribute intuitively. The Gardener maintainers feel responsible to facilitate community members and contributors. Barriers will further be reduced by our ongoing landscaper and documentation efforts. This is why we are so confident on Gardener adoption.\nThe Gardener team is gladly welcoming new community members, especially regarding adoption and contribution. Feel invited to try out your very own Gardener installation, join our Slack channel or community calls. We’re looking forward to seeing you there!\n","categories":"","description":"","excerpt":"Happy New Year Gardeners! As we greet 2021, we also celebrate …","ref":"/blog/2021/02.01-happy-anniversary-gardener/","tags":"","title":"Happy Anniversary, Gardener! Three Years of Open Source Kubernetes Management"},{"body":"Kubernetes is a cloud-native enabler built around the principles for a resilient, manageable, observable, highly automated, loosely coupled system. We know that Kubernetes is infrastructure agnostic with the help of a provider specific Cloud Controller Manager. But Kubernetes has explicitly externalized the management of the nodes. Once they appear - correctly configured - in the cluster, Kubernetes can use them. If nodes fail, Kubernetes can’t do anything about it, external tooling is required. But every tool, every provider is different. So, why not elevate node management to a first class Kubernetes citizen? Why not create a Kubernetes native resource that manages machines just like pods? Such an approach is brought to you by the Machine Controller Manager (aka MCM), which, of course, is an open sourced project. MCM gives you the following benefits:\n seamlessly manage machines/nodes with a declarative API (of course, across different cloud providers) integrate generically with the cluster autoscaler plugin with tools such as the node-problem-detector transport the immutability design principle to machine/nodes implement e.g. rolling upgrades of machines/nodes Machine Controller Manager aka MCM Machine Controller Manager is a group of cooperative controllers that manage the lifecycle of the worker machines. It is inspired by the design of Kube Controller Manager in which various sub controllers manage their respective Kubernetes Clients.\nMachine Controller Manager reconciles a set of Custom Resources namely MachineDeployment, MachineSet and Machines which are managed \u0026 monitored by their controllers MachineDeployment Controller, MachineSet Controller, Machine Controller respectively along with another cooperative controller called the Safety Controller.\nUnderstanding the sub-controllers and Custom Resources of MCM The Custom Resources MachineDeployment, MachineSet and Machines are very much analogous to the native K8s resources of Deployment, ReplicaSet and Pods respectively. So, in the context of MCM:\n MachineDeployment provides a declarative update for MachineSet and Machines. MachineDeployment Controller reconciles the MachineDeployment objects and manages the lifecycle of MachineSet objects. MachineDeployment consumes a provider specific MachineClass in its spec.template.spec, which is the template of the VM spec that would be spawned on the cloud by MCM. MachineSet ensures that the specified number of Machine replicas are running at a given point of time. MachineSet Controller reconciles the MachineSet objects and manages the lifecycle of Machine objects. Machines are the actual VMs running on the cloud platform provided by one of the supported cloud providers. Machine Controller is the controller that actually communicates with the cloud provider to create/update/delete machines on the cloud. There is a Safety Controller responsible for handling the unidentified or unknown behaviours from the cloud providers. Along with the above Custom Controllers and Resources, MCM requires the MachineClass to use K8s Secret that stores cloudconfig (initialization scripts used to create VMs) and cloud specific credentials. Workings of MCM Figure 1: In-Tree Machine Controller Manager In MCM, there are two K8s clusters in the scope — a Control Cluster and a Target Cluster. The Control Cluster is the K8s cluster where the MCM is installed to manage the machine lifecycle of the Target Cluster. In other words, the Control Cluster is the one where the machine-* objects are stored. The Target Cluster is where all the node objects are registered. These clusters can be two distinct clusters or the same cluster, whichever fits.\nWhen a MachineDeployment object is created, the MachineDeployment Controller creates the corresponding MachineSet object. The MachineSet Controller in-turn creates the Machine objects. The Machine Controller then talks to the cloud provider API and actually creates the VMs on the cloud.\nThe cloud initialization script that is introduced into the VMs via the K8s Secret consumed by the MachineClasses talks to the KCM (K8s Controller Manager) and creates the node objects. After registering themselves to the Target Cluster, nodes start sending health signals to the machine objects. That is when MCM updates the status of the machine object from Pending to Running.\nMore on Safety Controller Safety Controller contains the following functions:\nOrphan VM Handling It lists all the VMs in the cloud; matching the tag of given cluster name and maps the VMs with the Machine objects using the ProviderID field. VMs without any backing Machine objects are logged and deleted after confirmation. This handler runs every 30 minutes and is configurable via --machine-safety-orphan-vms-period flag. Freeze Mechanism Safety Controller freezes the MachineDeployment and MachineSet controller if the number of Machine objects goes beyond a certain threshold on top of the Spec.Replicas. It can be configured by the flag --safety-up or --safety-down and also --machine-safety-overshooting-period. Safety Controller freezes the functionality of the MCM if either of the target-apiserver or the control-apiserver is not reachable. Safety Controller unfreezes the MCM automatically once situation is resolved to normal. A freeze label is applied on MachineDeployment/MachineSet to enforce the freeze condition. Evolution of MCM from In-Tree to Out-of-Tree (OOT) MCM supports declarative management of machines in a K8s Cluster on various cloud providers like AWS, Azure, GCP, AliCloud, OpenStack, Metal-stack, Packet, KubeVirt, VMWare, Yandex. It can, of course, be easily extended to support other cloud providers.\nGoing ahead, having the implementation of the Machine Controller Manager supporting too many cloud providers would be too much upkeep from both a development and a maintenance point of view. Which is why the Machine Controller component of MCM has been moved to Out-of-Tree design, where the Machine Controller for each respective cloud provider runs as an independent executable, even though typically packaged under the same deployment.\nFigure 2: Out-Of-Tree (OOT) Machine Controller Manager This OOT Machine Controller will implement a common interface to manage the VMs on the respective cloud provider. Now, while the Machine Controller deals with the Machine objects, the Machine Controller Manager (MCM) deals with higher level objects such as the MachineSet and MachineDeployment objects.\nA lot of contributions are already being made towards an OOT Machine Controller Manager for various cloud providers. Below are the links to the repositories:\n Out of Tree Machine Controller Manager for AliCloud Out of Tree Machine Controller Manager for AWS Out of Tree Machine Controller Manager for Azure Out of Tree Machine Controller Manager for GCP Out of Tree Machine Controller Manager for KubeVirt Out of Tree Machine Controller Manager for Metal Out of Tree Machine Controller Manager for vSphere Out of Tree Machine Controller Manager for Yandex Watch the Out of Tree Machine Controller Manager video on our Gardener Project YouTube channel to understand more about OOT MCM.\nWho Uses MCM? Gardener\nMCM is originally developed and employed by a K8s Control Plane as a Service called Gardener. However, the MCM’s design is elegant enough to be employed when managing the machines of any independent K8s clusters, without having to necessarily associate it with Gardener.\nMetal Stack\nMetal-stack is a set of microservices that implements Metal as a Service (MaaS). It enables you to turn your hardware into elastic cloud infrastructure. Metal-stack employs the adopted Machine Controller Manager to their Metal API. Check out an introduction to it in metal-stack - kubernetes on bare metal.\nSky UK Limited\nSky UK Limited (a broadcaster) migrated their Kubernetes node management from Ansible to Machine Controller Manager. Check out the How Sky is using Machine Controller Manager (MCM) and autoscaler video on our Gardener Project YouTube channel.\nAlso, other interesting use cases with MCM are implemented by Kubernetes enthusiasts, who for example adjusted the Machine Controller Manager to provision machines in the cloud to extend a local Raspberry-Pi K3s cluster. This topic is covered in detail in the 2020-07-03 Gardener Community Meeting on our Gardener Project YouTube channel.\nConclusion Machine Controller Manager is the leading automation tool for machine management for, and in, Kubernetes. And the best part is that it is open sourced. It is freely (and easily) usable and extensible, and the community more than welcomes contributions.\nIf you want to know more about Machine Controller Manager or find out about a similar scope for your solutions, feel free to visit the GitHub page machine-controller-manager. We are so excited to see what you achieve with Machine Controller Manager.\n","categories":"","description":"","excerpt":"Kubernetes is a cloud-native enabler built around the principles for a …","ref":"/blog/2021/01.25-machine-controller-manager/","tags":"","title":"Machine Controller Manager"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2020/","tags":"","title":"2020"},{"body":"STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz Group, which consists of Lidl, Kaufland, as well as production and recycling companies. Following the industry trend, the Schwarz Group is in the process of a digital transformation. STACKIT enables this transformation by helping to modernize the internal IT of the company branches.\nWhat is STACKIT and the STACKIT Kubernetes Engine (SKE)? STACKIT started with colocation solutions for internal and external customers in Europe-based data centers, which was then expanded to a full cloud platform stack providing an IaaS layer with VMs, storage and network, as well as a PaaS layer including Cloud Foundry and a growing set of cloud services, like databases, messaging, etc.\nWith containers and Kubernetes becoming the lingua franca of the cloud, we are happy to announce the STACKIT Kubernetes Engine (SKE), which has been released as Beta in November this year. We decided to use Gardener as the cluster management engine underneath SKE - for good reasons as you will see – and we would like to share our experiences with Gardener when working on the SKE Beta release, and serve as a testimonial for this technology.\nFigure 1: STACKIT Component Diagram Why We Chose Gardener as a Cluster Management Tool We started with the Kubernetes endeavor in the beginning of 2020 with a newly formed agile team that consisted of software engineers, highly experienced in IT operations and development. After some exploration and a short conceptual phase, we had a clear-cut opinion on how the cluster management for STACKIT should look like: we were looking for a highly customizable tool that could be adapted to the specific needs of STACKIT and the Schwarz Group, e.g. in terms of network setup or the infrastructure layer it should be running on. Moreover, the tool should be scalable to a high number of managed Kubernetes clusters and should therefore provide a fully automated operation experience. As an open source project, contributing and influencing the tool, as well as collaborating with a larger community were important aspects that motivated us. Furthermore, we aimed to offer cluster management as a self-service in combination with an excellent user experience. Our objective was to have the managed clusters come with enterprise-grade SLAs – i.e. with “batteries included”, as some say.\nWith this mission, we started our quest through the world of Kubernetes and soon found Gardener to be a hot candidate of cluster management tools that seemed to fulfill our demands. We quickly got in contact and received a warm welcome from the Gardener community. As an interested potential adopter, but in the early days of the COVID-19 lockdown, we managed to organize an online workshop during which we got an introduction and deep dive into Gardener and discussed the STACKIT use cases. We learned that Gardener is extensible in many dimensions, and that contributions are always welcome and encouraged. Once we understood the basic Gardener concepts of Garden, Shoot and Seed clusters, its inception design and how this extends Kubernetes concepts in a natural way, we were eager to evaluate this tool in more detail.\nAfter this evaluation, we were convinced that this tool fulfilled all our requirements - a decision was made and off we went.\nHow Gardener was Adapted and Extended by SKE After becoming familiar with Gardener, we started to look into its code base to adapt it to the specific needs of the STACKIT OpenStack environment. Changes and extensions were made in order to get it integrated into the STACKIT environment, and whenever reasonable, we contributed those changes back:\n To run smoothly with the STACKIT OpenStack layer, the Gardener configuration was adapted in different places, e.g. to support CSI driver or to configure the domains of a shoot API server or ingress. Gardener was extended to support shoots and shooted seeds in dual stack and dual home setup. This is used in SKE for the communication between shooted seeds and the Garden cluster. SKE uses a private image registry for the Gardener installation in order to resolve dependencies to public image registries and to have more control over the used Gardener versions. To install and run Gardener with the private image registry, some new configurations need to be introduced into Gardener. Gardener is a first-class API based service what allowed us to smoothly integrate it into the STACKIT User Interface. We were also able to jump-start and utilize the Gardener Dashboard for our Beta release by merely adjusting the look-\u0026-feel, i.e. colors, labels and icons. Figure 2: Gardener Dashboard adapted to STACKIT UI style Experience with Gardener Operations As no OpenStack installation is identical to one another, getting Gardener to run stable on the STACKIT IaaS layer revealed some operational challenges. For instance, it was challenging to find the right configuration for Cinder CSI.\nTo test for its resilience, we tried to break the managed clusters with a Chaos Monkey test, e.g. by deleting services or components needed by Kubernetes and Gardener to work properly. The reconciliation feature of Gardener fixed all those problems automatically, so that damaged Shoot clusters became operational again after a short period of time. Thus, we were not able to break Shoot clusters from an end user perspective permanently, despite our efforts. Which again speaks for Gardener’s first-class cloud native design.\nWe also participated in a fruitful community support: For several challenges we contacted the community channel and help was provided in a timely manner. A lesson learned was that raising an issue in the community early on, before getting stuck too long on your own with an unresolved problem, is essential and efficient.\nSummary Gardener is used by SKE to provide a managed Kubernetes offering for internal use cases of the Schwarz Group as well as for the public cloud offering of STACKIT. Thanks to Gardener, it was possible to get from zero to a Beta release in only about half a year’s time – this speaks for itself. Within this period, we were able to integrate Gardener into the STACKIT environment, i.e. in its OpenStack IaaS layer, its management tools and its identity provisioning solution.\nGardener has become a vital building block in STACKIT’s cloud native platform offering. For the future, the possibility to manage clusters also on other infrastructures and hyperscalers is seen as another great opportunity for extended use cases. The open co-innovation exchange with the Gardener community member companies has also opened the door to commercial co-operation.\n","categories":"","description":"","excerpt":"STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz …","ref":"/blog/2020/12.03-stackit-kubernetes-engine-with-gardener/","tags":"","title":"STACKIT Kubernetes Engine with Gardener"},{"body":"Prerequisites Please read the following background material on Authenticating.\nOverview Kubernetes on its own doesn’t provide any user management. In other words, users aren’t managed through Kubernetes resources. Whenever you refer to a human user it’s sufficient to use a unique ID, for example, an email address. Nevertheless, Gardener project owners can use an identity provider to authenticate user access for shoot clusters in the following way:\n Configure an Identity Provider using OpenID Connect (OIDC). Configure a local kubectl oidc-login to enable oidc-login. Configure the shoot cluster to share details of the OIDC-compliant identity provider with the Kubernetes API Server. Authorize an authenticated user using role-based access control (RBAC). Verify the result Note Gardener allows administrators to modify aspects of the control plane setup. It gives administrators full control of how the control plane is parameterized. While this offers much flexibility, administrators need to ensure that they don’t configure a control plane that goes beyond the service level agreements of the responsible operators team. Configure an Identity Provider Create a tenant in an OIDC compatible Identity Provider. For simplicity, we use Auth0, which has a free plan.\n In your tenant, create a client application to use authentication with kubectl:\n Provide a Name, choose Native as application type, and choose CREATE.\n In the tab Settings, copy the following parameters to a local text file:\n Domain\nCorresponds to the issuer in OIDC. It must be an https-secured endpoint (Auth0 requires a trailing / at the end). For more information, see Issuer Identifier.\n Client ID\n Client Secret\n Configure the client to have a callback url of http://localhost:8000. This callback connects to your local kubectl oidc-login plugin:\n Save your changes.\n Verify that https://\u003cAuth0 Domain\u003e/.well-known/openid-configuration is reachable.\n Choose Users \u0026 Roles \u003e Users \u003e CREATE USERS to create a user with a user and password:\n Note Users must have a verified email address. Configure a Local kubectl oidc-login Install the kubectl plugin oidc-login. We highly recommend the krew installation tool, which also makes other plugins easily available.\nkubectl krew install oidc-login The response looks like this:\nUpdated the local copy of plugin index. Installing plugin: oidc-login CAVEATS: \\ | You need to setup the OIDC provider, Kubernetes API server, role binding and kubeconfig. | See https://github.com/int128/kubelogin for more. / Installed plugin: oidc-login Prepare a kubeconfig for later use:\ncp ~/.kube/config ~/.kube/config-oidc Modify the configuration of ~/.kube/config-oidc as follows:\napiVersion: v1 kind: Config ... contexts: - context: cluster: shoot--project--mycluster user: my-oidc name: shoot--project--mycluster ... users: - name: my-oidc user: exec: apiVersion: client.authentication.k8s.io/v1beta1 command: kubectl args: - oidc-login - get-token - --oidc-issuer-url=https://\u003cIssuer\u003e/ - --oidc-client-id=\u003cClient ID\u003e - --oidc-client-secret=\u003cClient Secret\u003e - --oidc-extra-scope=email,offline_access,profile To test our OIDC-based authentication, the context shoot--project--mycluster of ~/.kube/config-oidc is used in a later step. For now, continue to use the configuration ~/.kube/config with administration rights for your cluster.\nConfigure the Shoot Cluster Modify the shoot cluster YAML as follows, using the client ID and the domain (as issuer) from the settings of the client application you created in Auth0:\nkind: Shoot apiVersion: garden.sapcloud.io/v1beta1 metadata: name: mycluster namespace: garden-project ... spec: kubernetes: kubeAPIServer: oidcConfig: clientID: \u003cClient ID\u003e issuerURL: \"https://\u003cIssuer\u003e/\" usernameClaim: email This change of the Shoot manifest triggers a reconciliation. Once the reconciliation is finished, your OIDC configuration is applied. It doesn’t invalidate other certificate-based authentication methods. Wait for Gardener to reconcile the change. It can take up to 5 minutes.\nAuthorize an Authenticated User In Auth0, you created a user with a verified email address, test@test.com in our example. For simplicity, we authorize a single user identified by this email address with the cluster role view:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: viewer-test roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: view subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: test@test.com As administrator, apply the cluster role binding in your shoot cluster.\nVerify the Result To step into the shoes of your user, use the prepared kubeconfig file ~/.kube/config-oidc, and switch to the context that uses oidc-login:\ncd ~/.kube export KUBECONFIG=$(pwd)/config-oidc kubectl config use-context `shoot--project--mycluster` kubectl delegates the authentication to plugin oidc-login the first time the user uses kubectl to contact the API server, for example:\nkubectl get all The plugin opens a browser for an interactive authentication session with Auth0, and in parallel serves a local webserver for the configured callback.\n Enter your login credentials.\nYou should get a successful response from the API server:\nOpening in existing browser session. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 100.64.0.1 \u003cnone\u003e 443/TCP 86m Note After a successful login, kubectl uses a token for authentication so that you don’t have to provide user and password for every new kubectl command. How long the token is valid can be configured. If you want to log in again earlier, reset plugin oidc-login:\n Delete directory ~/.kube/cache/oidc-login. Delete the browser cache. To see if your user uses the cluster role view, do some checks with kubectl auth can-i.\n The response for the following commands should be no:\nkubectl auth can-i create clusterrolebindings kubectl auth can-i get secrets kubectl auth can-i describe secrets The response for the following commands should be yes:\nkubectl auth can-i list pods kubectl auth can-i get pods If the last step is successful, you’ve configured your cluster to authenticate against an identity provider using OIDC.\nRelated Links Auth0 Pricing ","categories":"","description":"Use OpenID Connect to authenticate users to access shoot clusters","excerpt":"Use OpenID Connect to authenticate users to access shoot clusters","ref":"/docs/guides/administer-shoots/oidc-login/","tags":"","title":"Authenticating with an Identity Provider"},{"body":"Dear community, we’re happy to announce a new minor release of Gardener, in fact, the 16th in 2020! v1.13 came out just today after a couple of weeks of code improvements and feature implementations. As usual, this blog post provides brief summaries for the most notable changes that we introduce with this version. Behind the scenes (and not explicitly highlighted below) we are progressing on internal code restructurings and refactorings to ease further extensions and to enhance development productivity. Speaking of those: You might be interested in watching the recording of the last Gardener Community Meeting which includes a detailed session for v2 of Terraformer, a complete rewrite in Golang, and improved state handling.\nNotable Changes in v1.13 The main themes of Gardener’s v1.13 release are increments for feature gate promotions, scalability and robustness, and cleanups and refactorings. The community plans to continue on those and wants to deliver at least one more release in 2020.\nAutomatic Quotas for Gardener Resources (gardener/gardener#3072) Gardener already supports ResourceQuotas since the last release, however, it was still up to operators/administrators to create these objects in project namespaces. Obviously, in large Gardener installations with thousands of projects, this is a quite challenging task. With this release, we are shipping an improvement in the Project controller in the gardener-controller-manager that allows operators to automatically create ResourceQuotas based on configuration. Operators can distinguish via project label selectors which default quotas shall be defined for various projects. Please find more details at Gardener Controller Manager!\nResource Capacity and Reservations for Seeds (gardener/gardener#3075) The larger the Gardener landscape, the more seed clusters you require. Naturally, they have limits of how many shoots they can accommodate (based on constraints of the underlying infrastructure provider and/or seed cluster configuration). Until this release, there were no means to prevent a seed cluster from becoming overloaded (and potentially die due to this load). Now you define resource capacity and reservations in the gardenlet’s component configuration, similar to how the kubelet announces allocatable resources for Node objects. We are defaulting this to 250 shoots, but you might want to adapt this value for your own environment.\nDistributed Gardenlet Rollout for Shooted Seeds (gardener/gardener#3135) With the same motivation, i.e., to improve catering with large landscapes, we allow operators to configure distributed rollouts of gardenlets for shooted seeds. When a new Gardener version is being deployed in landscapes with a high number of shooted seeds, gardenlets of earlier versions were immediately re-deploying copies of themselves into the shooted seeds they manage. This leads to a large number of new gardenlet pods that all roughly start at the same time. Depending on the size of the landscape, this may trouble the gardener-apiservers as all of them are starting to fill their caches and create watches at the same time. By default, this rollout is now randomized within a 5m time window, i.e., it may take up to 5m until all gardenlets in all seeds have been updated.\nProgressing on Beta-Promotion for APIServerSNI Feature Gate (gardener/gardener#3082, gardener/gardener#3143) The alpha APIServerSNI feature will drastically reduce the costs for load balancers in the seed clusters, thus, it is effectively contributing to Gardener’s “minimal TCO” goal. In this release we are introducing an important improvement that optimizes the connectivity when pods talk to their control plane by avoiding an extra network hop. This is realized by a MutatingWebhookConfiguration whose server runs as a sidecar container in the kube-apiserver pod in the seed (only when the APIServerSNI feature gate is enabled). The webhook injects a KUBERNETES_SERVICE_HOST environment variable into pods in the shoot which prevents the additional network hop to the apiserver-proxy on all worker nodes. You can read more about it in APIServerSNI environment variable injection.\nMore Control Plane Configurability (gardener/gardener#3141, gardener/gardener#3139) A main capability beloved by Gardener users is its openness when it comes to configurability and fine-tuning of the Kubernetes control plane components. Most managed Kubernetes offerings are not exposing options of the master components, but Gardener’s Shoot API offers a selected set of settings. With this release we are allowing to change the maximum number of (non-)mutating requests for the kube-apiserver of shoot clusters. Similarly, the grace period before deleting pods on failed nodes can now be fine-grained for the kube-controller-manager.\nImproved Project Resource Handling (gardener/gardener#3137, gardener/gardener#3136, gardener/gardener#3179) Projects are an important resource in the Gardener ecosystem as they enable collaboration with team members. A couple of improvements have landed into this release. Firstly, duplicates in the member list were not validated so far. With this release, the gardener-apiserver is automatically merging them, and in future releases requests with duplicates will be denied. Secondly, specific Projects may now be excluded from the stale checks if desired. Lastly, namespaces for Projects that were adopted (i.e., those that exist before the Project already) will now no longer be deleted when the Project is being deleted. Please note that this only applies for newly created Projects.\nRemoval of Deprecated Labels and Annotations (gardener/gardener#3094) The core.gardener.cloud API group succeeded the old garden.sapcloud.io API group in the beginning of 2020, however, a lot of labels and annotations with the old API group name were still supported. We have continued with the process of removing those deprecated (but replaced with the new API group name) names. Concretely, the project labels garden.sapcloud.io/role=project and project.garden.sapcloud.io/name=\u003cproject-name\u003e are no longer supported now. Similarly, the shoot.garden.sapcloud.io/use-as-seed and shoot.garden.sapcloud.io/ignore-alerts annotations got deleted. We are not finished yet, but we do small increments and plan to progress on the topic until we finally get rid of all artifacts with the old API group name.\nNodeLocalDNS Network Policy Rules Adapted (gardener/gardener#3184) The alpha NodeLocalDNS feature was already introduced and explained with Gardener v1.8 with the motivation to overcome certain bottlenecks with the horizontally auto-scaled CoreDNS in all shoot clusters. Unfortunately, due to a bug in the network policy rules, it was not working in all environments. We have fixed this one now, so it should be ready for further tests and investigations. Come give it a try!\nPlease bear in mind that this blog post only highlights the most noticeable changes and improvements, but there is a whole bunch more, including a ton of bug fixes in older versions! Come check out the full release notes and share your feedback in our #gardener Slack channel!\n","categories":"","description":"","excerpt":"Dear community, we’re happy to announce a new minor release of …","ref":"/blog/2020/11.23-gardener-v1.13-released/","tags":"","title":"Gardener v1.13 Released"},{"body":" This is a guest commentary from metal-stack.\nmetal-stack is a software that provides an API for provisioning and managing physical servers in the data center. To categorize this product, the terms “Metal-as-a-Service” (MaaS) or “bare metal cloud” are commonly used. One reason that you stumbled upon this blog post could be that you saw errors like the following in your ETCD instances:\netcd-main-0 etcd 2020-09-03 06:00:07.556157 W | etcdserver: read-only range request \"key:\\\"/registry/deployments/shoot--pwhhcd--devcluster2/kube-apiserver\\\" \" with result \"range_response_count:1 size:9566\" took too long (13.95374909s) to execute As it turns out, 14 seconds are way too slow for running Kubernetes API servers. It makes them go into a crash loop (leader election fails). Even worse, this whole thing is self-amplifying: The longer a response takes, the more requests queue up, leading to response times increasing further and further. The system is very unlikely to recover. 😞\nOn Github, you can easily find the reason for this problem. Most probably your disks are too slow (see etcd-io/etcd#10860). So, when you are (like in our case) on GKE and run your ETCD on their default persistent volumes, consider moving from standard disks to SSDs and the error messages should disappear. A guide on how to use SSD volumes on GKE can be found at Using SSD persistent disks.\nCase closed? Well. For some people it might be. But when you are seeing this in your Gardener infrastructure, it’s likely that there is something going wrong. The entire ETCD management is fully managed by Gardener, which makes the problem a bit more interesting to look at. This blog post strives to cover topics such as:\n Gardener operating principles Gardener architecture and ETCD management Pitfalls with multi-cloud environments Migrating GCP volumes to a new storage class We from metal-stack learned quite a lot about the capabilities of Gardener through this problem. We are happy to share this experience with a broader audience. Gardener adopters and operators read on.\nHow Gardener Manages ETCDs In our infrastructure, we use Gardener to provision Kubernetes clusters on bare metal machines in our own data centers using metal-stack. Even if the entire stack could be running on-premise, our initial seed cluster and the metal control plane are hosted on GKE. This way, we do not need to manage a single Kubernetes cluster in our entire landscape manually. As soon as we have Gardener deployed on this initial cluster, we can spin up further Seeds in our own data centers through the concept of ManagedSeeds.\nTo make this easier to understand, let us give you a simplified picture of how our Gardener production setup looks like:\nFigure 1: Simplified View on Our Production Setup For every shoot cluster, Gardener deploys an individual, standalone ETCD as a stateful set into a shoot namespace. The deployment of the ETCD stateful set is managed by a controller called etcd-druid, which reconciles a special resource of the kind etcds.druid.gardener.cloud. This Etcd resource is getting deployed during the shoot provisioning flow in the gardenlet.\nFor failure-safety, the etcd-druid deploys the official ETCD container image along with a sidecar project called etcd-backup-restore. The sidecar automatically takes backups of the ETCD and stores them at a cloud provider, e.g. in S3 Buckets, Google Buckets, or similar. In case the ETCD comes up without or with corrupted data, the sidecar looks into the backup buckets and automatically restores the latest backup before ETCD starts up. This entire approach basically takes away the pain for operators to manually have to restore data in the event of data loss.\nNote We found the etcd-backup-restore project very intriguing. It was the inspiration for us to come up with a similar sidecar for the databases we use with metal-stack. This project is called backup-restore-sidecar. We can cope with postgres and rethinkdb database at the moment and more to come. Feel free to check it out when you are interested. As it’s the nature for multi-cloud applications to act upon a variety of cloud providers, with a single installation of Gardener, it is easily possible to spin up new Kubernetes clusters not only on GCP, but on other supported cloud platforms, too.\nWhen the Gardenlet deploys a resource like the Etcd resource into a shoot namespace, a provider-specific extension-controller has the chance to manipulate it through a mutating webhook. This way, a cloud provider can adjust the generic Gardener resource to fit the provider-specific needs. For every cloud that Gardener supports, there is such an extension-controller. For metal-stack, we also maintain one, called gardener-extension-provider-metal.\nNote A side note for cloud providers: Meanwhile, new cloud providers can be added fully out-of-tree, i.e. without touching any of Gardener’s sources. This works through API extensions and CRDs. Gardener handles generic resources and backpacks provider-specific configuration through raw extensions. When you are a cloud provider on your own, this is really encouraging because you can integrate with Gardener without any burdens. You can find documentation on how to integrate your cloud into Gardener at Adding Cloud Providers and Extensibility Overview. The Mistake Is in the Deployment This section contains code examples from Gardener v1.8. Now that we know how the ETCDs are managed by Gardener, we can come back to the original problem from the beginning of this article. It turned out that the real problem was a misconfiguration in our deployment. Gardener actually does use SSD-backed storage on GCP for ETCDs by default. During reconciliation, the gardener-extension-controller-gcp deploys a storage class called gardener.cloud-fast that enables accessing SSDs on GCP.\nBut for some reason, in our cluster we did not find such a storage class. And even more interesting, we did not use the gardener-extension-provider-gcp for any shoot reconciliation, only for ETCD backup purposes. And that was the big mistake we made: We reconciled the shoot control plane completely with gardener-extension-provider-metal even though our initial Seed actually runs on GKE and specific parts of the shoot control plane should be reconciled by the GCP extension-controller instead!\nThis is how the initial Seed resource looked like:\napiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: initial-seed spec: ... provider: region: gke type: metal ... ... Surprisingly, this configuration was working pretty well for a long time. The initial seed properly produced the Kubernetes control planes of our managed seeds that looked like this:\n$ kubectl get controlplanes.extensions.gardener.cloud NAME TYPE PURPOSE STATUS AGE fra-equ01 metal Succeeded 85d fra-equ01-exposure metal exposure Succeeded 85d And this is another interesting observation: There are two ControlPlane resources. One regular resource and one with an exposure purpose. Gardener distinguishes between two types for this exact reason: Environments where the shoot control plane runs on a different cloud provider than the Kubernetes worker nodes. The regular ControlPlane resource gets reconciled by the provider configured in the Shoot resource, and the exposure type ControlPlane by the provider configured in the Seed resource.\nWith the existing configuration the gardener-extension-provider-gcp does not kick in and hence, it neither deploys the gardener.cloud-fast storage class nor does it mutate the Etcd resource to point to it. And in the end, we are left with ETCD volumes using the default storage class (which is what we do for ETCD stateful sets in the metal-stack seeds, because our default storage class uses csi-lvm that writes into logical volumes on the SSD disks in our physical servers).\nThe correction we had to make was a one-liner: Setting the provider type of the initial Seed resource to gcp.\n$ kubectl get seed initial-seed -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: initial-seed spec: ... provider: region: gke type: gcp # \u003c-- here ... ... This change moved over the control plane exposure reconciliation to the gardener-extension-provider-gcp:\n$ kubectl get -n \u003cshoot-namespace\u003e controlplanes.extensions.gardener.cloud NAME TYPE PURPOSE STATUS AGE fra-equ01 metal Succeeded 85d fra-equ01-exposure gcp exposure Succeeded 85d And boom, after some time of waiting for all sorts of magic reconciliations taking place in the background, the missing storage class suddenly appeared:\n$ kubectl get sc NAME PROVISIONER gardener.cloud-fast kubernetes.io/gce-pd standard (default) kubernetes.io/gce-pd Also, the Etcd resource was now configured properly to point to the new storage class:\n$ kubectl get -n \u003cshoot-namespace\u003e etcd etcd-main -o yaml apiVersion: druid.gardener.cloud/v1alpha1 kind: Etcd metadata: ... name: etcd-main spec: ... storageClass: gardener.cloud-fast # \u003c-- was pointing to default storage class before! volumeClaimTemplate: main-etcd ... Note Only the etcd-main storage class gets changed to gardener.cloud-fast. The etcd-events configuration will still point to standard disk storage because this ETCD is much less occupied as compared to the etcd-main stateful set. The Migration Now that the deployment was in place such that this mistake would not repeat in the future, we still had the ETCDs running on the default storage class. The reconciliation does not delete the existing persistent volumes (PVs) on its own.\nTo bring production back up quickly, we temporarily moved the ETCD pods to other nodes in the GKE cluster. These were nodes which were less occupied, such that the disk throughput was a little higher than before. But surely that was not a final solution.\nFor a proper solution we had to move the ETCD data out of the standard disk PV into a SSD-based PV.\nEven though we had the etcd-backup-restore sidecar, we did not want to fully rely on the restore mechanism to do the migration. The backup should only be there for emergency situations when something goes wrong. Thus, we came up with another approach to introduce the SSD volume: GCP disk snapshots. This is how we did the migration:\n Scale down etcd-druid to zero in order to prevent it from disturbing your migration Scale down the kube-apiservers deployment to zero, then wait for the ETCD stateful to take another clean snapshot Scale down the ETCD stateful set to zero as well (in order to prevent Gardener from trying to bring up the downscaled resources, we used small shell constructs like while true; do kubectl scale deploy etcd-druid --replicas 0 -n garden; sleep 1; done) Take a drive snapshot in GCP from the volume that is referenced by the ETCD PVC Create a new disk in GCP from the snapshot on a SSD disk Delete the existing PVC and PV of the ETCD (oops, data is now gone!) Manually deploy a PV into your Kubernetes cluster that references this new SSD disk Manually deploy a PVC with the name of the original PVC and let it reference the PV that you have just created Scale up the ETCD stateful set and check that ETCD is running properly (if something went terribly wrong, you still have the backup from the etcd-backup-restore sidecar, delete the PVC and PV again and let the sidecar bring up ETCD instead) Scale up the kube-apiserver deployment again Scale up etcd-druid again (stop your shell hacks ;D) This approach worked very well for us and we were able to fix our production deployment issue. And what happened: We have never seen any crashing kube-apiservers again. 🎉\nConclusion As bad as problems in production are, they are the best way for learning from your mistakes. For new users of Gardener it can be pretty overwhelming to understand the rich configuration possibilities that Gardener brings. However, once you get a hang of how Gardener works, the application offers an exceptional versatility that makes it very much suitable for production use-cases like ours.\nThis example has shown how Gardener:\n Can handle arbitrary layers of infrastructure hosted by different cloud providers. Allows provider-specific tweaks to gain ideal performance for every cloud you want to support. Leverages Kubernetes core principles across the entire project architecture, making it vastly extensible and resilient. Brings useful disaster recovery mechanisms to your infrastructure (e.g. with etcd-backup-restore). We hope that you could take away something new through this blog post. With this article we also want to thank the SAP Gardener team for helping us to integrate Gardener with metal-stack. It’s been a great experience so far. 😄 😍\n","categories":"","description":"In this case study, our friends from metal-stack lead you through their journey of migrating Gardener ETCD volumes in their production environment.","excerpt":"In this case study, our friends from metal-stack lead you through …","ref":"/blog/2020/11.20-case-study-migrating-etcd-volumes-in-production/","tags":"","title":"Case Study: Migrating ETCD Volumes in Production"},{"body":"Two months after our last Gardener release update, we are happy again to present release v1.11 and v1.12 in this blog post. Control plane migration, load balancer consolidation, and new security features are just a few topics we progressed with. As always, a detailed list of features, improvements, and bug fixes can be found in the release notes of each release. If you are going to update from a previous Gardener version, please take the time to go through the action items in the release notes.\nNotable Changes in v1.12 Release v1.12, fresh from the oven, is shipped with plenty of improvements, features, and some API changes we want to pick up in the next sections.\nDrop Functionless DNS Providers (gardener/gardener#3036) This release drops the support for the so-called functionless DNS providers. Those are providers in a shoot’s specification (.spec.dns.providers) which don’t serve the shoot’s domain (.spec.dns.domain), but are created by Gardener in the seed cluster to serve DNS requests coming from the shoot cluster. If such providers don’t specify a type or secretName, the creation or update request for the corresponding shoot is denied.\nSeed Taints (gardener/gardener#2955) In an earlier release, we reserved a dedicated section in seed.spec.settings as a replacement for disable-capacity-reservation, disable-dns, invisible taints. These already deprecated taints were still considered and synced, which gave operators enough time to switch their integration to the new settings field. As of version v1.12, support for them has been discontinued and they are automatically removed from seed objects. You may use the actual taint names in a future release of Gardener again.\nLoad Balancer Events During Shoot Reconciliation (gardener/gardener#3028) As Gardener is capable of managing thousands of clusters, it is crucial to keep operation efforts at a minimum. This release demonstrates this endeavor by further improving error reporting to the end user. During a shoot’s reconciliation, Gardener creates Services of type LoadBalancer in the shoot cluster, e.g. for VPN or Nginx-Ingress addon, and waits for a successful creation. However, in the past we experienced that occurring issues caused by the party creating the load balancer (typically Cloud-Controller-Manager) are only exposed in the logs or as events. Gardener now fetches these event messages and propagates them to the shoot status in case of a failure. Users can then often fix the problem themselves, if for example the failure discloses an exhausted quota on the cloud provider.\nKonnectivityTunnel Feature per Shoot(gardener/gardener#3007) Since release v1.6, Gardener has been capable of reversing the tunnel direction from the seed to the shoot via the KonnectivityTunnel feature gate. With this release we make it possible to control the feature per shoot. We recommend to selectively enable the KonnectivityTunnel, as it is still in alpha state.\nReference Protection (gardener/gardener#2771, gardener/gardener 1708419) Shoot clusters may refer to external objects, like Secrets for specified DNS providers or they have a reference to an audit policy ConfigMap. Deleting those objects while any shoot still references them causes server errors, often only recoverable by an immense amount of manual operations effort. To prevent such scenarios, Gardener now adds a new finalizer gardener.cloud/reference-protection to these objects and removes it as soon as the object itself becomes releasable. Due to compatibility reasons, we decided that the handling for the audit policy ConfigMaps is delivered as an opt-in feature first, so please familiarize yourself with the necessary settings in the Gardener Controller Manager component config if you already plan to enable it.\nSupport for Resource Quotas (gardener/gardener#2627) After the Kubernetes upstream change (kubernetes/kubernetes#93537) for externalizing the backing admission plugin has been accepted, we are happy to announce the support of ResourceQuotas for Gardener offered resource kinds. ResourceQuotas allow you to specify a maximum number of objects per namespace, especially for end-user objects like Shoots or SecretBindings in a project namespace. Even though the admission plugin is enabled by default in the Gardener API Server, make sure the Kube Controller Manager runs the resourcequota controller as well.\nWatch Out Developers, Terraformer v2 is Coming! (gardener/gardener#3034) Although not related only to Gardener core, the preparation towards Terraformer v2 in the extensions library is still an important milestone to mention. With Terraformer v2, Gardener extensions using Terraform scripts will benefit from great consistency improvements. Please check out PR #3034, which demonstrates necessary steps to transition to Terraformer v2 as soon as it’s released.\nNotable Changes in v1.11 The Gardener community worked eagerly to deliver plenty of improvements with version v1.11. Those help us to further progress with topics like control plane migration, which is actively being worked on, or to harden our load balancer consolidation (APIServerSNI) feature. Besides improvements and fixes (full list available in release notes), this release contains major features as well, and we don’t want to miss a chance to walk you through them.\nGardener Admission Controller (gardener/gardener#2832), (gardener/gardener#2781) In this release, all admission related HTTP handlers moved from the Gardener Controller Manager (GCM) to the new component Gardener Admission Controller. The admission controller is rather a small component as opposed to GCM with regards to memory footprint and CPU consumption, and thus allows you to run multiple replicas of it much cheaper than it was before. We certainly recommend specifying the admission controller deployment with more than one replica, since it reduces the odds of a system-wide outage and increases the performance of your Gardener service.\nBesides the already known Namespace and Kubeconfig Secret validation, a new admission handler Resource-Size-Validator was added to the admission controller. It allows operators to restrict the size for all kinds of Kubernetes objects, especially sent by end-users to the Kubernetes or Gardener API Server. We address a security concern with this feature to prevent denial of service attacks in which an attacker artificially increases the size of objects to exhaust your object store, API server caches, or to let Gardener and Kubernetes controllers run out-of-memory. The documentation reveals an approach of finding the right resource size for your setup and why you should create exceptions for technical users and operators.\nDeferring Shoot Progress Reporting (gardener/gardener#2909) Shoot progress reporting is the continuous update process of a shoot’s .status.lastOperation field while the shoot is being reconciled by Gardener. Many steps are involved during reconciliation and depending on the size of your setup, the updates might become an issue for the Gardener API Server, which will refrain from processing further requests for a certain period. With .controllers.shoot.progressReportPeriod in Gardenlet’s component configuration, you can now delay these updates for the specified period.\nNew Policy for Controller Registrations (gardener/gardener#2896) A while ago, we added support for different policies in ControllerRegistrations which determine under which circumstances the deployments of registration controllers happen in affected seed clusters. If you specify the new policy AlwaysExceptNoShoots, the respective extension controller will be deployed to all seed cluster hosting at least one shoot cluster. After all shoot clusters from a seed are gone, the extension deployment will be deleted again. A full list of supported policies can be found at Registering Extension Controllers.\n","categories":"","description":"","excerpt":"Two months after our last Gardener release update, we are happy again …","ref":"/blog/2020/11.04-gardener-v1.11-and-v1.12-released/","tags":"","title":"Gardener v1.11 and v1.12 Released"},{"body":"The Gardener team is happy to announce that Gardener now offers support for an additional, often requested, infrastructure/virtualization technology, namely KubeVirt! Gardener can now provide Kubernetes-conformant clusters using KubeVirt managed Virtual Machines in the environment of your choice. This integration has been tested and works with any qualified Kubernetes (provider) cluster that is compatibly configured to host the required KubeVirt components, in particular for example Red Hat OpenShift Virtualization.\nGardener enables Kubernetes consumers to centralize and operate efficiently homogenous Kubernetes clusters across different IaaS providers and even private environments. This way the same cloud-based application version can be hosted and operated by its vendor or consumer on a variety of infrastructures. When a new customer or your development team demands for a new infrastructure provider, Gardener helps you to quickly and easily on-board your workload. Furthermore, on this new infrastructure, Gardener keeps the seamless Kubernetes management experience for your Kubernetes operators, while upholding the consistency of the CI/CD pipeline of your software development team.\nArchitecture and Workflow Gardener is based on the idea of three types of clusters – Garden cluster, Seed cluster and Shoot cluster (see Figure 1). The Garden cluster is used to control the entire Kubernetes environment centrally in a highly scalable design. The highly available seed clusters are used to host the end users (shoot) clusters’ control planes. Finally, the shoot clusters consist only of worker nodes to host the cloud native applications.\nFigure 1: Gardener Architecture An integration of the Gardener open source project with a new cloud provider follows a standard Gardener extensibility approach. The integration requires two new components: a provider extension and a Machine Controller Manager (MCM) extension. Both components together enable Gardener to instruct the new cloud provider. They run in the Gardener seed clusters that host the control planes of the shoots based on that cloud provider. The role of the provider extension is to manage the provider-specific aspects of the shoot clusters’ lifecycle, including infrastructure, control plane, worker nodes, and others. It works in cooperation with the MCM extension, which in particular is responsible to handle machines that are provisioned as worker nodes for the shoot clusters. To get this job done, the MCM extension leverages the VM management/API capabilities available with the respective cloud provider.\nSetting up a Kubernetes cluster always involves a flow of interdependent steps (see Figure 2), beginning with the generation of certificates and preparation of the infrastructure, continuing with the provisioning of the control plane and the worker nodes, and ending with the deployment of system components. Gardener can be configured to utilize the KubeVirt extensions in its generic workflow at the right extension points, and deliver the desired outcome of a KubeVirt backed cluster.\nFigure 2: Generic cluster reconciliation flow with extension points Gardener Integration with KubeVirt in Detail Integration with KubeVirt follows the Gardener extensibility concept and introduces the two new components mentioned above: the KubeVirt Provider Extension and the KubeVirt Machine Controller Manager (MCM) Extension.\nFigure 3: Gardener integration with KubeVirt The KubeVirt Provider Extension consists of three separate controllers that handle respectively the infrastructure, the control plane, and the worker nodes of the shoot cluster.\nThe Infrastructure Controller configures the network communication between the shoot worker nodes. By default, shoot worker nodes only use the provider cluster’s pod network. To achieve higher level of network isolation and better performance, it is possible to add more networks and replace the default pod network with a different network using container network interface (CNI) plugins available in the provider cluster. This is currently based on Multus CNI and NetworkAttachmentDefinitions.\nExample infrastructure configuration in a shoot definition:\nprovider: type: kubevirt infrastructureConfig: apiVersion: kubevirt.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: tenantNetworks: - name: network-1 config: |{ \"cniVersion\": \"0.4.0\", \"name\": \"bridge-firewall\", \"plugins\": [ { \"type\": \"bridge\", \"isGateway\": true, \"isDefaultGateway\": true, \"ipMasq\": true, \"ipam\": { \"type\": \"host-local\", \"subnet\": \"10.100.0.0/16\" } }, { \"type\": \"firewall\" } ] } default: true The Control Plane Controller deploys a Cloud Controller Manager (CCM). This is a Kubernetes control plane component that embeds cloud-specific control logic. As any other CCM, it runs the Node controller that is responsible for initializing Node objects, annotating and labeling them with cloud-specific information, obtaining the node’s hostname and IP addresses, and verifying the node’s health. It also runs the Service controller that is responsible for setting up load balancers and other infrastructure components for Service resources that require them.\nFinally, the Worker Controller is responsible for managing the worker nodes of the Gardener shoot clusters.\nExample worker configuration in a shoot definition:\nprovider: type: kubevirt workers: - name: cpu-worker minimum: 1 maximum: 2 machine: type: standard-1 image: name: ubuntu version: \"18.04\" volume: type: default size: 20Gi zones: - europe-west1-c For more information about configuring the KubeVirt Provider Extension as an end-user, see Using the KubeVirt provider extension with Gardener as end-user.\nEnabling Your Gardener Setup to Leverage a KubeVirt Compatible Environment The very first step required is to define the machine types (VM types) for VMs that will be available. This is achieved via the CloudProfile custom resource. The machine types configuration includes details such as CPU, GPU, memory, OS image, and more.\nExample CloudProfile custom resource:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: kubevirt spec: type: kubevirt providerConfig: apiVersion: kubevirt.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: ubuntu versions: - version: \"18.04\" sourceURL: \"https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img\" kubernetes: versions: - version: \"1.18.5\" machineImages: - name: ubuntu versions: - version: \"18.04\" machineTypes: - name: standard-1 cpu: \"1\" gpu: \"0\" memory: 4Gi volumeTypes: - name: default class: default regions: - name: europe-west1 zones: - name: europe-west1-b - name: europe-west1-c - name: europe-west1-d Once a machine type is defined, it can be referenced in shoot definitions. This information is used by the KubeVirt Provider Extension to generate MachineDeployment and MachineClass custom resources required by the KubeVirt MCM extension for managing the worker nodes of the shoot clusters during the reconciliation process.\nFor more information about configuring the KubeVirt Provider Extension as an operator, see Using the KubeVirt provider extension with Gardener as operator.\nKubeVirt Machine Controller Manager (MCM) Extension The KubeVirt MCM Extension is responsible for managing the VMs that are used as worker nodes of the Gardener shoot clusters using the virtualization capabilities of KubeVirt. This extension handles all necessary lifecycle management activities, such as machines creation, fetching, updating, listing, and deletion.\nThe KubeVirt MCM Extension implements the Gardener’s common driver interface for managing VMs in different cloud providers. As already mentioned, the KubeVirt MCM Extension is using the MachineDeployments and MachineClasses – an abstraction layer that follows the Kubernetes native declarative approach - to get instructions from the KubeVirt Provider Extension about the required machines for the shoot worker nodes. Also, the cluster austoscaler integrates with the scale subresource of the MachineDeployment resource. This way, Gardener offers a homogeneous autoscaling experience across all supported providers.\nWhen a new shoot cluster is created or when a new worker node is needed for an existing shoot cluster, a new Machine will be created, and at that time, the KubeVirt MCM extension will create a new KubeVirt VirtualMachine in the provider cluster. This VirtualMachine will be created based on a set of configurations in the MachineClass that follows the specification of the KubeVirt provider.\nThe KubeVirt MCM Extension has two main components. The MachinePlugin is responsible for handling the machine objects, and the PluginSPI is in charge of making calls to the cloud provider interface, to manage its resources.\nFigure 4: KubeVirt MCM extension workflow and architecture As shown in Figure 4, the MachinePlugin receives a machine request from the MCM and starts its processing by decoding the request, doing partial validation, extracting the relevant information, and sending it to the PluginSPI.\nThe PluginSPI then creates, gets, or deletes VirtualMachines depending on the method called by the MachinePlugin. It extracts the kubeconfig of the provider cluster and handles all other required KubeVirt resources such as the secret that holds the cloud-init configurations, and DataVolumes that are mounted as disks to the VMs.\nSupported Environments The Gardener KubeVirt support is currently qualified on:\n KubeVirt v0.32.0 (and later) Red Hat OpenShift Container Platform 4.4 (and later) There are also plans for further improvements and new features, for example integration with CSI drivers for storage management. Details about the implementation progress can be found in the Gardener project on GitHub.\nYou can find further resources about the open source project Gardener at https://gardener.cloud.\n","categories":"","description":"","excerpt":"The Gardener team is happy to announce that Gardener now offers …","ref":"/blog/2020/10.19-gardener-integrates-with-kubevirt/","tags":"","title":"Gardener Integrates with KubeVirt"},{"body":"Do you want to understand how Gardener creates and updates Kubernetes clusters (Shoots)? Well, it’s complicated, but if you are not afraid of large diagrams and are a visual learner like me, this might be useful to you.\nIntroduction In this blog post I will share a technical diagram which attempts to tie together the various components involved when Gardener creates a Kubernetes cluster. I have created and curated the diagram, which visualizes the Shoot reconciliation flow since I started developing on Gardener. Aside from serving as a memory aid for myself, I created it in hopes that it may potentially help contributors to understand a core piece of the complex Gardener machinery. Please be advised that the diagram and components involved are large. Although it can be easily divided into multiple diagrams, I want to show all the components and connections in a single diagram to create an overview of the reconciliation flow.\nThe goal is to visualize the interactions of the components involved in the Shoot creation. It is not intended to serve as a documentation of every component involved.\nBackground Taking a step back, the Gardener README states:\n In essence, Gardener is an extension API server that comes along with a bundle of custom controllers. It introduces new API objects in an existing Kubernetes cluster (which is called a garden cluster) in order to use them for the management of end-user Kubernetes clusters (which are called shoot clusters). These shoot clusters are described via declarative cluster specifications which are observed by the controllers. They will bring up the clusters, reconcile their state, perform automated updates and make sure they are always up and running.\n This means that Gardener, just like any Kubernetes controller, creates Kubernetes clusters (Shoots) using a reconciliation loop.\nThe Gardenlet contains the controller and reconciliation loop responsible for the creation, update, deletion, and migration of Shoot clusters (there are more, but we spare them in this article). In addition, the Gardener Controller Manager also reconciles Shoot resources, but only for seed-independent functionality such as Shoot hibernation, Shoot maintenance or quota control.\nThis blog post is about the reconciliation loop in the Gardenlet responsible for creating and updating Shoot clusters. The code can be found in the gardener/gardener repository. The reconciliation loops of the extension controllers can be found in their individual repositories.\nShoot Reconciliation Flow Diagram When Gardner creates a Shoot cluster, there are three conceptual layers involved: the Garden cluster, the Seed cluster and the Shoot cluster. Each layer represents a top-level section in the diagram (similar to a lane in a BPMN diagram).\nIt might seem confusing that the Shoot cluster itself is a layer, because the whole flow in the first place is about creating the Shoot cluster. I decided to introduce this separate layer to make a clear distinction between which resources exist in the Seed API server (managed by Gardener) and which in the Shoot API server (accessible by the Shoot owner).\nEach section contains several components. Components are mostly Kubernetes resources in a Gardener installation (e.g. the gardenlet deployment in the Seed cluster).\nThis is the list of components:\n(Virtual) Garden Cluster\n Gardener Extension API server Validating Provider Webhooks Project Namespace Seed Cluster\n Gardenlet Seed API server every Shoot Control Plane has a dedicated namespace in the Seed. Cloud Provider (owned by Stakeholder) Arguably part of the Shoot cluster but used by components in the Seed cluster to create the infrastructure for the Shoot. Gardener DNS extension Provider Extension (such as gardener-extension-provider-aws) Gardener Extension ETCD Druid Gardener Resource Manager Operating System Extension (such as gardener-extension-os-gardenlinux) Networking Extension (such as gardener-extension-networking-cilium) Machine Controller Manager ContainerRuntime extension (such as gardener-extension-runtime-gvisor) Shoot API server (in the Shoot Namespace in the Seed cluster) Shoot Cluster\n Cloud Provider Compute API (owned by Stakeholder) - for VM/Node creation. VM / Bare metal node hosted by Cloud Provider (in Stakeholder owned account). How to Use the Diagram The diagram:\n should be read from top to bottom - starting in the top left corner with the creation of the Shoot resource via the Gardener Extension API server. should not require an encompassing documentation / description. More detailed documentation on the components itself can usually be found in the respective repository. does not show which activities execute in parallel (many) and also does not describe the exact dependencies between the steps. This can be found out by looking at the source code. It however tries to put the activities in a logical order of execution during the reconciliation flow. Occasionally, there is an info box with additional information next to parts in the diagram that in my point of view require further explanation. Large example resource for the Gardener CRDs (e.g Worker CRD, Infrastructure CRD) are placed on the left side and are referenced by a dotted line (—–).\nBe aware that Gardener is an evolving project, so the diagram will most likely be already outdated by the time you are reading this. Nevertheless, it should give a solid starting point for further explorations into the details of Gardener.\nFlow Diagram The diagram can be found below and on GitHub. There are multiple formats available (svg, vsdx, draw.io, html).\nPlease open an issue or open a PR in the repository if information is missing or is incorrect. Thanks!\n\n","categories":"","description":"","excerpt":"Do you want to understand how Gardener creates and updates Kubernetes …","ref":"/blog/2020/10.19-shoot-reconciliation-details/","tags":"","title":"Shoot Reconciliation Details"},{"body":"Summer holidays aren’t over yet, still, the Gardener community was able to release two new minor versions in the past weeks. Despite being limited in capacity these days, we were able to reach some major milestones, like adding Kubernetes v1.19 support and the long-delayed automated gardenlet certificate rotation. Whilst we continue to work on topics related to scalability, robustness, and better observability, we agreed to adjust our focus a little more into the areas of development productivity, code quality and unit/integration testing for the upcoming releases.\nNotable Changes in v1.10 Gardener v1.10 was a comparatively small release (measured by the number of changes) but it comes with some major features!\nKubernetes 1.19 Support (gardener/gardener#2799) The newest minor release of Kubernetes is now supported by Gardener (and all the maintained provider extensions)! Predominantly, we have enabled CSI migration for OpenStack now that it got promoted to beta, i.e. 1.19 shoots will no longer use the in-tree Cinder volume provisioner. The CSI migration enablement for Azure got postponed (to at least 1.20) due to some issues that the Kubernetes community is trying to fix in the 1.20 release cycle. As usual, the 1.19 release notes should be considered before upgrading your shoot clusters.\nAutomated Certificate Rotation for gardenlet (gardener/gardener#2542) Similar to the kubelet, the gardenlet supports TLS bootstrapping when deployed into a new seed cluster. It will request a client certificate for the garden cluster using the CertificateSigningRequest API of Kubernetes and store the generated results in a Secret object in the garden namespace of its seed. These certificates are usually valid for one year. We have now added support for automatic renewals if the expiration dates are approaching.\nImproved Monitoring Alerts (gardener/gardener#2776) We have worked on a larger refactoring to improve reliability and accuracy of our monitoring alerts for both shoot control planes in the seed, as well as shoot system components running on worker nodes. The improvements are primarily for operators and should result in less false positive alerts. Also, the alerts should fire less frequently and are better grouped in order to reduce to overall amount of alerts.\nSeed Deletion Protection (gardener/gardener#2732) Our validation to improve robustness and countermeasures against accidental mistakes has been improved. Earlier, it was possible to remove the use-as-seed annotation for shooted seeds or directly set the deletionTimestamp on Seed objects, despite of the fact that they might still run shoot control planes. Seed deletion would not start in these cases, although, it would disrupt the system unnecessarily, and result in some unexpected behaviour. The Gardener API server is now forbidding such requests if the seeds are not completely empty yet.\nLogging Improvements for Loki (multiple PRs) After we released our large logging stack refactoring (from EFK to Loki) with Gardener v1.8, we have continued to work on reliability, quality and user feedback in general. We aren’t done yet, though, Gardener v1.10 includes a bunch of improvements which will help to graduate the Logging feature gate to beta and GA, eventually.\nNotable Changes in v1.9 The v1.9 release contained tons of small improvements and adjustments in various areas of the code base and a little less new major features. However, we don’t want to miss the opportunity to highlight a few of them.\nCRI Validation in CloudProfiles (gardener/gardener#2137) A couple of releases back we have introduced support for containerd and the ContainerRuntime extension API. The supported container runtimes are operating system specific, and until now it wasn’t possible for end-users to easily figure out whether they can enable containerd or other ContainerRuntime extensions for their shoots. With this change, Gardener administrators/operators can now provide that information in the .spec.machineImages section in the CloudProfile resource. This also allows for enhanced validation and prevents misconfigurations.\nNew Shoot Event Controller (gardener/gardener#2649) The shoot controllers in both the gardener-controller-manager and gardenlet fire several Events for some important operations (e.g., automated hibernation/wake-up due to hibernation schedule, automated Kubernetes/machine image version update during maintenance, etc.). Earlier, the only way to prolong the lifetime of these events was to modify the --event-ttl command line parameter of the garden cluster’s kube-apiserver. This came with the disadvantage that all events were kept for a longer time (not only those related to Shoots that an operator is usually interested in and ideally wants to store for a couple of days). The new shoot event controller allows to achieve this by deleting non-shoot events. This helps operators and end-users to better understand which changes were applied to their shoots by Gardener.\nEarly Deployment of the Logging Stack for New Shoots (gardener/gardener#2750) Since the first introduction of the Logging feature gate two years back, the logging stack was only deployed at the very end of the shoot creation. This had the disadvantage that control plane pod logs were not kept in case the shoot creation flow is interrupted before the logging stack could be deployed. In some situations, this was preventing fetching relevant information about why a certain control plane component crashed. We now deploy the logging stack very early in the shoot creation flow to always have access to such information.\n","categories":"","description":"","excerpt":"Summer holidays aren’t over yet, still, the Gardener community was …","ref":"/blog/2020/09.11-gardener-v1.9-and-v1.10-released/","tags":"","title":"Gardener v1.9 and v1.10 Released"},{"body":"Even if we are in the midst of the summer holidays, a new Gardener release came out yesterday: v1.8.0! It’s main themes are the large change of our logging stack to Loki (which was already explained in detail on a blog post on grafana.com), more configuration options to optimize the utilization of a shoot, node-local DNS, new project roles, and significant improvements for the Kubernetes client that Gardener uses to interact with the many different clusters.\nNotable Changes Logging 2.0: EFK Stack Replaced by Loki (gardener/gardener#2515) Since two years or so, Gardener could optionally provision a dedicated logging stack per seed and per shoot which was based on fluent-bit, fluentd, ElasticSearch and Kibana. This feature was still hidden behind an alpha-level feature gate and never got promoted to beta so far. Due to various limitations of this solution, we decided to replace the EFK stack with Loki. As we already have Prometheus and Grafana deployments for both users and operators by default for all clusters, the choice was just natural. Please find out more on this topic at this dedicated blog post.\nCluster Identities and DNSOwner Objects (gardener/gardener#2471, gardener/gardener#2576) The shoot control plane migration topic is ongoing since a few months already, and we are very much progressing with it. A first alpha version will probably make it out soon. As part of these endeavors, we introduced cluster identities and the usage of DNSOwner objects in this release. Both are needed to gracefully migrate the DNSEntry extension objects from the old seed to the new seed as part of the control plane migration process. Please find out more on this topic at this blog post.\nNew uam Role for Project Members to Limit User Access Management Privileges (gardener/gardener#2611) In order to allow external user access management system to integrate with Gardener and to fulfil certain compliance aspects, we have introduced a new role called uam for Project members (next to admin and viewer). Only if a user has this role, then he/she is allowed to add/remove other human users to the respective Project. By default, all newly created Projects assign this role only to the owner while, for backwards-compatibility reasons, it will be assigned for all members for existing projects. Project owners can steadily revoke this access as desired. Interestingly, the uam role is backed by a custom RBAC verb called manage-members, i.e., the Gardener API server is only admitting changes to the human Project members if the respective user is bound to this RBAC verb.\nNew Node-Local DNS Feature for Shoots (gardener/gardener#2528) By default, we are using CoreDNS as DNS plugin in shoot clusters which we auto-scale horizontally using HPA. However, in some situations we are discovering certain bottlenecks with it, e.g., unreliable UDP connections, unnecessary node hopping, inefficient load balancing, etc. To further optimize the DNS performance for shoot clusters, it is now possible to enable a new alpha-level feature gate in the gardenlet’s componentconfig: NodeLocalDNS. If enabled, all shoots will get a new DaemonSet to run a DNS server on each node.\nMore kubelet and API Server Configurability (gardener/gardener#2574, gardener/gardener#2668) One large benefit of Gardener is that it allows you to optimize the usage of your control plane as well as worker nodes by exposing relevant configuration parameters in the Shoot API. In this version, we are adding support to configure kubelet’s values for systemReserved and kubeReserved resources as well as the kube-apiserver’s watch cache sizes. This allows end-users to get to better node utilization and/or performance for their shoot clusters.\nConfigurable Timeout Settings for machine-controller-manager (gardener/gardener#2563) One very central component in Project Gardener is the machine-controller-manager for managing the worker nodes of shoot clusters. It has extensive qualities with respect to node lifecycle management and rolling updates. As such, it uses certain timeout values, e.g. when creating or draining nodes, or when checking their health. Earlier, those were not customizable by end-users, but we are adding this possibility now. You can fine-grain these settings per worker pool in the Shoot API such that you can optimize the lifecycle management of your worker nodes even more!\nImproved Usage of Cached Client to Reduce Network I/O (gardener/gardener#2635, gardener/gardener#2637) In the last Gardener release v1.7 we have introduced a huge refactoring the clients that we use to interact with the many different Kubernetes clusters. This is to further optimize the network I/O performed by leveraging watches and caches as good as possible. It’s still an alpha-level feature that must be explicitly enabled in the Gardenlet’s component configuration, though, with this release we have improved certain things in order to pave the way for beta promotion. For example, we were initially also using a cached client when interacting with shoots. However, as the gardenlet runs in the seed as well (and thus can communicate cluster-internally with the kube-apiservers of the respective shoots) this cache is not necessary and just memory overhead. We have removed it again and saw the memory usage getting lower again. More to come!\nAWS EBS Volume Encryption by Default (gardener/gardener-extension-provider-aws#147) The Shoot API already exposed the possibility to encrypt the root disks of worker nodes since quite a while, but it was disabled by default (for backwards-compatibility reasons). With this release we have change this default, so new shoot worker nodes will be provisioned with encrypted root disks out-of-the-box. However, the g4dn instance types of AWS don’t support this encryption, so when you use them you have to explicitly disable the encryption in the worker pool configuration.\nLiveness Probe for Gardener API Server Deployment (gardener/gardener#2647) A small, but very valuable improvement is the introduction of a liveness probe for our Gardener API server. As it’s built with the same library like the Kubernetes API server, it exposes two endpoints at /livez and /readyz which were created exactly for the purpose of live- and readiness probes. With Gardener v1.8, the Helm chart contains a liveness probe configuration by default, and we are awaiting an upstream fix (kubernetes/kubernetes#93599) to also enable the readiness probe. This will help in a smoother rolling update of the Gardener API server pods, i.e., preventing clients from talking to a not yet initialized or already terminating API server instance.\nWebhook Ports Changed to Enable OpenShift (gardener/gardener#2660) In order to make it possible to run Gardener on OpenShift clusters as well, we had to make a change in the port configuration for the webhooks we are using in both Gardener and the extension controllers. Earlier, all the webhook servers directly exposed port 443, i.e., a system port which is a security concern and disallowed in OpenShift. We have changed this port now across all places and also adapted our network policies accordingly. This is most likely not the last necessary change to enable this scenario, however, it’s a great improvement to push the project forward.\nIf you’re interested in more details and even more improvements, you can read all the release notes for Gardener v1.8.0.\n","categories":"","description":"","excerpt":"Even if we are in the midst of the summer holidays, a new Gardener …","ref":"/blog/2020/08.06-gardener-v1.8.0-released/","tags":"","title":"Gardener v1.8.0 Released"},{"body":"Gardener is showing successful collaboration with its growing community of contributors and adopters. With this come some success stories, including PingCAP using Gardener to implement its managed service.\nAbout PingCAP and Its TiDB Cloud PingCAP started in 2015, when three seasoned infrastructure engineers working at leading Internet companies got sick and tired of the way databases were managed, scaled and maintained. Seeing no good solution on the market, they decided to build their own - the open-source way. With the help of a first-class team and hundreds of contributors from around the globe, PingCAP is building a distributed NewSQL, hybrid transactional and analytical processing (HTAP) database.\nIts flagship project, TiDB, is a cloud-native distributed SQL database with MySQL compatibility, and one of the most popular open-source database projects - with 23.5K+ stars and 400+ contributors. Its sister project TiKV is a Cloud Native Interactive Landscape project.\nPingCAP envisioned their managed TiDB service, known as TiDB Cloud, to be multi-tenant, secure, cost-efficient, and to be compatible with different cloud providers. As a result, the company turned to Gardener to build their managed TiDB cloud service offering.\nTiDB Cloud Beta Preview Limitations with Other Public Managed Kubernetes Services Previously, PingCAP encountered issues while using other public managed K8s cluster services, to develop the first version of its TiDB Cloud. Their worst pain point was that they felt helpless when encountering certain malfunctions. PingCAP wasn’t able to do much to resolve these issues, except waiting for the providers’ help. More specifically, they experienced problems due to cloud-provider specific Kubernetes system upgrades, delays in the support response (which could be avoided in exchange of a costly support fee), and no control over when things got fixed.\nThere was also a lot of cloud-specific integration work needed to follow a multi-cloud strategy, which proved to be expensive both to produce and maintain. With one of these managed K8s services, you would have to integrate the instance API, as opposed to a solution like Gardener, which provides a unified API for all clouds. Such a unified API eliminates the need to worry about cloud specific-integration work altogether.\nWhy PingCAP Chose Gardener to Build TiDB Cloud “Gardener has similar concepts to Kubernetes. Each Kubernetes cluster is just like a Kubernetes pod, so the similar concepts apply, and the controller pattern makes Gardener easy to manage. It was also easy to extend, as the team was already very familiar with Kubernetes, so it wasn’t hard for us to extend Gardener. We also saw that Gardener has a very active community, which is always a plus!”\n- Aylei Wu, (Cloud Engineer) at PingCAP\n At first glance, PingCAP had initial reservations about using Gardener - mainly due to its adoption level (still at the beginning) and an apparent complexity of use. However, these were soon eliminated as they learned more about the solution. As Aylei Wu mentioned during the last Gardener community meeting, “a good product speaks for itself”, and once the company got familiar with Gardener, they quickly noticed that the concepts were very similar to Kubernetes, which they were already familiar with.\nThey recognized that Gardener would be their best option, as it is highly extensible and provides a unified abstraction API layer. In essence, the machines can be managed via a machine controller manager for different cloud providers - without having to worry about the individual cloud APIs.\nThey agreed that Gardener’s solution, although complex, was definitely worth it. Even though it is a relatively new solution, meaning they didn’t have access to other user testimonials, they decided to go with the service since it checked all the boxes (and as SAP was running it productively with a huge fleet). PingCAP also came to the conclusion that building a managed Kubernetes service themselves would not be easy. Even if they were to build a managed K8s service, they would have to heavily invest in development and would still end up with an even more complex platform than Gardener’s. For all these reasons combined, PingCAP decided to go with Gardener to build its TiDB Cloud.\nHere are certain features of Gardener that PingCAP found appealing:\n Cloud agnostic: Gardener’s abstractions for cloud-specific integrations dramatically reduce the investment in supporting more than one cloud infrastructure. Once the integration with Amazon Web Services was done, moving on to Google Cloud Platform proved to be relatively easy. (At the moment, TiDB Cloud has subscription plans available for both GCP and AWS, and they are planning to support Alibaba Cloud in the future.) Familiar concepts: Gardener is K8s native; its concepts are easily related to core Kubernetes concepts. As such, it was easy to onboard for a K8s experienced team like PingCAP’s SRE team. Easy to manage and extend: Gardener’s API and extensibility are easy to implement, which has a positive impact on the implementation, maintenance costs and time-to-market. Active community: Prompt and quality responses on Slack from the Gardener team tremendously helped to quickly onboard and produce an efficient solution. How PingCAP Built TiDB Cloud with Gardener On a technical level, PingCAP’s set-up overview includes the following:\n A Base Cluster globally, which is the top-level control plane of TiDB Cloud A Seed Cluster per cloud provider per region, which makes up the fundamental data plane of TiDB Cloud A Shoot Cluster is dynamically provisioned per tenant per cloud provider per region when requested A tenant may create one or more TiDB clusters in a Shoot Cluster As a real world example, PingCAP sets up the Base Cluster and Seed Clusters in advance. When a tenant creates its first TiDB cluster under the us-west-2 region of AWS, a Shoot Cluster will be dynamically provisioned in this region, and will host all the TiDB clusters of this tenant under us-west-2. Nevertheless, if another tenant requests a TiDB cluster in the same region, a new Shoot Cluster will be provisioned. Since different Shoot Clusters are located in different VPCs and can even be hosted under different AWS accounts, TiDB Cloud is able to achieve hard isolation between tenants and meet the critical security requirements for our customers.\nTo automate these processes, PingCAP creates a service in the Base Cluster, known as the TiDB Cloud “Central” service. The Central is responsible for managing shoots and the TiDB clusters in the Shoot Clusters. As shown in the following diagram, user operations go to the Central, being authenticated, authorized, validated, stored and then applied asynchronously in a controller manner. The Central will talk to the Gardener API Server to create and scale Shoot clusters. The Central will also access the Shoot API Service to deploy and reconcile components in the Shoot cluster, including control components (TiDB Operator, API Proxy, Usage Reporter for billing, etc.) and the TiDB clusters.\nTiDB Cloud on Gardener Architecture Overview What’s Next for PingCAP and Gardener With the initial success of using the project to build TiDB Cloud, PingCAP is now working heavily on the stability and day-to-day operations of TiDB Cloud on Gardener. This includes writing Infrastructure-as-Code scripts/controllers with it to achieve GitOps, building tools to help diagnose problems across regions and clusters, as well as running chaos tests to identify and eliminate potential risks. After benefiting greatly from the community, PingCAP will continue to contribute back to Gardener.\nIn the future, PingCAP also plans to support more cloud providers like AliCloud and Azure. Moreover, PingCAP may explore the opportunity of running TiDB Cloud in on-premise data centers with the constantly expanding support this project provides. Engineers at PingCAP enjoy the ease of learning from Gardener’s Kubernetes-like concepts and being able to apply them everywhere. Gone are the days of heavy integrations with different clouds and worrying about vendor stability. With this project, PingCAP now sees broader opportunities to land TiDB Cloud on various infrastructures to meet the needs of their global user group.\nStay tuned, more blog posts to come on how Gardener is collaborating with its contributors and adopters to bring fully-managed clusters at scale everywhere! If you want to join in on the fun, connect with our community.\n","categories":"","description":"","excerpt":"Gardener is showing successful collaboration with its growing …","ref":"/blog/2020/05.27-pingcaps-experience/","tags":"","title":"PingCAP’s Experience in Implementing Their Managed TiDB Service with Gardener"},{"body":"The Gardener project website just received a serious facelift. Here are some of the highlights:\n A completely new landing page, emphasizing both on Gardener’s value proposition and the open community behind it. The Community page was reconstructed for quick access to the various community channels and will soon merge the Adopters page. It will provide a better insight into success stories from the community. Improved blogs layout. One-click sharing options are available starting with simple URL copy link and twitter button and others will closely follow up. While we are at it, give it a try. Spread the word. Website builds also got to a new level with:\n Containerization. The whole build environment is containerized now, eliminating differences between local and CI/CD setup and reducing content developers focus only to the /documentation repository. Running a local server for live preview of changes as you make them when developing content for the website, is now as easy as runing make serve in your local /documentation clone. Numerous improvements to the buld scripts. More configuration options, authenticated requests, fault tolerance and performance. Good news for Windows WSL users who will now enjoy a significantly support. See the updated README for details on that. A number of improvements in layouts styles, site assets and hugo site-building techniques. But hey, THAT’S NOT ALL!\nStay tuned for more improvements around the corner. The biggest ones are aligning the documentation with the new theme and restructuring it along, more emphasis on community success stories all around, more sharing options and more than a handful of shortcodes for content development and … let’s cut the spoilers here.\nI hope you will like it. Let us know what you think about it. Feel free to leave comments and discuss on Twitter and Slack, or in case of issues - on GitHub.\nGo ahead and help us spread the word: https://gardener.cloud\n","categories":"","description":"","excerpt":"The Gardener project website just received a serious facelift. Here …","ref":"/blog/2020/05.11-new-website-same-green-flower/","tags":"","title":"New Website, Same Green Flower"},{"body":"TL;DR Note Details of the description might change in the near future since Heptio was taken over by VMWare which might result in different GitHub repositories or other changes. Please don’t hesitate to inform us in case you encounter any issues. In general, Backup and Restore (BR) covers activities enabling an organization to bring a system back in a consistent state, e.g., after a disaster or to setup a new system. These activities vary in a very broad way depending on the applications and its persistency.\nKubernetes objects like Pods, Deployments, NetworkPolicies, etc. configure Kubernetes internal components and might as well include external components like load balancer and persistent volumes of the cloud provider. The BR of external components and their configurations might be difficult to handle in case manual configurations were needed to prepare these components.\nTo set the expectations right from the beginning, this tutorial covers the BR of Kubernetes deployments which might use persistent volumes. The BR of any manual configuration of external components, e.g., via the cloud providers console, is not covered here, as well as the BR of a whole Kubernetes system.\nThis tutorial puts the focus on the open source tool Velero (formerly Heptio Ark) and its functionality to explain the BR process.\n #body-inner blockquote { border: 0; padding: 10px; margin-top: 40px; margin-bottom: 40px; border-radius: 4px; background-color: rgba(0,0,0,0.05); box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23); position:relative; padding-left:60px; } #body-inner blockquote:before { content: \"i\"; font-weight: bold; position: absolute; top: 0; bottom: 0; left: 0; background-color: #00a273; color: white; vertical-align: middle; margin: auto; width: 36px; font-size: 30px; text-align: center; } Basically, Velero allows you to:\n backup and restore your Kubernetes cluster resources and persistent volumes (on-demand or scheduled) backup or restore all objects in your cluster, or filter resources by type, namespace, and/or label by default, all persistent volumes are backed up (configurable) replicate your production environment for development and testing environments define an expiration date per backup execute pre- and post-activities in a container of a pod when a backup is created (see Hooks) extend Velero by Plugins, e.g., for Object and Block store (see Plugins) Velero consists of a server side component and a client tool. The server components consists of Custom Resource Definitions (CRD) and controllers to perform the activities. The client tool communicates with the K8s API server to, e.g., create objects like a Backup object.\nThe diagram below explains the backup process. When creating a backup, Velero client makes a call to the Kubernetes API server to create a Backup object (1). The BackupController notices the new Backup object, validates the object (2) and begins the backup process (3). Based on the filter settings provided by the Velero client it collects the resources in question (3). The BackupController creates a tar ball with the Kubernetes objects and stores it in the backup location, e.g., AWS S3 (4) as well as snapshots of persistent volumes (5).\nThe size of the backup tar ball corresponds to the number of objects in etcd. The gzipped archive contains the Json representations of the objects.\nNote As of the writing of this tutorial, Velero or any other BR tool for Shoot clusters is not provided by Gardener. Getting Started At first, clone the Velero GitHub repository and get the Velero client from the releases or build it from source via make all in the main directory of the cloned GitHub repository.\nTo use an AWS S3 bucket as storage for the backup files and the persistent volumes, you need to:\n create a S3 bucket as the backup target create an AWS IAM user for Velero configure the Velero server create a secret for your AWS credentials For details about this setup, check the Set Permissions for Velero documentation. Moreover, it is possible to use other supported storage providers.\nNote Per default, Velero is installed in the namespace velero. To change the namespace, check the documentation. Velero offers a wide range of filter possibilities for Kubernetes resources, e.g filter by namespaces, labels or resource types. The filter settings can be combined and used as include or exclude, which gives a great flexibility for selecting resources.\nNote Carefully set labels and/or use namespaces for your deployments to make the selection of the resources to be backed up easier. The best practice would be to check in advance which resources are selected with the defined filter. Exemplary Use Cases Below are some use cases which could give you an idea on how to use Velero. You can also check Velero’s documentation for other introductory examples.\nHelm Based Deployments To be able to use Helm charts in your Kubernetes cluster, you need to install the Helm client helm and the server component tiller. Per default the server component is installed in the namespace kube-system. Even if it is possible to select single deployments via the filter settings of Velero, you should consider to install tiller in a separate namespace via helm init --tiller-namespace \u003cyour namespace\u003e. This approach applies as well for all Helm charts to be deployed - consider separate namespaces for your deployments as well by using the parameter --namespace.\nTo backup a Helm based deployment, you need to backup both Tiller and the deployment. Only then the deployments could be managed via Helm. As mentioned above, the selection of resources would be easier in case they are separated in namespaces.\nSeparate Backup Locations In case you run all your Kubernetes clusters on a single cloud provider, there is probably no need to store the backups in a bucket of a different cloud provider. However, if you run Kubernetes clusters on different cloud provider, you might consider to use a bucket on just one cloud provider as the target for the backups, e.g., to benefit from a lower price tag for the storage.\nPer default, Velero assumes that both the persistent volumes and the backup location are on the same cloud provider. During the setup of Velero, a secret is created using the credentials for a cloud provider user who has access to both objects (see the policies, e.g., for the AWS configuration).\nNow, since the backup location is different from the volume location, you need to follow these steps (described here for AWS):\n configure as documented the volume storage location in examples/aws/06-volumesnapshotlocation.yaml and provide the user credentials. In this case, the S3 related settings like the policies can be omitted\n create the bucket for the backup in the cloud provider in question and a user with the appropriate credentials and store them in a separate file similar to credentials-ark\n create a secret which contains two credentials, one for the volumes and one for the backup target, e.g., by using the command kubectl create secret generic cloud-credentials --namespace heptio-ark --from-file cloud=credentials-ark --from-file backup-target=backup-ark\n configure in the deployment manifest examples/aws/10-deployment.yaml the entries in volumeMounts, env and volumes accordingly, e.g., for a cluster running on AWS and the backup target bucket on GCP a configuration could look similar to:\n Note Some links might get broken in the near future since Heptio was taken over by VMWare which might result in different GitHub repositories or other changes. Please don’t hesitate to inform us in case you encounter any issues. Example Velero deployment # Copyright 2017 the Heptio Ark contributors. # # Licensed under the Apache License, Version 2.0 (the \"License\"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an \"AS IS\" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. --- apiVersion: apps/v1beta1 kind: Deployment metadata: namespace: velero name: velero spec: replicas: 1 template: metadata: labels: component: velero annotations: prometheus.io/scrape: \"true\" prometheus.io/port: \"8085\" prometheus.io/path: \"/metrics\" spec: restartPolicy: Always serviceAccountName: velero containers: - name: velero image: gcr.io/heptio-images/velero:latest command: - /velero args: - server volumeMounts: - name: cloud-credentials mountPath: /credentials - name: plugins mountPath: /plugins - name: scratch mountPath: /scratch env: - name: AWS_SHARED_CREDENTIALS_FILE value: /credentials/cloud - name: GOOGLE_APPLICATION_CREDENTIALS value: /credentials/backup-target - name: VELERO_SCRATCH_DIR value: /scratch volumes: - name: cloud-credentials secret: secretName: cloud-credentials - name: plugins emptyDir: {} - name: scratch emptyDir: {} finally, configure the backup storage location in examples/aws/05-backupstoragelocation.yaml to use, in this case, a GCP bucket\n Limitations Below is a potentially incomplete list of limitations. You can also consult Velero’s documentation to get up to date information.\n Only full backups of selected resources are supported. Incremental backups are not (yet) supported. However, by using filters it is possible to restrict the backup to specific resources Inconsistencies might occur in case of changes during the creation of the backup Application specific actions are not considered by default. However, they might be handled by using Velero’s Hooks or Plugins ","categories":"","description":"Details about backup and recovery of Kubernetes objects based on the open source tool [Velero](https://velero.io/).","excerpt":"Details about backup and recovery of Kubernetes objects based on the …","ref":"/docs/guides/administer-shoots/backup-restore/","tags":"","title":"Backup and Restore of Kubernetes Objects"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2019/","tags":"","title":"2019"},{"body":"Feature flags are used to change the behavior of a program at runtime without forcing a restart.\nAlthough they are essential in a native cloud environment, they cannot be implemented without significant effort on some platforms. Kubernetes has made this trivial. Here we will implement them through labels and annotations, but you can also implement them by connecting directly to the Kubernetes API Server.\nPossible Use Cases Turn on/off a specific instance Turn on/off the profiling of a specific instance Change the logging level, to capture detailed logs during a specific event Change caching strategy at runtime Change timeouts in production Toggle on/off some special verification ","categories":"","description":"","excerpt":"Feature flags are used to change the behavior of a program at runtime …","ref":"/blog/2019/06.11-feature-flags-in-kubernetes-applications/","tags":"","title":"Feature Flags in Kubernetes Applications"},{"body":"The kubectl command-line tool uses kubeconfig files to find the information it needs in order to choose a cluster and communicate with its API server.\nWhat happens if the kubeconfig file of your production cluster is leaked or published by accident? Since there is no possibility to rotate or revoke the initial kubeconfig, there is only one way to protect your infrastructure or application if the kubeconfig has leaked - delete the cluster.\nLearn more on Organizing Access Using kubeconfig Files.\n","categories":"","description":"","excerpt":"The kubectl command-line tool uses kubeconfig files to find the …","ref":"/blog/2019/06.11-organizing-access-using-kubeconfig-files/","tags":"","title":"Organizing Access Using kubeconfig Files"},{"body":"Speed up Your Terminal Workflow Use the Kubernetes command-line tool, kubectl, to deploy and manage applications on Kubernetes. Using kubectl, you can inspect cluster resources, as well as create, delete, and update components.\nYou will probably run more than a hundred kubectl commands on some days and you should speed up your terminal workflow with with some shortcuts. Of course, there are good shortcuts and bad shortcuts (lazy coding, lack of security review, etc.), but let’s stick with the positives and talk about a good shortcut: bash aliases in your .profile.\nWhat are those mysterious .profile and .bash_profile files you’ve heard about?\nNote The contents of a .profile file are executed on every log-in of the owner of the file What’s the .bash_profile then? It’s exactly the same, but under a different name. The unix shell you are logging into, in this case OS X, looks for etc/profile and loads it if it exists. Then it looks for ~/.bash_profile, ~/.bash_login and finally ~/.profile, and loads the first one of these it finds.\nPopulating the .profile File Here is the fantastic time saver that needs to be in your shell profile:\n# time save number one. shortcut for kubectl # alias k=\"kubectl\" # Start a shell in a pod AND kill them after leaving # alias ksh=\"kubectl run busybox -i --tty --image=busybox --restart=Never --rm -- sh\" # opens a bash # alias kbash=\"kubectl run busybox -i --tty --image=busybox --restart=Never --rm -- ash\" # activate/exports the kuberconfig.yaml in the current working directory # alias kexport=\"export KUBECONFIG=`pwd`/kubeconfig.yaml\" # usage: kurl http://your-svc.namespace.cluster.local # # we need for this our very own image...never trust an unknown image.. alias kurl=\"docker run --rm byrnedo/alpine-curl\" All the kubectl tab completions still work fine with these aliases, so you’re not losing that speed.\nNote If the approach above does not work for you add the following lines in your ~/.bashrc instead:\n# time save number one. shortcut for kubectl # alias k=\"kubectl\" # Enable kubectl completion source \u003c(k completion bash | sed s/kubectl/k/g) ","categories":"","description":"Some bash tips that save you some time","excerpt":"Some bash tips that save you some time","ref":"/docs/guides/client-tools/bash-tips/","tags":"","title":"Fun with kubectl Aliases"},{"body":"","categories":"","description":"","excerpt":"","ref":"/blog/2018/","tags":"","title":"2018"},{"body":"Green Tea Matcha Cookies For a team event during the Christmas season we decided to completely reinterpret the topic cookies. :-)\nMatcha cookies have the delicate flavor and color of green tea. These soft, pillowy and chewy green tea cookies are perfect with tea. And of course they fit perfectly to our logo.\nIngredients 1 stick butter, softened ⅞ cup of granulated sugar 1 cup + 2 tablespoons all-purpose flour 2 eggs 1¼ tablespoons culinary grade matcha powder 1 teaspoon baking powder pinch of salt Instructions Cream together the butter and sugar in a large mixing bowl - it should be creamy colored and airy. A hand blender or stand mixer works well for this. This helps the cookie become fluffy and chewy. Gently incorporate the eggs to the butter mixture one at a time. In a separate bowl, sift together all the dry ingredients. Add the dry ingredients to the wet by adding a little at a time and folding or gently mixing the batter together. Keep going until you’ve incorporated all the remaining flour mixture. The dough should be a beautiful green color. Chill the dough for at least an hour - up to overnight. The longer the better! Preheat your oven to 325 F. Roll the dough into balls the size of ping pong balls and place them on a non-stick cookie sheet. Bake them for 12-15 minutes until the bottoms just start to become golden brown and the cookie no longer looks wet in the middle. Note: you can always bake them at 350 F for a less moist, fluffy cookie. It will bake faster by about 2-4 minutes 350 F so watch them closely. Remove and let cool on a rack and enjoy! Note Make sure you get culinary grade matcha powder. You should be able to find this in Asian or natural grocers.\n","categories":"","description":"","excerpt":"Green Tea Matcha Cookies For a team event during the Christmas season …","ref":"/blog/2018/12.25-gardener_cookies/","tags":"","title":"Gardener Cookies"},{"body":"…they mess up the figure.\nFor a team event during the Christmas season we decided to completely reinterpret the topic cookies… since the vegetables have gone on a well-deserved vacation. :-)\nGet the recipe at Gardener Cookies.\n","categories":"","description":"","excerpt":"…they mess up the figure.\nFor a team event during the Christmas season …","ref":"/blog/2018/12.22-cookies-are-dangerous/","tags":"","title":"Cookies Are Dangerous..."},{"body":"You want to experiment with Kubernetes or set up a customer scenario, but don’t want to run the cluster 24 / 7 due to cost reasons?\nGardener gives you the possibility to scale your cluster down to zero nodes.\nLearn more on Hibernate a Cluster.\n","categories":"","description":"","excerpt":"You want to experiment with Kubernetes or set up a customer scenario, …","ref":"/blog/2018/07.11-hibernate-a-cluster-to-save-money/","tags":"","title":"Hibernate a Cluster to Save Money"},{"body":"Running as Root User Whenever possible, do not run containers as root users. One could be tempted to say that in Kubernetes, the node and pods are well separated, however, the host and the container share the same kernel. If the container is compromised, a root user can damage the underlying node.\nInstead of running a root user, use RUN groupadd -r anygroup \u0026\u0026 useradd -r -g anygroup myuser to create a group and a user in it. Use the USER command to switch to this user.\nStoring Data or Logs in Containers Containers are ideal for stateless applications and should be transient. This means that no data or logs should be stored in the container, as they are lost when the container is closed. If absolutely necessary, you can use persistence volumes instead to persist them outside the containers.\nHowever, an ELK stack is preferred for storing and processing log files.\nLearn more on Common Kubernetes Antipattern.\n","categories":"","description":"","excerpt":"Running as Root User Whenever possible, do not run containers as root …","ref":"/blog/2018/06.11-anti-patterns/","tags":"","title":"Anti Patterns"},{"body":"In summer 2018, the Gardener project team asked Kinvolk to execute several penetration tests in its role as a third-party contractor. The goal of this ongoing work is to increase the security of all Gardener stakeholders in the open source community. Following the Gardener architecture, the control plane of a Gardener managed shoot cluster resides in the corresponding seed cluster. This is a Control-Plane-as-a-Service with a network air gap.\nAlong the way we found various kinds of security issues, for example, due to misconfiguration or missing isolation, as well as two special problems with upstream Kubernetes and its Control-Plane-as-a-Service architecture.\nLearn more on Auditing Kubernetes for Secure Setup.\n","categories":"","description":"","excerpt":"In summer 2018, the Gardener project team asked Kinvolk to execute …","ref":"/blog/2018/06.11-auditing-kubernetes-for-secure-setup/","tags":"","title":"Auditing Kubernetes for Secure Setup"},{"body":"Microservices tend to use smaller runtimes but you can use what you have today - and this can be a problem in Kubernetes.\nSwitching your architecture from a monolith to microservices has many advantages, both in the way you write software and the way it is used throughout its lifecycle. In this post, my attempt is to cover one problem which does not get as much attention and discussion - size of the technology stack.\nGeneral Purpose Technology Stack There is a tendency to be more generalized in development and to apply this pattern to all services. One feels that a homogeneous image of the technology stack is good if it is the same for all services.\nOne forgets, however, that a large percentage of the integrated infrastructure is not used by all services in the same way, and is therefore only a burden. Thus, resources are wasted and the entire application becomes expensive in operation and scales very badly.\nLight Technology Stack Due to the lightweight nature of your service, you can run more containers on a physical server and virtual machines. The result is higher resource utilization.\nAdditionally, microservices are developed and deployed as containers independently of each another. This means that a development team can develop, optimize, and deploy a microservice without impacting other subsystems.\n","categories":"","description":"","excerpt":"Microservices tend to use smaller runtimes but you can use what you …","ref":"/blog/2018/06.11-big-things-come-in-small-packages/","tags":"","title":"Big Things Come in Small Packages"},{"body":"The Gardener project team has analyzed the impact of the Gardener CVE-2018-2475 and the Kubernetes CVE-2018-1002105 on the Gardener Community Setup. Following some recommendations it is possible to mitigate both vulnerabilities.\n","categories":"","description":"","excerpt":"The Gardener project team has analyzed the impact of the Gardener …","ref":"/blog/2018/06.11-hardening-the-gardener-community-setup/","tags":"","title":"Hardening the Gardener Community Setup"},{"body":" Kubernetes is only available in Docker for Mac 17.12 CE and higher on the Edge channel. Kubernetes support is not included in Docker for Mac Stable releases. To find out more about Stable and Edge channels and how to switch between them, see general configuration. Docker for Mac 17.12 CE (and higher) Edge includes a standalone Kubernetes server that runs on Mac, so that you can test deploying your Docker workloads on Kubernetes. The Kubernetes client command, kubectl, is included and configured to connect to the local Kubernetes server. If you have kubectl already installed and pointing to some other environment, such as minikube or a GKE cluster, be sure to change the context so that kubectl is pointing to docker-for-desktop. Read more on Docker.com.\nI recommend to setup your shell to see which KUBECONFIG is active.\n","categories":"","description":"","excerpt":" Kubernetes is only available in Docker for Mac 17.12 CE and higher …","ref":"/blog/2018/06.11-kubernetes-is-available-in-docker-for-mac-17-12-ce/","tags":"","title":"Kubernetes is Available in Docker for Mac 17.12 CE"},{"body":"…or DENY all traffic from other namespaces\nYou can configure a NetworkPolicy to deny all traffic from other namespaces while allowing all traffic coming from the same namespace the pod is deployed to. There are many reasons why you may choose to configure Kubernetes network policies:\n Isolate multi-tenant deployments Regulatory compliance Ensure containers assigned to different environments (e.g. dev/staging/prod) cannot interfere with each another Learn more on Namespace Isolation.\n","categories":"","description":"","excerpt":"…or DENY all traffic from other namespaces\nYou can configure a …","ref":"/blog/2018/06.11-namespace-isolation/","tags":"","title":"Namespace Isolation"},{"body":"Should I use:\n❌ one namespace per user/developer? ❌ one namespace per team? ❌ one per service type? ❌ one namespace per application type? 😄 one namespace per running instance of your application? Apply the Principle of Least Privilege\nAll user accounts should run as few privileges as possible at all times, and also launch applications with as few privileges as possible. If you share a cluster for a different user separated by a namespace, the user has access to all namespaces and services per default. It can happen that a user accidentally uses and destroys the namespace of a productive application or the namespace of another developer.\nKeep in mind - By default namespaces don’t provide:\n Network Isolation Access Control Audit Logging on user level ","categories":"","description":"","excerpt":"Should I use:\n❌ one namespace per user/developer? ❌ one namespace per …","ref":"/blog/2018/06.11-namespace-scope/","tags":"","title":"Namespace Scope"},{"body":"The efs-provisioner allows you to mount EFS storage as PersistentVolumes in Kubernetes. It consists of a container that has access to an AWS EFS resource. The container reads a configmap containing the EFS filesystem ID, the AWS region and the name identifying the efs-provisioner. This name will be used later when you create a storage class.\nWhy EFS When you have an application running on multiple nodes which require shared access to a file system. When you have an application that requires multiple virtual machines to access the same file system at the same time, AWS EFS is a tool that you can use. EFS supports encryption. EFS is SSD based storage and its storage capacity and pricing will scale in or out as needed, so there is no need for the system administrator to do additional operations. It can grow to a petabyte scale. EFS now supports NFSv4 lock upgrading and downgrading, so yes, you can use sqlite with EFS… even if it was possible before. EFS is easy to setup. Why Not EFS Sometimes when you think about using a service like EFS, you may also think about vendor lock-in and its negative sides. Making an EFS backup may decrease your production FS performance; the throughput used by backups counts towards your total file system throughput. EFS is expensive when compared to EBS (roughly twice the price of EBS storage). EFS is not the magical solution for all your distributed FS problems, it can be slow in many cases. Test, benchmark, and measure to ensure that EFS is a good solution for your use case. EFS distributed architecture results in a latency overhead for each file read/write operation. If you have the possibility to use a CDN, don’t use EFS, use it for the files which can’t be stored in a CDN. Don’t use EFS as a caching system, sometimes you could be doing this unintentionally. Last but not least, even if EFS is a fully managed NFS, you will face performance problems in many cases, resolving them takes time and needs effort. ","categories":"","description":"","excerpt":"The efs-provisioner allows you to mount EFS storage as …","ref":"/blog/2018/06.11-readwritemany-dynamically-provisioned-persistent-volumes-using-amazon-efs/","tags":"","title":"ReadWriteMany - Dynamically Provisioned Persistent Volumes Using Amazon EFS"},{"body":"The storage is definitely the most complex and important part of an application setup. Once this part is completed, one of the most problematic parts could be solved.\nMounting an S3 bucket into a pod using FUSE allows you to access data stored in S3 via the filesystem. The mount is a pointer to an S3 location, so the data is never synced locally. Once mounted, any pod can read or even write from that directory without the need for explicit keys.\nHowever, it can be used to import and parse large amounts of data into a database.\nLearn more on Shared S3 Storage.\n","categories":"","description":"","excerpt":"The storage is definitely the most complex and important part of an …","ref":"/blog/2018/06.11-shared-storage-with-s3-backend/","tags":"","title":"Shared Storage with S3 Backend"},{"body":"One thing that always bothered me was that I couldn’t get the logs of several pods at once with kubectl. A simple tail -f \u003cpath-to-logfile\u003e isn’t possible. Certainly, you can use kubectl logs -f \u003cpod-id\u003e, but it doesn’t help if you want to monitor more than one pod at a time.\nThis is something you really need a lot, at least if you run several instances of a pod behind a deploymentand you don’t have a log viewer service like Kibana set up.\nIn that case, kubetail comes to the rescue. It is a small bash script that allows you to aggregate the log files of several pods at the same time in a simple way. The script is called kubetail and is available at GitHub.\n","categories":"","description":"","excerpt":"One thing that always bothered me was that I couldn’t get the logs of …","ref":"/blog/2018/06.11-watching-logs-of-several-pods/","tags":"","title":"Watching Logs of Several Pods"},{"body":"Multi-node etcd cluster instances via etcd-druid This document proposes an approach (along with some alternatives) to support provisioning and management of multi-node etcd cluster instances via etcd-druid and etcd-backup-restore.\nContent Multi-node etcd cluster instances via etcd-druid Content Goal Background and Motivation Single-node etcd cluster Multi-node etcd-cluster Dynamic multi-node etcd cluster Prior Art ETCD Operator from CoreOS etcdadm from kubernetes-sigs Etcd Cluster Operator from Improbable-Engineering General Approach to ETCD Cluster Management Bootstrapping Assumptions Adding a new member to an etcd cluster Note Alternative Managing Failures Removing an existing member from an etcd cluster Restarting an existing member of an etcd cluster Recovering an etcd cluster from failure of majority of members Kubernetes Context Alternative ETCD Configuration Alternative Data Persistence Persistent Ephemeral Disk In-memory How to detect if valid metadata exists in an etcd member Recommendation How to detect if valid data exists in an etcd member Recommendation Separating peer and client traffic Cutting off client requests Manipulating Client Service podSelector Health Check Backup Failure Alternative Status Members Note Member name as the key Member Leases Conditions ClusterSize Alternative Decision table for etcd-druid based on the status 1. Pink of health Observed state Recommended Action 2. Member status is out of sync with their leases Observed state Recommended Action 3. All members are Ready but AllMembersReady condition is stale Observed state Recommended Action 4. Not all members are Ready but AllMembersReady condition is stale Observed state Recommended Action 5. Majority members are Ready but Ready condition is stale Observed state Recommended Action 6. Majority members are NotReady but Ready condition is stale Observed state Recommended Action 7. Some members have been in Unknown status for a while Observed state Recommended Action 8. Some member pods are not Ready but have not had the chance to update their status Observed state Recommended Action 9. Quorate cluster with a minority of members NotReady Observed state Recommended Action 10. Quorum lost with a majority of members NotReady Observed state Recommended Action 11. Scale up of a healthy cluster Observed state Recommended Action 12. Scale down of a healthy cluster Observed state Recommended Action 13. Superfluous member entries in Etcd status Observed state Recommended Action Decision table for etcd-backup-restore during initialization 1. First member during bootstrap of a fresh etcd cluster Observed state Recommended Action 2. Addition of a new following member during bootstrap of a fresh etcd cluster Observed state Recommended Action 3. Restart of an existing member of a quorate cluster with valid metadata and data Observed state Recommended Action 4. Restart of an existing member of a quorate cluster with valid metadata but without valid data Observed state Recommended Action 5. Restart of an existing member of a quorate cluster without valid metadata Observed state Recommended Action 6. Restart of an existing member of a non-quorate cluster with valid metadata and data Observed state Recommended Action 7. Restart of the first member of a non-quorate cluster without valid data Observed state Recommended Action 8. Restart of a following member of a non-quorate cluster without valid data Observed state Recommended Action Backup Leading ETCD main container’s sidecar is the backup leader Independent leader election between backup-restore sidecars History Compaction Defragmentation Work-flows in etcd-backup-restore Work-flows independent of leader election in all members Work-flows only on the leading member High Availability Zonal Cluster - Single Availability Zone Alternative Regional Cluster - Multiple Availability Zones Alternative PodDisruptionBudget Rolling updates to etcd members Follow Up Ephemeral Volumes Shoot Control-Plane Migration Performance impact of multi-node etcd clusters Metrics, Dashboards and Alerts Costs Future Work Gardener Ring Autonomous Shoot Clusters Optimization of recovery from non-quorate cluster with some member containing valid data Optimization of rolling updates to unhealthy etcd clusters Goal Enhance etcd-druid and etcd-backup-restore to support provisioning and management of multi-node etcd cluster instances within a single Kubernetes cluster. The etcd CRD interface should be simple to use. It should preferably work with just setting the spec.replicas field to the desired value and should not require any more configuration in the CRD than currently required for the single-node etcd instances. The spec.replicas field is part of the scale sub-resource implementation in Etcd CRD. The single-node and multi-node scenarios must be automatically identified and managed by etcd-druid and etcd-backup-restore. The etcd clusters (single-node or multi-node) managed by etcd-druid and etcd-backup-restore must automatically recover from failures (even quorum loss) and disaster (e.g. etcd member persistence/data loss) as much as possible. It must be possible to dynamically scale an etcd cluster horizontally (even between single-node and multi-node scenarios) by simply scaling the Etcd scale sub-resource. It must be possible to (optionally) schedule the individual members of an etcd clusters on different nodes or even infrastructure availability zones (within the hosting Kubernetes cluster). Though this proposal tries to cover most aspects related to single-node and multi-node etcd clusters, there are some more points that are not goals for this document but are still in the scope of either etcd-druid/etcd-backup-restore and/or gardener. In such cases, a high-level description of how they can be addressed in the future are mentioned at the end of the document.\nBackground and Motivation Single-node etcd cluster At present, etcd-druid supports only single-node etcd cluster instances. The advantages of this approach are given below.\n The problem domain is smaller. There are no leader election and quorum related issues to be handled. It is simpler to setup and manage a single-node etcd cluster. Single-node etcd clusters instances have less request latency than multi-node etcd clusters because there is no requirement to replicate the changes to the other members before committing the changes. etcd-druid provisions etcd cluster instances as pods (actually as statefulsets) in a Kubernetes cluster and Kubernetes is quick (\u003c20s) to restart container/pods if they go down. Also, etcd-druid is currently only used by gardener to provision etcd clusters to act as back-ends for Kubernetes control-planes and Kubernetes control-plane components (kube-apiserver, kubelet, kube-controller-manager, kube-scheduler etc.) can tolerate etcd going down and recover when it comes back up. Single-node etcd clusters incur less cost (CPU, memory and storage) It is easy to cut-off client requests if backups fail by using readinessProbe on the etcd-backup-restore healthz endpoint to minimize the gap between the latest revision and the backup revision. The disadvantages of using single-node etcd clusters are given below.\n The database verification step by etcd-backup-restore can introduce additional delays whenever etcd container/pod restarts (in total ~20-25s). This can be much longer if a database restoration is required. Especially, if there are incremental snapshots that need to be replayed (this can be mitigated by compacting the incremental snapshots in the background). Kubernetes control-plane components can go into CrashloopBackoff if etcd is down for some time. This is mitigated by the dependency-watchdog. But Kubernetes control-plane components require a lot of resources and create a lot of load on the etcd cluster and the apiserver when they come out of CrashloopBackoff. Especially, in medium or large sized clusters (\u003e 20 nodes). Maintenance operations such as updates to etcd (and updates to etcd-druid of etcd-backup-restore), rolling updates to the nodes of the underlying Kubernetes cluster and vertical scaling of etcd pods are disruptive because they cause etcd pods to be restarted. The vertical scaling of etcd pods is somewhat mitigated during scale down by doing it only during the target clusters’ maintenance window. But scale up is still disruptive. We currently use some form of elastic storage (via persistentvolumeclaims) for storing which have some upper-bounds on the I/O latency and throughput. This can be potentially be a problem for large clusters (\u003e 220 nodes). Also, some cloud providers (e.g. Azure) take a long time to attach/detach volumes to and from machines which increases the down time to the Kubernetes components that depend on etcd. It is difficult to use ephemeral/local storage (to achieve better latency/throughput as well as to circumvent volume attachment/detachment) for single-node etcd cluster instances. Multi-node etcd-cluster The advantages of introducing support for multi-node etcd clusters via etcd-druid are below.\n Multi-node etcd cluster is highly-available. It can tolerate disruption to individual etcd pods as long as the quorum is not lost (i.e. more than half the etcd member pods are healthy and ready). Maintenance operations such as updates to etcd (and updates to etcd-druid of etcd-backup-restore), rolling updates to the nodes of the underlying Kubernetes cluster and vertical scaling of etcd pods can be done non-disruptively by respecting poddisruptionbudgets for the various multi-node etcd cluster instances hosted on that cluster. Kubernetes control-plane components do not see any etcd cluster downtime unless quorum is lost (which is expected to be lot less frequent than current frequency of etcd container/pod restarts). We can consider using ephemeral/local storage for multi-node etcd cluster instances because individual member restarts can afford to take time to restore from backup before (re)joining the etcd cluster because the remaining members serve the requests in the meantime. High-availability across availability zones is also possible by specifying (anti)affinity for the etcd pods (possibly via kupid). Some disadvantages of using multi-node etcd clusters due to which it might still be desirable, in some cases, to continue to use single-node etcd cluster instances in the gardener context are given below.\n Multi-node etcd cluster instances are more complex to manage. The problem domain is larger including the following. Leader election Quorum loss Managing rolling changes Backups to be taken from only the leading member. More complex to cut-off client requests if backups fail to minimize the gap between the latest revision and the backup revision is under control. Multi-node etcd cluster instances incur more cost (CPU, memory and storage). Dynamic multi-node etcd cluster Though it is not part of this proposal, it is conceivable to convert a single-node etcd cluster into a multi-node etcd cluster temporarily to perform some disruptive operation (etcd, etcd-backup-restore or etcd-druid updates, etcd cluster vertical scaling and perhaps even node rollout) and convert it back to a single-node etcd cluster once the disruptive operation has been completed. This will necessarily still involve a down-time because scaling from a single-node etcd cluster to a three-node etcd cluster will involve etcd pod restarts, it is still probable that it can be managed with a shorter down time than we see at present for single-node etcd clusters (on the other hand, converting a three-node etcd cluster to five node etcd cluster can be non-disruptive).\nThis is definitely not to argue in favour of such a dynamic approach in all cases (eventually, if/when dynamic multi-node etcd clusters are supported). On the contrary, it makes sense to make use of static (fixed in size) multi-node etcd clusters for production scenarios because of the high-availability.\nPrior Art ETCD Operator from CoreOS etcd operator\nProject status: archived\nThis project is no longer actively developed or maintained. The project exists here for historical reference. If you are interested in the future of the project and taking over stewardship, please contact etcd-dev@googlegroups.com.\n etcdadm from kubernetes-sigs etcdadm is a command-line tool for operating an etcd cluster. It makes it easy to create a new cluster, add a member to, or remove a member from an existing cluster. Its user experience is inspired by kubeadm.\n It is a tool more tailored for manual command-line based management of etcd clusters with no API’s. It also makes no assumptions about the underlying platform on which the etcd clusters are provisioned and hence, doesn’t leverage any capabilities of Kubernetes.\nEtcd Cluster Operator from Improbable-Engineering Etcd Cluster Operator\nEtcd Cluster Operator is an Operator for automating the creation and management of etcd inside of Kubernetes. It provides a custom resource definition (CRD) based API to define etcd clusters with Kubernetes resources, and enable management with native Kubernetes tooling._\n Out of all the alternatives listed here, this one seems to be the only possible viable alternative. Parts of its design/implementations are similar to some of the approaches mentioned in this proposal. However, we still don’t propose to use it as -\n The project is still in early phase and is not mature enough to be consumed as is in productive scenarios of ours. The resotration part is completely different which makes it difficult to adopt as-is and requries lot of re-work with the current restoration semantics with etcd-backup-restore making the usage counter-productive. General Approach to ETCD Cluster Management Bootstrapping There are three ways to bootstrap an etcd cluster which are static, etcd discovery and DNS discovery. Out of these, the static way is the simplest (and probably faster to bootstrap the cluster) and has the least external dependencies. Hence, it is preferred in this proposal. But it requires that the initial (during bootstrapping) etcd cluster size (number of members) is already known before bootstrapping and that all of the members are already addressable (DNS,IP,TLS etc.). Such information needs to be passed to the individual members during startup using the following static configuration.\n ETCD_INITIAL_CLUSTER The list of peer URLs including all the members. This must be the same as the advertised peer URLs configuration. This can also be passed as initial-cluster flag to etcd. ETCD_INITIAL_CLUSTER_STATE This should be set to new while bootstrapping an etcd cluster. ETCD_INITIAL_CLUSTER_TOKEN This is a token to distinguish the etcd cluster from any other etcd cluster in the same network. Assumptions ETCD_INITIAL_CLUSTER can use DNS instead of IP addresses. We need to verify this by deleting a pod (as against scaling down the statefulset) to ensure that the pod IP changes and see if the recreated pod (by the statefulset controller) re-joins the cluster automatically. DNS for the individual members is known or computable. This is true in the case of etcd-druid setting up an etcd cluster using a single statefulset. But it may not necessarily be true in other cases (multiple statefulset per etcd cluster or deployments instead of statefulsets or in the case of etcd cluster with members distributed across more than one Kubernetes cluster. Adding a new member to an etcd cluster A new member can be added to an existing etcd cluster instance using the following steps.\n If the latest backup snapshot exists, restore the member’s etcd data to the latest backup snapshot. This can reduce the load on the leader to bring the new member up to date when it joins the cluster. If the latest backup snapshot doesn’t exist or if the latest backup snapshot is not accessible (please see backup failure) and if the cluster itself is quorate, then the new member can be started with an empty data. But this will will be suboptimal because the new member will fetch all the data from the leading member to get up-to-date. The cluster is informed that a new member is being added using the MemberAdd API including information like the member name and its advertised peer URLs. The new etcd member is then started with ETCD_INITIAL_CLUSTER_STATE=existing apart from other required configuration. This proposal recommends this approach.\nNote If there are incremental snapshots (taken by etcd-backup-restore), they cannot be applied because that requires the member to be started in isolation without joining the cluster which is not possible. This is acceptable if the amount of incremental snapshots are managed to be relatively small. This adds one more reason to increase the priority of the issue of incremental snapshot compaction. There is a time window, between the MemberAdd call and the new member joining the cluster and getting up to date, where the cluster is vulnerable to leader elections which could be disruptive. Alternative With v3.4, the new raft learner approach can be used to mitigate some of the possible disruptions mentioned above. Then the steps will be as follows.\n If the latest backup snapshot exists, restore the member’s etcd data to the latest backup snapshot. This can reduce the load on the leader to bring the new member up to date when it joins the cluster. The cluster is informed that a new member is being added using the MemberAddAsLearner API including information like the member name and its advertised peer URLs. The new etcd member is then started with ETCD_INITIAL_CLUSTER_STATE=existing apart from other required configuration. Once the new member (learner) is up to date, it can be promoted to a full voting member by using the MemberPromote API This approach is new and involves more steps and is not recommended in this proposal. It can be considered in future enhancements.\nManaging Failures A multi-node etcd cluster may face failures of diffent kinds during its life-cycle. The actions that need to be taken to manage these failures depend on the failure mode.\nRemoving an existing member from an etcd cluster If a member of an etcd cluster becomes unhealthy, it must be explicitly removed from the etcd cluster, as soon as possible. This can be done by using the MemberRemove API. This ensures that only healthy members participate as voting members.\nA member of an etcd cluster may be removed not just for managing failures but also for other reasons such as -\n The etcd cluster is being scaled down. I.e. the cluster size is being reduced An existing member is being replaced by a new one for some reason (e.g. upgrades) If the majority of the members of the etcd cluster are healthy and the member that is unhealthy/being removed happens to be the leader at that moment then the etcd cluster will automatically elect a new leader. But if only a minority of etcd clusters are healthy after removing the member then the the cluster will no longer be quorate and will stop accepting write requests. Such an etcd cluster needs to be recovered via some kind of disaster-recovery.\nRestarting an existing member of an etcd cluster If the existing member of an etcd cluster restarts and retains an uncorrupted data directory after the restart, then it can simply re-join the cluster as an existing member without any API calls or configuration changes. This is because the relevant metadata (including member ID and cluster ID) are maintained in the write ahead logs. However, if it doesn’t retain an uncorrupted data directory after the restart, then it must first be removed and added as a new member.\nRecovering an etcd cluster from failure of majority of members If a majority of members of an etcd cluster fail but if they retain their uncorrupted data directory then they can be simply restarted and they will re-form the existing etcd cluster when they come up. However, if they do not retain their uncorrupted data directory, then the etcd cluster must be recovered from latest snapshot in the backup. This is very similar to bootstrapping with the additional initial step of restoring the latest snapshot in each of the members. However, the same limitation about incremental snapshots, as in the case of adding a new member, applies here. But unlike in the case of adding a new member, not applying incremental snapshots is not acceptable in the case of etcd cluster recovery. Hence, if incremental snapshots are required to be applied, the etcd cluster must be recovered in the following steps.\n Restore a new single-member cluster using the latest snapshot. Apply incremental snapshots on the single-member cluster. Take a full snapshot which can now be used while adding the remaining members. Add new members using the latest snapshot created in the step above. Kubernetes Context Users will provision an etcd cluster in a Kubernetes cluster by creating an etcd CRD resource instance. A multi-node etcd cluster is indicated if the spec.replicas field is set to any value greater than 1. The etcd-druid will add validation to ensure that the spec.replicas value is an odd number according to the requirements of etcd. The etcd-druid controller will provision a statefulset with the etcd main container and the etcd-backup-restore sidecar container. It will pass on the spec.replicas field from the etcd resource to the statefulset. It will also supply the right pre-computed configuration to both the containers. The statefulset controller will create the pods based on the pod template in the statefulset spec and these individual pods will be the members that form the etcd cluster. This approach makes it possible to satisfy the assumption that the DNS for the individual members of the etcd cluster must be known/computable. This can be achieved by using a headless service (along with the statefulset) for each etcd cluster instance. Then we can address individual pods/etcd members via the predictable DNS name of \u003cstatefulset_name\u003e-{0|1|2|3|…|n}.\u003cheadless_service_name\u003e from within the Kubernetes namespace (or from outside the Kubernetes namespace by appending .\u003cnamespace\u003e.svc.\u003ccluster_domain\u003e suffix). The etcd-druid controller can compute the above configurations automatically based on the spec.replicas in the etcd resource.\nThis proposal recommends this approach.\nAlternative One statefulset is used for each member (instead of one statefulset for all members). While this approach gives a flexibility to have different pod specifications for the individual members, it makes managing the individual members (e.g. rolling updates) more complicated. Hence, this approach is not recommended.\nETCD Configuration As mentioned in the general approach section, there are differences in the configuration that needs to be passed to individual members of an etcd cluster in different scenarios such as bootstrapping, adding a new member, removing a member, restarting an existing member etc. Managing such differences in configuration for individual pods of a statefulset is tricky in the recommended approach of using a single statefulset to manage all the member pods of an etcd cluster. This is because statefulset uses the same pod template for all its pods.\nThe recommendation is for etcd-druid to provision the base configuration template in a ConfigMap which is passed to all the pods via the pod template in the StatefulSet. The initialization flow of etcd-backup-restore (which is invoked every time the etcd container is (re)started) is then enhanced to generate the customized etcd configuration for the corresponding member pod (in a shared volume between etcd and the backup-restore containers) based on the supplied template configuration. This will require that etcd-backup-restore will have to have a mechanism to detect which scenario listed above applies during any given member container/pod restart.\nAlternative As mentioned above, one statefulset is used for each member of the etcd cluster. Then different configuration (generated directly by etcd-druid) can be passed in the pod templates of the different statefulsets. Though this approach is advantageous in the context of managing the different configuration, it is not recommended in this proposal because it makes the rest of the management (e.g. rolling updates) more complicated.\nData Persistence The type of persistence used to store etcd data (including the member ID and cluster ID) has an impact on the steps that are needed to be taken when the member pods or containers (minority of them or majority) need to be recovered.\nPersistent Like the single-node case, persistentvolumes can be used to persist ETCD data for all the member pods. The individual member pods then get their own persistentvolumes. The advantage is that individual members retain their member ID across pod restarts and even pod deletion/recreation across Kubernetes nodes. This means that member pods that crash (or are unhealthy) can be restarted automatically (by configuring livenessProbe) and they will re-join the etcd cluster using their existing member ID without any need for explicit etcd cluster management).\nThe disadvantages of this approach are as follows.\n The number of persistentvolumes increases linearly with the cluster size which is a cost-related concern. Network-mounted persistentvolumes might eventually become a performance bottleneck under heavy load for a latency-sensitive component like ETCD. Volume attach/detach issues when associated with etcd cluster instances cause downtimes to the target shoot clusters that are backed by those etcd cluster instances. Ephemeral The ephemeral volumes use-case is considered as an optimization and may be planned as a follow-up action.\nDisk Ephemeral persistence can be achieved in Kubernetes by using either emptyDir volumes or local persistentvolumes to persist ETCD data. The advantages of this approach are as follows.\n Potentially faster disk I/O. The number of persistent volumes does not increase linearly with the cluster size (at least not technically). Issues related volume attachment/detachment can be avoided. The main disadvantage of using ephemeral persistence is that the individual members may retain their identity and data across container restarts but not across pod deletion/recreation across Kubernetes nodes. If the data is lost then on restart of the member pod, the older member (represented by the container) has to be removed and a new member has to be added.\nUsing emptyDir ephemeral persistence has the disadvantage that the volume doesn’t have its own identity. So, if the member pod is recreated but scheduled on the same node as before then it will not retain the identity as the persistence is lost. But it has the advantage that scheduling of pods is unencumbered especially during pod recreation as they are free to be scheduled anywhere.\nUsing local persistentvolumes has the advantage that the volume has its own indentity and hence, a recreated member pod will retain its identity if scheduled on the same node. But it has the disadvantage of tying down the member pod to a node which is a problem if the node becomes unhealthy requiring etcd druid to take additional actions (such as deleting the local persistent volume).\nBased on these constraints, if ephemeral persistence is opted for, it is recommended to use emptyDir ephemeral persistence.\nIn-memory In-memory ephemeral persistence can be achieved in Kubernetes by using emptyDir with medium: Memory. In this case, a tmpfs (RAM-backed file-system) volume will be used. In addition to the advantages of ephemeral persistence, this approach can achieve the fastest possible disk I/O. Similarly, in addition to the disadvantages of ephemeral persistence, in-memory persistence has the following additional disadvantages.\n More memory required for the individual member pods. Individual members may not at all retain their data and identity across container restarts let alone across pod restarts/deletion/recreation across Kubernetes nodes. I.e. every time an etcd container restarts, the old member (represented by the container) will have to be removed and a new member has to be added. How to detect if valid metadata exists in an etcd member Since the likelyhood of a member not having valid metadata in the WAL files is much more likely in the ephemeral persistence scenario, one option is to pass the information that ephemeral persistence is being used to the etcd-backup-restore sidecar (say, via command-line flags or environment variables).\nBut in principle, it might be better to determine this from the WAL files directly so that the possibility of corrupted WAL files also gets handled correctly. To do this, the wal package has some functions that might be useful.\nRecommendation It might be possible that using the wal package for verifying if valid metadata exists might be performance intensive. So, the performance impact needs to be measured. If the performance impact is acceptable (both in terms of resource usage and time), it is recommended to use this way to verify if the member contains valid metadata. Otherwise, alternatives such as a simple check that WAL folder exists coupled with the static information about use of persistent or ephemeral storage might be considered.\nHow to detect if valid data exists in an etcd member The initialization sequence in etcd-backup-restore already includes database verification. This would suffice to determine if the member has valid data.\nRecommendation Though ephemeral persistence has performance and logistics advantages, it is recommended to start with persistent data for the member pods. In addition to the reasons and concerns listed above, there is also the additional concern that in case of backup failure, the risk of additional data loss is a bit higher if ephemeral persistence is used (simultaneous quoram loss is sufficient) when compared to persistent storage (simultaenous quorum loss with majority persistence loss is needed). The risk might still be acceptable but the idea is to gain experience about how frequently member containers/pods get restarted/recreated, how frequently leader election happens among members of an etcd cluster and how frequently etcd clusters lose quorum. Based on this experience, we can move towards using ephemeral (perhaps even in-memory) persistence for the member pods.\nSeparating peer and client traffic The current single-node ETCD cluster implementation in etcd-druid and etcd-backup-restore uses a single service object to act as the entry point for the client traffic. There is no separation or distinction between the client and peer traffic because there is not much benefit to be had by making that distinction.\nIn the multi-node ETCD cluster scenario, it makes sense to distinguish between and separate the peer and client traffic. This can be done by using two services.\n peer To be used for peer communication. This could be a headless service. client To be used for client communication. This could be a normal ClusterIP service like it is in the single-node case. The main advantage of this approach is that it makes it possible (if needed) to allow only peer to peer communication while blocking client communication. Such a thing might be required during some phases of some maintenance tasks (manual or automated).\nCutting off client requests At present, in the single-node ETCD instances, etcd-druid configures the readinessProbe of the etcd main container to probe the healthz endpoint of the etcd-backup-restore sidecar which considers the status of the latest backup upload in addition to the regular checks about etcd and the side car being up and healthy. This has the effect of setting the etcd main container (and hence the etcd pod) as not ready if the latest backup upload failed. This results in the endpoints controller removing the pod IP address from the endpoints list for the service which eventually cuts off ingress traffic coming into the etcd pod via the etcd client service. The rationale for this is to fail early when the backup upload fails rather than continuing to serve requests while the gap between the last backup and the current data increases which might lead to unacceptably large amount of data loss if disaster strikes.\nThis approach will not work in the multi-node scenario because we need the individual member pods to be able to talk to each other to maintain the cluster quorum when backup upload fails but need to cut off only client ingress traffic.\nIt is recommended to separate the backup health condition tracking taking appropriate remedial actions. With that, the backup health condition tracking is now separated to the BackupReady condition in the Etcd resource status and the cutting off of client traffic (which could now be done for more reasons than failed backups) can be achieved in a different way described below.\nManipulating Client Service podSelector The client traffic can be cut off by updating (manually or automatically by some component) the podSelector of the client service to add an additional label (say, unhealthy or disabled) such that the podSelector no longer matches the member pods created by the statefulset. This will result in the client ingress traffic being cut off. The peer service is left unmodified so that peer communication is always possible.\nHealth Check The etcd main container and the etcd-backup-restore sidecar containers will be configured with livenessProbe and readinessProbe which will indicate the health of the containers and effectively the corresponding ETCD cluster member pod.\nBackup Failure As described above using readinessProbe failures based on latest backup failure is not viable in the multi-node ETCD scenario.\nThough cutting off traffic by manipulating client service podSelector is workable, it may not be desirable.\nIt is recommended that on backup failure, the leading etcd-backup-restore sidecar (the one that is responsible for taking backups at that point in time, as explained in the backup section below, updates the BackupReady condition in the Etcd status and raises a high priority alert to the landscape operators but does not cut off the client traffic.\nThe reasoning behind this decision to not cut off the client traffic on backup failure is to allow the Kubernetes cluster’s control plane (which relies on the ETCD cluster) to keep functioning as long as possible and to avoid bringing down the control-plane due to a missed backup.\nThe risk of this approach is that with a cascaded sequence of failures (on top of the backup failure), there is a chance of more data loss than the frequency of backup would otherwise indicate.\nTo be precise, the risk of such an additional data loss manifests only when backup failure as well as a special case of quorum loss (majority of the members are not ready) happen in such a way that the ETCD cluster needs to be re-bootstrapped from the backup. As described here, re-bootstrapping the ETCD cluster requires restoration from the latest backup only when a majority of members no longer have uncorrupted data persistence.\nIf persistent storage is used, this will happen only when backup failure as well as a majority of the disks/volumes backing the ETCD cluster members fail simultaneously. This would indeed be rare and might be an acceptable risk.\nIf ephemeral storage is used (especially, in-memory), the data loss will happen if a majority of the ETCD cluster members become NotReady (requiring a pod restart) at the same time as the backup failure. This may not be as rare as majority members’ disk/volume failure. The risk can be somewhat mitigated at least for planned maintenance operations by postponing potentially disruptive maintenance operations when BackupReady condition is false (vertical scaling, rolling updates, evictions due to node roll-outs).\nBut in practice (when ephemeral storage is used), the current proposal suggests restoring from the latest full backup even when a minority of ETCD members (even a single pod) restart both to speed up the process of the new member catching up to the latest revision but also to avoid load on the leading member which needs to supply the data to bring the new member up-to-date. But as described here, in case of a minority member failure while using ephemeral storage, it is possible to restart the new member with empty data and let it fetch all the data from the leading member (only if backup is not accessible). Though this is suboptimal, it is workable given the constraints and conditions. With this, the risk of additional data loss in the case of ephemeral storage is only if backup failure as well as quorum loss happens. While this is still less rare than the risk of additional data loss in case of persistent storage, the risk might be tolerable. Provided the risk of quorum loss is not too high. This needs to be monitored/evaluated before opting for ephemeral storage.\nGiven these constraints, it is better to dynamically avoid/postpone some potentially disruptive operations when BackupReady condition is false. This has the effect of allowing n/2 members to be evicted when the backups are healthy and completely disabling evictions when backups are not healthy.\n Skip/postpone potentially disruptive maintenance operations (listed below) when the BackupReady condition is false. Vertical scaling. Rolling updates, Basically, any updates to the StatefulSet spec which includes vertical scaling. Dynamically toggle the minAvailable field of the PodDisruptionBudget between n/2 + 1 and n (where n is the ETCD desired cluster size) whenever the BackupReady condition toggles between true and false. This will mean that etcd-backup-restore becomes Kubernetes-aware. But there might be reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to update the etcd resource status with latest full snapshot details). This enhancement should keep etcd-backup-restore backward compatible. I.e. it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-client-service-updates which can be defaulted to false for backward compatibility).\nAlternative The alternative is for etcd-druid to implement the above functionality.\nBut etcd-druid is centrally deployed in the host Kubernetes cluster and cannot scale well horizontally. So, it can potentially be a bottleneck if it is involved in regular health check mechanism for all the etcd clusters it manages. Also, the recommended approach above is more robust because it can work even if etcd-druid is down when the backup upload of a particular etcd cluster fails.\nStatus It is desirable (for the etcd-druid and landscape administrators/operators) to maintain/expose status of the etcd cluster instances in the status sub-resource of the Etcd CRD. The proposed structure for maintaining the status is as shown in the example below.\napiVersion: druid.gardener.cloud/v1alpha1 kind: Etcd metadata: name: etcd-main spec: replicas: 3 ... ... status: ... conditions: - type: Ready # Condition type for the readiness of the ETCD cluster status: \"True\" # Indicates of the ETCD Cluster is ready or not lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: Quorate # Quorate|QuorumLost - type: AllMembersReady # Condition type for the readiness of all the member of the ETCD cluster status: \"True\" # Indicates if all the members of the ETCD Cluster are ready lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: AllMembersReady # AllMembersReady|NotAllMembersReady - type: BackupReady # Condition type for the readiness of the backup of the ETCD cluster status: \"True\" # Indicates if the backup of the ETCD cluster is ready lastHeartbeatTime: \"2020-11-10T12:48:01Z\" lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: FullBackupSucceeded # FullBackupSucceeded|IncrementalBackupSucceeded|FullBackupFailed|IncrementalBackupFailed ... clusterSize: 3 ... replicas: 3 ... members: - name: etcd-main-0 # member pod name id: 272e204152 # member Id role: Leader # Member|Leader status: Ready # Ready|NotReady|Unknown lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: LeaseSucceeded # LeaseSucceeded|LeaseExpired|UnknownGracePeriodExceeded|PodNotRead - name: etcd-main-1 # member pod name id: 272e204153 # member Id role: Member # Member|Leader status: Ready # Ready|NotReady|Unknown lastTransitionTime: \"2020-11-10T12:48:01Z\" reason: LeaseSucceeded # LeaseSucceeded|LeaseExpired|UnknownGracePeriodExceeded|PodNotRead This proposal recommends that etcd-druid (preferrably, the custodian controller in etcd-druid) maintains most of the information in the status of the Etcd resources described above.\nOne exception to this is the BackupReady condition which is recommended to be maintained by the leading etcd-backup-restore sidecar container. This will mean that etcd-backup-restore becomes Kubernetes-aware. But there are other reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to maintain health conditions). This enhancement should keep etcd-backup-restore backward compatible. But it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-etcd-status-updates which can be defaulted to false for backward compatibility).\nMembers The members section of the status is intended to be maintained by etcd-druid (preferraby, the custodian controller of etcd-druid) based on the leases of the individual members.\nNote An earlier design in this proposal was for the individual etcd-backup-restore sidecars to update the corresponding status.members entries themselves. But this was redesigned to use member leases to avoid conflicts rising from frequent updates and the limitations in the support for Server-Side Apply in some versions of Kubernetes.\nThe spec.holderIdentity field in the leases is used to communicate the ETCD member id and role between the etcd-backup-restore sidecars and etcd-druid.\nMember name as the key In an ETCD cluster, the member id is the unique identifier for a member. However, this proposal recommends using a single StatefulSet whose pods form the members of the ETCD cluster and Pods of a StatefulSet have uniquely indexed names as well as uniquely addressible DNS.\nThis proposal recommends that the name of the member (which is the same as the name of the member Pod) be used as the unique key to identify a member in the members array. This can minimise the need to cleanup superfluous entries in the members array after the member pods are gone to some extent because the replacement pods for any member will share the same name and will overwrite the entry with a possibly new member id.\nThere is still the possibility of not only superfluous entries in the members array but also superfluous members in the ETCD cluster for which there is no corresponding pod in the StatefulSet anymore.\nFor example, if an ETCD cluster is scaled up from 3 to 5 and the new members were failing constantly due to insufficient resources and then if the ETCD client is scaled back down to 3 and failing member pods may not have the chance to clean up their member entries (from the members array as well as from the ETCD cluster) leading to superfluous members in the cluster that may have adverse effect on quorum of the cluster.\nHence, the superfluous entries in both members array as well as the ETCD cluster need to be cleaned up as appropriate.\nMember Leases One Kubernetes lease object per desired ETCD member is maintained by etcd-druid (preferrably, the custodian controller in etcd-druid). The lease objects will be created in the same namespace as their owning Etcd object and will have the same name as the member to which they correspond (which, in turn would be the same as the pod name in which the member ETCD process runs).\nThe lease objects are created and deleted only by etcd-druid but are continually renewed within the leaseDurationSeconds by the individual etcd-backup-restore sidecars (corresponding to their members) if the the corresponding ETCD member is ready and is part of the ETCD cluster.\nThis will mean that etcd-backup-restore becomes Kubernetes-aware. But there are other reasons for making etcd-backup-restore Kubernetes-aware anyway (e.g. to maintain health conditions). This enhancement should keep etcd-backup-restore backward compatible. But it should be possible to use etcd-backup-restore Kubernetes-unaware as before this proposal. This is possible either by auto-detecting the existence of kubeconfig or by an explicit command-line flag (such as --enable-etcd-lease-renewal which can be defaulted to false for backward compatibility).\nA member entry in the Etcd resource status would be marked as Ready (with reason: LeaseSucceeded) if the corresponding pod is ready and the corresponding lease has not yet expired. The member entry would be marked as NotReady if the corresponding pod is not ready (with reason PodNotReady) or as Unknown if the corresponding lease has expired (with reason: LeaseExpired).\nWhile renewing the lease, the etcd-backup-restore sidecars also maintain the ETCD member id and their role (Leader or Member) separated by : in the spec.holderIdentity field of the corresponding lease object since this information is only available to the ETCD member processes and the etcd-backup-restore sidecars (e.g. 272e204152:Leader or 272e204153:Member). When the lease objects are created by etcd-druid, the spec.holderIdentity field would be empty.\nThe value in spec.holderIdentity in the leases is parsed and copied onto the id and role fields of the corresponding status.members by etcd-druid.\nConditions The conditions section in the status describe the overall condition of the ETCD cluster. The condition type Ready indicates if the ETCD cluster as a whole is ready to serve requests (i.e. the cluster is quorate) even though some minority of the members are not ready. The condition type AllMembersReady indicates of all the members of the ETCD cluster are ready. The distinction between these conditions could be significant for both external consumers of the status as well as etcd-druid itself. Some maintenance operations might be safe to do (e.g. rolling updates) only when all members of the cluster are ready. The condition type BackupReady indicates of the most recent backup upload (full or incremental) succeeded. This information also might be significant because some maintenance operations might be safe to do (e.g. anything that involves re-bootstrapping the ETCD cluster) only when backup is ready.\nThe Ready and AllMembersReady conditions can be maintained by etcd-druid based on the status in the members section. The BackupReady condition will be maintained by the leading etcd-backup-restore sidecar that is in charge of taking backups.\nMore condition types could be introduced in the future if specific purposes arise.\nClusterSize The clusterSize field contains the current size of the ETCD cluster. It will be actively kept up-to-date by etcd-druid in all scenarios.\n Before bootstrapping the ETCD cluster (during cluster creation or later bootstrapping because of quorum failure), etcd-druid will clear the status.members array and set status.clusterSize to be equal to spec.replicas. While the ETCD cluster is quorate, etcd-druid will actively set status.clusterSize to be equal to length of the status.members whenever the length of the array changes (say, due to scaling of the ETCD cluster). Given that clusterSize reliably represents the size of the ETCD cluster, it can be used to calculate the Ready condition.\nAlternative The alternative is for etcd-druid to maintain the status in the Etcd status sub-resource. But etcd-druid is centrally deployed in the host Kubernetes cluster and cannot scale well horizontally. So, it can potentially be a bottleneck if it is involved in regular health check mechanism for all the etcd clusters it manages. Also, the recommended approach above is more robust because it can work even if etcd-druid is down when the backup upload of a particular etcd cluster fails.\nDecision table for etcd-druid based on the status The following decision table describes the various criteria etcd-druid takes into consideration to determine the different etcd cluster management scenarios and the corresponding reconciliation actions it must take. The general principle is to detect the scenario and take the minimum action to move the cluster along the path to good health. The path from any one scenario to a state of good health will typically involve going through multiple reconciliation actions which probably take the cluster through many other cluster management scenarios. Especially, it is proposed that individual members auto-heal where possible, even in the case of the failure of a majority of members of the etcd cluster and that etcd-druid takes action only if the auto-healing doesn’t happen for a configured period of time.\n1. Pink of health Observed state Cluster Size Desired: n Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Nothing to do\n2. Member status is out of sync with their leases Observed state Cluster Size Desired: n Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: r Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: l conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Mark the l members corresponding to the expired leases as Unknown with reason LeaseExpired and with id populated from spec.holderIdentity of the lease if they are not already updated so.\nMark the n - l members corresponding to the active leases as Ready with reason LeaseSucceeded and with id populated from spec.holderIdentity of the lease if they are not already updated so.\nPlease refer here for more details.\n3. All members are Ready but AllMembersReady condition is stale Observed state Cluster Size Desired: N/A Current: N/A StatefulSet replicas Desired: n Ready: N/A Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: N/A AllMembersReady: false BackupReady: N/A Recommended Action Mark the status condition type AllMembersReady to true.\n4. Not all members are Ready but AllMembersReady condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: N/A Ready: r where 0 \u003c= r \u003c n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n Members with expired lease: h where 0 \u003c h \u003c n conditions: Ready: N/A AllMembersReady: true BackupReady: N/A where (nr + u + h) \u003e 0 or r \u003c n\n Recommended Action Mark the status condition type AllMembersReady to false.\n5. Majority members are Ready but Ready condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: n Ready: r where r \u003e n/2 Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n/2 Members with expired lease: N/A conditions: Ready: false AllMembersReady: N/A BackupReady: N/A where 0 \u003c (nr + u + h) \u003c n/2\n Recommended Action Mark the status condition type Ready to true.\n6. Majority members are NotReady but Ready condition is stale Observed state Cluster Size\n Desired: N/A Current: N/A StatefulSet replicas\n Desired: n Ready: N/A Etcd status\n members Total: n Ready: r where 0 \u003c r \u003c n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: nr where 0 \u003c nr \u003c n Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where 0 \u003c u \u003c n Members with expired lease: N/A conditions: Ready: true AllMembersReady: N/A BackupReady: N/A where (nr + u + h) \u003e n/2 or r \u003c n/2\n Recommended Action Mark the status condition type Ready to false.\n7. Some members have been in Unknown status for a while Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: N/A Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: u where u \u003c= n Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Mark the u members as NotReady in Etcd status with reason: UnknownGracePeriodExceeded.\n8. Some member pods are not Ready but have not had the chance to update their status Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: n Ready: s where s \u003c n Etcd status members Total: N/A Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Mark the n - s members (corresponding to the pods that are not Ready) as NotReady in Etcd status with reason: PodNotReady\n9. Quorate cluster with a minority of members NotReady Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: n Ready: n - f Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: f where f \u003c n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: N/A conditions: Ready: true AllMembersReady: false BackupReady: true Recommended Action Delete the f NotReady member pods to force restart of the pods if they do not automatically restart via failed livenessProbe. The expectation is that they will either re-join the cluster as an existing member or remove themselves and join as new members on restart of the container or pod and renew their leases.\n10. Quorum lost with a majority of members NotReady Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: N/A Ready: N/A Etcd status members Total: n Ready: n - f Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: f where f \u003e= n/2 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: false AllMembersReady: false BackupReady: true Recommended Action Scale down the StatefulSet to replicas: 0. Ensure that all member pods are deleted. Ensure that all the members are removed from Etcd status. Delete and recreate all the member leases. Recover the cluster from loss of quorum as discussed here.\n11. Scale up of a healthy cluster Observed state Cluster Size Desired: d Current: n where d \u003e n StatefulSet replicas Desired: N/A Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Add d - n new members by scaling the StatefulSet to replicas: d. The rest of the StatefulSet spec need not be updated until the next cluster bootstrapping (alternatively, the rest of the StatefulSet spec can be updated pro-actively once the new members join the cluster. This will trigger a rolling update).\nAlso, create the additional member leases for the d - n new members.\n12. Scale down of a healthy cluster Observed state Cluster Size Desired: d Current: n where d \u003c n StatefulSet replicas Desired: n Ready: n Etcd status members Total: n Ready: n Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: 0 Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: 0 Members with expired lease: 0 conditions: Ready: true AllMembersReady: true BackupReady: true Recommended Action Remove d - n existing members (numbered d, d + 1 … n) by scaling the StatefulSet to replicas: d. The StatefulSet spec need not be updated until the next cluster bootstrapping (alternatively, the StatefulSet spec can be updated pro-actively once the superfluous members exit the cluster. This will trigger a rolling update).\nAlso, delete the member leases for the d - n members being removed.\nThe superfluous entries in the members array will be cleaned up as explained here. The superfluous members in the ETCD cluster will be cleaned up by the leading etcd-backup-restore sidecar.\n13. Superfluous member entries in Etcd status Observed state Cluster Size Desired: N/A Current: n StatefulSet replicas Desired: n Ready: n Etcd status members Total: m where m \u003e n Ready: N/A Members NotReady for long enough to be evicted, i.e. lastTransitionTime \u003e notReadyGracePeriod: N/A Members with readiness status Unknown long enough to be considered NotReady, i.e. lastTransitionTime \u003e unknownGracePeriod: N/A Members with expired lease: N/A conditions: Ready: N/A AllMembersReady: N/A BackupReady: N/A Recommended Action Remove the superfluous m - n member entries from Etcd status (numbered n, n+1 … m). Remove the superfluous m - n member leases if they exist. The superfluous members in the ETCD cluster will be cleaned up by the leading etcd-backup-restore sidecar.\nDecision table for etcd-backup-restore during initialization As discussed above, the initialization sequence of etcd-backup-restore in a member pod needs to generate suitable etcd configuration for its etcd container. It also might have to handle the etcd database verification and restoration functionality differently in different scenarios.\nThe initialization sequence itself is proposed to be as follows. It is an enhancement of the existing initialization sequence. The details of the decisions to be taken during the initialization are given below.\n1. First member during bootstrap of a fresh etcd cluster Observed state Cluster Size: n Etcd status members: Total: 0 Ready: 0 Status contains own member: false Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: false Backup Backup exists: false Backup has incremental snapshots: false Recommended Action Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n2. Addition of a new following member during bootstrap of a fresh etcd cluster Observed state Cluster Size: n Etcd status members: Total: m where 0 \u003c m \u003c n Ready: m Status contains own member: false Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: false Backup Backup exists: false Backup has incremental snapshots: false Recommended Action Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n3. Restart of an existing member of a quorate cluster with valid metadata and data Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: true Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Re-use previously generated etcd configuration and return success.\n4. Restart of an existing member of a quorate cluster with valid metadata but without valid data Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Remove self as a member (old member ID) from the etcd cluster as well as Etcd status. Add self as a new member of the etcd cluster as well as in the Etcd status. If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\n5. Restart of an existing member of a quorate cluster without valid metadata Observed state Cluster Size: n Etcd status members: Total: m where m \u003e n/2 Ready: r where r \u003e n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: false Data directory is valid and up-to-date: N/A Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Remove self as a member (old member ID) from the etcd cluster as well as Etcd status. Add self as a new member of the etcd cluster as well as in the Etcd status. If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\n6. Restart of an existing member of a non-quorate cluster with valid metadata and data Observed state Cluster Size: n Etcd status members: Total: m where m \u003c n/2 Ready: r where r \u003c n/2 Status contains own member: true Data persistence WAL directory has cluster/ member metadata: true Data directory is valid and up-to-date: true Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action Re-use previously generated etcd configuration and return success.\n7. Restart of the first member of a non-quorate cluster without valid data Observed state Cluster Size: n Etcd status members: Total: 0 Ready: 0 Status contains own member: false Data persistence WAL directory has cluster/ member metadata: N/A Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action If backups do not exist, create an empty data and WAL directory. If backups exist, restore the latest full snapshot. Start a single-node embedded etcd with initial cluster peer URLs containing only own peer URL and initial cluster state new. If incremental snapshots exist, apply them serially (honouring source transactions). Take and upload a full snapshot after incremental snapshots are applied successfully (please see here for more reasons why). Generate etcd configuration with n initial cluster peer URLs and initial cluster state new and return success.\n8. Restart of a following member of a non-quorate cluster without valid data Observed state Cluster Size: n Etcd status members: Total: m where 1 \u003c m \u003c n Ready: r where 1 \u003c r \u003c n Status contains own member: false Data persistence WAL directory has cluster/ member metadata: N/A Data directory is valid and up-to-date: false Backup Backup exists: N/A Backup has incremental snapshots: N/A Recommended Action If backups do not exist, create an empty data and WAL directory. If backups exist, restore only the latest full snapshot (please see here for the reason for not restoring incremental snapshots). Generate etcd configuration with n initial cluster peer URLs and initial cluster state existing and return success.\nBackup Only one of the etcd-backup-restore sidecars among the members are required to take the backup for a given ETCD cluster. This can be called a backup leader. There are two possibilities to ensure this.\nLeading ETCD main container’s sidecar is the backup leader The backup-restore sidecar could poll the etcd cluster and/or its own etcd main container to see if it is the leading member in the etcd cluster. This information can be used by the backup-restore sidecars to decide that sidecar of the leading etcd main container is the backup leader (i.e. responsible to for taking/uploading backups regularly).\nThe advantages of this approach are as follows.\n The approach is operationally and conceptually simple. The leading etcd container and backup-restore sidecar are always located in the same pod. Network traffic between the backup container and the etcd cluster will always be local. The disadvantage is that this approach may not age well in the future if we think about moving the backup-restore container as a separate pod rather than a sidecar container.\nIndependent leader election between backup-restore sidecars We could use the etcd lease mechanism to perform leader election among the backup-restore sidecars. For example, using something like go.etcd.io/etcd/clientv3/concurrency.\nThe advantage and disadvantages are pretty much the opposite of the approach above. The advantage being that this approach may age well in the future if we think about moving the backup-restore container as a separate pod rather than a sidecar container.\nThe disadvantages are as follows.\n The approach is operationally and conceptually a bit complex. The leading etcd container and backup-restore sidecar might potentially belong to different pods. Network traffic between the backup container and the etcd cluster might potentially be across nodes. History Compaction This proposal recommends to configure automatic history compaction on the individual members.\nDefragmentation Defragmentation is already triggered periodically by etcd-backup-restore. This proposal recommends to enhance this functionality to be performed only by the leading backup-restore container. The defragmentation must be performed only when etcd cluster is in full health and must be done in a rolling manner for each members to avoid disruption. The leading member should be defragmented last after all the rest of the members have been defragmented to minimise potential leadership changes caused by defragmentation. If the etcd cluster is unhealthy when it is time to trigger scheduled defragmentation, the defragmentation must be postponed until the cluster becomes healthy. This check must be done before triggering defragmentation for each member.\nWork-flows in etcd-backup-restore There are different work-flows in etcd-backup-restore. Some existing flows like initialization, scheduled backups and defragmentation have been enhanced or modified. Some new work-flows like status updates have been introduced. Some of these work-flows are sensitive to which etcd-backup-restore container is leading and some are not.\nThe life-cycle of these work-flows is shown below. Work-flows independent of leader election in all members Serve the HTTP API that all members are expected to support currently but some HTTP API call which are used to take out-of-sync delta or full snapshot should delegate the incoming HTTP requests to the leading-sidecar and one of the possible approach to achieve this is via an HTTP reverse proxy. Check the health of the respective etcd member and renew the corresponding member lease. Work-flows only on the leading member Take backups (full and incremental) at configured regular intervals Defragment all the members sequentially at configured regular intervals Cleanup superflous members from the ETCD cluster for which there is no corresponding pod (the ordinal in the pod name is greater than the cluster size) at regular intervals (or whenever the Etcd resource status changes by watching it) The cleanup of superfluous entries in status.members array is already covered here High Availability Considering that high-availability is the primary reason for using a multi-node etcd cluster, it makes sense to distribute the individual member pods of the etcd cluster across different physical nodes. If the underlying Kubernetes cluster has nodes from multiple availability zones, it makes sense to also distribute the member pods across nodes from different availability zones.\nOne possibility to do this is via SelectorSpreadPriority of kube-scheduler but this is only best-effort and may not always be enforced strictly.\nIt is better to use pod anti-affinity to enforce such distribution of member pods.\nZonal Cluster - Single Availability Zone A zonal cluster is configured to consist of nodes belonging to only a single availability zone in a region of the cloud provider. In such a case, we can at best distribute the member pods of a multi-node etcd cluster instance only across different nodes in the configured availability zone.\nThis can be done by specifying pod anti-affinity in the specification of the member pods using kubernetes.io/hostname as the topology key.\napiVersion: apps/v1 kind: StatefulSet ... spec: ... template: ... spec: ... affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: {} # podSelector that matches the member pods of the given etcd cluster instance topologyKey: \"kubernetes.io/hostname\" ... ... ... The recommendation is to keep etcd-druid agnostic of such topics related scheduling and cluster-topology and to use kupid to orthogonally inject the desired pod anti-affinity.\nAlternative Another option is to build the functionality into etcd-druid to include the required pod anti-affinity when it provisions the StatefulSet that manages the member pods. While this has the advantage of avoiding a dependency on an external component like kupid, the disadvantage is that we might need to address development or testing use-cases where it might be desirable to avoid distributing member pods and schedule them on as less number of nodes as possible. Also, as mentioned below, kupid can be used to distribute member pods of an etcd cluster instance across nodes in a single availability zone as well as across nodes in multiple availability zones with very minor variation. This keeps the solution uniform regardless of the topology of the underlying Kubernetes cluster.\nRegional Cluster - Multiple Availability Zones A regional cluster is configured to consist of nodes belonging to multiple availability zones (typically, three) in a region of the cloud provider. In such a case, we can distribute the member pods of a multi-node etcd cluster instance across nodes belonging to different availability zones.\nThis can be done by specifying pod anti-affinity in the specification of the member pods using topology.kubernetes.io/zone as the topology key. In Kubernetes clusters using Kubernetes release older than 1.17, the older (and now deprecated) failure-domain.beta.kubernetes.io/zone might have to be used as the topology key.\napiVersion: apps/v1 kind: StatefulSet ... spec: ... template: ... spec: ... affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: {} # podSelector that matches the member pods of the given etcd cluster instance topologyKey: \"topology.kubernetes.io/zone ... ... ... The recommendation is to keep etcd-druid agnostic of such topics related scheduling and cluster-topology and to use kupid to orthogonally inject the desired pod anti-affinity.\nAlternative Another option is to build the functionality into etcd-druid to include the required pod anti-affinity when it provisions the StatefulSet that manages the member pods. While this has the advantage of avoiding a dependency on an external component like kupid, the disadvantage is that such built-in support necessarily limits what kind of topologies of the underlying cluster will be supported. Hence, it is better to keep etcd-druid altogether agnostic of issues related to scheduling and cluster-topology.\nPodDisruptionBudget This proposal recommends that etcd-druid should deploy PodDisruptionBudget (minAvailable set to floor(\u003ccluster size\u003e/2) + 1) for multi-node etcd clusters (if AllMembersReady condition is true) to ensure that any planned disruptive operation can try and honour the disruption budget to ensure high availability of the etcd cluster while making potentially disrupting maintenance operations.\nAlso, it is recommended to toggle the minAvailable field between floor(\u003ccluster size\u003e/2) and \u003cnumber of members with status Ready true\u003e whenever the AllMembersReady condition toggles between true and false. This is to disable eviction of any member pods when not all members are Ready.\nIn case of a conflict, the recommendation is to use the highest of the applicable values for minAvailable.\nRolling updates to etcd members Any changes to the Etcd resource spec that might result in a change to StatefulSet spec or otherwise result in a rolling update of member pods should be applied/propagated by etcd-druid only when the etcd cluster is fully healthy to reduce the risk of quorum loss during the updates. This would include vertical autoscaling changes (via, HVPA). If the cluster status unhealthy (i.e. if either AllMembersReady or BackupReady conditions are false), etcd-druid must restore it to full health before proceeding with such operations that lead to rolling updates. This can be further optimized in the future to handle the cases where rolling updates can still be performed on an etcd cluster that is not fully healthy.\nFollow Up Ephemeral Volumes See section Ephemeral Volumes.\nShoot Control-Plane Migration This proposal adds support for multi-node etcd clusters but it should not have significant impact on shoot control-plane migration any more than what already present in the single-node etcd cluster scenario. But to be sure, this needs to be discussed further.\nPerformance impact of multi-node etcd clusters Multi-node etcd clusters incur a cost on write performance as compared to single-node etcd clusters. This performance impact needs to be measured and documented. Here, we should compare different persistence option for the multi-nodeetcd clusters so that we have all the information necessary to take the decision balancing the high-availability, performance and costs.\nMetrics, Dashboards and Alerts There are already metrics exported by etcd and etcd-backup-restore which are visualized in monitoring dashboards and also used in triggering alerts. These might have hidden assumptions about single-node etcd clusters. These might need to be enhanced and potentially new metrics, dashboards and alerts configured to cover the multi-node etcd cluster scenario.\nEspecially, a high priority alert must be raised if BackupReady condition becomes false.\nCosts Multi-node etcd clusters will clearly involve higher cost (when compared with single-node etcd clusters) just going by the CPU and memory usage for the additional members. Also, the different options for persistence for etcd data for the members will have different cost implications. Such cost impact needs to be assessed and documented to help navigate the trade offs between high availability, performance and costs.\nFuture Work Gardener Ring Gardener Ring, requires provisioning and management of an etcd cluster with the members distributed across more than one Kubernetes cluster. This cannot be achieved by etcd-druid alone which has only the view of a single Kubernetes cluster. An additional component that has the view of all the Kubernetes clusters involved in setting up the gardener ring will be required to achieve this. However, etcd-druid can be used by such a higher-level component/controller (for example, by supplying the initial cluster configuration) such that individual etcd-druid instances in the individual Kubernetes clusters can manage the corresponding etcd cluster members.\nAutonomous Shoot Clusters Autonomous Shoot Clusters also will require a highly availble etcd cluster to back its control-plane and the multi-node support proposed here can be leveraged in that context. However, the current proposal will not meet all the needs of a autonomous shoot cluster. Some additional components will be required that have the overall view of the autonomous shoot cluster and they can use etcd-druid to manage the multi-node etcd cluster. But this scenario may be different from that of Gardener Ring in that the individual etcd members of the cluster may not be hosted on different Kubernetes clusters.\nOptimization of recovery from non-quorate cluster with some member containing valid data It might be possible to optimize the actions during the recovery of a non-quorate cluster where some of the members contain valid data and some other don’t. The optimization involves verifying the data of the valid members to determine the data of which member is the most recent (even considering the latest backup) so that the full snapshot can be taken from it before recovering the etcd cluster. Such an optimization can be attempted in the future.\nOptimization of rolling updates to unhealthy etcd clusters As mentioned above, optimizations to proceed with rolling updates to unhealthy etcd clusters (without first restoring the cluster to full health) can be pursued in future work.\n","categories":"","description":"","excerpt":"Multi-node etcd cluster instances via etcd-druid This document …","ref":"/docs/other-components/etcd-druid/proposals/01-multi-node-etcd-clusters/","tags":"","title":"01 Multi Node Etcd Clusters"},{"body":"Snapshot Compaction for Etcd Current Problem To ensure recoverability of Etcd, backups of the database are taken at regular interval. Backups are of two types: Full Snapshots and Incremental Snapshots.\nFull Snapshots Full snapshot is a snapshot of the complete database at given point in time.The size of the database keeps changing with time and typically the size is relatively large (measured in 100s of megabytes or even in gigabytes. For this reason, full snapshots are taken after some large intervals.\nIncremental Snapshots Incremental Snapshots are collection of events on Etcd database, obtained through running WATCH API Call on Etcd. After some short intervals, all the events that are accumulated through WATCH API Call are saved in a file and named as Incremental Snapshots at relatively short time intervals.\nRecovery from the Snapshots Recovery from Full Snapshots As the full snapshots are snapshots of the complete database, the whole database can be recovered from a full snapshot in one go. Etcd provides API Call to restore the database from a full snapshot file.\nRecovery from Incremental Snapshots Delta snapshots are collection of retrospective Etcd events. So, to restore from Incremental snapshot file, the events from the file are needed to be applied sequentially on Etcd database through Etcd Put/Delete API calls. As it is heavily dependent on Etcd calls sequentially, restoring from Incremental Snapshot files can take long if there are numerous commands captured in Incremental Snapshot files.\nDelta snapshots are applied on top of running Etcd database. So, if there is inconsistency between the state of database at the point of applying and the state of the database when the delta snapshot commands were captured, restoration will fail.\nCurrently, in Gardener setup, Etcd is restored from the last full snapshot and then the delta snapshots, which were captured after the last full snapshot.\nThe main problem with this is that the complete restoration time can be unacceptably large if the rate of change coming into the etcd database is quite high because there are large number of events in the delta snapshots to be applied sequentially. A secondary problem is that, though auto-compaction is enabled for etcd, it is not quick enough to compact all the changes from the incremental snapshots being re-applied during the relatively short period of time of restoration (as compared to the actual period of time when the incremental snapshots were accumulated). This may lead to the etcd pod (the backup-restore sidecar container, to be precise) to run out of memory and/or storage space even if it is sufficient for normal operations.\nSolution Compaction command To help with the problem mentioned earlier, our proposal is to introduce compact subcommand with etcdbrctl. On execution of compact command, A separate embedded Etcd process will be started where the Etcd data will be restored from the snapstore (exactly as in the restoration scenario today). Then the new Etcd database will be compacted and defragmented using Etcd API calls. The compaction will strip off the Etcd database of old revisions as per the Etcd auto-compaction configuration. The defragmentation will free up the unused fragment memory space released after compaction. Then a full snapshot of the compacted database will be saved in snapstore which then can be used as the base snapshot during any subsequent restoration (or backup compaction).\nHow the solution works The newly introduced compact command does not disturb the running Etcd while compacting the backup snapshots. The command is designed to run potentially separately (from the main Etcd process/container/pod). Etcd Druid can be configured to run the newly introduced compact command as a separate job (scheduled periodically) based on total number of Etcd events accumulated after the most recent full snapshot.\nEtcd-druid flags: Etcd-druid introduces the following flags to configure the compaction job:\n --enable-backup-compaction (default false): Set this flag to true to enable the automatic compaction of etcd backups when the threshold value denoted by CLI flag --etcd-events-threshold is exceeded. --compaction-workers (default 3): Number of worker threads of the CompactionJob controller. The controller creates a backup compaction job if a certain etcd event threshold is reached. If compaction is enabled, the value for this flag must be greater than zero. --etcd-events-threshold (default 1000000): Total number of etcd events that can be allowed before a backup compaction job is triggered. --active-deadline-duration (default 3h): Duration after which a running backup compaction job will be terminated. --metrics-scrape-wait-duration (default 0s): Duration to wait for after compaction job is completed, to allow Prometheus metrics to be scraped. Points to take care while saving the compacted snapshot: As compacted snapshot and the existing periodic full snapshots are taken by different processes running in different pods but accessing same store to save the snapshots, some problems may arise:\n When uploading the compacted snapshot to the snapstore, there is the problem of how does the restorer know when to start using the newly compacted snapshot. This communication needs to be atomic. With a regular schedule for compaction that happens potentially separately from the main etcd pod, is there a need for regular scheduled full snapshots anymore? We are planning to introduce new directory structure, under v2 prefix, for saving the snapshots (compacted and full), as mentioned in details below. But for backward compatibility, we also need to consider the older directory, which is currently under v1 prefix, during accessing snapshots. How to swap full snapshot with compacted snapshot atomically Currently, full snapshots and the subsequent delta snapshots are grouped under same prefix path in the snapstore. When a full snapshot is created, it is placed under a prefix/directory with the name comprising of timestamp. Then subsequent delta snapshots are also pushed into the same directory. Thus each prefix/directory contains a single full snapshot and the subsequent delta snapshots. So far, it is the job of ETCDBR to start main Etcd process and snapshotter process which takes full snapshot and delta snapshot periodically. But as per our proposal, compaction will be running as parallel process to main Etcd process and snapshotter process. So we can’t reliably co-ordinate between the processes to achieve switching to the compacted snapshot as the base snapshot atomically.\nCurrent Directory Structure - Backup-192345 - Full-Snapshot-0-1-192345 - Incremental-Snapshot-1-100-192355 - Incremental-Snapshot-100-200-192365 - Incremental-Snapshot-200-300-192375 - Backup-192789 - Full-Snapshot-0-300-192789 - Incremental-Snapshot-300-400-192799 - Incremental-Snapshot-400-500-192809 - Incremental-Snapshot-500-600-192819 To solve the problem, proposal is:\n ETCDBR will take the first full snapshot after it starts main Etcd Process and snapshotter process. After taking the first full snapshot, snapshotter will continue taking full snapshots. On the other hand, ETCDBR compactor command will be run as periodic job in a separate pod and use the existing full or compacted snapshots to produce further compacted snapshots. Full snapshots and compacted snapshots will be named after same fashion. So, there is no need of any mechanism to choose which snapshots(among full and compacted snapshot) to consider as base snapshots. Flatten the directory structure of backup folder. Save all the full snapshots, delta snapshots and compacted snapshots under same directory/prefix. Restorer will restore from full/compacted snapshots and delta snapshots sorted based on the revision numbers in name (or timestamp if the revision numbers are equal). Proposed Directory Structure Backup : - Full-Snapshot-0-1-192355 (Taken by snapshotter) - Incremental-Snapshot-revision-1-100-192365 - Incremental-Snapshot-revision-100-200-192375 - Full-Snapshot-revision-0-200-192379 (Taken by snapshotter) - Incremental-Snapshot-revision-200-300-192385 - Full-Snapshot-revision-0-300-192386 (Taken by compaction job) - Incremental-Snapshot-revision-300-400-192396 - Incremental-Snapshot-revision-400-500-192406 - Incremental-Snapshot-revision-500-600-192416 - Full-Snapshot-revision-0-600-192419 (Taken by snapshotter) - Full-Snapshot-revision-0-600-192420 (Taken by compaction job) What happens to the delta snapshots that were compacted? The proposed compaction sub-command in etcdbrctl (and hence, the CronJob provisioned by etcd-druid that will schedule it at a regular interval) would only upload the compacted full snapshot. It will not delete the snapshots (delta or full snapshots) that were compacted. These snapshots which were superseded by a freshly uploaded compacted snapshot would follow the same life-cycle as other older snapshots. I.e. they will be garbage collected according to the configured backup snapshot retention policy. For example, if an exponential retention policy is configured and if compaction is done every 30m then there might be at most 48 additional (compacted) full snapshots (24h * 2) in the backup for the latest day. As time rolls forward to the next day, these additional compacted snapshots (along with the delta snapshots that were compacted into them) will get garbage collected retaining only one full snapshot for the day before according to the retention policy.\nFuture work In the future, we have plan to stop the snapshotter just after taking the first full snapshot. Then, the compaction job will be solely responsible for taking subsequent full snapshots. The directory structure would be looking like following:\nBackup : - Full-Snapshot-0-1-192355 (Taken by snapshotter) - Incremental-Snapshot-revision-1-100-192365 - Incremental-Snapshot-revision-100-200-192375 - Incremental-Snapshot-revision-200-300-192385 - Full-Snapshot-revision-0-300-192386 (Taken by compaction job) - Incremental-Snapshot-revision-300-400-192396 - Incremental-Snapshot-revision-400-500-192406 - Incremental-Snapshot-revision-500-600-192416 - Full-Snapshot-revision-0-600-192420 (Taken by compaction job) Backward Compatibility Restoration : The changes to handle the newly proposed backup directory structure must be backward compatible with older structures at least for restoration because we need have to restore from backups in the older structure. This includes the support for restoring from a backup without a metadata file if that is used in the actual implementation. Backup : For new snapshots (even on a backup containing the older structure), the new structure may be used. The new structure must be setup automatically including creating the base full snapshot. Garbage collection : The existing functionality of garbage collection of snapshots (full and incremental) according to the backup retention policy must be compatible with both old and new backup folder structure. I.e. the snapshots in the older backup structure must be retained in their own structure and the snapshots in the proposed backup structure should be retained in the proposed structure. Once all the snapshots in the older backup structure go out of the retention policy and are garbage collected, we can think of removing the support for older backup folder structure. Note: Compactor will run parallel to current snapshotter process and work only if there is any full snapshot already present in the store. By current design, a full snapshot will be taken if there is already no full snapshot or the existing full snapshot is older than 24 hours. It is not limitation but a design choice. As per proposed design, the backup storage will contain both periodic full snapshots as well as periodic compacted snapshot. Restorer will pickup the base snapshot whichever is latest one.\n","categories":"","description":"","excerpt":"Snapshot Compaction for Etcd Current Problem To ensure recoverability …","ref":"/docs/other-components/etcd-druid/proposals/02-snapshot-compaction/","tags":"","title":"02 Snapshot Compaction"},{"body":"Scaling-up a single-node to multi-node etcd cluster deployed by etcd-druid To mark a cluster for scale-up from single node to multi-node etcd, just patch the etcd custom resource’s .spec.replicas from 1 to 3 (for example).\nChallenges for scale-up Etcd cluster with single replica don’t have any peers, so no peer communication is required hence peer URL may or may not be TLS enabled. However, while scaling up from single node etcd to multi-node etcd, there will be a requirement to have peer communication between members of the etcd cluster. Peer communication is required for various reasons, for instance for members to sync up cluster state, data, and to perform leader election or any cluster wide operation like removal or addition of a member etc. Hence in a multi-node etcd cluster we need to have TLS enable peer URL for peer communication. Providing the correct configuration to start new etcd members as it is different from boostrapping a cluster since these new etcd members will join an existing cluster. Approach We first went through the etcd doc of update-advertise-peer-urls to find out information regarding peer URL updation. Interestingly, etcd doc has mentioned the following:\nTo update the advertise peer URLs of a member, first update it explicitly via member command and then restart the member. But we can’t assume peer URL is not TLS enabled for single-node cluster as it depends on end-user. A user may or may not enable the TLS for peer URL for a single node etcd cluster. So, How do we detect whether peer URL was enabled or not when cluster is marked for scale-up?\nDetecting if peerURL TLS is enabled or not For this we use an annotation in member lease object member.etcd.gardener.cloud/tls-enabled set by backup-restore sidecar of etcd. As etcd configuration is provided by backup-restore, so it can find out whether TLS is enabled or not and accordingly set this annotation member.etcd.gardener.cloud/tls-enabled to either true or false in member lease object. And with the help of this annotation and config-map values etcd-druid is able to detect whether there is a change in a peer URL or not.\nEtcd-Druid helps in scaling up etcd cluster Now, it is detected whether peer URL was TLS enabled or not for single node etcd cluster. Etcd-druid can now use this information to take action:\n If peer URL was already TLS enabled then no action is required from etcd-druid side. Etcd-druid can proceed with scaling up the cluster. If peer URL was not TLS enabled then etcd-druid has to intervene and make sure peer URL should be TLS enabled first for the single node before marking the cluster for scale-up. Action taken by etcd-druid to enable the peerURL TLS Etcd-druid will update the etcd-bootstrap config-map with new config like initial-cluster,initial-advertise-peer-urls etc. Backup-restore will detect this change and update the member lease annotation to member.etcd.gardener.cloud/tls-enabled: \"true\". In case the peer URL TLS has been changed to enabled: Etcd-druid will add tasks to the deployment flow: Check if peer TLS has been enabled for existing StatefulSet pods, by checking the member leases for the annotation member.etcd.gardener.cloud/tls-enabled. If peer TLS enablement is pending for any of the members, then check and patch the StatefulSet with the peer TLS volume mounts, if not already patched. This will cause a rolling update of the existing StatefulSet pods, which allows etcd-backup-restore to update the member peer URL in the etcd cluster. Requeue this reconciliation flow until peer TLS has been enabled for all the existing etcd members. After PeerURL is TLS enabled After peer URL TLS enablement for single node etcd cluster, now etcd-druid adds a scale-up annotation: gardener.cloud/scaled-to-multi-node to the etcd statefulset and etcd-druid will patch the statefulsets .spec.replicas to 3(for example). The statefulset controller will then bring up new pods(etcd with backup-restore as a sidecar). Now etcd’s sidecar i.e backup-restore will check whether this member is already a part of a cluster or not and incase it is unable to check (may be due to some network issues) then backup-restore checks presence of this annotation: gardener.cloud/scaled-to-multi-node in etcd statefulset to detect scale-up. If it finds out it is the scale-up case then backup-restore adds new etcd member as a learner first and then starts the etcd learner by providing the correct configuration. Once learner gets in sync with the etcd cluster leader, it will get promoted to a voting member.\nProviding the correct etcd config As backup-restore detects that it’s a scale-up scenario, backup-restore sets initial-cluster-state to existing as this member will join an existing cluster and it calculates the rest of the config from the updated config-map provided by etcd-druid.\nFuture improvements: The need of restarting etcd pods twice will change in the future. please refer: https://github.com/gardener/etcd-backup-restore/issues/538\n","categories":"","description":"","excerpt":"Scaling-up a single-node to multi-node etcd cluster deployed by …","ref":"/docs/other-components/etcd-druid/proposals/03-scaling-up-an-etcd-cluster/","tags":"","title":"03 Scaling Up An Etcd Cluster"},{"body":"Question You have deployed an application with a web UI or an internal endpoint in your Kubernetes (K8s) cluster. How to access this endpoint without an external load balancer (e.g., Ingress)?\nThis tutorial presents two options:\n Using Kubernetes port forward Using Kubernetes apiserver proxy Please note that the options described here are mostly for quick testing or troubleshooting your application. For enabling access to your application for productive environment, please refer to the official Kubernetes documentation.\nSolution 1: Using Kubernetes Port Forward You could use the port forwarding functionality of kubectl to access the pods from your local host without involving a service.\nTo access any pod follow these steps:\n Run kubectl get pods Note down the name of the pod in question as \u003cyour-pod-name\u003e Run kubectl port-forward \u003cyour-pod-name\u003e \u003clocal-port\u003e:\u003cyour-app-port\u003e Run a web browser or curl locally and enter the URL: http(s)://localhost:\u003clocal-port\u003e In addition, kubectl port-forward allows using a resource name, such as a deployment name or service name, to select a matching pod to port forward. More details can be found in the Kubernetes documentation.\nThe main drawback of this approach is that the pod’s name changes as soon as it is restarted. Moreover, you need to have a web browser on your client and you need to make sure that the local port is not already used by an application running on your system. Finally, sometimes the port forwarding is canceled due to nonobvious reasons. This leads to a kind of shaky approach. A more stable possibility is based on accessing the app via the kube-proxy, which accesses the corresponding service.\nSolution 2: Using the apiserver Proxy of Your Kubernetes Cluster There are several different proxies in Kubernetes. In this tutorial we will be using apiserver proxy to enable the access to the services in your cluster without Ingress. Unlike the first solution, here a service is required.\nUse the following format to compose a URL for accessing your service through an existing proxy on the Kubernetes cluster:\nhttps://\u003cyour-cluster-master\u003e/api/v1/namespace/\u003cyour-namespace\u003e/services/\u003cyour-service\u003e:\u003cyour-service-port\u003e/proxy/\u003cservice-endpoint\u003e\nExample:\n your-main-cluster your-namespace your-service your-service-port your-service-endpoint url to access service api.testclstr.cpet.k8s.sapcloud.io default nginx-svc 80 / http://api.testclstr.cpet.k8s.sapcloud.io/api/v1/namespaces/default/services/nginx-svc:80/proxy/ api.testclstr.cpet.k8s.sapcloud.io default docker-nodejs-svc 4500 /cpu?baseNumber=4 https://api.testclstr.cpet.k8s.sapcloud.io/api/v1/namespaces/default/services/docker-nodejs-svc:4500/proxy/cpu?baseNumber=4 For more details on the format, please refer to the official Kubernetes documentation.\nNote There are applications which do not support relative URLs yet, e.g. Prometheus (as of November, 2022). This typically leads to missing JavaScript objects, which could be investigated with your browser’s development tools. If such an issue occurs, please use the port-forward approach described above. ","categories":"","description":"","excerpt":"Question You have deployed an application with a web UI or an internal …","ref":"/docs/guides/applications/access-pod-from-local/","tags":"","title":"Access a Port of a Pod Locally"},{"body":"Access Restrictions The dashboard can be configured with access restrictions.\nAccess restrictions are shown for regions that have a matching label in the CloudProfile\n regions: - name: pangaea-north-1 zones: - name: pangaea-north-1a - name: pangaea-north-1b - name: pangaea-north-1c labels: seed.gardener.cloud/eu-access: \"true\" If the user selects the access restriction, spec.seedSelector.matchLabels[key] will be set. When selecting an option, metadata.annotations[optionKey] will be set. The value that is set depends on the configuration. See 2. under Configuration section below.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: annotations: support.gardener.cloud/eu-access-for-cluster-addons: \"true\" support.gardener.cloud/eu-access-for-cluster-nodes: \"true\" ... spec: seedSelector: matchLabels: seed.gardener.cloud/eu-access: \"true\" In order for the shoot (with enabled access restriction) to be scheduled on a seed, the seed needs to have the label set. E.g.\napiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: seed.gardener.cloud/eu-access: \"true\" ... Configuration As gardener administrator:\n you can control the visibility of the chips with the accessRestriction.items[].display.visibleIf and accessRestriction.items[].options[].display.visibleIf property. E.g. in this example the access restriction chip is shown if the value is true and the option is shown if the value is false. you can control the value of the input field (switch / checkbox) with the accessRestriction.items[].input.inverted and accessRestriction.items[].options[].input.inverted property. Setting the inverted property to true will invert the value. That means that when selecting the input field the value will be'false' instead of 'true'. you can configure the text that is displayed when no access restriction options are available by setting accessRestriction.noItemsText example values.yaml: accessRestriction: noItemsText: No access restriction options available for region {region} and cloud profile {cloudProfile} items: - key: seed.gardener.cloud/eu-access display: visibleIf: true # title: foo # optional title, if not defined key will be used # description: bar # optional description displayed in a tooltip input: title: EU Access description: | This service is offered to you with our regular SLAs and 24x7 support for the control plane of the cluster. 24x7 support for cluster add-ons and nodes is only available if you meet the following conditions: options: - key: support.gardener.cloud/eu-access-for-cluster-addons display: visibleIf: false # title: bar # optional title, if not defined key will be used # description: baz # optional description displayed in a tooltip input: title: No personal data is used as name or in the content of Gardener or Kubernetes resources (e.g. Gardener project name or Kubernetes namespace, configMap or secret in Gardener or Kubernetes) description: | If you can't comply, only third-level/dev support at usual 8x5 working hours in EEA will be available to you for all cluster add-ons such as DNS and certificates, Calico overlay network and network policies, kube-proxy and services, and everything else that would require direct inspection of your cluster through its API server inverted: true - key: support.gardener.cloud/eu-access-for-cluster-nodes display: visibleIf: false input: title: No personal data is stored in any Kubernetes volume except for container file system, emptyDirs, and persistentVolumes (in particular, not on hostPath volumes) description: | If you can't comply, only third-level/dev support at usual 8x5 working hours in EEA will be available to you for all node-related components such as Docker and Kubelet, the operating system, and everything else that would require direct inspection of your nodes through a privileged pod or SSH inverted: true ","categories":"","description":"","excerpt":"Access Restrictions The dashboard can be configured with access …","ref":"/docs/dashboard/access-restrictions/","tags":"","title":"Access Restrictions"},{"body":"Access to the Garden Cluster for Extensions Gardener offers different means to provide or equip registered extensions with a kubeconfig which may be used to connect to the garden cluster.\nAdmission Controllers For extensions with an admission controller deployment, gardener-operator injects a token-based kubeconfig as a volume and volume mount. The token is valid for 12h, automatically renewed, and associated with a dedicated ServiceAccount in the garden cluster. The path to this kubeconfig is revealed under the GARDEN_KUBECONFIG environment variable, also added to the pod spec(s).\nExtensions on Seed Clusters Extensions that are installed on seed clusters via a ControllerInstallation can simply read the kubeconfig file specified by the GARDEN_KUBECONFIG environment variable to create a garden cluster client. With this, they use a short-lived token (valid for 12h) associated with a dedicated ServiceAccount in the seed-\u003cseed-name\u003e namespace to securely access the garden cluster. The used ServiceAccounts are granted permissions in the garden cluster similar to gardenlet clients.\nBackground Historically, gardenlet has been the only component running in the seed cluster that has access to both the seed cluster and the garden cluster. Accordingly, extensions running on the seed cluster didn’t have access to the garden cluster.\nStarting from Gardener v1.74.0, there is a new mechanism for components running on seed clusters to get access to the garden cluster. For this, gardenlet runs an instance of the TokenRequestor for requesting tokens that can be used to communicate with the garden cluster.\nUsing Gardenlet-Managed Garden Access By default, extensions are equipped with secure access to the garden cluster using a dedicated ServiceAccount without requiring any additional action. They can simply read the file specified by the GARDEN_KUBECONFIG and construct a garden client with it.\nWhen installing a ControllerInstallation, gardenlet creates two secrets in the installation’s namespace: a generic garden kubeconfig (generic-garden-kubeconfig-\u003chash\u003e) and a garden access secret (garden-access-extension). Note that the ServiceAccount created based on this access secret will be created in the respective seed-* namespace in the garden cluster and labelled with controllerregistration.core.gardener.cloud/name=\u003cname\u003e.\nAdditionally, gardenlet injects volume, volumeMounts, and two environment variables into all (init) containers in all objects in the apps and batch API groups:\n GARDEN_KUBECONFIG: points to the path where the generic garden kubeconfig is mounted. SEED_NAME: set to the name of the Seed where the extension is installed. This is useful for restricting watches in the garden cluster to relevant objects. If an object already contains the GARDEN_KUBECONFIG environment variable, it is not overwritten and injection of volume and volumeMounts is skipped.\nFor example, a Deployment deployed via a ControllerInstallation will be mutated as follows:\napiVersion: apps/v1 kind: Deployment metadata: name: gardener-extension-provider-local annotations: reference.resources.gardener.cloud/secret-795f7ca6: garden-access-extension reference.resources.gardener.cloud/secret-d5f5a834: generic-garden-kubeconfig-81fb3a88 spec: template: metadata: annotations: reference.resources.gardener.cloud/secret-795f7ca6: garden-access-extension reference.resources.gardener.cloud/secret-d5f5a834: generic-garden-kubeconfig-81fb3a88 spec: containers: - name: gardener-extension-provider-local env: - name: GARDEN_KUBECONFIG value: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig/kubeconfig - name: SEED_NAME value: local volumeMounts: - mountPath: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig name: garden-kubeconfig readOnly: true volumes: - name: garden-kubeconfig projected: defaultMode: 420 sources: - secret: items: - key: kubeconfig path: kubeconfig name: generic-garden-kubeconfig-81fb3a88 optional: false - secret: items: - key: token path: token name: garden-access-extension optional: false The generic garden kubeconfig will look like this:\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: LS0t... server: https://garden.local.gardener.cloud:6443 name: garden contexts: - context: cluster: garden user: extension name: garden current-context: garden users: - name: extension user: tokenFile: /var/run/secrets/gardener.cloud/garden/generic-kubeconfig/token Manually Requesting a Token for the Garden Cluster Seed components that need to communicate with the garden cluster can request a token in the garden cluster by creating a garden access secret. This secret has to be labelled with resources.gardener.cloud/purpose=token-requestor and resources.gardener.cloud/class=garden, e.g.:\napiVersion: v1 kind: Secret metadata: name: garden-access-example namespace: example labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: garden annotations: serviceaccount.resources.gardener.cloud/name: example type: Opaque This will instruct gardenlet to create a new ServiceAccount named example in its own seed-\u003cseed-name\u003e namespace in the garden cluster, request a token for it, and populate the token in the secret’s data under the token key.\nPermissions in the Garden Cluster Both the SeedAuthorizer and the SeedRestriction plugin handle extensions clients and generally grant the same permissions in the garden cluster to them as to gardenlet clients. With this, extensions are restricted to work with objects in the garden cluster that are related to seed they are running one just like gardenlet. Note that if the plugins are not enabled, extension clients are only granted read access to global resources like CloudProfiles (this is granted to all authenticated users). There are a few exceptions to the granted permissions as documented here.\nAdditional Permissions If an extension needs access to additional resources in the garden cluster (e.g., extension-specific custom resources), permissions need to be granted via the usual RBAC means. Let’s consider the following example: An extension requires the privileges to create authorization.k8s.io/v1.SubjectAccessReviews (which is not covered by the “default” permissions mentioned above). This requires a human Gardener operator to create a ClusterRole in the garden cluster with the needed rules:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: extension-create-subjectaccessreviews annotations: authorization.gardener.cloud/extensions-serviceaccount-selector: '{\"matchLabels\":{\"controllerregistration.core.gardener.cloud/name\":\"\u003cextension-name\u003e\"}}' labels: authorization.gardener.cloud/custom-extensions-permissions: \"true\" rules: - apiGroups: - authorization.k8s.io resources: - subjectaccessreviews verbs: - create Note the label authorization.gardener.cloud/extensions-serviceaccount-selector which contains a label selector for ServiceAccounts.\nThere is a controller part of gardener-controller-manager which takes care of maintaining the respective ClusterRoleBinding resources. It binds all ServiceAccounts in the seed namespaces in the garden cluster (i.e., all extension clients) whose labels match. You can read more about this controller here.\nCustom Permissions If an extension wants to create a dedicated ServiceAccount for accessing the garden cluster without automatically inheriting all permissions of the gardenlet, it first needs to create a garden access secret in its extension namespace in the seed cluster:\napiVersion: v1 kind: Secret metadata: name: my-custom-component namespace: \u003cextension-namespace\u003e labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: garden annotations: serviceaccount.resources.gardener.cloud/name: my-custom-component-extension-foo serviceaccount.resources.gardener.cloud/labels: '{\"foo\":\"bar}' type: Opaque ❗️️Do not prefix the service account name with extension- to prevent inheriting the gardenlet permissions! It is still recommended to add the extension name (e.g., as a suffix) for easier identification where this ServiceAccount comes from.\nNext, you can follow the same approach described above. However, the authorization.gardener.cloud/extensions-serviceaccount-selector annotation should not contain controllerregistration.core.gardener.cloud/name=\u003cextension-name\u003e but rather custom labels, e.g. foo=bar.\nThis way, the created ServiceAccount will only get the permissions of above ClusterRole and nothing else.\nRenewing All Garden Access Secrets Operators can trigger an automatic renewal of all garden access secrets in a given Seed and their requested ServiceAccount tokens, e.g., when rotating the garden cluster’s ServiceAccount signing key. For this, the Seed has to be annotated with gardener.cloud/operation=renew-garden-access-secrets.\n","categories":"","description":"","excerpt":"Access to the Garden Cluster for Extensions Gardener offers different …","ref":"/docs/gardener/extensions/garden-api-access/","tags":"","title":"Access to the Garden Cluster for Extensions"},{"body":"Accessing Shoot Clusters After creation of a shoot cluster, end-users require a kubeconfig to access it. There are several options available to get to such kubeconfig.\nshoots/adminkubeconfig Subresource The shoots/adminkubeconfig subresource allows users to dynamically generate temporary kubeconfigs that can be used to access shoot cluster with cluster-admin privileges. The credentials associated with this kubeconfig are client certificates which have a very short validity and must be renewed before they expire (by calling the subresource endpoint again).\nThe username associated with such kubeconfig will be the same which is used for authenticating to the Gardener API. Apart from this advantage, the created kubeconfig will not be persisted anywhere.\nIn order to request such a kubeconfig, you can run the following commands (targeting the garden cluster):\nexport NAMESPACE=garden-my-namespace export SHOOT_NAME=my-shoot export KUBECONFIG=\u003ckubeconfig for garden cluster\u003e # can be set using \"gardenctl target --garden \u003clandscape\u003e\" kubectl create \\ -f \u003c(printf '{\"spec\":{\"expirationSeconds\":600}}') \\ --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/adminkubeconfig | \\ jq -r \".status.kubeconfig\" | \\ base64 -d You also can use controller-runtime client (\u003e= v0.14.3) to create such a kubeconfig from your go code like so:\nexpiration := 10 * time.Minute expirationSeconds := int64(expiration.Seconds()) adminKubeconfigRequest := \u0026authenticationv1alpha1.AdminKubeconfigRequest{ Spec: authenticationv1alpha1.AdminKubeconfigRequestSpec{ ExpirationSeconds: \u0026expirationSeconds, }, } err := client.SubResource(\"adminkubeconfig\").Create(ctx, shoot, adminKubeconfigRequest) if err != nil { return err } config = adminKubeconfigRequest.Status.Kubeconfig In Python, you can use the native kubernetes client to create such a kubeconfig like this:\n# This script first loads an existing kubeconfig from your system, and then sends a request to the Gardener API to create a new kubeconfig for a shoot cluster. # The received kubeconfig is then decoded and a new API client is created for interacting with the shoot cluster. import base64 import json from kubernetes import client, config import yaml # Set configuration options shoot_name=\"my-shoot\" # Name of the shoot project_namespace=\"garden-my-namespace\" # Namespace of the project # Load kubeconfig from default ~/.kube/config config.load_kube_config() api = client.ApiClient() # Create kubeconfig request kubeconfig_request = { 'apiVersion': 'authentication.gardener.cloud/v1alpha1', 'kind': 'AdminKubeconfigRequest', 'spec': { 'expirationSeconds': 600 } } response = api.call_api(resource_path=f'/apis/core.gardener.cloud/v1beta1/namespaces/{project_namespace}/shoots/{shoot_name}/adminkubeconfig', method='POST', body=kubeconfig_request, auth_settings=['BearerToken'], _preload_content=False, _return_http_data_only=True, ) decoded_kubeconfig = base64.b64decode(json.loads(response.data)[\"status\"][\"kubeconfig\"]).decode('utf-8') print(decoded_kubeconfig) # Create an API client to interact with the shoot cluster shoot_api_client = config.new_client_from_config_dict(yaml.safe_load(decoded_kubeconfig)) v1 = client.CoreV1Api(shoot_api_client) Note: The gardenctl-v2 tool simplifies targeting shoot clusters. It automatically downloads a kubeconfig that uses the gardenlogin kubectl auth plugin. This transparently manages authentication and certificate renewal without containing any credentials.\n shoots/viewerkubeconfig Subresource The shoots/viewerkubeconfig subresource works similar to the shoots/adminkubeconfig. The difference is that it returns a kubeconfig with read-only access for all APIs except the core/v1.Secret API and the resources which are specified in the spec.kubernetes.kubeAPIServer.encryptionConfig field in the Shoot (see this document).\nIn order to request such a kubeconfig, you can run follow almost the same code as above - the only difference is that you need to use the viewerkubeconfig subresource. For example, in bash this looks like this:\nexport NAMESPACE=garden-my-namespace export SHOOT_NAME=my-shoot kubectl create \\ -f \u003c(printf '{\"spec\":{\"expirationSeconds\":600}}') \\ --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/viewerkubeconfig | \\ jq -r \".status.kubeconfig\" | \\ base64 -d The examples for other programming languages are similar to the above and can be adapted accordingly.\nOpenID Connect The kube-apiserver of shoot clusters can be provided with OpenID Connect configuration via the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: oidcConfig: ... It is the end-user’s responsibility to incorporate the OpenID Connect configurations in the kubeconfig for accessing the cluster (i.e., Gardener will not automatically generate the kubeconfig based on these OIDC settings). The recommended way is using the kubectl plugin called kubectl oidc-login for OIDC authentication.\nIf you want to use the same OIDC configuration for all your shoots by default, then you can use the ClusterOpenIDConnectPreset and OpenIDConnectPreset API resources. They allow defaulting the .spec.kubernetes.kubeAPIServer.oidcConfig fields for newly created Shoots such that you don’t have to repeat yourself every time (similar to PodPreset resources in Kubernetes). ClusterOpenIDConnectPreset specified OIDC configuration applies to Projects and Shoots cluster-wide (hence, only available to Gardener operators), while OpenIDConnectPreset is Project-scoped. Shoots have to “opt-in” for such defaulting by using the oidc=enable label.\nFor further information on (Cluster)OpenIDConnectPreset, refer to ClusterOpenIDConnectPreset and OpenIDConnectPreset.\nFor shoots with Kubernetes version \u003e= 1.30, which have StructuredAuthenticationConfiguration feature gate enabled (enabled by default), it is advised to use Structured Authentication instead of configuring .spec.kubernetes.kubeAPIServer.oidcConfig. If oidcConfig is configured, it is translated into an AuthenticationConfiguration file to use for Structured Authentication configuration\nStructured Authentication For shoots with Kubernetes version \u003e= 1.30, which have StructuredAuthenticationConfiguration feature gate enabled (enabled by default), kube-apiserver of shoot clusters can be provided with Structured Authentication configuration via the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: kubeAPIServer: structuredAuthentication: configMapName: name-of-configmap-containing-authentication-config The configMapName references a user created ConfigMap in the project namespace containing the AuthenticationConfiguration in it’s config.yaml data field. Here is an example of such ConfigMap:\napiVersion: v1 kind: ConfigMap metadata: name: name-of-configmap-containing-authentication-config namespace: garden-my-project data: config.yaml: |apiVersion: apiserver.config.k8s.io/v1alpha1 kind: AuthenticationConfiguration jwt: - issuer: url: https://issuer1.example.com audiences: - audience1 - audience2 claimMappings: username: expression: 'claims.username' groups: expression: 'claims.groups' uid: expression: 'claims.uid' claimValidationRules: expression: 'claims.hd == \"example.com\"' message: \"the hosted domain name must be example.com\" Currently, only apiVersion: apiserver.config.k8s.io/v1alpha1 is supported. The user is resposible for the validity of the configured JWTAuthenticators.\nStatic Token kubeconfig Note: Static token kubeconfig is not available for Shoot clusters using Kubernetes version \u003e= 1.27. The shoots/adminkubeconfig subresource should be used instead.\n This kubeconfig contains a static token and provides cluster-admin privileges. It is created by default and persisted in the \u003cshoot-name\u003e.kubeconfig secret in the project namespace in the garden cluster.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: kubernetes: enableStaticTokenKubeconfig: true ... It is not the recommended method to access the shoot cluster, as the static token kubeconfig has some security flaws associated with it:\n The static token in the kubeconfig doesn’t have any expiration date. Read Credentials Rotation for Shoot Clusters to learn how to rotate the static token. The static token doesn’t have any user identity associated with it. The user in that token will always be system:cluster-admin, irrespective of the person accessing the cluster. Hence, it is impossible to audit the events in cluster. When the enableStaticTokenKubeconfig field is not explicitly set in the Shoot spec:\n for Shoot clusters using Kubernetes version \u003c 1.26, the field is defaulted to true. for Shoot clusters using Kubernetes version \u003e= 1.26, the field is defaulted to false. Note: Starting with Kubernetes 1.27, the enableStaticTokenKubeconfig field will be locked to false.\n ","categories":"","description":"","excerpt":"Accessing Shoot Clusters After creation of a shoot cluster, end-users …","ref":"/docs/gardener/shoot_access/","tags":"","title":"Accessing Shoot Clusters"},{"body":"Overview In order to add GitHub documentation to the website that is hosted outside of the main repository, you need to make changes to the central manifest. You can usually find it in the \u003corganization-name\u003e/\u003crepo-name\u003e/.docforge/ folder, for example gardener/documentation/.docforge.\nSample codeblock:\n- dir: machine-controller-manager structure: - file: _index.md frontmatter: title: Machine Controller Manager weight: 1 description: Declarative way of managing machines for Kubernetes cluster source: https://github.com/gardener/machine-controller-manager/blob/master/README.md - fileTree: https://github.com/gardener/machine-controller-manager/tree/master/docs This short code snippet adds a whole repository worth of content and contains examples of some of the most important elements:\n - dir: \u003cdir-name\u003e - the name of the directory in the navigation path structure: - required after using dir; shows that the following lines contain a file structure - file: _index.md - the content will be a single file; also creates an index file frontmatter: - allows for manual setting/overwriting of the various properties a file can have source: \u003clink\u003e - where the content for the file element is located - fileTree: \u003clink\u003e - the content will be a whole folder; also gives the location of the content Check the Notes and Tips section for useful advice when making changes to the manifest files.\nAdding Existing Documentation You can use the following templates in order to add documentation to the website that exists in other GitHub repositories.\nNote Proper indentation is incredibly important, as yaml relies on it for nesting! Adding a Single File You can add a single topic to the website by providing a link to it in the manifest.\n- dir: \u003cdir-name\u003e structure: - file: \u003cfile-name\u003e frontmatter: title: \u003ctopic-name\u003e description: \u003ctopic-description\u003e weight: \u003cweight\u003e source: https://github.com/\u003cpath\u003e/\u003cfile\u003e Example - dir: dashboard structure: - file: _index.md frontmatter: title: Dashboard description: The web UI for managing your projects and clusters weight: 3 source: https://github.com/gardener/dashboard/blob/master/README.md Adding Multiple Files You can also add multiple topics to the website at once, either through linking a whole folder or a manifest than contains the documentation structure.\nNote If the content you want to add does not have an _index.md file in it, it won’t show up as a single section on the website. You can fix this by adding the following after the structure: element:\n- file: _index.md frontmatter: title: \u003ctopic-name\u003e description: \u003ctopic-description\u003e weight: \u003cweight\u003e Linking a Folder - dir: \u003cdir-name\u003e structure: - fileTree: https://github.com/\u003cpath\u003e/\u003cfolder\u003e Example - dir: development structure: - fileTree: https://github.com/gardener/gardener/tree/master/docs/development Linking a Manifest File - dir: \u003cdir-name\u003e structure: - manifest: https://github.com/\u003cpath\u003e/manifest.yaml Example - dir: extensions structure: - manifest: https://github.com/gardener/documentation/blob/master/.docforge/documentation/gardener-extensions/gardener-extensions.yaml Notes and Tips If you want to place a file inside of an already existing directory in the main repo, you need to create a dir element that matches its name. If one already exists, simply add your link to its structure element. You can chain multiple files, folders, and manifests inside of a single structure element. For examples of frontmatter elements, see the Style Guide. ","categories":"","description":"","excerpt":"Overview In order to add GitHub documentation to the website that is …","ref":"/docs/contribute/documentation/adding-existing-documentation/","tags":"","title":"Adding Already Existing Documentation"},{"body":"Adding support for a new provider Steps to be followed while implementing a new (hyperscale) provider are mentioned below. This is the easiest way to add new provider support using a blueprint code.\nHowever, you may also develop your machine controller from scratch, which would provide you with more flexibility. First, however, make sure that your custom machine controller adheres to the Machine.Status struct defined in the MachineAPIs. This will make sure the MCM can act with higher-level controllers like MachineSet and MachineDeployment controller. The key is the Machine.Status.CurrentStatus.Phase key that indicates the status of the machine object.\nOur strong recommendation would be to follow the steps below. This provides the most flexibility required to support machine management for adding new providers. And if you feel to extend the functionality, feel free to update our machine controller libraries.\nSetting up your repository Create a new empty repository named machine-controller-manager-provider-{provider-name} on GitHub username/project. Do not initialize this repository with a README. Copy the remote repository URL (HTTPS/SSH) to this repository displayed once you create this repository. Now, on your local system, create directories as required. {your-github-username} given below could also be {github-project} depending on where you have created the new repository. mkdir -p $GOPATH/src/github.com/{your-github-username} Navigate to this created directory. cd $GOPATH/src/github.com/{your-github-username} Clone this repository on your local machine. git clone git@github.com:gardener/machine-controller-manager-provider-sampleprovider.git Rename the directory from machine-controller-manager-provider-sampleprovider to machine-controller-manager-provider-{provider-name}. mv machine-controller-manager-provider-sampleprovider machine-controller-manager-provider-{provider-name} Navigate into the newly-created directory. cd machine-controller-manager-provider-{provider-name} Update the remote origin URL to the newly created repository’s URL you had copied above. git remote set-url origin git@github.com:{your-github-username}/machine-controller-manager-provider-{provider-name}.git Rename GitHub project from gardener to {github-org/your-github-username} wherever you have cloned the repository above. Also, edit all occurrences of the word sampleprovider to {provider-name} in the code. Then, use the hack script given below to do the same. make rename-project PROJECT_NAME={github-org/your-github-username} PROVIDER_NAME={provider-name} eg: make rename-project PROJECT_NAME=gardener PROVIDER_NAME=AmazonWebServices (or) make rename-project PROJECT_NAME=githubusername PROVIDER_NAME=AWS Now, commit your changes and push them upstream. git add -A git commit -m \"Renamed SampleProvide to {provider-name}\" git push origin master Code changes required The contract between the Machine Controller Manager (MCM) and the Machine Controller (MC) AKA driver has been documented here and the machine error codes can be found here. You may refer to them for any queries.\n⚠️\n Keep in mind that there should be a unique way to map between machine objects and VMs. This can be done by mapping machine object names with VM-Name/ tags/ other metadata. Optionally, there should also be a unique way to map a VM to its machine class object. This can be done by tagging VM objects with tags/resource groups associated with the machine class. Steps to integrate Update the pkg/provider/apis/provider_spec.go specification file to reflect the structure of the ProviderSpec blob. It typically contains the machine template details in the MachineClass object. Follow the sample spec provided already in the file. A sample provider specification can be found here. Fill in the methods described at pkg/provider/core.go to manage VMs on your cloud provider. Comments are provided above each method to help you fill them up with desired REQUEST and RESPONSE parameters. A sample provider implementation for these methods can be found here. Fill in the required methods CreateMachine(), and DeleteMachine() methods. Optionally fill in methods like GetMachineStatus(), InitializeMachine, ListMachines(), and GetVolumeIDs(). You may choose to fill these once the working of the required methods seems to be working. GetVolumeIDs() expects VolumeIDs to be decoded from the volumeSpec based on the cloud provider. There is also an OPTIONAL method GenerateMachineClassForMigration() that helps in migration of {ProviderSpecific}MachineClass to MachineClass CR (custom resource). This only makes sense if you have an existing implementation (in-tree) acting on different CRD types. You would like to migrate this. If not, you MUST return an error (machine error UNIMPLEMENTED) to avoid processing this step. Perform validation of APIs that you have described and make it a part of your methods as required at each request. Write unit tests to make it work with your implementation by running make test. make test Tidy the go dependencies. make tidy Update the sample YAML files on the kubernetes/ directory to provide sample files through which the working of the machine controller can be tested. Update README.md to reflect any additional changes Testing your code changes Make sure $TARGET_KUBECONFIG points to the cluster where you wish to manage machines. Likewise, $CONTROL_NAMESPACE represents the namespaces where MCM is looking for machine CR objects, and $CONTROL_KUBECONFIG points to the cluster that holds these machine CRs.\n On the first terminal running at $GOPATH/src/github.com/{github-org/your-github-username}/machine-controller-manager-provider-{provider-name}, Run the machine controller (driver) using the command below. make start On the second terminal pointing to $GOPATH/src/github.com/gardener, Clone the latest MCM code git clone git@github.com:gardener/machine-controller-manager.git Navigate to the newly-created directory. cd machine-controller-manager Deploy the required CRDs from the machine-controller-manager repo, kubectl apply -f kubernetes/crds Run the machine-controller-manager in the master branch make start On the third terminal pointing to $GOPATH/src/github.com/{github-org/your-github-username}/machine-controller-manager-provider-{provider-name} Fill in the object files given below and deploy them as described below. Deploy the machine-class kubectl apply -f kubernetes/machine-class.yaml Deploy the kubernetes secret if required. kubectl apply -f kubernetes/secret.yaml Deploy the machine object and make sure it joins the cluster successfully. kubectl apply -f kubernetes/machine.yaml Once the machine joins, you can test by deploying a machine-deployment. Deploy the machine-deployment object and make sure it joins the cluster successfully. kubectl apply -f kubernetes/machine-deployment.yaml Make sure to delete both the machine and machine-deployment objects after use. kubectl delete -f kubernetes/machine.yaml kubectl delete -f kubernetes/machine-deployment.yaml Releasing your docker image Make sure you have logged into gcloud/docker using the CLI. To release your docker image, run the following. make release IMAGE_REPOSITORY=\u003clink-to-image-repo\u003e A sample kubernetes deploy file can be found at kubernetes/deployment.yaml. Update the same (with your desired MCM and MC images) to deploy your MCM pod. ","categories":"","description":"","excerpt":"Adding support for a new provider Steps to be followed while …","ref":"/docs/other-components/machine-controller-manager/cp_support_new/","tags":"","title":"Adding Support for a Cloud Provider"},{"body":"Extension Admission The extensions are expected to validate their respective resources for their extension specific configurations, when the resources are newly created or updated. For example, provider extensions would validate spec.provider.infrastructureConfig and spec.provider.controlPlaneConfig in the Shoot resource and spec.providerConfig in the CloudProfile resource, networking extensions would validate spec.networking.providerConfig in the Shoot resource. As best practice, the validation should be performed only if there is a change in the spec of the resource. Please find an exemplary implementation in the gardener/gardener-extension-provider-aws repository.\nWhen a resource is newly created or updated, Gardener adds an extension label for all the extension types referenced in the spec of the resource. This label is of the form \u003cextension-type\u003e.extensions.gardener.cloud/\u003cextension-name\u003e : \"true\". For example, an extension label for a provider extension type aws looks like provider.extensions.gardener.cloud/aws : \"true\". The extensions should add object selectors in their admission webhooks for these labels, to filter out the objects they are responsible for. At present, these labels are added to BackupEntrys, BackupBuckets, CloudProfiles, Seeds, SecretBindings and Shoots. Please see the types_constants.go file for the full list of extension labels.\n","categories":"","description":"","excerpt":"Extension Admission The extensions are expected to validate their …","ref":"/docs/gardener/extensions/admission/","tags":"","title":"Admission"},{"body":"Admission Configuration for the PodSecurity Admission Plugin If you wish to add your custom configuration for the PodSecurity plugin, you can do so in the Shoot spec under .spec.kubernetes.kubeAPIServer.admissionPlugins by adding:\nadmissionPlugins: - name: PodSecurity config: apiVersion: pod-security.admission.config.k8s.io/v1 kind: PodSecurityConfiguration # Defaults applied when a mode label is not set. # # Level label values must be one of: # - \"privileged\" (default) # - \"baseline\" # - \"restricted\" # # Version label values must be one of: # - \"latest\" (default) # - specific version like \"v1.25\" defaults: enforce: \"privileged\" enforce-version: \"latest\" audit: \"privileged\" audit-version: \"latest\" warn: \"privileged\" warn-version: \"latest\" exemptions: # Array of authenticated usernames to exempt. usernames: [] # Array of runtime class names to exempt. runtimeClasses: [] # Array of namespaces to exempt. namespaces: [] For proper functioning of Gardener, kube-system namespace will also be automatically added to the exemptions.namespaces list.\n","categories":"","description":"Adding custom configuration for the `PodSecurity` plugin in `.spec.kubernetes.kubeAPIServer.admissionPlugins`","excerpt":"Adding custom configuration for the `PodSecurity` plugin in …","ref":"/docs/gardener/pod-security/","tags":"","title":"Admission Configuration for the `PodSecurity` Admission Plugin"},{"body":"See who is using Gardener Gardener adopters in production environments that have publicly shared details of their usage. SAP uses Gardener to deploy and manage Kubernetes clusters at scale in a uniform way across infrastructures (AWS, Azure, GCP, Alicloud, as well as generic interfaces to OpenStack and vSphere). Workloads include Databases (SAP HANA Cloud), Big Data (SAP Data Intelligence), Kyma, many other cloud native applications, and diverse business workloads. Gardener can now be run by customers on the Public Cloud Platform of the leading European Cloud Provider OVHcloud. ScaleUp Technologies runs Gardener within their public Openstack Clouds (Hamburg, Berlin, Düsseldorf). Their clients run all kinds of workloads on top of Gardener maintained Kubernetes clusters ranging from databases to Software-as-a-Service applications. Finanz Informatik Technologie Services GmbH uses Gardener to offer k8s as a service for customers in the financial industry in Germany. It is built on top of a “metal as a service” infrastructure implemented from scratch for k8s workloads in mind. The result is k8s on top of bare metal in minutes. PingCAP TiDB, is a cloud-native distributed SQL database with MySQL compatibility, and one of the most popular open-source database projects - with 23.5K+ stars and 400+ contributors. Its sister project TiKV is a Cloud Native Interactive Landscape project. PingCAP envisioned their managed TiDB service, known as TiDB Cloud, to be multi-tenant, secure, cost-efficient, and to be compatible with different cloud providers and they chose Gardener. Beezlabs uses Gardener to deliver Intelligent Process Automation platform, on multiple cloud providers and reduce costs and lock-in risks. b’nerd uses Gardener as the core technology for its own managed Kubernetes as a Service solution and operates multiple Gardener installations for several cloud hosting service providers. STACKIT is a digital brand of Europe’s biggest retailer, the Schwarz Group, which includes Lidl, Kaufland, but also production and recycling companies. It uses Gardener to offer public and private Kubernetes as a service in own data centers in Europe and targets to become the cloud provider for German and European small and mid-sized companies. Supporting and managing multiple application landscapes on-premises and across different hyperscaler infrastructures can be painful. At T-Systems we use Gardener both for internal usage and to manage clusters for our customers. We love the openness of the project, the flexibility and the architecture that allows us to manage clusters around the world with only one team from one single pane of glass and to meet industry specific certification standards. The sovereignty by design is another great value, the technology implicitly brings along. The German-based company 23 Technologies uses Gardener to offer an enterprise-class Kubernetes engine for industrial use cases as well as cloud service providers and offers managed and professional services for it. 23T is also the team behind okeanos.dev, a public service that can be used by anyone to try out Gardener. B1 Systems GmbH is a international provider of Linux \u0026 Open Source consulting, training, managed service \u0026 support. We are founded in 2004 and based in Germany. Our team of 140 Linux experts offers tailor-made solutions based on cloud \u0026 container technologies, virtualization \u0026 high availability as well as monitoring, system \u0026 configuration management. B1 is using Gardener internally and also set up solutions/environments for customers. finleap connect GmbH is the leading independent Open Banking platform provider in Europe. It enables companies across a multitude of industries to provide the next generation of financial services by understanding how customers transact and interact. With its “full-stack” platform of solutions, finleap connect makes it possible for its clients to compliantly access the financial transactions data of customers, enrich said data with analytics tools, provide digital banking services and deliver high-quality, digital financial services products and services to customers. Gardener uniquly enables us to deploy our platform in Europe and across the globe in a uniform way on the providers preferred by our customers. Codesphere is a Cloud IDE with integrated and automated deployment of web apps. It uses Gardener internally to manage clusters that host customer deployments and internal systems all over the world. plusserver combines its own cloud offerings with hyperscaler platforms to provide individually tailored multi-cloud solutions. The plusserver Kubernetes Engine (PSKE) based on Gardener reduces the complexity in managing multi-cloud environments and enables companies to orchestrate their containers and cloud-native applications across a variety of platforms such as plusserver’s pluscloud open or hyperscalers such as AWS, either by mouseclick or via an API. With PSKE, companies remain vendor-independent and profit from guaranteed data sovereignty and data security due to GDPR-compliant cloud platforms in the certified plusserver data centers in Germany. Fuga Cloud uses Gardener as the basis for its Enterprise Managed Kubernetes (EMK), a platform that simplifies the management of your k8s and provides insight into usage and performance. The other Fuga Cloud services can be added with a mouse click, and the choice of another cloud provider is a negotiable option. Fuga Cloud stands for Digital Sovereignty, Data Portability and GDPR compatibility. metalstack.cloud uses Gardener and is based on the open-source software metal-stack.io, which is developed for regulated financial institutions. The focus here is on the highest possible security and compliance conformity. This makes metalstack.cloud perfect for running enterprise-grade container applications and provides your workloads with the highest possible performance. Cleura uses Gardener to power its Container Orchestration Engine for Cleura Public Cloud and Cleura Compliant Cloud. Cleura Container Orchestration Engine simplifies the creation and management of Kubernetes clusters through their user-friendly Cleura Cloud Management Panel or API, allowing users to focus on deploying applications instead of maintaining the underlying infrastructure. PITS Globale Datenrettungsdienste is a data recovery company located in Germany specializing in recovering lost or damaged files from hard drives, solid-state drives, flash drives, and other storage media. Gardener is used to handle highly-loaded internal infrastructure and provide reliable, fully-managed K8 cluster solutions. If you’re using Gardener and you aren’t on this list, submit a pull request! ","categories":"","description":"","excerpt":"See who is using Gardener Gardener adopters in production environments …","ref":"/adopter/","tags":"","title":"Adopters"},{"body":"Alerting Gardener uses Prometheus to gather metrics from each component. A Prometheus is deployed in each shoot control plane (on the seed) which is responsible for gathering control plane and cluster metrics. Prometheus can be configured to fire alerts based on these metrics and send them to an Alertmanager. The Alertmanager is responsible for sending the alerts to users and operators. This document describes how to setup alerting for:\n end-users/stakeholders/customers operators/administrators Alerting for Users To receive email alerts as a user, set the following values in the shoot spec:\nspec: monitoring: alerting: emailReceivers: - john.doe@example.com emailReceivers is a list of emails that will receive alerts if something is wrong with the shoot cluster.\nAlerting for Operators Currently, Gardener supports two options for alerting:\n Email Alerting Sending Alerts to an External Alertmanager Email Alerting Gardener provides the option to deploy an Alertmanager into each seed. This Alertmanager is responsible for sending out alerts to operators for each shoot cluster in the seed. Only email alerts are supported by the Alertmanager managed by Gardener. This is configurable by setting the Gardener controller manager configuration values alerting. See Gardener Configuration and Usage on how to configure the Gardener’s SMTP secret. If the values are set, a secret with the label gardener.cloud/role: alerting will be created in the garden namespace of the garden cluster. This secret will be used by each Alertmanager in each seed.\nExternal Alertmanager The Alertmanager supports different kinds of alerting configurations. The Alertmanager provided by Gardener only supports email alerts. If email is not sufficient, then alerts can be sent to an external Alertmanager. Prometheus will send alerts to a URL and then alerts will be handled by the external Alertmanager. This external Alertmanager is operated and configured by the operator (i.e. Gardener does not configure or deploy this Alertmanager). To configure sending alerts to an external Alertmanager, create a secret in the virtual garden cluster in the garden namespace with the label: gardener.cloud/role: alerting. This secret needs to contain a URL to the external Alertmanager and information regarding authentication. Supported authentication types are:\n No Authentication (none) Basic Authentication (basic) Mutual TLS (certificate) Remote Alertmanager Examples Note: The url value cannot be prepended with http or https.\n # No Authentication apiVersion: v1 kind: Secret metadata: labels: gardener.cloud/role: alerting name: alerting-auth namespace: garden data: # No Authentication auth_type: base64(none) url: base64(external.alertmanager.foo) # Basic Auth auth_type: base64(basic) url: base64(extenal.alertmanager.foo) username: base64(admin) password: base64(password) # Mutual TLS auth_type: base64(certificate) url: base64(external.alertmanager.foo) ca.crt: base64(ca) tls.crt: base64(certificate) tls.key: base64(key) insecure_skip_verify: base64(false) # Email Alerts (internal alertmanager) auth_type: base64(smtp) auth_identity: base64(internal.alertmanager.auth_identity) auth_password: base64(internal.alertmanager.auth_password) auth_username: base64(internal.alertmanager.auth_username) from: base64(internal.alertmanager.from) smarthost: base64(internal.alertmanager.smarthost) to: base64(internal.alertmanager.to) type: Opaque Configuring Your External Alertmanager Please refer to the Alertmanager documentation on how to configure an Alertmanager.\nWe recommend you use at least the following inhibition rules in your Alertmanager configuration to prevent excessive alerts:\ninhibit_rules: # Apply inhibition if the alert name is the same. - source_match: severity: critical target_match: severity: warning equal: ['alertname', 'service', 'cluster'] # Stop all alerts for type=shoot if there are VPN problems. - source_match: service: vpn target_match_re: type: shoot equal: ['type', 'cluster'] # Stop warning and critical alerts if there is a blocker - source_match: severity: blocker target_match_re: severity: ^(critical|warning)$ equal: ['cluster'] # If the API server is down inhibit no worker nodes alert. No worker nodes depends on kube-state-metrics which depends on the API server. - source_match: service: kube-apiserver target_match_re: service: nodes equal: ['cluster'] # If API server is down inhibit kube-state-metrics alerts. - source_match: service: kube-apiserver target_match_re: severity: info equal: ['cluster'] # No Worker nodes depends on kube-state-metrics. Inhibit no worker nodes if kube-state-metrics is down. - source_match: service: kube-state-metrics-shoot target_match_re: service: nodes equal: ['cluster'] Below is a graph visualizing the inhibition rules:\n","categories":"","description":"","excerpt":"Alerting Gardener uses Prometheus to gather metrics from each …","ref":"/docs/gardener/monitoring/alerting/","tags":"","title":"Alerting"},{"body":"Overview Sometimes operators want to find out why a certain node got removed. This guide helps to identify possible causes. There are a few potential reasons why nodes can be removed:\n broken node: a node becomes unhealthy and machine-controller-manager terminates it in an attempt to replace the unhealthy node with a new one scale-down: cluster-autoscaler sees that a node is under-utilized and therefore scales down a worker pool node rolling: configuration changes to a worker pool (or cluster) require all nodes of one or all worker pools to be rolled and thus all nodes to be replaced. Some possible changes are: the K8s/OS version changing machine types Helpful information can be obtained by using the logging stack. See Logging Stack for how to utilize the logging information in Gardener.\nFind Out Whether the Node Was unhealthy Check the Node Events A good first indication on what happened to a node can be obtained from the node’s events. Events are scraped and ingested into the logging system, so they can be found in the explore tab of Grafana (make sure to select loki as datasource) with a query like {job=\"event-logging\"} | unpack | object=\"Node/\u003cnode-name\u003e\" or find any event mentioning the node in question via a broader query like {job=\"event-logging\"}|=\"\u003cnode-name\u003e\".\nA potential result might reveal:\n{\"_entry\":\"Node ip-10-55-138-185.eu-central-1.compute.internal status is now: NodeNotReady\",\"count\":1,\"firstTimestamp\":\"2023-04-05T12:02:08Z\",\"lastTimestamp\":\"2023-04-05T12:02:08Z\",\"namespace\":\"default\",\"object\":\"Node/ip-10-55-138-185.eu-central-1.compute.internal\",\"origin\":\"shoot\",\"reason\":\"NodeNotReady\",\"source\":\"node-controller\",\"type\":\"Normal\"} Check machine-controller-manager Logs If a node was getting unhealthy, the last conditions can be found in the logs of the machine-controller-manager by using a query like {pod_name=~\"machine-controller-manager.*\"}|=\"\u003cnode-name\u003e\".\nCaveat: every node resource is backed by a corresponding machine resource managed by machine-controller-manager. Usually two corresponding node and machine resources have the same name with the exception of AWS. Here you first need to find with the above query the corresponding machine name, typically via a log like this\n2023-04-05 12:02:08 {\"log\":\"Conditions of Machine \\\"shoot--demo--cluster-pool-z1-6dffc-jh4z4\\\" with providerID \\\"aws:///eu-central-1/i-0a6ad1ca4c2e615dc\\\" and backing node \\\"ip-10-55-138-185.eu-central-1.compute.internal\\\" are changing\",\"pid\":\"1\",\"severity\":\"INFO\",\"source\":\"machine_util.go:629\"} This reveals that node ip-10-55-138-185.eu-central-1.compute.internal is backed by machine shoot--demo--cluster-pool-z1-6dffc-jh4z4. On infrastructures other than AWS you can omit this step.\nWith the machine name at hand, now search for log entries with {pod_name=~\"machine-controller-manager.*\"}|=\"\u003cmachine-name\u003e\". In case the node had failing conditions, you’d find logs like this:\n2023-04-05 12:02:08 {\"log\":\"Machine shoot--demo--cluster-pool-z1-6dffc-jh4z4 is unhealthy - changing MachineState to Unknown. Node conditions: [{Type:ClusterNetworkProblem Status:False LastHeartbeatTime:2023-04-05 11:58:39 +0000 UTC LastTransitionTime:2023-03-23 11:59:29 +0000 UTC Reason:NoNetworkProblems Message:no cluster network problems} ... {Type:Ready Status:Unknown LastHeartbeatTime:2023-04-05 11:55:27 +0000 UTC LastTransitionTime:2023-04-05 12:02:07 +0000 UTC Reason:NodeStatusUnknown Message:Kubelet stopped posting node status.}]\",\"pid\":\"1\",\"severity\":\"WARN\",\"source\":\"machine_util.go:637\"} In the example above, the reason for an unhealthy node was that kubelet failed to renew its heartbeat. Typical reasons would be either a broken VM (that couldn’t execute kubelet anymore) or a broken network. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).\nIn both cases, the infrastructure provider might be able to provide more information on particular VM or network failures.\nWhatever the failure condition might have been, if a node gets unhealthy, it will be terminated by machine-controller-manager after the machineHealthTimeout has elapsed (this parameter can be configured in your shoot spec).\nCheck the Node Logs For each node the kernel and kubelet logs, as well as a few others, are scraped and can be queried with this query {nodename=\"\u003cnode-name\u003e\"} This might reveal OS specific issues or, in the absence of any logs (e.g., after the node went unhealthy), might indicate a network disruption or sudden VM termination. Note that some VM terminations performed by the infrastructure provider are actually expected (e.g., scheduled events on AWS).\nInfrastructure providers might be able to provide more information on particular VM failures in such cases.\nCheck the Network Problem Detector Dashboard If your Gardener installation utilizes gardener-extension-shoot-networking-problemdetector, you can check the dashboard named “Network Problem Detector” in Grafana for hints on network issues on the node of interest.\nScale-Down In general, scale-downs are managed by the cluster-autoscaler, its logs can be found with the query {container_name=\"cluster-autoscaler\"}. Attempts to remove a node can be found with the query {container_name=\"cluster-autoscaler\"}|=\"Scale-down: removing empty node\"\nIf a scale-down has caused disruptions in your workload, consider protecting your workload by adding PodDisruptionBudgets (see the autoscaler FAQ for more options).\nNode Rolling Node rolling can be caused by, e.g.:\n change of the K8s minor version of the cluster or a worker pool change of the OS version of the cluster or a worker pool change of the disk size/type or machine size/type of a worker pool change of node labels Changes like the above are done by altering the shoot specification and thus are recorded in the external auditlog system that is configured for the garden cluster.\n","categories":"","description":"Utilize Gardener's Monitoring and Logging to analyze removal and failures of nodes","excerpt":"Utilize Gardener's Monitoring and Logging to analyze removal and …","ref":"/docs/guides/monitoring-and-troubleshooting/analysing-node-failures/","tags":"","title":"Analyzing Node Removal and Failures"},{"body":"Specification ProviderSpec Schema Machine Machine is the representation of a physical or virtual machine.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string Machine metadata Kubernetes meta/v1.ObjectMeta ObjectMeta for machine object\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSpec Spec contains the specification of the machine\n class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n status MachineStatus Status contains fields depicting the status\n MachineClass MachineClass can be used to templatize and re-use provider configuration across multiple Machines / MachineSets / MachineDeployments.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineClass metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. nodeTemplate NodeTemplate (Optional) NodeTemplate contains subfields to track all node resources and other node info required to scale nodegroup from zero\n credentialsSecretRef Kubernetes core/v1.SecretReference CredentialsSecretRef can optionally store the credentials (in this case the SecretRef does not need to store them). This might be useful if multiple machine classes with the same credentials but different user-datas are used.\n providerSpec k8s.io/apimachinery/pkg/runtime.RawExtension Provider-specific configuration to use during node creation.\n provider string Provider is the combination of name and location of cloud-specific drivers.\n secretRef Kubernetes core/v1.SecretReference SecretRef stores the necessary secrets such as credentials or userdata.\n MachineDeployment MachineDeployment enables declarative updates for machines and MachineSets.\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineDeployment metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineDeploymentSpec (Optional) Specification of the desired behavior of the MachineDeployment.\n replicas int32 (Optional) Number of desired machines. This is a pointer to distinguish between explicit zero and not specified. Defaults to 0.\n selector Kubernetes meta/v1.LabelSelector (Optional) Label selector for machines. Existing MachineSets whose machines are selected by this will be the ones affected by this MachineDeployment.\n template MachineTemplateSpec Template describes the machines that will be created.\n strategy MachineDeploymentStrategy (Optional) The MachineDeployment strategy to use to replace existing machines with new ones.\n minReadySeconds int32 (Optional) Minimum number of seconds for which a newly created machine should be ready without any of its container crashing, for it to be considered available. Defaults to 0 (machine will be considered available as soon as it is ready)\n revisionHistoryLimit *int32 (Optional) The number of old MachineSets to retain to allow rollback. This is a pointer to distinguish between explicit zero and not specified.\n paused bool (Optional) Indicates that the MachineDeployment is paused and will not be processed by the MachineDeployment controller.\n rollbackTo RollbackConfig (Optional) DEPRECATED. The config this MachineDeployment is rolling back to. Will be cleared after rollback is done.\n progressDeadlineSeconds *int32 (Optional) The maximum time in seconds for a MachineDeployment to make progress before it is considered to be failed. The MachineDeployment controller will continue to process failed MachineDeployments and a condition with a ProgressDeadlineExceeded reason will be surfaced in the MachineDeployment status. Note that progress will not be estimated during the time a MachineDeployment is paused. This is not set by default, which is treated as infinite deadline.\n status MachineDeploymentStatus (Optional) Most recently observed status of the MachineDeployment.\n MachineSet MachineSet TODO\n Field Type Description apiVersion string machine.sapcloud.io/v1alpha1 kind string MachineSet metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSetSpec (Optional) replicas int32 (Optional) selector Kubernetes meta/v1.LabelSelector (Optional) machineClass ClassSpec (Optional) template MachineTemplateSpec (Optional) minReadySeconds int32 (Optional) status MachineSetStatus (Optional) ClassSpec (Appears on: MachineSetSpec, MachineSpec) ClassSpec is the class specification of machine\n Field Type Description apiGroup string API group to which it belongs\n kind string Kind for machine class\n name string Name of machine class\n ConditionStatus (string alias)\n (Appears on: MachineDeploymentCondition, MachineSetCondition) ConditionStatus are valid condition statuses\nCurrentStatus (Appears on: MachineStatus) CurrentStatus contains information about the current status of Machine.\n Field Type Description phase MachinePhase timeoutActive bool lastUpdateTime Kubernetes meta/v1.Time Last update time of current status\n LastOperation (Appears on: MachineSetStatus, MachineStatus, MachineSummary) LastOperation suggests the last operation performed on the object\n Field Type Description description string Description of the current operation\n errorCode string (Optional) ErrorCode of the current operation if any\n lastUpdateTime Kubernetes meta/v1.Time Last update time of current operation\n state MachineState State of operation\n type MachineOperationType Type of operation\n MachineConfiguration (Appears on: MachineSpec) MachineConfiguration describes the configurations useful for the machine-controller.\n Field Type Description drainTimeout Kubernetes meta/v1.Duration (Optional) MachineDraintimeout is the timeout after which machine is forcefully deleted.\n healthTimeout Kubernetes meta/v1.Duration (Optional) MachineHealthTimeout is the timeout after which machine is declared unhealhty/failed.\n creationTimeout Kubernetes meta/v1.Duration (Optional) MachineCreationTimeout is the timeout after which machinie creation is declared failed.\n maxEvictRetries *int32 (Optional) MaxEvictRetries is the number of retries that will be attempted while draining the node.\n nodeConditions *string (Optional) NodeConditions are the set of conditions if set to true for MachineHealthTimeOut, machine will be declared failed.\n MachineDeploymentCondition (Appears on: MachineDeploymentStatus) MachineDeploymentCondition describes the state of a MachineDeployment at a certain point.\n Field Type Description type MachineDeploymentConditionType Type of MachineDeployment condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastUpdateTime Kubernetes meta/v1.Time The last time this condition was updated.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n reason string The reason for the condition’s last transition.\n message string A human readable message indicating details about the transition.\n MachineDeploymentConditionType (string alias)\n (Appears on: MachineDeploymentCondition) MachineDeploymentConditionType are valid conditions of MachineDeployments\nMachineDeploymentSpec (Appears on: MachineDeployment) MachineDeploymentSpec is the specification of the desired behavior of the MachineDeployment.\n Field Type Description replicas int32 (Optional) Number of desired machines. This is a pointer to distinguish between explicit zero and not specified. Defaults to 0.\n selector Kubernetes meta/v1.LabelSelector (Optional) Label selector for machines. Existing MachineSets whose machines are selected by this will be the ones affected by this MachineDeployment.\n template MachineTemplateSpec Template describes the machines that will be created.\n strategy MachineDeploymentStrategy (Optional) The MachineDeployment strategy to use to replace existing machines with new ones.\n minReadySeconds int32 (Optional) Minimum number of seconds for which a newly created machine should be ready without any of its container crashing, for it to be considered available. Defaults to 0 (machine will be considered available as soon as it is ready)\n revisionHistoryLimit *int32 (Optional) The number of old MachineSets to retain to allow rollback. This is a pointer to distinguish between explicit zero and not specified.\n paused bool (Optional) Indicates that the MachineDeployment is paused and will not be processed by the MachineDeployment controller.\n rollbackTo RollbackConfig (Optional) DEPRECATED. The config this MachineDeployment is rolling back to. Will be cleared after rollback is done.\n progressDeadlineSeconds *int32 (Optional) The maximum time in seconds for a MachineDeployment to make progress before it is considered to be failed. The MachineDeployment controller will continue to process failed MachineDeployments and a condition with a ProgressDeadlineExceeded reason will be surfaced in the MachineDeployment status. Note that progress will not be estimated during the time a MachineDeployment is paused. This is not set by default, which is treated as infinite deadline.\n MachineDeploymentStatus (Appears on: MachineDeployment) MachineDeploymentStatus is the most recently observed status of the MachineDeployment.\n Field Type Description observedGeneration int64 (Optional) The generation observed by the MachineDeployment controller.\n replicas int32 (Optional) Total number of non-terminated machines targeted by this MachineDeployment (their labels match the selector).\n updatedReplicas int32 (Optional) Total number of non-terminated machines targeted by this MachineDeployment that have the desired template spec.\n readyReplicas int32 (Optional) Total number of ready machines targeted by this MachineDeployment.\n availableReplicas int32 (Optional) Total number of available machines (ready for at least minReadySeconds) targeted by this MachineDeployment.\n unavailableReplicas int32 (Optional) Total number of unavailable machines targeted by this MachineDeployment. This is the total number of machines that are still required for the MachineDeployment to have 100% available capacity. They may either be machines that are running but not yet available or machines that still have not been created.\n conditions []MachineDeploymentCondition Represents the latest available observations of a MachineDeployment’s current state.\n collisionCount *int32 (Optional) Count of hash collisions for the MachineDeployment. The MachineDeployment controller uses this field as a collision avoidance mechanism when it needs to create the name for the newest MachineSet.\n failedMachines []*github.com/gardener/machine-controller-manager/pkg/apis/machine/v1alpha1.MachineSummary (Optional) FailedMachines has summary of machines on which lastOperation Failed\n MachineDeploymentStrategy (Appears on: MachineDeploymentSpec) MachineDeploymentStrategy describes how to replace existing machines with new ones.\n Field Type Description type MachineDeploymentStrategyType (Optional) Type of MachineDeployment. Can be “Recreate” or “RollingUpdate”. Default is RollingUpdate.\n rollingUpdate RollingUpdateMachineDeployment (Optional) Rolling update config params. Present only if MachineDeploymentStrategyType =\nRollingUpdate. TODO: Update this to follow our convention for oneOf, whatever we decide it to be.\n MachineDeploymentStrategyType (string alias)\n (Appears on: MachineDeploymentStrategy) MachineDeploymentStrategyType are valid strategy types for rolling MachineDeployments\nMachineOperationType (string alias)\n (Appears on: LastOperation) MachineOperationType is a label for the operation performed on a machine object.\nMachinePhase (string alias)\n (Appears on: CurrentStatus) MachinePhase is a label for the condition of a machine at the current time.\nMachineSetCondition (Appears on: MachineSetStatus) MachineSetCondition describes the state of a machine set at a certain point.\n Field Type Description type MachineSetConditionType Type of machine set condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time (Optional) The last time the condition transitioned from one status to another.\n reason string (Optional) The reason for the condition’s last transition.\n message string (Optional) A human readable message indicating details about the transition.\n MachineSetConditionType (string alias)\n (Appears on: MachineSetCondition) MachineSetConditionType is the condition on machineset object\nMachineSetSpec (Appears on: MachineSet) MachineSetSpec is the specification of a MachineSet.\n Field Type Description replicas int32 (Optional) selector Kubernetes meta/v1.LabelSelector (Optional) machineClass ClassSpec (Optional) template MachineTemplateSpec (Optional) minReadySeconds int32 (Optional) MachineSetStatus (Appears on: MachineSet) MachineSetStatus holds the most recently observed status of MachineSet.\n Field Type Description replicas int32 Replicas is the number of actual replicas.\n fullyLabeledReplicas int32 (Optional) The number of pods that have labels matching the labels of the pod template of the replicaset.\n readyReplicas int32 (Optional) The number of ready replicas for this replica set.\n availableReplicas int32 (Optional) The number of available replicas (ready for at least minReadySeconds) for this replica set.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed by the controller.\n machineSetCondition []MachineSetCondition (Optional) Represents the latest available observations of a replica set’s current state.\n lastOperation LastOperation LastOperation performed\n failedMachines []github.com/gardener/machine-controller-manager/pkg/apis/machine/v1alpha1.MachineSummary (Optional) FailedMachines has summary of machines on which lastOperation Failed\n MachineSpec (Appears on: Machine, MachineTemplateSpec) MachineSpec is the specification of a Machine.\n Field Type Description class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n MachineState (string alias)\n (Appears on: LastOperation) MachineState is a current state of the operation.\nMachineStatus (Appears on: Machine) MachineStatus holds the most recently observed status of Machine.\n Field Type Description conditions []Kubernetes core/v1.NodeCondition Conditions of this machine, same as node\n lastOperation LastOperation Last operation refers to the status of the last operation performed\n currentStatus CurrentStatus Current status of the machine object\n lastKnownState string (Optional) LastKnownState can store details of the last known state of the VM by the plugins. It can be used by future operation calls to determine current infrastucture state\n MachineSummary MachineSummary store the summary of machine.\n Field Type Description name string Name of the machine object\n providerID string ProviderID represents the provider’s unique ID given to a machine\n lastOperation LastOperation Last operation refers to the status of the last operation performed\n ownerRef string OwnerRef\n MachineTemplateSpec (Appears on: MachineDeploymentSpec, MachineSetSpec) MachineTemplateSpec describes the data a machine should have when created from a template\n Field Type Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object’s metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec MachineSpec (Optional) Specification of the desired behavior of the machine. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status\n class ClassSpec (Optional) Class contains the machineclass attributes of a machine\n providerID string (Optional) ProviderID represents the provider’s unique ID given to a machine\n nodeTemplate NodeTemplateSpec (Optional) NodeTemplateSpec describes the data a node should have when created from a template\n MachineConfiguration MachineConfiguration (Members of MachineConfiguration are embedded into this type.) (Optional) Configuration for the machine-controller.\n NodeTemplate (Appears on: MachineClass) NodeTemplate contains subfields to track all node resources and other node info required to scale nodegroup from zero\n Field Type Description capacity Kubernetes core/v1.ResourceList Capacity contains subfields to track all node resources required to scale nodegroup from zero\n instanceType string Instance type of the node belonging to nodeGroup\n region string Region of the expected node belonging to nodeGroup\n zone string Zone of the expected node belonging to nodeGroup\n architecture *string (Optional) CPU Architecture of the node belonging to nodeGroup\n NodeTemplateSpec (Appears on: MachineSpec) NodeTemplateSpec describes the data a node should have when created from a template\n Field Type Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec Kubernetes core/v1.NodeSpec (Optional) NodeSpec describes the attributes that a node is created with.\n podCIDR string (Optional) PodCIDR represents the pod IP range assigned to the node.\n podCIDRs []string (Optional) podCIDRs represents the IP ranges assigned to the node for usage by Pods on that node. If this field is specified, the 0th entry must match the podCIDR field. It may contain at most 1 value for each of IPv4 and IPv6.\n providerID string (Optional) ID of the node assigned by the cloud provider in the format: ://\n unschedulable bool (Optional) Unschedulable controls node schedulability of new pods. By default, node is schedulable. More info: https://kubernetes.io/docs/concepts/nodes/node/#manual-node-administration\n taints []Kubernetes core/v1.Taint (Optional) If specified, the node’s taints.\n configSource Kubernetes core/v1.NodeConfigSource (Optional) Deprecated: Previously used to specify the source of the node’s configuration for the DynamicKubeletConfig feature. This feature is removed.\n externalID string (Optional) Deprecated. Not all kubelets will set this field. Remove field after 1.13. see: https://issues.k8s.io/61966\n RollbackConfig (Appears on: MachineDeploymentSpec) RollbackConfig is the config to rollback a MachineDeployment\n Field Type Description revision int64 (Optional) The revision to rollback to. If set to 0, rollback to the last revision.\n RollingUpdateMachineDeployment (Appears on: MachineDeploymentStrategy) RollingUpdateMachineDeployment is the spec to control the desired behavior of rolling update.\n Field Type Description maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) The maximum number of machines that can be unavailable during the update. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). Absolute number is calculated from percentage by rounding down. This can not be 0 if MaxSurge is 0. By default, a fixed value of 1 is used. Example: when this is set to 30%, the old MC can be scaled down to 70% of desired machines immediately when the rolling update starts. Once new machines are ready, old MC can be scaled down further, followed by scaling up the new MC, ensuring that the total number of machines available at all times during the update is at least 70% of desired machines.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) The maximum number of machines that can be scheduled above the desired number of machines. Value can be an absolute number (ex: 5) or a percentage of desired machines (ex: 10%). This can not be 0 if MaxUnavailable is 0. Absolute number is calculated from percentage by rounding up. By default, a value of 1 is used. Example: when this is set to 30%, the new MC can be scaled up immediately when the rolling update starts, such that the total number of old and new machines do not exceed 130% of desired machines. Once old machines have been killed, new MC can be scaled up further, ensuring that total number of machines running at any time during the update is atmost 130% of desired machines.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Specification ProviderSpec Schema Machine Machine is the …","ref":"/docs/other-components/machine-controller-manager/documents/apis/","tags":"","title":"Apis"},{"body":"Overview Similar to the kube-apiserver, the gardener-apiserver comes with a few in-tree managed admission plugins. If you want to get an overview of the what and why of admission plugins then this document might be a good start.\nThis document lists all existing admission plugins with a short explanation of what it is responsible for.\nClusterOpenIDConnectPreset, OpenIDConnectPreset (both enabled by default)\nThese admission controllers react on CREATE operations for Shoots. If the Shoot does not specify any OIDC configuration (.spec.kubernetes.kubeAPIServer.oidcConfig=nil), then it tries to find a matching ClusterOpenIDConnectPreset or OpenIDConnectPreset, respectively. If there are multiple matches, then the one with the highest weight “wins”. In this case, the admission controller will default the OIDC configuration in the Shoot.\nControllerRegistrationResources (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for ControllerRegistrations. It validates that there exists only one ControllerRegistration in the system that is primarily responsible for a given kind/type resource combination. This prevents misconfiguration by the Gardener administrator/operator.\nCustomVerbAuthorizer (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Projects and NamespacedCloudProfiles.\nFor Projects it validates whether the user is bound to an RBAC role with the modify-spec-tolerations-whitelist verb in case the user tries to change the .spec.tolerations.whitelist field of the respective Project resource. Usually, regular project members are not bound to this custom verb, allowing the Gardener administrator to manage certain toleration whitelists on Project basis.\nFor NamespacedCloudProfiles it validates whether the user is assigned an RBAC role with the modify-spec-kubernetes verb when attempting to change the .spec.kubernetes field, or the modify-spec-machineimages verb when attempting to change the .spec.machineImages field of the respective NamespacedCloudProfile resource.\nDeletionConfirmation (enabled by default)\nThis admission controller reacts on DELETE operations for Projects, Shoots, and ShootStates. It validates that the respective resource is annotated with a deletion confirmation annotation, namely confirmation.gardener.cloud/deletion=true. Only if this annotation is present it allows the DELETE operation to pass. This prevents users from accidental/undesired deletions. In addition, it applies the “four-eyes principle for deletion” concept if the Project is configured accordingly. Find all information about it in this document.\nFurthermore, this admission controller reacts on CREATE or UPDATE operations for Shoots. It makes sure that the deletion.gardener.cloud/confirmed-by annotation is properly maintained in case the Shoot deletion is confirmed with above mentioned annotation.\nExposureClass (enabled by default)\nThis admission controller reacts on Create operations for Shoots. It mutates Shoot resources which have an ExposureClass referenced by merging both their shootSelectors and/or tolerations into the Shoot resource.\nExtensionValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for BackupEntrys, BackupBuckets, Seeds, and Shoots. For all the various extension types in the specifications of these objects, it validates whether there exists a ControllerRegistration in the system that is primarily responsible for the stated extension type(s). This prevents misconfigurations that would otherwise allow users to create such resources with extension types that don’t exist in the cluster, effectively leading to failing reconciliation loops.\nExtensionLabels (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for BackupBuckets, BackupEntrys, CloudProfiles, Seeds, SecretBindings and Shoots. For all the various extension types in the specifications of these objects, it adds a corresponding label in the resource. This would allow extension admission webhooks to filter out the resources they are responsible for and ignore all others. This label is of the form \u003cextension-type\u003e.extensions.gardener.cloud/\u003cextension-name\u003e : \"true\". For example, an extension label for provider extension type aws, looks like provider.extensions.gardener.cloud/aws : \"true\".\nProjectValidator (enabled by default)\nThis admission controller reacts on CREATE operations for Projects. It prevents creating Projects with a non-empty .spec.namespace if the value in .spec.namespace does not start with garden-.\n⚠️ This admission plugin will be removed in a future release and its business logic will be incorporated into the static validation of the gardener-apiserver.\nResourceQuota (enabled by default)\nThis admission controller enables object count ResourceQuotas for Gardener resources, e.g. Shoots, SecretBindings, Projects, etc.\n ⚠️ In addition to this admission plugin, the ResourceQuota controller must be enabled for the Kube-Controller-Manager of your Garden cluster.\n ResourceReferenceManager (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for CloudProfiles, Projects, SecretBindings, Seeds, and Shoots. Generally, it checks whether referred resources stated in the specifications of these objects exist in the system (e.g., if a referenced Secret exists). However, it also has some special behaviours for certain resources:\n CloudProfiles: It rejects removing Kubernetes or machine image versions if there is at least one Shoot that refers to them. Projects: It sets the .spec.createdBy field for newly created Project resources, and defaults the .spec.owner field in case it is empty (to the same value of .spec.createdBy). Shoots: It sets the gardener.cloud/created-by=\u003cusername\u003e annotation for newly created Shoot resources. SeedValidator (enabled by default)\nThis admission controller reacts on DELETE operations for Seeds. Rejects the deletion if Shoot(s) reference the seed cluster.\nShootDNS (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It tries to assign a default domain to the Shoot. It also validates the DNS configuration (.spec.dns) for shoots.\nShootNodeLocalDNSEnabledByDefault (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it will enable node local dns within the shoot cluster (for more information, see NodeLocalDNS Configuration) by setting spec.systemComponents.nodeLocalDNS.enabled=true for newly created Shoots. Already existing Shoots and new Shoots that explicitly disable node local dns (spec.systemComponents.nodeLocalDNS.enabled=false) will not be affected by this admission plugin.\nShootQuotaValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It validates the resource consumption declared in the specification against applicable Quota resources. Only if the applicable Quota resources admit the configured resources in the Shoot then it allows the request. Applicable Quotas are referred in the SecretBinding that is used by the Shoot.\nShootResourceReservation (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It injects the Kubernetes.Kubelet.KubeReserved setting for kubelet either as global setting for a shoot or on a per worker pool basis. If the admission configuration (see this example) for the ShootResourceReservation plugin contains useGKEFormula: false (the default), then it sets a static default resource reservation for the shoot.\nIf useGKEFormula: true is set, then the plugin injects resource reservations based on the machine type similar to GKE’s formula for resource reservation into each worker pool. Already existing resource reservations are not modified; this also means that resource reservations are not automatically updated if the machine type for a worker pool is changed. If a shoot contains global resource reservations, then no per worker pool resource reservations are injected.\nShootVPAEnabledByDefault (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it will enable the managed VerticalPodAutoscaler components (for more information, see Vertical Pod Auto-Scaling) by setting spec.kubernetes.verticalPodAutoscaler.enabled=true for newly created Shoots. Already existing Shoots and new Shoots that explicitly disable VPA (spec.kubernetes.verticalPodAutoscaler.enabled=false) will not be affected by this admission plugin.\nShootTolerationRestriction (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for Shoots. It validates the .spec.tolerations used in Shoots against the whitelist of its Project, or against the whitelist configured in the admission controller’s configuration, respectively. Additionally, it defaults the .spec.tolerations in Shoots with those configured in its Project, and those configured in the admission controller’s configuration, respectively.\nShootValidator (enabled by default)\nThis admission controller reacts on CREATE, UPDATE and DELETE operations for Shoots. It validates certain configurations in the specification against the referred CloudProfile (e.g., machine images, machine types, used Kubernetes version, …). Generally, it performs validations that cannot be handled by the static API validation due to their dynamic nature (e.g., when something needs to be checked against referred resources). Additionally, it takes over certain defaulting tasks (e.g., default machine image for worker pools, default Kubernetes version).\nShootManagedSeed (enabled by default)\nThis admission controller reacts on UPDATE and DELETE operations for Shoots. It validates certain configuration values in the specification that are specific to ManagedSeeds (e.g. the nginx-addon of the Shoot has to be disabled, the Shoot VPA has to be enabled). It rejects the deletion if the Shoot is referred to by a ManagedSeed.\nManagedSeedValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for ManagedSeedss. It validates certain configuration values in the specification against the referred Shoot, for example Seed provider, network ranges, DNS domain, etc. Similar to ShootValidator, it performs validations that cannot be handled by the static API validation due to their dynamic nature. Additionally, it performs certain defaulting tasks, making sure that configuration values that are not specified are defaulted to the values of the referred Shoot, for example Seed provider, network ranges, DNS domain, etc.\nManagedSeedShoot (enabled by default)\nThis admission controller reacts on DELETE operations for ManagedSeeds. It rejects the deletion if there are Shoots that are scheduled onto the Seed that is registered by the ManagedSeed.\nShootDNSRewriting (disabled by default)\nThis admission controller reacts on CREATE operations for Shoots. If enabled, it adds a set of common suffixes configured in its admission plugin configuration to the Shoot (spec.systemComponents.coreDNS.rewriting.commonSuffixes) (for more information, see DNS Search Path Optimization). Already existing Shoots will not be affected by this admission plugin.\nNamespacedCloudProfileValidator (enabled by default)\nThis admission controller reacts on CREATE and UPDATE operations for NamespacedCloudProfiles. It primarily validates if the referenced parent CloudProfile exists in the system. In addition, the admission controller ensures that the NamespacedCloudProfile only configures new machine types, and does not overwrite those from the parent CloudProfile.\n","categories":"","description":"A list of all gardener managed admission plugins together with their responsibilities","excerpt":"A list of all gardener managed admission plugins together with their …","ref":"/docs/gardener/concepts/apiserver-admission-plugins/","tags":"","title":"APIServer Admission Plugins"},{"body":"Official Definition - What is Kubernetes? “Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.”\n Introduction - Basic Principle The foundation of the Gardener (providing Kubernetes Clusters as a Service) is Kubernetes itself, because Kubernetes is the go-to solution to manage software in the Cloud, even when it’s Kubernetes itself (see also OpenStack which is provisioned more and more on top of Kubernetes as well).\nWhile self-hosting, meaning to run Kubernetes components inside Kubernetes, is a popular topic in the community, we apply a special pattern catering to the needs of our cloud platform to provision hundreds or even thousands of clusters. We take a so-called “seed” cluster and seed the control plane (such as the API server, scheduler, controllers, etcd persistence and others) of an end-user cluster, which we call “shoot” cluster, as pods into the “seed” cluster. That means that one “seed” cluster, of which we will have one per IaaS and region, hosts the control planes of multiple “shoot” clusters. That allows us to avoid dedicated hardware/virtual machines for the “shoot” cluster control planes. We simply put the control plane into pods/containers and since the “seed” cluster watches them, they can be deployed with a replica count of 1 and only need to be scaled out when the control plane gets under pressure, but no longer for HA reasons. At the same time, the deployments get simpler (standard Kubernetes deployment) and easier to update (standard Kubernetes rolling update). The actual “shoot” cluster consists only of the worker nodes (no control plane) and therefore the users may get full administrative access to their clusters.\nSetting The Scene - Components and Procedure We provide a central operator UI, which we call the “Gardener Dashboard”. It talks to a dedicated cluster, which we call the “Garden” cluster, and uses custom resources managed by an aggregated API server (one of the general extension concepts of Kubernetes) to represent “shoot” clusters. In this “Garden” cluster runs the “Gardener”, which is basically a Kubernetes controller that watches the custom resources and acts upon them, i.e. creates, updates/modifies, or deletes “shoot” clusters. The creation follows basically these steps:\n Create a namespace in the “seed” cluster for the “shoot” cluster, which will host the “shoot” cluster control plane. Generate secrets and credentials, which the worker nodes will need to talk to the control plane. Create the infrastructure (using Terraform), which basically consists out of the network setup. Deploy the “shoot” cluster control plane into the “shoot” namespace in the “seed” cluster, containing the “machine-controller-manager” pod. Create machine CRDs in the “seed” cluster, describing the configuration and the number of worker machines for the “shoot” (the machine-controller-manager watches the CRDs and creates virtual machines out of it). Wait for the “shoot” cluster API server to become responsive (pods will be scheduled, persistent volumes and load balancers are created by Kubernetes via the respective cloud provider). Finally, we deploy kube-system daemons like kube-proxy and further add-ons like the dashboard into the “shoot” cluster and the cluster becomes active. Overview Architecture Diagram Detailed Architecture Diagram Note: The kubelet, as well as the pods inside the “shoot” cluster, talks through the front-door (load balancer IP; public Internet) to its “shoot” cluster API server running in the “seed” cluster. The reverse communication from the API server to the pod, service, and node networks happens through a VPN connection that we deploy into the “seed” and “shoot” clusters.\n","categories":["Users"],"description":"The concepts behind the Gardener architecture","excerpt":"The concepts behind the Gardener architecture","ref":"/docs/gardener/concepts/architecture/","tags":"","title":"Architecture"},{"body":"Audit a Kubernetes Cluster The shoot cluster is a Kubernetes cluster and its kube-apiserver handles the audit events. In order to define which audit events must be logged, a proper audit policy file must be passed to the Kubernetes API server. You could find more information about auditing a kubernetes cluster in the Auditing topic.\nDefault Audit Policy By default, the Gardener will deploy the shoot cluster with audit policy defined in the kube-apiserver package.\nCustom Audit Policy If you need specific audit policy for your shoot cluster, then you could deploy the required audit policy in the garden cluster as ConfigMap resource and set up your shoot to refer this ConfigMap. Note that the policy must be stored under the key policy in the data section of the ConfigMap.\nFor example, deploy the auditpolicy ConfigMap in the same namespace as your Shoot resource:\nkubectl apply -f example/95-configmap-custom-audit-policy.yaml then set your shoot to refer that ConfigMap (only related fields are shown):\nspec: kubernetes: kubeAPIServer: auditConfig: auditPolicy: configMapRef: name: auditpolicy Gardener validate the Shoot resource to refer only existing ConfigMap containing valid audit policy, and rejects the Shoot on failure. If you want to switch back to the default audit policy, you have to remove the section\nauditPolicy: configMapRef: name: \u003cconfigmap-name\u003e from the shoot spec.\nRolling Out Changes to the Audit Policy Gardener is not automatically rolling out changes to the Audit Policy to minimize the amount of Shoot reconciliations in order to prevent cloud provider rate limits, etc. Gardener will pick up the changes on the next reconciliation of Shoots referencing the Audit Policy ConfigMap. If users want to immediately rollout Audit Policy changes, they can manually trigger a Shoot reconciliation as described in triggering an immediate reconciliation. This is similar to changes to the cloud provider secret referenced by Shoots.\n","categories":"","description":"How to define a custom audit policy through a `ConfigMap` and reference it in the shoot spec","excerpt":"How to define a custom audit policy through a `ConfigMap` and …","ref":"/docs/gardener/shoot_auditpolicy/","tags":"","title":"Audit a Kubernetes Cluster"},{"body":"Increasing the Security of All Gardener Stakeholders In summer 2018, the Gardener project team asked Kinvolk to execute several penetration tests in its role as third-party contractor. The goal of this ongoing work was to increase the security of all Gardener stakeholders in the open source community. Following the Gardener architecture, the control plane of a Gardener managed shoot cluster resides in the corresponding seed cluster. This is a Control-Plane-as-a-Service with a network air gap.\nAlong the way we found various kinds of security issues, for example, due to misconfiguration or missing isolation, as well as two special problems with upstream Kubernetes and its Control-Plane-as-a-Service architecture.\nMajor Findings From this experience, we’d like to share a few examples of security issues that could happen on a Kubernetes installation and how to fix them.\nAlban Crequy (Kinvolk) and Dirk Marwinski (SAP SE) gave a presentation entitled Hardening Multi-Cloud Kubernetes Clusters as a Service at KubeCon 2018 in Shanghai presenting some of the findings.\nHere is a summary of the findings:\n Privilege escalation due to insecure configuration of the Kubernetes API server\n Root cause: Same certificate authority (CA) is used for both the API server and the proxy that allows accessing the API server. Risk: Users can get access to the API server. Recommendation: Always use different CAs. Exploration of the control plane network with malicious HTTP-redirects\n Root cause: See detailed description below. Risk: Provoked error message contains full HTTP payload from anexisting endpoint which can be exploited. The contents of the payload depends on your setup, but can potentially be user data, configuration data, and credentials. Recommendation: Use the latest version of Gardener Ensure the seed cluster’s container network supports network policies. Clusters that have been created with Kubify are not protected as Flannel is used there which doesn’t support network policies. Reading private AWS metadata via Grafana\n Root cause: It is possible to configuring a new custom data source in Grafana, we could send HTTP requests to target the control Risk: Users can get the “user-data” for the seed cluster from the metadata service and retrieve a kubeconfig for that Kubernetes cluster Recommendation: Lockdown Grafana features to only what’s necessary in this setup, block all unnecessary outgoing traffic, move Grafana to a different network, lockdown unauthenticated endpoints Scenario 1: Privilege Escalation with Insecure API Server In most configurations, different components connect directly to the Kubernetes API server, often using a kubeconfig with a client certificate. The API server is started with the flag:\n/hyperkube apiserver --client-ca-file=/srv/kubernetes/ca/ca.crt ... The API server will check whether the client certificate presented by kubectl, kubelet, scheduler or another component is really signed by the configured certificate authority for clients.\nThe API server can have many clients of various kinds\nHowever, it is possible to configure the API server differently for use with an intermediate authenticating proxy. The proxy will authenticate the client with its own custom method and then issue HTTP requests to the API server with additional HTTP headers specifying the user name and group name. The API server should only accept HTTP requests with HTTP headers from a legitimate proxy. To allow the API server to check incoming requests, you need pass on a list of certificate authorities (CAs) to it. Requests coming from a proxy are only accepted if they use a client certificate that is signed by one of the CAs of that list.\n--requestheader-client-ca-file=/srv/kubernetes/ca/ca-proxy.crt --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group API server clients can reach the API server through an authenticating proxy\nSo far, so good. But what happens if the malicious user “Mallory” tries to connect directly to the API server and reuses the HTTP headers to pretend to be someone else?\nWhat happens when a client bypasses the proxy, connecting directly to the API server?\nWith a correct configuration, Mallory’s kubeconfig will have a certificate signed by the API server certificate authority but not signed by the proxy certificate authority. So the API server will not accept the extra HTTP header “X-Remote-Group: system:masters”.\nYou only run into an issue when the same certificate authority is used for both the API server and the proxy. Then, any Kubernetes client certificate can be used to take the role of different user or group as the API server will accept the user header and group header.\nThe kubectl tool does not normally add those HTTP headers but it’s pretty easy to generate the corresponding HTTP requests manually.\nWe worked on improving the Kubernetes documentation to make clearer that this configuration should be avoided.\nScenario 2: Exploration of the Control Plane Network with Malicious HTTP-Redirects The API server is a central component of Kubernetes and many components initiate connections to it, including the kubelet running on worker nodes. Most of the requests from those clients will end up updating Kubernetes objects (pods, services, deployments, and so on) in the etcd database but the API server usually does not need to initiate TCP connections itself.\nThe API server is mostly a component that receives requests\nHowever, there are exceptions. Some kubectl commands will trigger the API server to open a new connection to the kubelet. kubectl exec is one of those commands. In order to get the standard I/Os from the pod, the API server will start an HTTP connection to the kubelet on the worker node where the pod is running. Depending on the container runtime used, it can be done in different ways, but one way to do it is for the kubelet to reply with a HTTP-302 redirection to the Container Runtime Interface (CRI). Basically, the kubelet is telling the API server to get the streams from CRI itself directly instead of forwarding. The redirection from the kubelet will only change the port and path from the URL; the IP address will not be changed because the kubelet and the CRI component run on the same worker node.\nBut the API server also initiates some connections, for example, to worker nodes\nIt’s often quite easy for users of a Kubernetes cluster to get access to worker nodes and tamper with the kubelet. They could be given explicit SSH access or they could be given a kubeconfig with enough privileges to create privileged pods or even just pods with “host” volumes.\nIn contrast, users (even those with “system:masters” permissions or “root” rights) are often not given access to the control plane. On setups like, for example, GKE or Gardener, the control plane is running on separate nodes, with a different administrative access. It could be hosted on a different cloud provider account. So users are not free to explore the internal networking the control plane.\nWhat would happen if a user was tampering with the kubelet to make it maliciously redirect kubectl exec requests to a different random endpoint? Most likely the given endpoint would not speak to the streaming server protocol, so there would be an error. However, the full HTTP payload from the endpoint is included in the error message printed by kubectl exec.\nThe API server is tricked to connect to other components\nThe impact of this issue depends on the specific setup. But in many configurations, we could find a metadata service (such as the AWS metadata service) containing user data, configurations and credentials. The setup we explored had a different AWS account and a different EC2 instance profile for the worker nodes and the control plane. This issue allowed users to get access to the AWS metadata service in the context of the control plane, which they should not have access to.\nWe have reported this issue to the Kubernetes Security mailing list and the public pull request that addresses the issue has been merged PR#66516. It provides a way to enforce HTTP redirect validation (disabled by default).\nBut there are several other ways that users could trigger the API server to generate HTTP requests and get the reply payload back, so it is advised to isolate the API server and other components from the network as additional precautious measures. Depending on where the API server runs, it could be with Kubernetes Network Policies, EC2 Security Groups or just iptables directly. Following the defense in depth principle, it is a good idea to apply the API server HTTP redirect validation when it is available as well as firewall rules.\nIn Gardener, this has been fixed with Kubernetes network policies along with changes to ensure the API server does not need to contact the metadata service. You can see more details in the announcements on the Gardener mailing list. This is tracked in CVE-2018-2475.\nTo be protected from this issue, stakeholders should:\n Use the latest version of Gardener Ensure the seed cluster’s container network supports network policies. Clusters that have been created with Kubify are not protected as Flannel is used there which doesn’t support network policies. Scenario 3: Reading Private AWS Metadata via Grafana For our tests, we had access to a Kubernetes setup where users are not only given access to the API server in the control plane, but also to a Grafana instance that is used to gather data from their Kubernetes clusters via Prometheus. The control plane is managed and users don’t have access to the nodes that it runs. They can only access the API server and Grafana via a load balancer. The internal network of the control plane is therefore hidden to users.\nPrometheus and Grafana can be used to monitor worker nodes\nUnfortunately, that setup was not protecting the control plane network from nosy users. By configuring a new custom data source in Grafana, we could send HTTP requests to target the control plane network, for example the AWS metadata service. The reply payload is not displayed on the Grafana Web UI but it is possible to access it from the debugging console of the Chrome browser.\nCredentials can be retrieved from the debugging console of Chrome\nAdding a Grafana data source is a way to issue HTTP requests to arbitrary targets\nIn that installation, users could get the “user-data” for the seed cluster from the metadata service and retrieve a kubeconfig for that Kubernetes cluster.\nThere are many possible measures to avoid this situation: lockdown Grafana features to only what’s necessary in this setup, block all unnecessary outgoing traffic, move Grafana to a different network, or lockdown unauthenticated endpoints, among others.\nConclusion The three scenarios above show pitfalls with a Kubernetes setup. A lot of them were specific to the Kubernetes installation: different cloud providers or different configurations will show different weaknesses. Users should no longer be given access to Grafana.\n","categories":"","description":"A few insecure configurations in Kubernetes","excerpt":"A few insecure configurations in Kubernetes","ref":"/docs/guides/applications/insecure-configuration/","tags":"","title":"Auditing Kubernetes for Secure Setup"},{"body":"Packages:\n authentication.gardener.cloud/v1alpha1 authentication.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API. “authentication.gardener.cloud/v1alpha1” API is already used for CRD registration and must not be served by the API server.\nResource Types: AdminKubeconfigRequest AdminKubeconfigRequest can be used to request a kubeconfig with admin credentials for a Shoot cluster.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec AdminKubeconfigRequestSpec Spec is the specification of the AdminKubeconfigRequest.\n expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n status AdminKubeconfigRequestStatus Status is the status of the AdminKubeconfigRequest.\n AdminKubeconfigRequestSpec (Appears on: AdminKubeconfigRequest) AdminKubeconfigRequestSpec contains the expiration time of the kubeconfig.\n Field Description expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n AdminKubeconfigRequestStatus (Appears on: AdminKubeconfigRequest) AdminKubeconfigRequestStatus is the status of the AdminKubeconfigRequest containing the kubeconfig and expiration of the credential.\n Field Description kubeconfig []byte Kubeconfig contains the kubeconfig with cluster-admin privileges for the shoot cluster.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the expiration timestamp of the returned credential.\n ViewerKubeconfigRequest ViewerKubeconfigRequest can be used to request a kubeconfig with viewer credentials (excluding Secrets) for a Shoot cluster.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ViewerKubeconfigRequestSpec Spec is the specification of the ViewerKubeconfigRequest.\n expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n status ViewerKubeconfigRequestStatus Status is the status of the ViewerKubeconfigRequest.\n ViewerKubeconfigRequestSpec (Appears on: ViewerKubeconfigRequest) ViewerKubeconfigRequestSpec contains the expiration time of the kubeconfig.\n Field Description expirationSeconds int64 (Optional) ExpirationSeconds is the requested validity duration of the credential. The credential issuer may return a credential with a different validity duration so a client needs to check the ‘expirationTimestamp’ field in a response. Defaults to 1 hour.\n ViewerKubeconfigRequestStatus (Appears on: ViewerKubeconfigRequest) ViewerKubeconfigRequestStatus is the status of the ViewerKubeconfigRequest containing the kubeconfig and expiration of the credential.\n Field Description kubeconfig []byte Kubeconfig contains the kubeconfig with viewer privileges (excluding Secrets) for the shoot cluster.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the expiration timestamp of the returned credential.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n authentication.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/authentication/","tags":"","title":"Authentication"},{"body":"Authentication of Gardener Control Plane Components Against the Garden Cluster Note: This document refers to Gardener’s API server, admission controller, controller manager and scheduler components. Any reference to the term Gardener control plane component can be replaced with any of the mentioned above.\n There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy a Gardener control plane component is to not provide a kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution is to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see the example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.\u003cGardenerControlPlaneComponent\u003e.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication is to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig, which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.deployment.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution is to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.deployment.virtualGarden.enabled: true and .Values.global.deployment.virtualGarden.\u003cGardenerControlPlaneComponent\u003e.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also, the runtime cluster should be registered as a trusted identity provider in the target cluster. Then, projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.deployment.virtualGarden.enabled: true and .Values.global.deployment.virtualGarden.\u003cGardenerControlPlaneComponent\u003e.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e\n Set .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.\u003cGardenerControlPlaneComponent\u003e.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003cclient-id-from-trust-config\u003e.\n Craft a kubeconfig (see the example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Authentication of Gardener Control Plane Components Against the Garden …","ref":"/docs/gardener/deployment/authentication_gardener_control_plane/","tags":"","title":"Authentication Gardener Control Plane"},{"body":"Overview The project resource operations that are performed manually in the dashboard or via kubectl can be automated using the Gardener API and a Service Account authorized to perform them.\nCreate a Service Account Prerequisites You are logged on to the Gardener Dashboard You have created a project Steps Select your project and choose MEMBERS from the menu on the left.\n Locate the section Service Accounts and choose +.\n Enter the service account details.\nThe following Roles are available:\n Role Granted Permissions Owner Combines the Admin, UAM and Service Account Manager roles. There can only be one owner per project. You can change the owner on the project administration page. Admin Allows to manage resources inside the project (e.g. secrets, shoots, configmaps and similar) and to manage permissions for service accounts. Note that the Admin role has read-only access to service accounts. Viewer Provides read access to project details and shoots. Has access to shoots but is not able to create new ones. Cannot read cloud provider secrets. UAM Allows to add/modify/remove human users, service accounts or groups to/from the project member list. In case an external UAM system is connected via a service account, only this account should get the UAM role. Service Account Manager Allows to manage service accounts inside the project namespace and request tokens for them. The permissions of the created service accounts are instead managed by the Admin role. For security reasons this role should not be assigned to service accounts. In particular it should be ensured that the service account is not able to refresh service account tokens forever. Choose CREATE. Use the Service Account To use the service account, download or copy its kubeconfig. With it you can connect to the API endpoint of your Gardener project.\n Note: The downloaded kubeconfig contains the service account credentials. Treat with care.\n Delete the Service Account Choose Delete Service Account to delete it.\nRelated Links Service Account Manager ","categories":"","description":"","excerpt":"Overview The project resource operations that are performed manually …","ref":"/docs/dashboard/automated-resource-management/","tags":"","title":"Automating Project Resource Management"},{"body":"Overview This document describes the used autoscaling mechanism for several components.\nGarden or Shoot Cluster etcd By default, if none of the autoscaling modes is requested the etcd is deployed with static resources, without autoscaling.\nHowever, there are two supported autoscaling modes for the Garden or Shoot cluster etcd.\n HVPA\nIn HVPA mode, the etcd is scaled by the hvpa-controller. The gardenlet/gardener-operator is creating an HVPA resource for the etcd (main or events). The HVPA enables a vertical scaling for etcd.\nThe HVPA mode is the used autoscaling mode when the HVPA feature gate is enabled and the VPAForETCD feature gate is disabled.\n VPA\nIn VPA mode, the etcd is scaled by a native VPA resource.\nThe VPA mode is the used autoscaling mode when the VPAForETCD feature gate is enabled (takes precedence over the HVPA feature gate).\n [!NOTE] Starting with release v1.97, the VPAForETCD feature gate is enabled by default.\n For both of the autoscaling modes downscaling is handled more pessimistically to prevent many subsequent etcd restarts. Thus, for production and infrastructure Shoot clusters (or all Garden clusters), downscaling is deactivated for the main etcd. For all other Shoot clusters, lower advertised requests/limits are only applied during the Shoot’s maintenance time window.\nShoot Kubernetes API Server There are three supported autoscaling modes for the Shoot Kubernetes API server.\n Baseline\nIn Baseline mode, the Shoot Kubernetes API server is scaled by active HPA and VPA in passive, recommend-only mode.\nThe API server resource requests are computed based on the Shoot’s minimum Nodes count:\n Range Resource Requests [0, 2] 800m, 800Mi (2, 10] 1000m, 1100Mi (10, 50] 1200m, 1600Mi (50, 100] 2500m, 5200Mi (100, inf.) 3000m, 5200Mi The API server’s min replicas count is 2, the max replicas count - 3.\nThe Baseline mode is the used autoscaling mode when the HVPA and VPAAndHPAForAPIServer feature gates are not enabled.\n HVPA\nIn HVPA mode, the Shoot Kubernetes API server is scaled by the hvpa-controller. The gardenlet is creating an HVPA resource for the API server. The HVPA resource is backed by HPA and VPA both in recommend-only mode. The hvpa-controller is responsible for enabling simultaneous horizontal and vertical scaling by incorporating the recommendations from the HPA and VPA.\nThe initial API server resource requests are 500m and 1Gi. HVPA’s HPA is scaling only on CPU (average utilization 80%). HVPA’s VPA max allowed values are 8 CPU and 25G.\nThe API server’s min replicas count is 2, the max replicas count - 3.\nThe HVPA mode is the used autoscaling mode when the HVPA feature gate is enabled (and the VPAAndHPAForAPIServer feature gate is disabled).\n VPAAndHPA\nIn VPAAndHPA mode, the Shoot Kubernetes API server is scaled simultaneously by VPA and HPA on the same metric (CPU and memory usage). The pod-trashing cycle between VPA and HPA scaling on the same metric is avoided by configuring the HPA to scale on average usage (not on average utilization) and by picking the target average utilization values in sync with VPA’s allowed maximums. This makes possible VPA to first scale vertically on CPU/memory usage. Once all Pods’ average CPU/memory usage is close to exceed the VPA’s allowed maximum CPU/memory (the HPA’s target average utilization, 1/7 less than VPA’s allowed maximums), HPA is scaling horizontally (by adding a new replica).\nThe VPAAndHPA mode is introduced to address disadvantages with HVPA: additional component; modifies the deployment triggering unnecessary rollouts; vertical scaling only at max replicas; stuck vertical resource requests when scaling in again; etc.\nThe initial API server resource requests are 250m and 500Mi. VPA’s max allowed values are 7 CPU and 28G. HPA’s average target usage values are 6 CPU and 24G.\nThe API server’s min replicas count is 2, the max replicas count - 6.\nThe VPAAndHPA mode is the used autoscaling mode when the VPAAndHPAForAPIServer feature gate is enabled (takes precedence over the HVPA feature gate).\n [!NOTE] Starting with release v1.101, the VPAAndHPAForAPIServer feature gate is enabled by default.\n In all scaling modes the min replicas count of 2 is imposed by the High Availability of Shoot Control Plane Components.\nThe gardenlet sets the initial API server resource requests only when the Deployment is not found. When the Deployment exists, it is not overwriting the kube-apiserver container resources.\nDisabling Scale Down for Components in the Shoot Control Plane Some Shoot clusters’ control plane components can be overloaded and can have very high resource usage. The existing autoscaling solution could be imperfect to cover these cases. Scale down actions for such overloaded components could be disruptive.\nTo prevent such disruptive scale-down actions it is possible to disable scale down of the etcd, Kubernetes API server and Kubernetes controller manager in the Shoot control plane by annotating the Shoot with alpha.control-plane.scaling.shoot.gardener.cloud/scale-down-disabled=true.\nThere are the following specifics for when disabling scale-down for the Kubernetes API server component:\n In Baseline and HVPA modes the HPA’s min and max replicas count are set to 4. In VPAAndHPA mode if the HPA resource exists and HPA’s spec.minReplicas is not nil then the min replicas count is max(spec.minReplicas, status.desiredReplicas). When scale-down is disabled, this allows operators to specify a custom value for HPA spec.minReplicas and this value not to be reverted by gardenlet. I.e, HPA does scale down to min replicas but not below min replicas. HPA’s max replicas count is 6. Note: The alpha.control-plane.scaling.shoot.gardener.cloud/scale-down-disabled annotation is alpha and can be removed anytime without further notice. Only use it if you know what you do.\n Virtual Kubernetes API Server and Gardener API Server The virtual Kubernetes API server’s autoscaling is same as the Shoot Kubernetes API server’s with the following differences:\n The initial API server resource requests are 600m and 512Mi in all autoscaling modes. The min replicas count is 2 for a non-HA virtual cluster and 3 for an HA virtual cluster. The max replicas count is 6. In HVPA mode, HVPA’s HPA is scaling on both CPU and memory (average utilization 80% for both). The Gardener API server’s autoscaling is the same as the Shoot Kubernetes API server’s with the following differences:\n The initial API server resource requests are 600m and 512Mi in all autoscaling modes. The min replicas count is 2 for a non-HA virtual cluster and 3 for an HA virtual cluster. The max replicas count is 6. In HVPA mode, HVPA’s HPA is scaling on both CPU and memory (average utilization 80% for both). In HVPA mode, HVPA’s VPA max allowed values are 4 CPU and 25G. ","categories":"","description":"","excerpt":"Overview This document describes the used autoscaling mechanism for …","ref":"/docs/gardener/autoscaling-specifics-for-components/","tags":"","title":"Autoscaling Specifics for Components"},{"body":"Azure Permissions The following document describes the required Azure actions manage a Shoot cluster on Azure split by the different Azure provider/services.\nBe aware some actions are just required if particilar deployment sceanrios or features e.g. bring your own vNet, use Azure-file, let the Shoot act as Seed etc. should be used.\nMicrosoft.Compute # Required if a non zonal cluster based on Availability Set should be used. Microsoft.Compute/availabilitySets/delete Microsoft.Compute/availabilitySets/read Microsoft.Compute/availabilitySets/write # Required to let Kubernetes manage Azure disks. Microsoft.Compute/disks/delete Microsoft.Compute/disks/read Microsoft.Compute/disks/write # Required for to fetch meta information about disk and virtual machines sizes. Microsoft.Compute/locations/diskOperations/read Microsoft.Compute/locations/operations/read Microsoft.Compute/locations/vmSizes/read # Required if csi snapshot capabilities should be used and/or the Shoot should act as a Seed. Microsoft.Compute/snapshots/delete Microsoft.Compute/snapshots/read Microsoft.Compute/snapshots/write # Required to let Gardener/Machine-Controller-Manager manage the cluster nodes/machines. Microsoft.Compute/virtualMachines/delete Microsoft.Compute/virtualMachines/read Microsoft.Compute/virtualMachines/start/action Microsoft.Compute/virtualMachines/write # Required if a non zonal cluster based on VMSS Flex (VMO) should be used. Microsoft.Compute/virtualMachineScaleSets/delete Microsoft.Compute/virtualMachineScaleSets/read Microsoft.Compute/virtualMachineScaleSets/write Microsoft.ManagedIdentity # Required if a user provided Azure managed identity should attached to the cluster nodes. Microsoft.ManagedIdentity/userAssignedIdentities/assign/action Microsoft.ManagedIdentity/userAssignedIdentities/read Microsoft.MarketplaceOrdering # Required if nodes/machines should be created with images hosted on the Azure Marketplace. Microsoft.MarketplaceOrdering/offertypes/publishers/offers/plans/agreements/read Microsoft.MarketplaceOrdering/offertypes/publishers/offers/plans/agreements/write Microsoft.Network # Required to let Kubernetes manage services of type 'LoadBalancer'. Microsoft.Network/loadBalancers/backendAddressPools/join/action Microsoft.Network/loadBalancers/delete Microsoft.Network/loadBalancers/read Microsoft.Network/loadBalancers/write # Required in case the Shoot should use NatGateway(s). Microsoft.Network/natGateways/delete Microsoft.Network/natGateways/join/action Microsoft.Network/natGateways/read Microsoft.Network/natGateways/write # Required to let Gardener/Machine-Controller-Manager manage the cluster nodes/machines. Microsoft.Network/networkInterfaces/delete Microsoft.Network/networkInterfaces/ipconfigurations/join/action Microsoft.Network/networkInterfaces/ipconfigurations/read Microsoft.Network/networkInterfaces/join/action Microsoft.Network/networkInterfaces/read Microsoft.Network/networkInterfaces/write # Required to let Gardener maintain the basic infrastructure of the Shoot cluster and maintaing LoadBalancer services. Microsoft.Network/networkSecurityGroups/delete Microsoft.Network/networkSecurityGroups/join/action Microsoft.Network/networkSecurityGroups/read Microsoft.Network/networkSecurityGroups/write # Required for managing LoadBalancers and NatGateways. Microsoft.Network/publicIPAddresses/delete Microsoft.Network/publicIPAddresses/join/action Microsoft.Network/publicIPAddresses/read Microsoft.Network/publicIPAddresses/write # Required for managing the basic infrastructure of a cluster and maintaing LoadBalancer services. Microsoft.Network/routeTables/delete Microsoft.Network/routeTables/join/action Microsoft.Network/routeTables/read Microsoft.Network/routeTables/routes/delete Microsoft.Network/routeTables/routes/read Microsoft.Network/routeTables/routes/write Microsoft.Network/routeTables/write # Required to let Gardener maintain the basic infrastructure of the Shoot cluster. # Only a subset is required for the bring your own vNet scenario. Microsoft.Network/virtualNetworks/delete # not required for bring your own vnet Microsoft.Network/virtualNetworks/read Microsoft.Network/virtualNetworks/subnets/delete Microsoft.Network/virtualNetworks/subnets/join/action Microsoft.Network/virtualNetworks/subnets/read Microsoft.Network/virtualNetworks/subnets/write Microsoft.Network/virtualNetworks/write # not required for bring your own vnet Microsoft.Resources # Required to let Gardener maintain the basic infrastructure of the Shoot cluster. Microsoft.Resources/subscriptions/resourceGroups/delete Microsoft.Resources/subscriptions/resourceGroups/read Microsoft.Resources/subscriptions/resourceGroups/write Microsoft.Storage # Required if Azure File should be used and/or if the Shoot should act as Seed. Microsoft.Storage/operations/read Microsoft.Storage/storageAccounts/blobServices/containers/delete Microsoft.Storage/storageAccounts/blobServices/containers/read Microsoft.Storage/storageAccounts/blobServices/containers/write Microsoft.Storage/storageAccounts/blobServices/read Microsoft.Storage/storageAccounts/delete Microsoft.Storage/storageAccounts/listkeys/action Microsoft.Storage/storageAccounts/read Microsoft.Storage/storageAccounts/write ","categories":"","description":"","excerpt":"Azure Permissions The following document describes the required Azure …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/azure-permissions/","tags":"","title":"Azure Permissions"},{"body":"Overview Kubernetes uses etcd as the key-value store for its resource definitions. Gardener supports the backup and restore of etcd. It is the responsibility of the shoot owners to backup the workload data.\nGardener uses an etcd-backup-restore component to backup the etcd backing the Shoot cluster regularly and restore it in case of disaster. It is deployed as sidecar via etcd-druid. This doc mainly focuses on the backup and restore configuration used by Gardener when deploying these components. For more details on the design and internal implementation details, please refer to GEP-06 and the documentation on individual repositories.\nBucket Provisioning Refer to the backup bucket extension document to find out details about configuring the backup bucket.\nBackup Policy etcd-backup-restore supports full snapshot and delta snapshots over full snapshot. In Gardener, this configuration is currently hard-coded to the following parameters:\n Full Snapshot schedule: Daily, 24hr interval. For each Shoot, the schedule time in a day is randomized based on the configured Shoot maintenance window. Delta Snapshot schedule: At 5min interval. If aggregated events size since last snapshot goes beyond 100Mib. Backup History / Garbage backup deletion policy: Gardener configures backup restore to have Exponential garbage collection policy. As per policy, the following backups are retained: All full backups and delta backups for the previous hour. Latest full snapshot of each previous hour for the day. Latest full snapshot of each previous day for 7 days. Latest full snapshot of the previous 4 weeks. Garbage Collection is configured at 12hr interval. Listing: Gardener doesn’t have any API to list out the backups. To find the backups list, an admin can checkout the BackupEntry resource associated with the Shoot which holds the bucket and prefix details on the object store. Restoration The restoration process of etcd is automated through the etcd-backup-restore component from the latest snapshot. Gardener doesn’t support Point-In-Time-Recovery (PITR) of etcd. In case of an etcd disaster, the etcd is recovered from the latest backup automatically. For further details, please refer the Restoration topic. Post restoration of etcd, the Shoot reconciliation loop brings the cluster back to its previous state.\nAgain, the Shoot owner is responsible for maintaining the backup/restore of his workload. Gardener only takes care of the cluster’s etcd.\n","categories":["Users"],"description":"Understand the etcd backup and restore capabilities of Gardener","excerpt":"Understand the etcd backup and restore capabilities of Gardener","ref":"/docs/gardener/concepts/backup-restore/","tags":"","title":"Backup and Restore"},{"body":"Contract: BackupBucket Resource The Gardener project features a sub-project called etcd-backup-restore to take periodic backups of etcd backing Shoot clusters. It demands the bucket (or its equivalent in different object store providers) to be created and configured externally with appropriate credentials. The BackupBucket resource takes this responsibility in Gardener.\nBefore introducing the BackupBucket extension resource, Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see AWS Backup). Now, Gardener commissions an external, provider-specific controller to take over this task. You can also refer to backupInfra proposal documentation to get an idea about how the transition was done and understand the resource in a broader scope.\nWhat Is the Scope of a Bucket? A bucket will be provisioned per Seed. So, a backup of every Shoot created on that Seed will be stored under a different shoot specific prefix under the bucket. For the backup of the Shoot rescheduled on different Seed, it will continue to use the same bucket.\nWhat Is the Lifespan of a BackupBucket? The bucket associated with BackupBucket will be created at the creation of the Seed. And as per current implementation, it will also be deleted on deletion of the Seed, if there isn’t any BackupEntry resource associated with it.\nIn the future, we plan to introduce a schedule for BackupBucket - the deletion logic for the BackupBucket resource, which will reschedule it on different available Seeds on deletion or failure of a health check for the currently associated seed. In that case, the BackupBucket will be deleted only if there isn’t any schedulable Seed available and there isn’t any associated BackupEntry resource.\nWhat Needs to Be Implemented to Support a New Infrastructure Provider? As part of the seed flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: BackupBucket metadata: name: foo spec: type: azure providerConfig: \u003csome-optional-provider-specific-backupbucket-configuration\u003e region: eu-west-1 secretRef: name: backupprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. This provider secret will be configured by the Gardener operator in the Seed resource and propagated over there by the seed controller.\nAfter your controller has created the required bucket, if required, it generates the secret to access the objects in the bucket and put a reference to it in status. This secret is supposed to be used by Gardener, or eventually a BackupEntry resource and etcd-backup-restore component, to backup the etcd.\nIn order to support a new infrastructure provider, you need to write a controller that watches all BackupBuckets with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nReferences and Additional Resources BackupBucket API Reference Exemplary Implementation for the Azure Provider BackupEntry Resource Documentation Shared Bucket Proposal ","categories":"","description":"","excerpt":"Contract: BackupBucket Resource The Gardener project features a …","ref":"/docs/gardener/extensions/backupbucket/","tags":"","title":"BackupBucket"},{"body":"Contract: BackupEntry Resource The Gardener project features a sub-project called etcd-backup-restore to take periodic backups of etcd backing Shoot clusters. It demands the bucket (or its equivalent in different object store providers) access credentials to be created and configured externally with appropriate credentials. The BackupEntry resource takes this responsibility in Gardener to provide this information by creating a secret specific to the component.\nThat being said, the core motivation for introducing this resource was to support retention of backups post deletion of Shoot. The etcd-backup-restore components take responsibility of garbage collecting old backups out of the defined period. Once a shoot is deleted, we need to persist the backups for few days. Hence, Gardener uses the BackupEntry resource for this housekeeping work post deletion of a Shoot. The BackupEntry resource is responsible for shoot specific prefix under referred bucket.\nBefore introducing the BackupEntry extension resource, Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see AWS Backup). Now, Gardener commissions an external, provider-specific controller to take over this task. You can also refer to backupInfra proposal documentation to get idea about how the transition was done and understand the resource in broader scope.\nWhat Is the Lifespan of a BackupEntry? The bucket associated with BackupEntry will be created by using a BackupBucket resource. The BackupEntry resource will be created as a part of the Shoot creation. But resources might continue to exist post deletion of a Shoot (see gardenlet for more details).\nWhat Needs to be Implemented to Support a New Infrastructure Provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: BackupEntry metadata: name: shoot--foo--bar spec: type: azure providerConfig: \u003csome-optional-provider-specific-backup-bucket-configuration\u003e backupBucketProviderStatus: \u003csome-optional-provider-specific-backup-bucket-status\u003e region: eu-west-1 bucketName: foo secretRef: name: backupprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. This provider secret will be propagated from the BackupBucket resource by the shoot controller.\nYour controller is supposed to create the etcd-backup secret in the control plane namespace of a shoot. This secret is supposed to be used by Gardener or eventually by the etcd-backup-restore component to backup the etcd. The controller implementation should clean up the objects created under the shoot specific prefix in the bucket equivalent to the name of the BackupEntry resource.\nIn order to support a new infrastructure provider, you need to write a controller that watches all the BackupBuckets with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nReferences and Additional Resources BackupEntry API Reference Exemplary Implementation for the Azure Provider BackupBucket Resource Documentation Shared Bucket Proposal Gardener-controller-manager-component-config API Specification ","categories":"","description":"","excerpt":"Contract: BackupEntry Resource The Gardener project features a …","ref":"/docs/gardener/extensions/backupentry/","tags":"","title":"BackupEntry"},{"body":"Contract: Bastion Resource The Gardener project allows users to connect to Shoot worker nodes via SSH. As nodes are usually firewalled and not directly accessible from the public internet, GEP-15 introduced the concept of “Bastions”. A bastion is a dedicated server that only serves to allow SSH ingress to the worker nodes.\nBastion resources contain the user’s public SSH key and IP address, in order to provision the server accordingly: The public key is put onto the Bastion and SSH ingress is only authorized for the given IP address (in fact, it’s not a single IP address, but a set of IP ranges, however for most purposes a single IP is be used).\nWhat Is the Lifespan of a Bastion? Once a Bastion has been created in the garden, it will be replicated to the appropriate seed cluster, where a controller then reconciles a server and firewall rules etc., on the cloud provider used by the target Shoot. When the Bastion is ready (i.e. has a public IP), that IP is stored in the Bastion’s status and from there it is picked up by the garden cluster and gardenctl eventually.\nTo make multiple SSH sessions possible, the existence of the Bastion is not directly tied to the execution of gardenctl: users can exit out of gardenctl and use ssh manually to connect to the bastion and worker nodes.\nHowever, Bastions have an expiry date, after which they will be garbage collected.\nWhen SSH access is set to false for the Shoot in the workers settings (see Shoot Worker Nodes Settings), Bastion resources are deleted during Shoot reconciliation and new Bastions are prevented from being created.\nWhat Needs to Be Implemented to Support a New Infrastructure Provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Bastion metadata: name: mybastion namespace: shoot--foo--bar spec: type: aws # userData is base64-encoded cloud provider user data; this contains the # user's SSH key userData: IyEvYmluL2Jhc2ggL....Nlcgo= ingress: - ipBlock: cidr: 192.88.99.0/32 # this is most likely the user's IP address Your controller is supposed to create a new instance at the given cloud provider, firewall it to only allow SSH (TCP port 22) from the given IP blocks, and then configure the firewall for the worker nodes to allow SSH from the bastion instance. When a Bastion is deleted, all these changes need to be reverted.\nImplementation Details ConfigValidator Interface For bastion controllers, the generic Reconciler also delegates to a ConfigValidator interface that contains a single Validate method. This method is called by the generic Reconciler at the beginning of every reconciliation, and can be implemented by the extension to validate the .spec.providerConfig part of the Bastion resource with the respective cloud provider, typically the existence and validity of cloud provider resources such as VPCs, images, etc.\nThe Validate method returns a list of errors. If this list is non-empty, the generic Reconciler will fail with an error. This error will have the error code ERR_CONFIGURATION_PROBLEM, unless there is at least one error in the list that has its ErrorType field set to field.ErrorTypeInternal.\nReferences and Additional Resources Bastion API Reference Exemplary Implementation for the AWS Provider GEP-15 ","categories":"","description":"","excerpt":"Contract: Bastion Resource The Gardener project allows users to …","ref":"/docs/gardener/extensions/bastion/","tags":"","title":"Bastion"},{"body":"CA Rotation in Extensions GEP-18 proposes adding support for automated rotation of Shoot cluster certificate authorities (CAs). This document outlines all the requirements that Gardener extensions need to fulfill in order to support the CA rotation feature.\nRequirements for Shoot Cluster CA Rotation Extensions must not rely on static CA Secret names managed by the gardenlet, because their names are changing during CA rotation. Extensions cannot issue or use client certificates for authenticating against shoot API servers. Instead, they should use short-lived auto-rotated ServiceAccount tokens via gardener-resource-manager’s TokenRequestor. Also see Conventions and TokenRequestor documents. Extensions need to generate dedicated CAs for signing server certificates (e.g. cloud-controller-manager). There should be one CA per controller and purpose in order to bind the lifecycle to the reconciliation cycle of the respective object for which it is created. CAs managed by extensions should be rotated in lock-step with the shoot cluster CA. When the user triggers a rotation, the gardenlet writes phase and initiation time to Shoot.status.credentials.rotation.certificateAuthorities.{phase,lastInitiationTime}. See GEP-18 for a detailed description on what needs to happen in each phase. Extensions can retrieve this information from Cluster.shoot.status. Utilities for Secrets Management In order to fulfill the requirements listed above, extension controllers can reuse the SecretsManager that the gardenlet uses to manage all shoot cluster CAs, certificates, and other secrets as well. It implements the core logic for managing secrets that need to be rotated, auto-renewed, etc.\nAdditionally, there are utilities for reusing SecretsManager in extension controllers. They already implement the above requirements based on the Cluster resource and allow focusing on the extension controllers’ business logic.\nFor example, a simple SecretsManager usage in an extension controller could look like this:\nconst ( // identity for SecretsManager instance in ControlPlane controller identity = \"provider-foo-controlplane\" // secret config name of the dedicated CA caControlPlaneName = \"ca-provider-foo-controlplane\" ) func Reconcile() { var ( cluster *extensionscontroller.Cluster client client.Client // define wanted secrets with options secretConfigs = []extensionssecretsmanager.SecretConfigWithOptions{ { // dedicated CA for ControlPlane controller Config: \u0026secretutils.CertificateSecretConfig{ Name: caControlPlaneName, CommonName: \"ca-provider-foo-controlplane\", CertType: secretutils.CACert, }, // persist CA so that it gets restored on control plane migration Options: []secretsmanager.GenerateOption{secretsmanager.Persist()}, }, { // server cert for control plane component Config: \u0026secretutils.CertificateSecretConfig{ Name: \"cloud-controller-manager\", CommonName: \"cloud-controller-manager\", DNSNames: kutil.DNSNamesForService(\"cloud-controller-manager\", namespace), CertType: secretutils.ServerCert, }, // sign with our dedicated CA Options: []secretsmanager.GenerateOption{secretsmanager.SignedByCA(caControlPlaneName)}, }, } ) // initialize SecretsManager based on Cluster object sm, err := extensionssecretsmanager.SecretsManagerForCluster(ctx, logger.WithName(\"secretsmanager\"), clock.RealClock{}, client, cluster, identity, secretConfigs) // generate all wanted secrets (first CAs, then the rest) secrets, err := extensionssecretsmanager.GenerateAllSecrets(ctx, sm, secretConfigs) // cleanup any secrets that are not needed any more (e.g. after rotation) err = sm.Cleanup(ctx) } Please pay attention to the following points:\n There should be one SecretsManager identity per controller (and purpose if applicable) in order to prevent conflicts between different instances. E.g., there should be different identities for Infrastructrue, Worker controller, etc., and the ControlPlane controller should use dedicated SecretsManager identities per purpose (e.g. provider-foo-controlplane and provider-foo-controlplane-exposure). All other points in Reusing the SecretsManager in Other Components. ","categories":"","description":"","excerpt":"CA Rotation in Extensions GEP-18 proposes adding support for automated …","ref":"/docs/gardener/extensions/ca-rotation/","tags":"","title":"CA Rotation"},{"body":"Gardener Extension for Calico Networking \nThis controller operates on the Network resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting Calico Networking configuration (.spec.type=calico):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: calico-network namespace: shoot--core--test-01 spec: type: calico clusterCIDR: 192.168.0.0/24 serviceCIDR: 10.96.0.0/24 providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: false Please find a concrete example in the example folder. All the Calico specific configuration should be configured in the providerConfig section. If additional configuration is required, it should be added to the networking-calico chart in controllers/networking-calico/charts/internal/calico/values.yaml and corresponding code parts should be adapted (for example in controllers/networking-calico/pkg/charts/utils.go).\nOnce the network resource is applied, the networking-calico controller would then create all the necessary managed-resources which should be picked up by the gardener-resource-manager which will then apply all the network extensions resources to the shoot cluster.\nFinally after successful reconciliation an output similar to the one below should be expected.\n status: lastOperation: description: Successfully reconciled network lastUpdateTime: \"...\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 1 providerStatus: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkStatus Compatibility The following table lists known compatibility issues of this extension controller with other Gardener components.\n Calico Extension Gardener Action Notes \u003e= v1.30.0 \u003c v1.63.0 Please first update Gardener components to \u003e= v1.63.0. Without the mentioned minimum Gardener version, Calico Pods are not only scheduled to dedicated system component nodes in the shoot cluster. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig pointed to the cluster you want to connect to. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Calico CNI network plugin","excerpt":"Gardener extension controller for the Calico CNI network plugin","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/","tags":"","title":"Calico CNI"},{"body":"While it is possible, we highly recommend not to use privileged containers in your productive environment.\n","categories":"","description":"","excerpt":"While it is possible, we highly recommend not to use privileged …","ref":"/docs/faq/privileged-containers/","tags":"","title":"Can I run privileged containers?"},{"body":"There is no automatic migration of major/minor versions of Kubernetes. You need to update your clusters manually or press the Upgrade button in the Dashboard.\nBefore updating a cluster you should be aware of the potential errors this might cause. The following video will dive into a Kubernetes outage in production that Monzo experienced, its causes and effects, and the architectural and operational lessons learned.\n It is therefore recommended to first update your test cluster and validate it before performing changes on a productive environment.\n","categories":"","description":"","excerpt":"There is no automatic migration of major/minor versions of Kubernetes. …","ref":"/docs/faq/automatic-upgrade/","tags":"","title":"Can Kubernetes upgrade automatically?"},{"body":"Backing up your Kubernetes cluster is possible through the use of specialized software like Velero. Velero consists of a server side component and a client tool that allow you to backup or restore all objects in your cluster, as well as the cluster resources and persistent volumes.\n","categories":"","description":"","excerpt":"Backing up your Kubernetes cluster is possible through the use of …","ref":"/docs/faq/backup/","tags":"","title":"Can you backup your Kubernetes cluster resources?"},{"body":"The migration of clusters or content from one cluster to another is out of scope for the Gardener project. For such scenarios you may consider using tools like Velero.\n","categories":"","description":"","excerpt":"The migration of clusters or content from one cluster to another is …","ref":"/docs/faq/automatic-migrate/","tags":"","title":"Can you migrate the content of one cluster to another cluster?"},{"body":"Changing the API This document describes the steps that need to be performed when changing the API. It provides guidance for API changes to both (Gardener system in general or component configurations).\nGenerally, as Gardener is a Kubernetes-native extension, it follows the same API conventions and guidelines like Kubernetes itself. The Kubernetes API Conventions as well as Changing the API topics already provide a good overview and general explanation of the basic concepts behind it. We are following the same approaches.\nGardener API The Gardener API is defined in the pkg/apis/{core,extensions,settings} directories and is the main point of interaction with the system. It must be ensured that the API is always backwards-compatible.\nChanging the API Checklist when changing the API:\n Modify the field(s) in the respective Golang files of all external versions and the internal version. Make sure new fields are being added as “optional” fields, i.e., they are of pointer types, they have the // +optional comment, and they have the omitempty JSON tag. Make sure that the existing field numbers in the protobuf tags are not changed. Do not copy protobuf tags from other fields but create them with make generate WHAT=\"protobuf\". If necessary, implement/adapt the conversion logic defined in the versioned APIs (e.g., pkg/apis/core/v1beta1/conversions*.go). If necessary, implement/adapt defaulting logic defined in the versioned APIs (e.g., pkg/apis/core/v1beta1/defaults*.go). Run the code generation: make generate If necessary, implement/adapt validation logic defined in the internal API (e.g., pkg/apis/core/validation/validation*.go). If necessary, adapt the exemplary YAML manifests of the Gardener resources defined in example/*.yaml. In most cases, it makes sense to add/adapt the documentation for administrators/operators and/or end-users in the docs folder to provide information on purpose and usage of the added/changed fields. When opening the pull request, always add a release note so that end-users are becoming aware of the changes. Removing a Field If fields shall be removed permanently from the API, then a proper deprecation period must be adhered to so that end-users have enough time to adapt their clients.\nOnce the deprecation period is over, the field should be dropped from the API in a two-step process, i.e., in two release cycles. In the first step, all the usages in the code base should be dropped. In the second step, the field should be dropped from API. We need to follow this two-step process cause there can be the case where gardener-apiserver is upgraded to a new version in which the field has been removed but other controllers are still on the old version of Gardener. This can lead to nil pointer exceptions or other unexpected behaviour.\nThe steps for removing a field from the code base is:\n The field in the external version(s) has to be commented out with appropriate doc string that the protobuf number of the corresponding field is reserved. Example:\n-\tSeedTemplate *gardencorev1beta1.SeedTemplate `json:\"seedTemplate,omitempty\" protobuf:\"bytes,2,opt,name=seedTemplate\"` +\t// SeedTemplate is tombstoned to show why 2 is reserved protobuf tag. +\t// SeedTemplate *gardencorev1beta1.SeedTemplate `json:\"seedTemplate,omitempty\" protobuf:\"bytes,2,opt,name=seedTemplate\"` The reasoning behind this is to prevent the same protobuf number being used by a new field. Introducing a new field with the same protobuf number would be a breaking change for clients still using the old protobuf definitions that have the old field for the given protobuf number. The field in the internal version can be removed.\n A unit test has to be added to make sure that a new field does not reuse the already reserved protobuf tag.\n Example of field removal can be found in the Remove seedTemplate field from ManagedSeed API PR.\nComponent Configuration APIs Most Gardener components have a component configuration that follows similar principles to the Gardener API. Those component configurations are defined in pkg/{controllermanager,gardenlet,scheduler},pkg/apis/config. Hence, the above checklist also applies for changes to those APIs. However, since these APIs are only used internally and only during the deployment of Gardener, the guidelines with respect to changes and backwards-compatibility are slightly relaxed. If necessary, it is allowed to remove fields without a proper deprecation period if the release note uses the breaking operator keywords.\nIn addition to the above checklist:\n If necessary, then adapt the Helm chart of Gardener defined in charts/gardener. Adapt the values.yaml file as well as the manifest templates. ","categories":"","description":"","excerpt":"Changing the API This document describes the steps that need to be …","ref":"/docs/gardener/changing-the-api/","tags":"","title":"Changing the API"},{"body":"CI/CD As an execution environment for CI/CD workloads, we use Concourse. We however abstract from the underlying “build executor” and instead offer a Pipeline Definition Contract, through which components declare their build pipelines as required.\nIn order to run continuous delivery workloads for all components contributing to the Gardener project, we operate a central service.\nTypical workloads encompass the execution of tests and builds of a variety of technologies, as well as building and publishing container images, typically containing build results.\nWe are building our CI/CD offering around some principles:\n container-native - each workload is executed within a container environment. Components may customise used container images automation - pipelines are generated without manual interaction self-service - components customise their pipelines by changing their sources standardisation Learn more on our: Build Pipeline Reference Manual\n","categories":"","description":"","excerpt":"CI/CD As an execution environment for CI/CD workloads, we use …","ref":"/docs/contribute/code/cicd/","tags":"","title":"CI/CD"},{"body":"Gardener Extension for cilium Networking \nThis controller operates on the Network resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting cilium Networking configuration (.spec.type=cilium):\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: cilium-network namespace: shoot--foo--bar spec: type: cilium podCIDR: 10.244.0.0/16 serviceCIDR: 10.96.0.0/24 providerConfig: apiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig # hubble: # enabled: true # store: kubernetes Please find a concrete example in the example folder. All the cilium specific configuration should be configured in the providerConfig section. If additional configuration is required, it should be added to the networking-cilium chart in controllers/networking-cilium/charts/internal/cilium/values.yaml and corresponding code parts should be adapted (for example in controllers/networking-cilium/pkg/charts/utils.go).\nOnce the network resource is applied, the networking-cilium controller would then create all the necessary managed-resources which should be picked up by the gardener-resource-manager which will then apply all the network extensions resources to the shoot cluster.\nFinally after successful reconciliation an output similar to the one below should be expected.\n status: lastOperation: description: Successfully reconciled network lastUpdateTime: \"...\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 1 How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig pointed to the cluster you want to connect to. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Docs for cilium user ","categories":"","description":"Gardener extension controller for the Cilium CNI network plugin","excerpt":"Gardener extension controller for the Cilium CNI network plugin","ref":"/docs/extensions/network-extensions/gardener-extension-networking-cilium/","tags":"","title":"Cilium CNI"},{"body":"Cleanup of Shoot Clusters in Deletion When a shoot cluster is deleted then Gardener tries to gracefully remove most of the Kubernetes resources inside the cluster. This is to prevent that any infrastructure or other artifacts remain after the shoot deletion.\nThe cleanup is performed in four steps. Some resources are deleted with a grace period, and all resources are forcefully deleted (by removing blocking finalizers) after some time to not block the cluster deletion entirely.\nCleanup steps:\n All ValidatingWebhookConfigurations and MutatingWebhookConfigurations are deleted with a 5m grace period. Forceful finalization happens after 5m. All APIServices and CustomResourceDefinitions are deleted with a 5m grace period. Forceful finalization happens after 1h. All CronJobs, DaemonSets, Deployments, Ingresss, Jobs, Pods, ReplicaSets, ReplicationControllers, Services, StatefulSets, PersistentVolumeClaims are deleted with a 5m grace period. Forceful finalization happens after 5m. If the Shoot is annotated with shoot.gardener.cloud/skip-cleanup=true, then only Services and PersistentVolumeClaims are considered.\n All VolumeSnapshots and VolumeSnapshotContents are deleted with a 5m grace period. Forceful finalization happens after 1h. It is possible to override the finalization grace periods via annotations on the Shoot:\n shoot.gardener.cloud/cleanup-webhooks-finalize-grace-period-seconds (for the resources handled in step 1) shoot.gardener.cloud/cleanup-extended-apis-finalize-grace-period-seconds (for the resources handled in step 2) shoot.gardener.cloud/cleanup-kubernetes-resources-finalize-grace-period-seconds (for the resources handled in step 3) ⚠️ If \"0\" is provided, then all resources are finalized immediately without waiting for any graceful deletion. Please be aware that this might lead to orphaned infrastructure artifacts.\n","categories":"","description":"","excerpt":"Cleanup of Shoot Clusters in Deletion When a shoot cluster is deleted …","ref":"/docs/gardener/shoot_cleanup/","tags":"","title":"Cleanup of Shoot Clusters in Deletion"},{"body":"CLI Flags Etcd-druid exposes the following CLI flags that allow for configuring its behavior.\n CLI FLag Component Description Default feature-gates etcd-druid A set of key=value pairs that describe feature gates for alpha/experimental features. Please check feature-gates for more information. \"\" metrics-bind-address controller-manager The IP address that the metrics endpoint binds to. \"\" metrics-port controller-manager The port used for the metrics endpoint. 8080 metrics-addr controller-manager The fully qualified address:port that the metrics endpoint binds to.\nDeprecated: this field will be eventually removed. Please use --metrics-bind-address and –metrics-port instead. \":8080\" webhook-server-bind-address controller-manager The IP address on which to listen for the HTTPS webhook server. \"\" webhook-server-port controller-manager The port on which to listen for the HTTPS webhook server. 9443 webhook-server-tls-server-cert-dir controller-manager The path to a directory containing the server’s TLS certificate and key (the files must be named tls.crt and tls.key respectively). \"/etc/webhook-server-tls\" enable-leader-election controller-manager Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager. false leader-election-id controller-manager Name of the resource that leader election will use for holding the leader lock. \"druid-leader-election\" leader-election-resource-lock controller-manager Specifies which resource type to use for leader election. Supported options are ’endpoints’, ‘configmaps’, ’leases’, ’endpointsleases’ and ‘configmapsleases’.\nDeprecated. Will be removed in the future in favour of using only leases as the leader election resource lock for the controller manager. \"leases\" disable-lease-cache controller-manager Disable cache for lease.coordination.k8s.io resources. false etcd-workers etcd-controller Number of workers spawned for concurrent reconciles of etcd spec and status changes. If not specified then default of 3 is assumed. 3 ignore-operation-annotation etcd-controller Specifies whether to ignore or honour the annotation gardener.cloud/operation: reconcile on resources to be reconciled.\nDeprecated: please use --enable-etcd-spec-auto-reconcile instead. false enable-etcd-spec-auto-reconcile etcd-controller If true then automatically reconciles Etcd Spec. If false, waits for explicit annotation gardener.cloud/operation: reconcile to be placed on the Etcd resource to trigger reconcile. false disable-etcd-serviceaccount-automount etcd-controller If true then .automountServiceAccountToken will be set to false for the ServiceAccount created for etcd StatefulSets. false etcd-status-sync-period etcd-controller Period after which an etcd status sync will be attempted. 15s etcd-member-notready-threshold etcd-controller Threshold after which an etcd member is considered not ready if the status was unknown before. 5m etcd-member-unknown-threshold etcd-controller Threshold after which an etcd member is considered unknown. 1m enable-backup-compaction compaction-controller Enable automatic compaction of etcd backups. false compaction-workers compaction-controller Number of worker threads of the CompactionJob controller. The controller creates a backup compaction job if a certain etcd event threshold is reached. If compaction is enabled, the value for this flag must be greater than zero. 3 etcd-events-threshold compaction-controller Total number of etcd events that can be allowed before a backup compaction job is triggered. 1000000 active-deadline-duration compaction-controller Duration after which a running backup compaction job will be terminated. 3h metrics-scrape-wait-duration compaction-controller Duration to wait for after compaction job is completed, to allow Prometheus metrics to be scraped. 0s etcd-copy-backups-task-workers etcdcopybackupstask-controller Number of worker threads for the etcdcopybackupstask controller. 3 secret-workers secret-controller Number of worker threads for the secrets controller. 10 enable-etcd-components-webhook etcdcomponents-webhook Enable EtcdComponents Webhook to prevent unintended changes to resources managed by etcd-druid. false reconciler-service-account etcdcomponents-webhook The fully qualified name of the service account used by etcd-druid for reconciling etcd resources. If unspecified, the default service account mounted for etcd-druid will be used. \u003cetcd-druid-service-account\u003e etcd-components-exempt-service-accounts etcdcomponents-webhook The comma-separated list of fully qualified names of service accounts that are exempt from EtcdComponents Webhook checks. \"\" ","categories":"","description":"","excerpt":"CLI Flags Etcd-druid exposes the following CLI flags that allow for …","ref":"/docs/other-components/etcd-druid/deployment/cli-flags/","tags":"","title":"Cli Flags"},{"body":"Cluster Resource As part of the extensibility epic, a lot of responsibility that was previously taken over by Gardener directly has now been shifted to extension controllers running in the seed clusters. These extensions often serve a well-defined purpose, e.g. the management of DNS records, infrastructure, etc. We have introduced a couple of extension CRDs in the seeds whose specification is written by Gardener, and which are acted up by the extensions.\nHowever, the extensions sometimes require more information that is not directly part of the specification. One example of that is the GCP infrastructure controller which needs to know the shoot’s pod and service network. Another example is the Azure infrastructure controller which requires some information out of the CloudProfile resource. The problem is that Gardener does not know which extension requires which information so that it can write it into their specific CRDs.\nIn order to deal with this problem we have introduced the Cluster extension resource. This CRD is written into the seeds, however, it does not contain a status, so it is not expected that something acts upon it. Instead, you can treat it like a ConfigMap which contains data that might be interesting for you. In the context of Gardener, seeds and shoots, and extensibility the Cluster resource contains the CloudProfile, Seed, and Shoot manifest. Extension controllers can take whatever information they want out of it that might help completing their individual tasks.\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Cluster metadata: name: shoot--foo--bar spec: cloudProfile: apiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile ... seed: apiVersion: core.gardener.cloud/v1beta1 kind: Seed ... shoot: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... The resource is written by Gardener before it starts the reconciliation flow of the shoot.\n⚠️ All Gardener components use the core.gardener.cloud/v1beta1 version, i.e., the Cluster resource will contain the objects in this version.\nImportant Information that Should Be Taken into Account There are some fields in the Shoot specification that might be interesting to take into account.\n .spec.hibernation.enabled={true,false}: Extension controllers might want to behave differently if the shoot is hibernated or not (probably they might want to scale down their control plane components, for example). .status.lastOperation.state=Failed: If Gardener sets the shoot’s last operation state to Failed, it means that Gardener won’t automatically retry to finish the reconciliation/deletion flow because an error occurred that could not be resolved within the last 24h (default). In this case, end-users are expected to manually re-trigger the reconciliation flow in case they want Gardener to try again. Extension controllers are expected to follow the same principle. This means they have to read the shoot state out of the Cluster resource. Extension Resources Not Associated with a Shoot In some cases, Gardener may create extension resources that are not associated with a shoot, but are needed to support some functionality internal to Gardener. Such resources will be created in the garden namespace of a seed cluster.\nFor example, if the managed ingress controller is active on the seed, Gardener will create a DNSRecord resource(s) in the garden namespace of the seed cluster for the ingress DNS record.\nExtension controllers that may be expected to reconcile extension resources in the garden namespace should make sure that they can tolerate the absence of a cluster resource. This means that they should not attempt to read the cluster resource in such cases, or if they do they should ignore the “not found” error.\nReferences and Additional Resources Cluster API (Golang Specification) ","categories":"","description":"","excerpt":"Cluster Resource As part of the extensibility epic, a lot of …","ref":"/docs/gardener/extensions/cluster/","tags":"","title":"Cluster"},{"body":"Relation Between Gardener API and Cluster API (SIG Cluster Lifecycle) In essence, the Cluster API harmonizes how to get to clusters, while Gardener goes one step further and also harmonizes the clusters themselves. The Cluster API delegates the specifics to so-called providers for infrastructures or control planes via specific CR(D)s, while Gardener only has one cluster CR(D). Different Cluster API providers, e.g. for AWS, Azure, GCP, etc., give you vastly different Kubernetes clusters. In contrast, Gardener gives you the exact same clusters with the exact same K8s version, operating system, control plane configuration like for API server or kubelet, add-ons like overlay network, HPA/VPA, DNS and certificate controllers, ingress and network policy controllers, control plane monitoring and logging stacks, down to the behavior of update procedures, auto-scaling, self-healing, etc., on all supported infrastructures. These homogeneous clusters are an essential goal for Gardener, as its main purpose is to simplify operations for teams that need to develop and ship software on Kubernetes clusters on a plethora of infrastructures (a.k.a. multi-cloud).\nIncidentally, Gardener influenced the Machine API in the Cluster API with its Machine Controller Manager and was the first to adopt it. You can find more information on that in the joint SIG Cluster Lifecycle KubeCon talk where @hardikdr from our Gardener team in India spoke.\nThat means that we follow the Cluster API with great interest and are active members. It was completely overhauled from v1alpha1 to v1alpha2. But because v1alpha2 made too many assumptions about the bring-up of masters and was enforcing master machine operations (for more information, see The Cluster API Book: “As of v1alpha2, Machine-Based is the only control plane type that Cluster API supports”), services that managed their control planes differently like GKE or Gardener couldn’t adopt it. In 2020 v1alpha3 was introduced and made it possible (again) to integrate managed services like GKE or Gardener. The mapping from the Gardener API to the Cluster API is mostly syntactic.\nTo wrap it up, while the Cluster API knows about clusters, it doesn’t know about their make-up. With Gardener, we wanted to go beyond that and harmonize the make-up of the clusters themselves and make them homogeneous across all supported infrastructures. Gardener can therefore deliver homogeneous clusters with exactly the same configuration and behavior on all infrastructures (see also Gardener’s coverage in the official conformance test grid).\nWith Cluster API v1alpha3 and the support for declarative control plane management, it has became possible (again) to enable Kubernetes managed services like GKE or Gardener. We would be more than happy if the community would be interested to contribute a Gardener control plane provider.\n","categories":["Users"],"description":"Understand the evolution of the Gardener API and its relation to the Cluster API","excerpt":"Understand the evolution of the Gardener API and its relation to the …","ref":"/docs/gardener/concepts/cluster-api/","tags":"","title":"Cluster API"},{"body":"Community Calls Join our community calls to connect with other Gardener enthusiasts and watch cool presentations.\nWhat content can you expect?\n Gardener core developers roll out new information, share knowledge with the members and demonstrate new service capabilities. Adopters and contributors share their use-cases, experience and exchange on future requirements. If you want to receive updates, sign up here:\n Gardener Google Group The recordings are published on the Gardener Project YouTube channel. Topic Speaker Date and Time Link Get more computing power in Gardener by overcoming Kubelet limitations with CRI-resource-manager Pawel Palucki, Alexander D. Kanevskiy October 20, 2022 Recording Summary Cilium / Isovalent Presentation Raymond de Jong October 6, 2022 Recording Summary Gardener Extension Development - From scratch to the gardener-extension-shoot-flux Jens Schneider, Lothar Gesslein June 9, 2022 Recording Summary Deploying and Developing Gardener Locally (Without Any External Infrastructure!) Tim Ebert, Rafael Franzke March 17, 2022 Recording Summary Gardenctl-v2 Holger Kosser, Lukas Gross, Peter Sutter February 17, 2022 Recording Summary Google Calendar\n Presenting a Topic If there is a topic you would like to present, message us in our #gardener slack channel or get in touch with Jessica Katz.\n","categories":"","description":"","excerpt":"Community Calls Join our community calls to connect with other …","ref":"/community/","tags":"","title":"Community"},{"body":"Checklist For Adding New Components Adding new components that run in the garden, seed, or shoot cluster is theoretically quite simple - we just need a Deployment (or other similar workload resource), the respective container image, and maybe a bit of configuration. In practice, however, there are a couple of things to keep in mind in order to make the deployment production-ready. This document provides a checklist for them that you can walk through.\nGeneral Avoid usage of Helm charts (example)\nNowadays, we use Golang components instead of Helm charts for deploying components to a cluster. Please find a typical structure of such components in the provided metrics_server.go file (configuration values are typically managed in a Values structure). There are a few exceptions (e.g., Istio) still using charts, however the default should be using a Golang-based implementation. For the exceptional cases, use Golang’s embed package to embed the Helm chart directory (example 1, example 2).\n Choose the proper deployment way (example 1 (direct application w/ client), example 2 (using ManagedResource), example 3 (mixed scenario))\nFor historic reasons, resources related to shoot control plane components are applied directly with the client. All other resources (seed or shoot system components) are deployed via gardener-resource-manager’s Resource controller (ManagedResources) since it performs health checks out-of-the-box and has a lot of other features (see its documentation for more information). Components that can run as both seed system component or shoot control plane component (e.g., VPA or kube-state-metrics) can make use of these utility functions.\n Use unique ConfigMaps/Secrets (example 1, example 2)\nUnique ConfigMaps/Secrets are immutable for modification and have a unique name. This has a couple of benefits, e.g. the kubelet doesn’t watch these resources, and it is always clear which resource contains which data since it cannot be changed. As a consequence, unique/immutable ConfigMaps/Secret are superior to checksum annotations on the pod templates. Stale/unused ConfigMaps/Secrets are garbage-collected by gardener-resource-manager’s GarbageCollector. There are utility functions (see examples above) for using unique ConfigMaps/Secrets in Golang components. It is essential to inject the annotations into the workload resource to make the garbage-collection work.\nNote that some ConfigMaps/Secrets should not be unique (e.g., those containing monitoring or logging configuration). The reason is that the old revision stays in the cluster even if unused until the garbage-collector acts. During this time, they would be wrongly aggregated to the full configuration.\n Manage certificates/secrets via secrets manager (example)\nYou should use the secrets manager for the management of any kind of credentials. This makes sure that credentials rotation works out-of-the-box without you requiring to think about it. Generally, do not use client certificates (see the Security section).\n Consider hibernation when calculating replica count (example)\nShoot clusters can be hibernated meaning that all control plane components in the shoot namespace in the seed cluster are scaled down to zero and all worker nodes are terminated. If your component runs in the seed cluster then you have to consider this case and provide the proper replica count. There is a utility function available (see example).\n Ensure task dependencies are as precise as possible in shoot flows (example 1, example 2)\nOnly define the minimum of needed dependency tasks in the shoot reconciliation/deletion flows.\n Handle shoot system components\nShoot system components deployed by gardener-resource-manager are labelled with resource.gardener.cloud/managed-by: gardener. This makes Gardener adding required label selectors and tolerations so that non-DaemonSet managed Pods will exclusively run on selected nodes (for more information, see System Components Webhook). DaemonSets on the other hand, should generally tolerate any NoSchedule or NoExecute taints so that they can run on any Node, regardless of user added taints.\n Images Do not hard-code container image references (example 1, example 2, example 3)\nWe define all image references centrally in the imagevector/containers.yaml file. Hence, the image references must not be hard-coded in the pod template spec but read from this so-called image vector instead.\n Do not use container images from registries that don’t support IPv6 (example: image vector, prow configuration)\nRegistries such as ECR, GHCR (ghcr.io), MCR (mcr.microsoft.com) don’t support pulling images over IPv6.\nCheck if the upstream image is being also maintained in a registry that support IPv6 natively such as Artifact Registry, Quay (quay.io). If there is such image, use the image from registry with IPv6 support.\nIf the image is not available in a registry with IPv6 then copy the image to the gardener GCR. There is a prow job copying images that are needed in gardener components from a source registry to the gardener GCR under the prefix europe-docker.pkg.dev/gardener-project/releases/3rd/ (see the documentation or gardener/ci-infra#619).\nIf you want to use a new image from a registry without IPv6 support or upgrade an already used image to a newer tag, please open a PR to the ci-infra repository that modifies the job’s list of images to copy: images.yaml.\n Do not use container images from Docker Hub (example: image vector, prow configuration)\nThere is a strict rate-limit that applies to the Docker Hub registry. As described in 2., use another registry (if possible) or copy the image to the gardener GCR.\n Do not use Shoot container images that are not multi-arch\nGardener supports Shoot clusters with both amd64 and arm64 based worker Nodes. amd64 container images cannot run on arm64 worker Nodes and vice-versa.\n Security Use a dedicated ServiceAccount and disable auto-mount (example)\nComponents that need to talk to the API server of their runtime cluster must always use a dedicated ServiceAccount (do not use default), with automountServiceAccountToken set to false. This makes gardener-resource-manager’s TokenInvalidator invalidate the static token secret and its ProjectedTokenMount webhook inject a projected token automatically.\n Use shoot access tokens instead of a client certificates (example)\nFor components that need to talk to a target cluster different from their runtime cluster (e.g., running in seed cluster but talking to shoot) the gardener-resource-manager’s TokenRequestor should be used to manage a so-called “shoot access token”.\n Define RBAC roles with minimal privileges (example)\nThe component’s ServiceAccount (if it exists) should have as little privileges as possible. Consequently, please define proper RBAC roles for it. This might include a combination of ClusterRoles and Roles. Please do not provide elevated privileges due to laziness (e.g., because there is already a ClusterRole that can be extended vs. creating a Role only when access to a single namespace is needed).\n Use NetworkPolicys to restrict network traffic\nYou should restrict both ingress and egress traffic to/from your component as much as possible to ensure that it only gets access to/from other components if really needed. Gardener provides a few default policies for typical usage scenarios. For more information, see NetworkPolicys In Garden, Seed, Shoot Clusters.\n Do not run containers in privileged mode (example, example 2)\nAvoid running containers with privileged=true. Instead, define the needed Linux capabilities.\n Do not run containers as root (example)\nAvoid running containers as root. Usually, components such as Kubernetes controllers and admission webhook servers don’t need root user capabilities to do their jobs.\nThe problem with running as root, starts with how the container is first built. Unless a non-privileged user is configured in the Dockerfile, container build systems by default set up the container with the root user. Add a non-privileged user to your Dockerfile or use a base image with a non-root user (for example the nonroot images from distroless such as gcr.io/distroless/static-debian12:nonroot).\nIf the image is an upstream one, then consider configuring a securityContext for the container/Pod with a non-privileged user. For more information, see Configure a Security Context for a Pod or Container.\n Choose the proper Seccomp profile (example 1, example 2)\nFor components deployed in the Seed cluster, the Seccomp profile will be defaulted to RuntimeDefault by gardener-resource-manager’s SeccompProfile webhook which works well for the majority of components. However, in some special cases you might need to overwrite it.\nThe gardener-resource-manager’s SeccompProfile webhook is not enabled for a Shoot cluster. For components deployed in the Shoot cluster, it is required [*] to explicitly specify the Seccomp profile.\n[*] It is required because if a component deployed in the Shoot cluster does not specify a Seccomp profile and cannot run with the RuntimeDefault Seccomp profile, then enabling the .spec.kubernetes.kubelet.seccompDefault field in the Shoot spec would break the corresponding component.\n High Availability / Stability Specify the component type label for high availability (example)\nTo support high-availability deployments, gardener-resource-managers HighAvailabilityConfig webhook injects the proper specification like replica or topology spread constraints. You only need to specify the type label. For more information, see High Availability Of Deployed Components.\n Define a PodDisruptionBudget (example)\nClosely related to high availability but also to stability in general: The definition of a PodDisruptionBudget with maxUnavailable=1 should be provided by default.\n Choose the right PriorityClass (example)\nEach cluster runs many components with different priorities. Gardener provides a set of default PriorityClasses. For more information, see Priority Classes.\n Consider defining liveness and readiness probes (example)\nTo ensure smooth rolling update behaviour, consider the definition of liveness and/or readiness probes.\n Mark node-critical components (example)\nTo ensure user workload pods are only scheduled to Nodes where all node-critical components are ready, these components need to tolerate the node.gardener.cloud/critical-components-not-ready taint (NoSchedule effect). Also, such DaemonSets and the included PodTemplates need to be labelled with node.gardener.cloud/critical-component=true. For more information, see Readiness of Shoot Worker Nodes.\n Consider making a Service topology-aware (example)\nTo reduce costs and to improve the network traffic latency in multi-zone Seed clusters, consider making a Service topology-aware, if applicable. In short, when a Service is topology-aware, Kubernetes routes network traffic to the Endpoints (Pods) which are located in the same zone where the traffic originated from. In this way, the cross availability zone traffic is avoided. See Topology-Aware Traffic Routing.\n Enable leader election unconditionally for controllers (example 1, example 2, example 3)\nEnable leader election unconditionally for controllers independently from the number of replicas or from the high availability configurations. Having leader election enabled even for a single replica Deployment prevents having two Pods active at the same time. Otherwise, there are some corner cases that can result in two active Pods - Deployment rolling update or kubelet stops running on a Node and is not able to terminate the old replica while kube-controller-manager creates a new replica to match the Deployment’s desired replicas count.\n Scalability Provide resource requirements (example)\nAll components should define reasonable (initial) CPU and memory requests and avoid limits (especially CPU limits) unless you know the healthy range for your component (almost impossible with most components today), but no more than the node allocatable remainder (after daemonset pods) of the largest eligible machine type. Scheduling only takes requests into account!\n Define a VerticalPodAutoscaler (example)\nWe typically (need to) perform vertical auto-scaling for containers that have a significant usage (\u003e50m/100M) and a significant usage spread over time (\u003e2x) by defining a VerticalPodAutoscaler with updatePolicy.updateMode Auto, containerPolicies[].controlledValues RequestsOnly, reasonable minAllowed configuration and no maxAllowed configuration (will be taken care of in Gardener environments for you/capped at the largest eligible machine type).\n Define a HorizontalPodAutoscaler if needed (example)\nIf your component is capable of scaling horizontally, you should consider defining a HorizontalPodAutoscaler.\n [!NOTE] For more information and concrete configuration hints, please see our best practices guide for pod auto scaling and especially the summary and recommendations sections.\n Observability / Operations Productivity Provide monitoring scrape config and alerting rules (example 1, example 2)\nComponents should provide scrape configuration and alerting rules for Prometheus/Alertmanager if appropriate. This should be done inside a dedicated monitoring.go file. Extensions should follow the guidelines described in Extensions Monitoring Integration.\n Provide logging parsers and filters (example 1, example 2)\nComponents should provide parsers and filters for fluent-bit, if appropriate. This should be done inside a dedicated logging.go file. Extensions should follow the guidelines described in Fluent-bit log parsers and filters.\n Set the revisionHistoryLimit to 2 for Deployments (example)\nIn order to allow easy inspection of two ReplicaSets to quickly find the changes that lead to a rolling update, the revision history limit should be set to 2.\n Define health checks (example 1)\ngardener-operators’s and gardenlet’s care controllers regularly check the health status of components relevant to the respective cluster (garden/seed/shoot). For shoot control plane components, you need to enhance the lists of components to make sure your component is checked, see example above. For components deployed via ManagedResource, please consult the respective care controller documentation for more information (garden, seed, shoot).\n Configure automatic restarts in shoot maintenance time window (example 1, example 2)\nGardener offers to restart components during the maintenance time window. For more information, see Restart Control Plane Controllers and Restart Some Core Addons. You can consider adding the needed label to your control plane component to get this automatic restart (probably not needed for most components).\n ","categories":"","description":"","excerpt":"Checklist For Adding New Components Adding new components that run in …","ref":"/docs/gardener/component-checklist/","tags":"","title":"Component Checklist"},{"body":"Concept Title (the topic title can also be placed in the frontmatter)\nOverview This section provides an overview of the topic and the information provided in it.\nRelevant heading 1 This section gives the user all the information needed in order to understand the topic.\nRelevant subheading This adds additional information that belongs to the topic discussed in the parent heading.\nRelevant heading 2 This section gives the user all the information needed in order to understand the topic.\nRelated Links Link 1 Link 2 ","categories":"","description":"Describes the contents of a concept topic","excerpt":"Describes the contents of a concept topic","ref":"/docs/contribute/documentation/style-guide/concept_template/","tags":"","title":"Concept Topic Structure"},{"body":"Deployment of the shoot DNS service extension Disclaimer: This document is NOT a step by step deployment guide for the shoot DNS service extension and only contains some configuration specifics regarding the deployment of different components via the helm charts residing in the shoot DNS service extension repository.\ngardener-extension-admission-shoot-dns-service Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-shoot-dns-service component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the shoot DNS service extension Disclaimer: This …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/configuration/","tags":"","title":"Configuration"},{"body":"Configuring the Rsyslog Relp Extension Introduction As a cluster owner, you might need audit logs on a Shoot node level. With these audit logs you can track actions on your nodes like privilege escalation, file integrity, process executions, and who is the user that performed these actions. Such information is essential for the security of your Shoot cluster. Linux operating systems collect such logs via the auditd and journald daemons. However, these logs can be lost if they are only kept locally on the operating system. You need a reliable way to send them to a remote server where they can be stored for longer time periods and retrieved when necessary.\nRsyslog offers a solution for that. It gathers and processes logs from auditd and journald and then forwards them to a remote server. Moreover, rsyslog can make use of the RELP protocol so that logs are sent reliably and no messages are lost.\nThe shoot-rsyslog-relp extension is used to configure rsyslog on each Shoot node so that the following can take place:\n Rsyslog reads logs from the auditd and journald sockets. The logs are filtered based on the program name and syslog severity of the message. The logs are enriched with metadata containing the name of the Project in which the Shoot is created, the name of the Shoot, the UID of the Shoot, and the hostname of the node on which the log event occurred. The enriched logs are sent to the target remote server via the RELP protocol. The following graph shows a rough outline of how that looks in a Shoot cluster: Shoot Configuration The extension is not globally enabled and must be configured per Shoot cluster. The Shoot specification has to be adapted to include the shoot-rsyslog-relp extension configuration, which specifies the target server to which logs are forwarded, its port, and some optional rsyslog settings described in the examples below.\nBelow is an example shoot-rsyslog-relp extension configuration as part of the Shoot spec:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig # Set the target server to which logs are sent. The server must support the RELP protocol. target: some.rsyslog-relp.server # Set the port of the target server. port: 10250 # Define rules to select logs from which programs and with what syslog severity # are forwarded to the target server. loggingRules: - severity: 4 programNames: [\"kubelet\", \"audisp-syslog\"] - severity: 1 programNames: [\"audisp-syslog\"] # Define an interval of 90 seconds at which the current connection is broken and re-established. # By default this value is 0 which means that the connection is never broken and re-established. rebindInterval: 90 # Set the timeout for relp sessions to 90 seconds. If set too low, valid sessions may be considered # dead and tried to recover. timeout: 90 # Set how often an action is retried before it is considered to have failed. # Failed actions discard log messages. Setting `-1` here means that messages are never discarded. resumeRetryCount: -1 # Configures rsyslog to report continuation of action suspension, e.g. when the connection to the target # server is broken. reportSuspensionContinuation: true # Add tls settings if tls should be used to encrypt the connection to the target server. tls: enabled: true # Use `name` authentication mode for the tls connection. authMode: name # Only allow connections if the server's name is `some.rsyslog-relp.server` permittedPeer: - \"some.rsyslog-relp.server\" # Reference to the resource which contains certificates used for the tls connection. # It must be added to the `.spec.resources` field of the Shoot. secretReferenceName: rsyslog-relp-tls # Instruct librelp on the Shoot nodes to use the gnutls tls library. tlsLib: gnutls # Add auditConfig settings if you want to customize node level auditing. auditConfig: enabled: true # Reference to the resource which contains the audit configuration. # It must be added to the `.spec.resources` field of the Shoot. configMapReferenceName: audit-config resources: # Add the rsyslog-relp-tls secret in the resources field of the Shoot spec. - name: rsyslog-relp-tls resourceRef: apiVersion: v1 kind: Secret name: rsyslog-relp-tls-v1 - name: audit-config resourceRef: apiVersion: v1 kind: ConfigMap name: audit-config-v1 ... Choosing Which Log Messages to Send to the Target Server The .loggingRules field defines rules about which logs should be sent to the target server. When a log is processed by rsyslog, it is compared against the list of rules in order. If the program name and the syslog severity of the log messages matches the rule, the message is forwarded to the target server. The following table describes the syslog severity and their corresponding codes:\nNumerical Severity Code 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error conditions 4 Warning: warning conditions 5 Notice: normal but significant condition 6 Informational: informational messages 7 Debug: debug-level messages Below is an example with a .loggingRules section that will only forward logs from the kubelet program with syslog severity of 6 or lower and any other program with syslog severity of 2 or lower:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: localhost port: 1520 loggingRules: - severity: 6 programNames: [\"kubelet\"] - severity: 2 You can use a minimal shoot-rsyslog-relp extension configuration to forward all logs to the target server:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 Securing the Communication to the Target Server with TLS The communication to the target server is not encrypted by default. To enable encryption, set the .tls.enabled field in the shoot-rsyslog-relp extension configuration to true. In this case, an immutable secret which contains the TLS certificates used to establish the TLS connection to the server must be created in the same project namespace as your Shoot.\nAn example Secret is given below:\n Note: The secret must be immutable\n kind: Secret apiVersion: v1 metadata: name: rsyslog-relp-tls-v1 namespace: garden-foo immutable: true data: ca: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- crt: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- key: |-----BEGIN BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- The Secret must be referenced in the Shoot’s .spec.resources field and the corresponding resource entry must be referenced in the .tls.secretReferenceName of the shoot-rsyslog-relp extension configuration:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 tls: enabled: true secretReferenceName: rsyslog-relp-tls resources: - name: rsyslog-relp-tls resourceRef: apiVersion: v1 kind: Secret name: rsyslog-relp-tls-v1 ... You can set a few additional parameters for the TLS connection: .tls.authMode, tls.permittedPeer, and tls.tlsLib. Refer to the rsyslog documentation for more information on these parameters:\n .tls.authMode .tls.permittedPeer .tls.tlsLib Configuring the Audit Daemon on the Shoot Nodes The shoot-rsyslog-relp extension also allows you to configure the Audit Daemon (auditd) on the Shoot nodes.\nBy default, the audit rules located under the /etc/audit/rules.d directory on your Shoot’s nodes will be moved to /etc/audit/rules.d.original and the following rules will be placed under the /etc/audit/rules.d directory: 00-base-config.rules, 10-privilege-escalation.rules, 11-privilege-special.rules, 12-system-integrity.rules. Next, augerules --load will be called and the audit daemon (auditd) restarted so that the new rules can take effect.\nAlternatively, you can define your own auditd rules to be placed on your Shoot’s nodes by using the following configuration:\napiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: Auditd auditRules: |## First rule - delete all existing rules -D ## Now define some custom rules -a exit,always -F arch=b64 -S setuid -S setreuid -S setgid -S setregid -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation -a exit,always -F arch=b64 -S execve -S execveat -F euid=0 -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation In this case the original rules are also backed up in the /etc/audit/rules.d.original directory.\nTo deploy this configuration, it must be embedded in an immutable ConfigMap.\n [!NOTE] The data key storing this configuration must be named auditd.\n An example ConfigMap is given below:\napiVersion: v1 kind: ConfigMap metadata: name: audit-config-v1 namespace: garden-foo immutable: true data: auditd: |apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: Auditd auditRules: | ## First rule - delete all existing rules -D ## Now define some custom rules -a exit,always -F arch=b64 -S setuid -S setreuid -S setgid -S setregid -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation -a exit,always -F arch=b64 -S execve -S execveat -F euid=0 -F auid\u003e0 -F auid!=-1 -F key=privilege_escalation After creating such a ConfigMap, it must be included in the Shoot’s spec.resources array and then referenced from the providerConfig.auditConfig.configMapReferenceName field in the shoot-rsyslog-relp extension configuration.\nAn example configuration is given below:\nkind: Shoot metadata: name: bar namespace: garden-foo ... spec: extensions: - type: shoot-rsyslog-relp providerConfig: apiVersion: rsyslog-relp.extensions.gardener.cloud/v1alpha1 kind: RsyslogRelpConfig target: some.rsyslog-relp.server port: 10250 loggingRules: - severity: 7 auditConfig: enabled: true configMapReferenceName: audit-config resources: - name: audit-config resourceRef: apiVersion: v1 kind: ConfigMap name: audit-config-v1 Finally, by setting providerConfig.auditConfig.enabled to false in the shoot-rsyslog-relp extension configuration, the original audit rules on your Shoot’s nodes will not be modified and auditd will not be restarted.\nExamples on how the providerConfig.auditConfig.enabled field functions are given below:\n The following deploys the extension default audit rules as of today: providerConfig: auditConfig: enabled: true The following deploys only the rules specified in the referenced ConfigMap: providerConfig: auditConfig: enabled: true configMapReferenceName: audit-config Both of the following do not deploy any audit rules: providerConfig: auditConfig: enabled: false configMapReferenceName: audit-config providerConfig: auditConfig: enabled: false ","categories":"","description":"","excerpt":"Configuring the Rsyslog Relp Extension Introduction As a cluster …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/configuration/","tags":"","title":"Configuration"},{"body":"Gardener Configuration and Usage Gardener automates the full lifecycle of Kubernetes clusters as a service. Additionally, it has several extension points allowing external controllers to plug-in to the lifecycle. As a consequence, there are several configuration options for the various custom resources that are partially required.\nThis document describes the:\n Configuration and usage of Gardener as operator/administrator. Configuration and usage of Gardener as end-user/stakeholder/customer. Configuration and Usage of Gardener as Operator/Administrator When we use the terms “operator/administrator”, we refer to both the people deploying and operating Gardener. Gardener consists of the following components:\n gardener-apiserver, a Kubernetes-native API extension that serves custom resources in the Kubernetes-style (like Seeds and Shoots), and a component that contains multiple admission plugins. gardener-admission-controller, an HTTP(S) server with several handlers to be used in a ValidatingWebhookConfiguration. gardener-controller-manager, a component consisting of multiple controllers that implement reconciliation and deletion flows for some of the custom resources (e.g., it contains the logic for maintaining Shoots, reconciling Projects). gardener-scheduler, a component that assigns newly created Shoot clusters to appropriate Seed clusters. gardenlet, a component running in seed clusters and consisting out of multiple controllers that implement reconciliation and deletion flows for some of the custom resources (e.g., it contains the logic for reconciliation and deletion of Shoots). Each of these components have various configuration options. The gardener-apiserver uses the standard API server library maintained by the Kubernetes community, and as such it mainly supports command line flags. Other components use so-called componentconfig files that describe their configuration in a Kubernetes-style versioned object.\nConfiguration File for Gardener Admission Controller The Gardener admission controller only supports one command line flag, which should be a path to a valid admission-controller configuration file. Please take a look at this example configuration.\nConfiguration File for Gardener Controller Manager The Gardener controller manager only supports one command line flag, which should be a path to a valid controller-manager configuration file. Please take a look at this example configuration.\nConfiguration File for Gardener Scheduler The Gardener scheduler also only supports one command line flag, which should be a path to a valid scheduler configuration file. Please take a look at this example configuration. Information about the concepts of the Gardener scheduler can be found at Gardener Scheduler.\nConfiguration File for gardenlet The gardenlet also only supports one command line flag, which should be a path to a valid gardenlet configuration file. Please take a look at this example configuration. Information about the concepts of the Gardenlet can be found at gardenlet.\nSystem Configuration After successful deployment of the four components, you need to setup the system. Let’s first focus on some “static” configuration. When the gardenlet starts, it scans the garden namespace of the garden cluster for Secrets that have influence on its reconciliation loops, mainly the Shoot reconciliation:\n Internal domain secret - contains the DNS provider credentials (having appropriate privileges) which will be used to create/delete the so-called “internal” DNS records for the Shoot clusters, please see this yaml file for an example.\n This secret is used in order to establish a stable endpoint for shoot clusters, which is used internally by all control plane components. The DNS records are normal DNS records but called “internal” in our scenario because only the kubeconfigs for the control plane components use this endpoint when talking to the shoot clusters. It is forbidden to change the internal domain secret if there are existing shoot clusters. Default domain secrets (optional) - contain the DNS provider credentials (having appropriate privileges) which will be used to create/delete DNS records for a default domain for shoots (e.g., example.com), please see this yaml file for an example.\n Not every end-user/stakeholder/customer has its own domain, however, Gardener needs to create a DNS record for every shoot cluster. As landscape operator you might want to define a default domain owned and controlled by you that is used for all shoot clusters that don’t specify their own domain. If you have multiple default domain secrets defined you can add a priority as an annotation (dns.gardener.cloud/domain-default-priority) to select which domain should be used for new shoots during creation. The domain with the highest priority is selected during shoot creation. If there is no annotation defined, the default priority is 0, also all non integer values are considered as priority 0. Alerting secrets (optional) - contain the alerting configuration and credentials for the AlertManager to send email alerts. It is also possible to configure the monitoring stack to send alerts to an AlertManager not deployed by Gardener to handle alerting. Please see this yaml file for an example.\n If email alerting is configured: An AlertManager is deployed into each seed cluster that handles the alerting for all shoots on the seed cluster. Gardener will inject the SMTP credentials into the configuration of the AlertManager. The AlertManager will send emails to the configured email address in case any alerts are firing. If an external AlertManager is configured: Each shoot has a Prometheus responsible for monitoring components and sending out alerts. The alerts will be sent to a URL configured in the alerting secret. This external AlertManager is not managed by Gardener and can be configured however the operator sees fit. Supported authentication types are no authentication, basic, or mutual TLS. Global monitoring secrets (optional) - contains basic authentication credentials for the Prometheus aggregating metrics for all clusters.\n These secrets are synced to each seed cluster and used to gain access to the aggregate monitoring components. Shoot Service Account Issuer secret (optional) - contains the configuration needed to centrally configure gardenlets in order to implement GEP-24. Please see the example configuration for more details. In addition to that, the ShootManagedIssuer gardenlet feature gate should be enabled in order for configurations to take effect.\n This secret contains the hostname which will be used to configure the shoot’s managed issuer, therefore the value of the hostname should not be changed once configured. [!CAUTION] Gardener Operator manages this field automatically if Gardener Discovery Server is enabled and does not provide a way to change the default value of it as of now. It calculates it based on the first ingress domain for the runtime Garden cluster. The domain is prefixed with “discovery.” using the formula discovery.{garden.spec.runtimeCluster.ingress.domains[0]}. If you are not yet using Gardener Operator but plan to enable the ShootManagedIssuer feature gate, it is EXTREMELY important to follow the same convention as Gardener Operator, so that during migration to Gardener Operator the hostname can stay the same and avoid disruptions for shoots that already have a managed service account issuer.\n Apart from this “static” configuration there are several custom resources extending the Kubernetes API and used by Gardener. As an operator/administrator, you have to configure some of them to make the system work.\nConfiguration and Usage of Gardener as End-User/Stakeholder/Customer As an end-user/stakeholder/customer, you are using a Gardener landscape that has been setup for you by another team. You don’t need to care about how Gardener itself has to be configured or how it has to be deployed. Take a look at Gardener API Server - the topic describes which resources are offered by Gardener. You may want to have a more detailed look for Projects, SecretBindings, Shoots, and (Cluster)OpenIDConnectPresets.\n","categories":"","description":"","excerpt":"Gardener Configuration and Usage Gardener automates the full lifecycle …","ref":"/docs/gardener/configuration/","tags":"","title":"Configuration"},{"body":"Configure Dependency Watchdog Components Prober Dependency watchdog prober command takes command-line-flags which are meant to fine-tune the prober. In addition a ConfigMap is also mounted to the container which provides tuning knobs for the all probes that the prober starts.\nCommand line arguments Prober can be configured via the following flags:\n Flag Name Type Required Default Value Description kube-api-burst int No 10 Burst to use while talking with kubernetes API server. The number must be \u003e= 0. If it is 0 then a default value of 10 will be used kube-api-qps float No 5.0 Maximum QPS (queries per second) allowed when talking with kubernetes API server. The number must be \u003e= 0. If it is 0 then a default value of 5.0 will be used concurrent-reconciles int No 1 Maximum number of concurrent reconciles config-file string Yes NA Path of the config file containing the configuration to be used for all probes metrics-bind-addr string No “:9643” The TCP address that the controller should bind to for serving prometheus metrics health-bind-addr string No “:9644” The TCP address that the controller should bind to for serving health probes enable-leader-election bool No false In case prober deployment has more than 1 replica for high availability, then it will be setup in a active-passive mode. Out of many replicas one will become the leader and the rest will be passive followers waiting to acquire leadership in case the leader dies. leader-election-namespace string No “garden” Namespace in which leader election resource will be created. It should be the same namespace where DWD pods are deployed leader-elect-lease-duration time.Duration No 15s The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. leader-elect-renew-deadline time.Duration No 10s The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. leader-elect-retry-period time.Duration No 2s The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. You can view an example kubernetes prober deployment YAML to see how these command line args are configured.\nProber Configuration A probe configuration is mounted as ConfigMap to the container. The path to the config file is configured via config-file command line argument as mentioned above. Prober will start one probe per Shoot control plane hosted within the Seed cluster. Each such probe will run asynchronously and will periodically connect to the Kube ApiServer of the Shoot. Configuration below will influence each such probe.\nYou can view an example YAML configuration provided as data in a ConfigMap here.\n Name Type Required Default Value Description kubeConfigSecretName string Yes NA Name of the kubernetes Secret which has the encoded KubeConfig required to connect to the Shoot control plane Kube ApiServer via an internal domain. This typically uses the local cluster DNS. probeInterval metav1.Duration No 10s Interval with which each probe will run. initialDelay metav1.Duration No 30s Initial delay for the probe to become active. Only applicable when the probe is created for the first time. probeTimeout metav1.Duration No 30s In each run of the probe it will attempt to connect to the Shoot Kube ApiServer. probeTimeout defines the timeout after which a single run of the probe will fail. backoffJitterFactor float64 No 0.2 Jitter with which a probe is run. dependentResourceInfos []prober.DependentResourceInfo Yes NA Detailed below. kcmNodeMonitorGraceDuration metav1.Duration Yes NA It is the node-monitor-grace-period set in the kcm flags. Used to determine whether a node lease can be considered expired. nodeLeaseFailureFraction float64 No 0.6 is used to determine the maximum number of leases that can be expired for a lease probe to succeed. DependentResourceInfo If a lease probe fails, then it scales down the dependent resources defined by this property. Similarly, if the lease probe is now successful, then it scales up the dependent resources defined by this property.\nEach dependent resource info has the following properties:\n Name Type Required Default Value Description ref autoscalingv1.CrossVersionObjectReference Yes NA It is a collection of ApiVersion, Kind and Name for a kubernetes resource thus serving as an identifier. optional bool Yes NA It is possible that a dependent resource is optional for a Shoot control plane. This property enables a probe to determine the correct behavior in case it is unable to find the resource identified via ref. scaleUp prober.ScaleInfo No Captures the configuration to scale up this resource. Detailed below. scaleDown prober.ScaleInfo No Captures the configuration to scale down this resource. Detailed below. NOTE: Since each dependent resource is a target for scale up/down, therefore it is mandatory that the resource reference points a kubernetes resource which has a scale subresource.\n ScaleInfo How to scale a DependentResourceInfo is captured in ScaleInfo. It has the following properties:\n Name Type Required Default Value Description level int Yes NA Detailed below. initialDelay metav1.Duration No 0s (No initial delay) Once a decision is taken to scale a resource then via this property a delay can be induced before triggering the scale of the dependent resource. timeout metav1.Duration No 30s Defines the timeout for the scale operation to finish for a dependent resource. Determining target replicas\nProber cannot assume any target replicas during a scale-up operation for the following reasons:\n Kubernetes resources could be set to provide highly availability and the number of replicas could wary from one shoot control plane to the other. In gardener the number of replicas of pods in shoot namespace are controlled by the shoot control plane configuration. If Horizontal Pod Autoscaler has been configured for a kubernetes dependent resource then it could potentially change the spec.replicas for a deployment/statefulset. Given the above constraint lets look at how prober determines the target replicas during scale-down or scale-up operations.\n Scale-Up: Primary responsibility of a probe while performing a scale-up is to restore the replicas of a kubernetes dependent resource prior to scale-down. In order to do that it updates the following for each dependent resource that requires a scale-up:\n spec.replicas: Checks if dependency-watchdog.gardener.cloud/replicas is set. If it is, then it will take the value stored against this key as the target replicas. To be a valid value it should always be greater than 0. If dependency-watchdog.gardener.cloud/replicas annotation is not present then it falls back to the hard coded default value for scale-up which is set to 1. Removes the annotation dependency-watchdog.gardener.cloud/replicas if it exists. Scale-Down: To scale down a dependent kubernetes resource it does the following:\n Adds an annotation dependency-watchdog.gardener.cloud/replicas and sets its value to the current value of spec.replicas. Updates spec.replicas to 0. Level\nEach dependent resource that should be scaled up or down is associated to a level. Levels are ordered and processed in ascending order (starting with 0 assigning it the highest priority). Consider the following configuration:\ndependentResourceInfos: - ref: kind: \"Deployment\" name: \"kube-controller-manager\" apiVersion: \"apps/v1\" scaleUp: level: 1 scaleDown: level: 0 - ref: kind: \"Deployment\" name: \"machine-controller-manager\" apiVersion: \"apps/v1\" scaleUp: level: 1 scaleDown: level: 1 - ref: kind: \"Deployment\" name: \"cluster-autoscaler\" apiVersion: \"apps/v1\" scaleUp: level: 0 scaleDown: level: 2 Let us order the dependent resources by their respective levels for both scale-up and scale-down. We get the following order:\nScale Up Operation\nOrder of scale up will be:\n cluster-autoscaler kube-controller-manager and machine-controller-manager will be scaled up concurrently after cluster-autoscaler has been scaled up. Scale Down Operation\nOrder of scale down will be:\n kube-controller-manager machine-controller-manager after (1) has been scaled down. cluster-autoscaler after (2) has been scaled down. Disable/Ignore Scaling A probe can be configured to ignore scaling of configured dependent kubernetes resources. To do that one must set dependency-watchdog.gardener.cloud/ignore-scaling annotation to true on the scalable resource for which scaling should be ignored.\nWeeder Dependency watchdog weeder command also (just like the prober command) takes command-line-flags which are meant to fine-tune the weeder. In addition a ConfigMap is also mounted to the container which helps in defining the dependency of pods on endpoints.\nCommand Line Arguments Weeder can be configured with the same flags as that for prober described under command-line-arguments section You can find an example weeder deployment YAML to see how these command line args are configured.\nWeeder Configuration Weeder configuration is mounted as ConfigMap to the container. The path to the config file is configured via config-file command line argument as mentioned above. Weeder will start one go routine per podSelector per endpoint on an endpoint event as described in weeder internal concepts.\nYou can view the example YAML configuration provided as data in a ConfigMap here.\n Name Type Required Default Value Description watchDuration *metav1.Duration No 5m0s The time duration for which watch is kept on dependent pods to see if anyone turns to CrashLoopBackoff servicesAndDependantSelectors map[string]DependantSelectors Yes NA Endpoint name and its corresponding dependent pods. More info below. DependantSelectors If the service recovers from downtime, then weeder starts to watch for CrashLoopBackOff pods. These pods are identified by info stored in this property.\n Name Type Required Default Value Description podSelectors []*metav1.LabelSelector Yes NA This is a list of Label selector ","categories":"","description":"","excerpt":"Configure Dependency Watchdog Components Prober Dependency watchdog …","ref":"/docs/other-components/dependency-watchdog/deployment/configure/","tags":"","title":"Configure"},{"body":"Configuring the Logging Stack via gardenlet Configurations Enable the Logging In order to install the Gardener logging stack, the logging.enabled configuration option has to be enabled in the Gardenlet configuration:\nlogging: enabled: true From now on, each Seed is going to have a logging stack which will collect logs from all pods and some systemd services. Logs related to Shoots with testing purpose are dropped in the fluent-bit output plugin. Shoots with a purpose different than testing have the same type of log aggregator (but different instance) as the Seed. The logs can be viewed in the Plutono in the garden namespace for the Seed components and in the respective shoot control plane namespaces.\nEnable Logs from the Shoot’s Node systemd Services The logs from the systemd services on each node can be retrieved by enabling the logging.shootNodeLogging option in the gardenlet configuration:\nlogging: enabled: true shootNodeLogging: shootPurposes: - \"evaluation\" - \"deployment\" Under the shootPurpose section, just list all the shoot purposes for which the Shoot node logging feature will be enabled. Specifying the testing purpose has no effect because this purpose prevents the logging stack installation. Logs can be viewed in the operator Plutono! The dedicated labels are unit, syslog_identifier, and nodename in the Explore menu.\nConfiguring Central Vali Storage Capacity By default, the central Vali has 100Gi of storage capacity. To overwrite the current central Vali storage capacity, the logging.vali.garden.storage setting in the gardenlet’s component configuration should be altered. If you need to increase it, you can do so without losing the current data by specifying a higher capacity. By doing so, the Vali’s PersistentVolume capacity will be increased instead of deleting the current PV. However, if you specify less capacity, then the PersistentVolume will be deleted and with it the logs, too.\nlogging: enabled: true vali: garden: storage: \"200Gi\" ","categories":"","description":"","excerpt":"Configuring the Logging Stack via gardenlet Configurations Enable the …","ref":"/docs/gardener/deployment/configuring_logging/","tags":"","title":"Configuring Logging"},{"body":"Configuring the Registry Cache Extension Introduction Use Case For a Shoot cluster, the containerd daemon of every Node goes to the internet and fetches an image that it doesn’t have locally in the Node’s image cache. New Nodes are often created due to events such as auto-scaling (scale up), rolling update, or replacement of unhealthy Node. Such a new Node would need to pull all of the images of the Pods running on it from the internet because the Node’s cache is initially empty. Pulling an image from a registry produces network traffic and registry costs. To avoid these network traffic and registry costs, you can use the registry-cache extension to run a registry as pull-through cache.\nThe following diagram shows a rough outline of how an image pull looks like for a Shoot cluster without registry cache: Solution The registry-cache extension deploys and manages a registry in the Shoot cluster that runs as pull-through cache. The used registry implementation is distribution/distribution.\nHow does it work? When the extension is enabled, a registry cache for each configured upstream is deployed to the Shoot cluster. Along with this, the containerd daemon on the Shoot cluster Nodes gets configured to use as a mirror the Service IP address of the deployed registry cache. For example, if a registry cache for upstream docker.io is requested via the Shoot spec, then containerd gets configured to first pull the image from the deployed cache in the Shoot cluster. If this image pull operation fails, containerd falls back to the upstream itself (docker.io in that case).\nThe first time an image is requested from the pull-through cache, it pulls the image from the configured upstream registry and stores it locally, before handing it back to the client. On subsequent requests, the pull-through cache is able to serve the image from its own storage.\n [!NOTE] The used registry implementation (distribution/distribution) supports mirroring of only one upstream registry.\n The following diagram shows a rough outline of how an image pull looks like for a Shoot cluster with registry cache: Shoot Configuration The extension is not globally enabled and must be configured per Shoot cluster. The Shoot specification has to be adapted to include the registry-cache extension configuration.\nBelow is an example of registry-cache extension configuration as part of the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: crazy-botany namespace: garden-dev spec: extensions: - type: registry-cache providerConfig: apiVersion: registry.extensions.gardener.cloud/v1alpha3 kind: RegistryConfig caches: - upstream: docker.io volume: size: 100Gi # storageClassName: premium - upstream: ghcr.io - upstream: quay.io garbageCollection: ttl: 0s secretReferenceName: quay-credentials - upstream: my-registry.io:5000 remoteURL: http://my-registry.io:5000 # ... resources: - name: quay-credentials resourceRef: apiVersion: v1 kind: Secret name: quay-credentials-v1 The providerConfig field is required.\nThe providerConfig.caches field contains information about the registry caches to deploy. It is a required field. At least one cache has to be specified.\nThe providerConfig.caches[].upstream field is the remote registry host to cache. It is a required field. The value must be a valid DNS subdomain (RFC 1123) and optionally a port (i.e. \u003chost\u003e[:\u003cport\u003e]). It must not include a scheme.\nThe providerConfig.caches[].remoteURL optional field is the remote registry URL. If configured, it must include an https:// or http:// scheme. If the field is not configured, the remote registry URL defaults to https://\u003cupstream\u003e. In case the upstream is docker.io, it defaults to https://registry-1.docker.io.\nThe providerConfig.caches[].volume field contains settings for the registry cache volume. The registry-cache extension deploys a StatefulSet with a volume claim template. A PersistentVolumeClaim is created with the configured size and StorageClass name.\nThe providerConfig.caches[].volume.size field is the size of the registry cache volume. Defaults to 10Gi. The size must be a positive quantity (greater than 0). This field is immutable. See Increase the cache disk size on how to resize the disk. The extension defines alerts for the volume. See Alerting for Users on how to enable notifications for Shoot cluster alerts.\nThe providerConfig.caches[].volume.storageClassName field is the name of the StorageClass used by the registry cache volume. This field is immutable. If the field is not specified, then the default StorageClass will be used.\nThe providerConfig.caches[].garbageCollection.ttl field is the time to live of a blob in the cache. If the field is set to 0s, the garbage collection is disabled. Defaults to 168h (7 days). See the Garbage Collection section for more details.\nThe providerConfig.caches[].secretReferenceName is the name of the reference for the Secret containing the upstream registry credentials. To cache images from a private registry, credentials to the upstream registry should be supplied. For more details, see How to provide credentials for upstream registry.\n [!NOTE] It is only possible to provide one set of credentials for one private upstream registry.\n Garbage Collection When the registry cache receives a request for an image that is not present in its local store, it fetches the image from the upstream, returns it to the client and stores the image in the local store. The registry cache runs a scheduler that deletes images when their time to live (ttl) expires. When adding an image to the local store, the registry cache also adds a time to live for the image. The ttl defaults to 168h (7 days) and is configurable. The garbage collection can be disabled by setting the ttl to 0s. Requesting an image from the registry cache does not extend the time to live of the image. Hence, an image is always garbage collected from the registry cache store when its ttl expires. At the time of writing this document, there is no functionality for garbage collection based on disk size - e.g., garbage collecting images when a certain disk usage threshold is passed. The garbage collection cannot be enabled once it is disabled. This constraint is added to mitigate distribution/distribution#4249.\nIncrease the Cache Disk Size When there is no available disk space, the registry cache continues to respond to requests. However, it cannot store the remotely fetched images locally because it has no free disk space. In such case, it is simply acting as a proxy without being able to cache the images in its local store. The disk has to be resized to ensure that the registry cache continues to cache images.\nThere are two alternatives to enlarge the cache’s disk size:\n[Alternative 1] Resize the PVC To enlarge the PVC’s size, perform the following steps:\n Make sure that the KUBECONFIG environment variable is targeting the correct Shoot cluster.\n Find the PVC name to resize for the desired upstream. The below example fetches the PVC for the docker.io upstream:\nkubectl -n kube-system get pvc -l upstream-host=docker.io Patch the PVC’s size to the desired size. The below example patches the size of a PVC to 10Gi:\nkubectl -n kube-system patch pvc $PVC_NAME --type merge -p '{\"spec\":{\"resources\":{\"requests\": {\"storage\": \"10Gi\"}}}}' Make sure that the PVC gets resized. Describe the PVC to check the resize operation result:\nkubectl -n kube-system describe pvc -l upstream-host=docker.io Drawback of this approach: The cache’s size in the Shoot spec (providerConfig.caches[].size) diverges from the PVC’s size.\n [Alternative 2] Remove and Readd the Cache There is always the option to remove the cache from the Shoot spec and to readd it again with the updated size.\n Drawback of this approach: The already cached images get lost and the cache starts with an empty disk.\n High Аvailability The registry cache runs with a single replica. This fact may lead to concerns for the high availability such as “What happens when the registry cache is down? Does containerd fail to pull the image?”. As outlined in the How does it work? section, containerd is configured to fall back to the upstream registry if it fails to pull the image from the registry cache. Hence, when the registry cache is unavailable, the containerd’s image pull operations are not affected because containerd falls back to image pull from the upstream registry.\nPossible Pitfalls The used registry implementation (the Distribution project) supports mirroring of only one upstream registry. The extension deploys a pull-through cache for each configured upstream. us-docker.pkg.dev, europe-docker.pkg.dev, and asia-docker.pkg.dev are different upstreams. Hence, configuring pkg.dev as upstream won’t cache images from us-docker.pkg.dev, europe-docker.pkg.dev, or asia-docker.pkg.dev. Limitations Images that are pulled before a registry cache Pod is running or before a registry cache Service is reachable from the corresponding Node won’t be cached - containerd will pull these images directly from the upstream.\nThe reasoning behind this limitation is that a registry cache Pod is running in the Shoot cluster. To have a registry cache’s Service cluster IP reachable from containerd running on the Node, the registry cache Pod has to be running and kube-proxy has to configure iptables/IPVS rules for the registry cache Service. If kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service, then the image pull times (and new Node bootstrap times) will be increased significantly. For more detailed explanations, see point 2. and gardener/gardener-extension-registry-cache#68.\nThat’s why the registry configuration on a Node is applied only after the registry cache Service is reachable from the Node. The gardener-node-agent.service systemd unit sends requests to the registry cache’s Service. Once the registry cache responds with HTTP 200, the unit creates the needed registry configuration file (hosts.toml).\nAs a result, for images from Shoot system components:\n On Shoot creation with the registry cache extension enabled, a registry cache is unable to cache all of the images from the Shoot system components. Usually, until the registry cache Pod is running, containerd pulls from upstream the images from Shoot system components (before the registry configuration gets applied). On new Node creation for existing Shoot with the registry cache extension enabled, a registry cache is unable to cache most of the images from Shoot system components. The reachability of the registry cache Service requires the Service network to be set up, i.e., the kube-proxy for that new Node to be running and to have set up iptables/IPVS configuration for the registry cache Service. containerd requests will time out in 30s in case kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service - the image pull times will increase significantly.\ncontainerd is configured to fall back to the upstream itself if a request against the cache fails. However, if the cluster IP of the registry cache Service does not exist or if kube-proxy hasn’t configured iptables/IPVS rules for the registry cache Service, then containerd requests against the registry cache time out in 30 seconds. This significantly increases the image pull times because containerd does multiple requests as part of the image pull (HEAD request to resolve the manifest by tag, GET request for the manifest by SHA, GET requests for blobs)\nExample: If the Service of a registry cache is deleted, then a new Service will be created. containerd’s registry config will still contain the old Service’s cluster IP. containerd requests against the old Service’s cluster IP will time out and containerd will fall back to upstream.\n Image pull of docker.io/library/alpine:3.13.2 from the upstream takes ~2s while image pull of the same image with invalid registry cache cluster IP takes ~2m.2s. Image pull of eu.gcr.io/gardener-project/gardener/ops-toolbelt:0.18.0 from the upstream takes ~10s while image pull of the same image with invalid registry cache cluster IP takes ~3m.10s. Amazon Elastic Container Registry is currently not supported. For details see distribution/distribution#4383.\n ","categories":"","description":"Learn what is the use-case for a pull-through cache, how to enable it and configure it","excerpt":"Learn what is the use-case for a pull-through cache, how to enable it …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-cache/configuration/","tags":"","title":"Configuring the Registry Cache Extension"},{"body":"Configuring the Registry Mirror Extension Introduction Use Case containerd allows registry mirrors to be configured. Use cases are:\n Usage of public mirror(s) - for example, circumvent issues with the upstream registry such as rate limiting, outages, and others. Usage of private mirror(s) - for example, reduce network costs by using a private mirror running in the same network. Solution The registry-mirror extension allows the registry mirror configuration to be configured via the Shoot spec directly.\nHow does it work? When the extension is enabled, the containerd daemon on the Shoot cluster Nodes gets configured to use the requested mirrors as a mirror. For example, if for the upstream docker.io the mirror https://mirror.gcr.io is configured in the Shoot spec, then containerd gets configured to first pull the image from the mirror (https://mirror.gcr.io in that case). If this image pull operation fails, containerd falls back to the upstream itself (docker.io in that case).\nThe extension is based on the contract described in containerd Registry Configuration. The corresponding upstream documentation in containerd is Registry Configuration - Introduction.\nShoot Configuration The Shoot specification has to be adapted to include the registry-mirror extension configuration.\nBelow is an example of registry-mirror extension configuration as part of the Shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: crazy-botany namespace: garden-dev spec: extensions: - type: registry-mirror providerConfig: apiVersion: mirror.extensions.gardener.cloud/v1alpha1 kind: MirrorConfig mirrors: - upstream: docker.io hosts: - host: \"https://mirror.gcr.io\" capabilities: [\"pull\"] The providerConfig field is required.\nThe providerConfig.mirrors field contains information about the registry mirrors to configure. It is a required field. At least one mirror has to be specified.\nThe providerConfig.mirror[].upstream field is the remote registry host to mirror. It is a required field. The value must be a valid DNS subdomain (RFC 1123) and optionally a port (i.e. \u003chost\u003e[:\u003cport\u003e]). It must not include a scheme.\nThe providerConfig.mirror[].hosts field represents the mirror hosts to be used for the upstream. At least one mirror host has to be specified.\nThe providerConfig.mirror[].hosts[].host field is the mirror host. It is a required field. The value must include a scheme - http:// or https://.\nThe providerConfig.mirror[].hosts[].capabilities field represents the operations a host is capable of performing. This also represents the set of operations for which the mirror host may be trusted to perform. Defaults to [\"pull\"]. The supported values are pull and resolve. See the capabilities field documentation for more information on which operations are considered trusted ones against public/private mirrors.\n","categories":"","description":"Learn what is the use-case for a registry mirror, how to enable and configure it","excerpt":"Learn what is the use-case for a registry mirror, how to enable and …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-mirror/configuration/","tags":"","title":"Configuring the Registry Mirror Extension"},{"body":"Connect Kubectl In Kubernetes, the configuration for accessing your cluster is in a format known as kubeconfig, which is stored as a file. It contains details such as cluster API server addresses and access credentials or a command to obtain access credentials from a kubectl credential plugin. In general, treat a kubeconfig as sensitive data. Tools like kubectl use the kubeconfig to connect and authenticate to a cluster and perform operations on it. Learn more about kubeconfig and kubectl on kubernetes.io.\nTools In this guide, we reference the following tools:\n kubectl: Command-line tool for running commands against Kubernetes clusters. It allows you to control various aspects of your cluster, such as creating or modifying resources, viewing resource status, and debugging your applications. kubelogin: kubectl credential plugin used for OIDC authentication, which is required for the (OIDC) Garden cluster kubeconfig gardenlogin: kubectl credential plugin used for Shoot authentication as system:masters, which is required for the (gardenlogin) Shoot cluster kubeconfig gardenctl: Optional. Command-line tool to administrate one or many Garden, Seed and Shoot clusters. Use this tool to setup gardenlogin and gardenctl itself, configure access to clusters and configure cloud provider CLI tools. Connect Kubectl to a Shoot Cluster In order to connect to a Shoot cluster, you first have to install and setup gardenlogin.\nYou can obtain the kubeconfig for the Shoot cluster either by downloading it from the Gardener dashboard or by copying the gardenctl target command from the dashboard and executing it.\nSetup Gardenlogin Prerequisites You are logged on to the Gardener dashboard. The dashboard admin has configured OIDC for the dashboard. You have installed kubelogin You have installed gardenlogin To setup gardenlogin, you need to:\n Download the kubeconfig for the Garden cluster Configure gardenlogin Download Kubeconfig for the Garden Cluster Navigate to the MY ACCOUNT page on the dashboard by clicking on the user avatar -\u003e MY ACCOUNT. Under the Access section, download the kubeconfig. Configure Gardenlogin Configure gardenlogin by following the installation instruction on the dashboard:\n Select your project from the dropdown on the left Choose CLUSTERS and select your cluster in the list. Choose the Show information about gardenlogin info icon and follow the configuration hints. [!IMPORTANT] Use the previously downloaded kubeconfig for the Garden cluster as the kubeconfig path. Do not use the gardenlogin Shoot cluster kubeconfig here.\n Download and Setup Kubeconfig for a Shoot Cluster The gardenlogin kubeconfig for the Shoot cluster can be obtained in various ways:\n Copy and run the gardenctl target command from the dashboard Download from the Gardener dashboard Copy and Run gardenctl target Command Using the gardenctl target command you can quickly set or switch between clusters. The command sets the scope for the next operation, e.g., it ensures that the KUBECONFIG env variable always points to the current targeted cluster.\nTo target a Shoot cluster:\n Copy the gardenctl target command from the dashboard\n Paste and run the command in the terminal application, for example:\n $ gardenctl target --garden landscape-dev --project core --shoot mycluster Successfully targeted shoot \"mycluster\" Your KUBECONFIG env variable is now pointing to the current target (also visible with gardenctl target view -o yaml). You can now run kubectl commands against your Shoot cluster.\n$ kubectl get namespaces The command connects to the cluster and list its namespaces.\nKUBECONFIG Env Var not Setup Correctly If your KUBECONFIG env variable does not point to the current target, you will see the following message after running the gardenctl target command:\nWARN The KUBECONFIG environment variable does not point to the current target of gardenctl. Run `gardenctl kubectl-env --help` on how to configure the KUBECONFIG environment variable accordingly In this case you would need to run the following command (assuming bash as your current shell). For other shells, consult the gardenctl kubectl-env –help documentation.\n$ eval \"$(gardenctl kubectl-env bash)\" Download from Dashboard Select your project from the dropdown on the left, then choose CLUSTERS and locate your cluster in the list. Choose the key icon to bring up a dialog with the access options.\nIn the Kubeconfig - Gardenlogin section the options are to show gardenlogin info, download, copy or view the kubeconfig for the cluster.\nThe same options are available also in the Access section in the cluster details screen. To find it, choose a cluster from the list.\n Choose the download icon to download the kubeconfig as file on your local system.\n Connecting to the Cluster In the following command, change \u003cpath-to-gardenlogin-kubeconfig\u003e with the actual path to the file where you stored the kubeconfig downloaded in the previous step 2.\n$ kubectl --kubeconfig=\u003cpath-to-gardenlogin-kubeconfig\u003e get namespaces The command connects to the cluster and list its namespaces.\nExporting KUBECONFIG environment variable Since many kubectl commands will be used, it’s a good idea to take advantage of every opportunity to shorten the expressions. The kubectl tool has a fallback strategy for looking up a kubeconfig to work with. For example, it looks for the KUBECONFIG environment variable with value that is the path to the kubeconfig file meant to be used. Export the variable:\n$ export KUBECONFIG=\u003cpath-to-gardenlogin-kubeconfig\u003e Again, replace \u003cpath-to-gardenlogin-kubeconfig\u003e with the actual path to the kubeconfig for the cluster you want to connect to.\nWhat’s next? Using Dashboard Terminal ","categories":"","description":"","excerpt":"Connect Kubectl In Kubernetes, the configuration for accessing your …","ref":"/docs/dashboard/connect-kubectl/","tags":"","title":"Connect Kubectl"},{"body":"Connectivity Shoot Connectivity We measure the connectivity from the shoot to the API Server. This is done via the blackbox exporter which is deployed in the shoot’s kube-system namespace. Prometheus will scrape the blackbox exporter and then the exporter will try to access the API Server. Metrics are exposed if the connection was successful or not. This can be seen in the Kubernetes Control Plane Status dashboard under the API Server Connectivity panel. The shoot line represents the connectivity from the shoot.\nSeed Connectivity In addition to the shoot connectivity, we also measure the seed connectivity. This means trying to reach the API Server from the seed via the external fully qualified domain name of the API server. The connectivity is also displayed in the above panel as the seed line. Both seed and shoot connectivity are shown below.\n","categories":["Users"],"description":"","excerpt":"Connectivity Shoot Connectivity We measure the connectivity from the …","ref":"/docs/gardener/monitoring/connectivity/","tags":"","title":"Connectivity"},{"body":"Problem Two of the most common causes of this problems are specifying the wrong container image or trying to use private images without providing registry credentials.\nNote There is no observable difference in pod status between a missing image and incorrect registry permissions. In either case, Kubernetes will report an ErrImagePull status for the pods. For this reason, this article deals with both scenarios. Example Let’s see an example. We’ll create a pod named fail, referencing a non-existent Docker image:\nkubectl run -i --tty fail --image=tutum/curl:1.123456 The command doesn’t return and you can terminate the process with Ctrl+C.\nError Analysis We can then inspect our pods and see that we have one pod with a status of ErrImagePull or ImagePullBackOff.\n$ (minikube) kubectl get pods NAME READY STATUS RESTARTS AGE client-5b65b6c866-cs4ch 1/1 Running 1 1m fail-6667d7685d-7v6w8 0/1 ErrImagePull 0 \u003cinvalid\u003e vuejs-578574b75f-5x98z 1/1 Running 0 1d $ (minikube) For some additional information, we can describe the failing pod.\nkubectl describe pod fail-6667d7685d-7v6w8 As you can see in the events section, your image can’t be pulled:\nName: fail-6667d7685d-7v6w8 Namespace: default Node: minikube/192.168.64.10 Start Time: Wed, 22 Nov 2017 10:01:59 +0100 Labels: pod-template-hash=2223832418 run=fail Annotations: kubernetes.io/created-by={\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"fail-6667d7685d\",\"uid\":\"cc4ccb3f-cf63-11e7-afca-4a7a1fa05b3f\",\"a... . . . . Events: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 1m\t1m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned fail-6667d7685d-7v6w8 to minikube 1m\t1m\t1\tkubelet, minikube\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-9fr6r\" 1m\t6s\t4\tkubelet, minikube\tspec.containers{fail}\tNormal\tPulling\tpulling image \"tutum/curl:1.123456\" 1m\t5s\t4\tkubelet, minikube\tspec.containers{fail}\tWarning\tFailed\tFailed to pull image \"tutum/curl:1.123456\": rpc error: code = Unknown desc = Error response from daemon: manifest for tutum/curl:1.123456 not found 1m\t\u003cinvalid\u003e\t10\tkubelet, minikube\tWarning\tFailedSync\tError syncing pod 1m\t\u003cinvalid\u003e\t6\tkubelet, minikube\tspec.containers{fail}\tNormal\tBackOff\tBack-off pulling image \"tutum/curl:1.123456\" Why couldn’t Kubernetes pull the image? There are three primary candidates besides network connectivity issues:\n The image tag is incorrect The image doesn’t exist Kubernetes doesn’t have permissions to pull that image If you don’t notice a typo in your image tag, then it’s time to test using your local machine. I usually start by running docker pull on my local development machine with the exact same image tag. In this case, I would run docker pull tutum/curl:1.123456.\nIf this succeeds, then it probably means that Kubernetes doesn’t have the correct permissions to pull that image.\nAdd the docker registry user/pwd to your cluster:\nkubectl create secret docker-registry dockersecret --docker-server=https://index.docker.io/v1/ --docker-username=\u003cusername\u003e --docker-password=\u003cpassword\u003e --docker-email=\u003cemail\u003e If the exact image tag fails, then I will test without an explicit image tag:\ndocker pull tutum/curl This command will attempt to pull the latest tag. If this succeeds, then that means the originally specified tag doesn’t exist. Go to the Docker registry and check which tags are available for this image.\nIf docker pull tutum/curl (without an exact tag) fails, then we have a bigger problem - that image does not exist at all in our image registry.\n","categories":"","description":"Wrong Container Image or Invalid Registry Permissions","excerpt":"Wrong Container Image or Invalid Registry Permissions","ref":"/docs/guides/applications/missing-registry-permission/","tags":"","title":"Container Image Not Pulled"},{"body":"Introduction A container image should use a fixed tag or the SHA of the image. It should not use the tags latest, head, canary, or other tags that are designed to be floating.\nProblem If you have encountered this issue, you have probably done something along the lines of:\n Deploy anything using an image tag (e.g., cp-enablement/awesomeapp:1.0) Fix a bug in awesomeapp Build a new image and push it with the same tag (cp-enablement/awesomeapp:1.0) Update the deployment Realize that the bug is still present Repeat steps 3-5 without any improvement The problem relates to how Kubernetes decides whether to do a docker pull when starting a container. Since we tagged our image as :1.0, the default pull policy is IfNotPresent. The Kubelet already has a local copy of cp-enablement/awesomeapp:1.0, so it doesn’t attempt to do a docker pull. When the new Pods come up, they’re still using the old broken Docker image.\nThere are a couple of ways to resolve this, with the recommended one being to use unique tags.\nSolution In order to fix the problem, you can use the following bash script that runs anytime the deployment is updated to create a new tag and push it to the registry.\n#!/usr/bin/env bash # Set the docker image name and the corresponding repository # Ensure that you change them in the deployment.yml as well. # You must be logged in with docker login. # # CHANGE THIS TO YOUR Docker.io SETTINGS # PROJECT=awesomeapp REPOSITORY=cp-enablement # causes the shell to exit if any subcommand or pipeline returns a non-zero status. # set -e # set debug mode # set -x # build my nodeJS app # npm run build # get the latest version ID from the Docker.io registry and increment them # VERSION=$(curl https://registry.hub.docker.com/v1/repositories/$REPOSITORY/$PROJECT/tags | sed -e 's/[][]//g' -e 's/\"//g' -e 's/ //g' | tr '}' '\\n' | awk -F: '{print $3}' | grep v| tail -n 1) VERSION=${VERSION:1} ((VERSION++)) VERSION=\"v$VERSION\" # build the new docker image # echo '\u003e\u003e\u003e Building new image' echo '\u003e\u003e\u003e Push new image' docker push $REPOSITORY/$PROJECT:$VERSION ","categories":"","description":"Updating images in your cluster during development","excerpt":"Updating images in your cluster during development","ref":"/docs/guides/applications/image-pull-policy/","tags":"","title":"Container Image Not Updating"},{"body":"containerd Registry Configuration containerd supports configuring registries and mirrors. Using this native containerd feature, Shoot owners can configure containerd to use public or private mirrors for a given upstream registry. More details about the registry configuration can be found in the corresponding upstream documentation.\ncontainerd Registry Configuration Patterns At the time of writing this document, containerd support two patterns for configuring registries/mirrors.\n Note: Trying to use both of the patterns at the same time is not supported by containerd. Only one of the configuration patterns has to be followed strictly.\n Old and Deprecated Pattern The old and deprecated pattern is specifying registry.mirrors and registry.configs in the containerd’s config.toml file. See the upstream documentation. Example of the old and deprecated pattern:\nversion = 2 [plugins.\"io.containerd.grpc.v1.cri\".registry] [plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors] [plugins.\"io.containerd.grpc.v1.cri\".registry.mirrors.\"docker.io\"] endpoint = [\"https://public-mirror.example.com\"] In the above example, containerd is configured to first try to pull docker.io images from a configured endpoint (https://public-mirror.example.com). If the image is not available in https://public-mirror.example.com, then containerd will fall back to the upstream registry (docker.io) and will pull the image from there.\nHosts Directory Pattern The hosts directory pattern is the new and recommended pattern for configuring registries. It is available starting containerd@v1.5.0. See the upstream documentation. The above example in the hosts directory pattern looks as follows. The /etc/containerd/config.toml file has the following section:\nversion = 2 [plugins.\"io.containerd.grpc.v1.cri\".registry] config_path = \"/etc/containerd/certs.d\" The following hosts directory structure has to be created:\n$ tree /etc/containerd/certs.d /etc/containerd/certs.d └── docker.io └── hosts.toml Finally, for the docker.io upstream registry, we configure a hosts.toml file as follows:\nserver = \"https://registry-1.docker.io\" [host.\"http://public-mirror.example.com\"] capabilities = [\"pull\", \"resolve\"] Configuring containerd Registries for a Shoot Gardener supports configuring containerd registries on a Shoot using the new hosts directory pattern. For each Shoot Node, Gardener creates the /etc/containerd/certs.d directory and adds the following section to the containerd’s /etc/containerd/config.toml file:\n[plugins.\"io.containerd.grpc.v1.cri\".registry] # gardener-managed config_path = \"/etc/containerd/certs.d\" This allows Shoot owners to use the hosts directory pattern to configure registries for containerd. To do this, the Shoot owners need to create a directory under /etc/containerd/certs.d that is named with the upstream registry host name. In the newly created directory, a hosts.toml file needs to be created. For more details, see the hosts directory pattern section and the upstream documentation.\nThe registry-cache Extension There is a Gardener-native extension named registry-cache that supports:\n Configuring containerd registry mirrors based on the above-described contract. The feature is added in registry-cache@v0.6.0. Running pull through cache(s) in the Shoot. For more details, see the registry-cache documentation.\n","categories":"","description":"","excerpt":"containerd Registry Configuration containerd supports configuring …","ref":"/docs/gardener/containerd-registry-configuration/","tags":"","title":"containerd Registry Configuration"},{"body":"Gardener Container Runtime Extension At the lowest layers of a Kubernetes node is the software that, among other things, starts and stops containers. It is called “Container Runtime”. The most widely known container runtime is Docker, but it is not alone in this space. In fact, the container runtime space has been rapidly evolving.\nKubernetes supports different container runtimes using Container Runtime Interface (CRI) – a plugin interface which enables kubelet to use a wide variety of container runtimes.\nGardener supports creation of Worker machines using CRI. For more information, see CRI Support.\nMotivation Prior to the Container Runtime Extensibility concept, Gardener used Docker as the only container runtime to use in shoot worker machines. Because of the wide variety of different container runtimes offering multiple important features (for example, enhanced security concepts), it is important to enable end users to use other container runtimes as well.\nThe ContainerRuntime Extension Resource Here is what a typical ContainerRuntime resource would look like:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: ContainerRuntime metadata: name: my-container-runtime spec: binaryPath: /var/bin/containerruntimes type: gvisor workerPool: name: worker-ubuntu selector: matchLabels: worker.gardener.cloud/pool: worker-ubuntu Gardener deploys one ContainerRuntime resource per worker pool per CRI. To exemplify this, consider a Shoot having two worker pools (worker-one, worker-two) using containerd as the CRI as well as gvisor and kata as enabled container runtimes. Gardener would deploy four ContainerRuntime resources. For worker-one: one ContainerRuntime for type gvisor and one for type kata. The same resource are being deployed for worker-two.\nSupporting a New Container Runtime Provider To add support for another container runtime (e.g., gvisor, kata-containers), a container runtime extension controller needs to be implemented. It should support Gardener’s supported CRI plugins.\nThe container runtime extension should install the necessary resources into the shoot cluster (e.g., RuntimeClasses), and it should copy the runtime binaries to the relevant worker machines in path: spec.binaryPath. Gardener labels the shoot nodes according to the CRI configured: worker.gardener.cloud/cri-name=\u003cvalue\u003e (e.g worker.gardener.cloud/cri-name=containerd) and multiple labels for each of the container runtimes configured for the shoot Worker machine: containerruntime.worker.gardener.cloud/\u003ccontainer-runtime-type-value\u003e=true (e.g containerruntime.worker.gardener.cloud/gvisor=true). The way to install the binaries is by creating a daemon set which copies the binaries from an image in a docker registry to the relevant labeled Worker’s nodes (avoid downloading binaries from the internet to also cater with isolated environments).\nFor additional reference, please have a look at the runtime-gvsior provider extension, which provides more information on how to configure the necessary charts, as well as the actuators required to reconcile container runtime inside the Shoot cluster to the desired state.\n","categories":"","description":"","excerpt":"Gardener Container Runtime Extension At the lowest layers of a …","ref":"/docs/gardener/extensions/containerruntime/","tags":"","title":"ContainerRuntime"},{"body":"You are welcome to contribute code to Gardener in order to fix a bug or to implement a new feature.\nThe following rules govern code contributions:\n Contributions must be licensed under the Apache 2.0 License You need to sign the Contributor License Agreement. We are using CLA assistant providing a click-through workflow for accepting the CLA. For company contributors additionally the company needs to sign a corporate license agreement. See the following sections for details. ","categories":"","description":"","excerpt":"You are welcome to contribute code to Gardener in order to fix a bug …","ref":"/docs/contribute/code/","tags":"","title":"Contributing Code"},{"body":"You are welcome to contribute documentation to Gardener.\nThe following rules govern documentation contributions:\n Contributions must be licensed under the Creative Commons Attribution 4.0 International License You need to sign the Contributor License Agreement. We are using CLA assistant providing a click-through workflow for accepting the CLA. For company contributors additionally the company needs to sign a corporate license agreement. See the following sections for details. ","categories":"","description":"","excerpt":"You are welcome to contribute documentation to Gardener.\nThe following …","ref":"/contribute/docs/","tags":"","title":"Contributing Documentation"},{"body":"How to contribute? Contributions are always welcome!\nIn order to contribute ensure that you have the development environment setup and you familiarize yourself with required steps to build, verify-quality and test.\nSetting up development environment Installing Go Minimum Golang version required: 1.18. On MacOS run:\nbrew install go For other OS, follow the installation instructions.\nInstalling Git Git is used as version control for dependency-watchdog. On MacOS run:\nbrew install git If you do not have git installed already then please follow the installation instructions.\nInstalling Docker In order to test dependency-watchdog containers you will need a local kubernetes setup. Easiest way is to first install Docker. This becomes a pre-requisite to setting up either a vanilla KIND/minikube cluster or a local Gardener cluster.\nOn MacOS run:\nbrew install -cash docker For other OS, follow the installation instructions.\nInstalling Kubectl To interact with the local Kubernetes cluster you will need kubectl. On MacOS run:\nbrew install kubernetes-cli For other IS, follow the installation instructions.\nGet the sources Clone the repository from Github:\ngit clone https://github.com/gardener/dependency-watchdog.git Using Makefile For every change following make targets are recommended to run.\n# build the code changes \u003e make build # ensure that all required checks pass \u003e make verify # this will check formatting, linting and will run unit tests # if you do not wish to run tests then you can use the following make target. \u003e make check All tests should be run and the test coverage should ideally not reduce. Please ensure that you have read testing guidelines.\nBefore raising a pull request ensure that if you are introducing any new file then you must add licesence header to all new files. To add license header you can run this make target:\n\u003e make add-license-headers # This will add license headers to any file which does not already have it. NOTE: Also have a look at the Makefile as it has other targets that are not mentioned here.\n Raising a Pull Request To raise a pull request do the following:\n Create a fork of dependency-watchdog Add dependency-watchdog as upstream remote via git remote add upstream https://github.com/gardener/dependency-watchdog It is recommended that you create a git branch and push all your changes for the pull-request. Ensure that while you work on your pull-request, you continue to rebase the changes from upstream to your branch. To do that execute the following command: git pull --rebase upstream master We prefer clean commits. If you have multiple commits in the pull-request, then squash the commits to a single commit. You can do this via interactive git rebase command. For example if your PR branch is ahead of remote origin HEAD by 5 commits then you can execute the following command and pick the first commit and squash the remaining commits. git rebase -i HEAD~5 #actual number from the head will depend upon how many commits your branch is ahead of remote origin master ","categories":"","description":"","excerpt":"How to contribute? Contributions are always welcome!\nIn order to …","ref":"/docs/other-components/dependency-watchdog/contribution/","tags":"","title":"Contribution"},{"body":"Endpoints and Ports of a Shoot Control-Plane With the reversed VPN tunnel, there are no endpoints with open ports in the shoot cluster required by Gardener. In order to allow communication to the shoots control-plane in the seed cluster, there are endpoints shared by multiple shoots of a seed cluster. Depending on the configured zones or exposure classes, there are different endpoints in a seed cluster. The IP address(es) can be determined by a DNS query for the API Server URL. The main entry-point into the seed cluster is the load balancer of the Istio ingress-gateway service. Depending on the infrastructure provider, there can be one IP address per zone.\nThe load balancer of the Istio ingress-gateway service exposes the following TCP ports:\n 443 for requests to the shoot API Server. The request is dispatched according to the set TLS SNI extension. 8443 for requests to the shoot API Server via api-server-proxy, dispatched based on the proxy protocol target, which is the IP address of kubernetes.default.svc.cluster.local in the shoot. 8132 to establish the reversed VPN connection. It’s dispatched according to an HTTP header value. kube-apiserver via SNI DNS entries for api.\u003cexternal-domain\u003e and api.\u003cshoot\u003e.\u003cproject\u003e.\u003cinternal-domain\u003e point to the load balancer of an Istio ingress-gateway service. The Kubernetes client sets the server name to api.\u003cexternal-domain\u003e or api.\u003cshoot\u003e.\u003cproject\u003e.\u003cinternal-domain\u003e. Based on SNI, the connection is forwarded to the respective API Server at TCP layer. There is no TLS termination at the Istio ingress-gateway. TLS termination happens on the shoots API Server. Traffic is end-to-end encrypted between the client and the API Server. The certificate authority and authentication are defined in the corresponding kubeconfig. Details can be found in GEP-08.\nkube-apiserver via apiserver-proxy Inside the shoot cluster, the API Server can also be reached by the cluster internal name kubernetes.default.svc.cluster.local. The pods apiserver-proxy are deployed in the host network as daemonset and intercept connections to the Kubernetes service IP address. The destination address is changed to the cluster IP address of the service kube-apiserver.\u003cshoot-namespace\u003e.svc.cluster.local in the seed cluster. The connections are forwarded via the HaProxy Proxy Protocol to the Istio ingress-gateway in the seed cluster. The Istio ingress-gateway forwards the connection to the respective shoot API Server by it’s cluster IP address. As TLS termination happens at the API Server, the traffic is end-to-end encrypted the same way as with SNI.\nDetails can be found in GEP-11.\nReversed VPN Tunnel As the API Server has to be able to connect to endpoints in the shoot cluster, a VPN connection is established. This VPN connection is initiated from a VPN client in the shoot cluster. The VPN client connects to the Istio ingress-gateway and is forwarded to the VPN server in the control-plane namespace of the shoot. Once the VPN tunnel between the VPN client in the shoot and the VPN server in the seed cluster is established, the API Server can connect to nodes, services and pods in the shoot cluster.\nMore details can be found in the usage document and GEP-14.\n","categories":"","description":"","excerpt":"Endpoints and Ports of a Shoot Control-Plane With the reversed VPN …","ref":"/docs/gardener/control-plane-endpoints-and-ports/","tags":"","title":"Control Plane Endpoints And Ports"},{"body":"Control Plane Migration Prerequisites The Seeds involved in the control plane migration must have backups enabled - their .spec.backup fields cannot be nil.\nShootState ShootState is an API resource which stores non-reconstructible state and data required to completely recreate a Shoot’s control plane on a new Seed. The ShootState resource is created on Shoot creation in its Project namespace and the required state/data is persisted during Shoot creation or reconciliation.\nShoot Control Plane Migration Triggering the migration is done by changing the Shoot’s .spec.seedName to a Seed that differs from the .status.seedName, we call this Seed a \"Destination Seed\". This action can only be performed by an operator (see Triggering the Migration). If the Destination Seed does not have a backup and restore configuration, the change to spec.seedName is rejected. Additionally, this Seed must not be set for deletion and must be healthy.\nIf the Shoot has different .spec.seedName and .status.seedName, a process is started to prepare the Control Plane for migration:\n .status.lastOperation is changed to Migrate. Kubernetes API Server is stopped and the extension resources are annotated with gardener.cloud/operation=migrate. Full snapshot of the ETCD is created and terminating of the Control Plane in the Source Seed is initiated. If the process is successful, we update the status of the Shoot by setting the .status.seedName to the null value. That way, a restoration is triggered in the Destination Seed and .status.lastOperation is changed to Restore. The control plane migration is completed when the Restore operation has completed successfully.\nThe etcd backups will be copied over to the BackupBucket of the Destination Seed during control plane migration and any future backups will be uploaded there.\nTriggering the Migration For control plane migration, operators with the necessary RBAC can use the shoots/binding subresource to change the .spec.seedName, with the following commands:\nNAMESPACE=my-namespace SHOOT_NAME=my-shoot DEST_SEED_NAME=destination-seed kubectl get --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME} | jq -c '.spec.seedName = \"'${DEST_SEED_NAME}'\"' | kubectl replace --raw /apis/core.gardener.cloud/v1beta1/namespaces/${NAMESPACE}/shoots/${SHOOT_NAME}/binding -f - | jq -r '.spec.seedName' [!IMPORTANT] When migrating Shoots to a Destination Seed with different provider type from the Source Seed, make sure of the following:\nPods running in the Destination Seed must have network connectivity to the backup storage provider of the Source Seed so that etcd backups can be copied successfully. Otherwise, the Restore operation will get stuck at the Waiting until etcd backups are copied step. However, if you do end up in this case, you can still finish the control plane migration by following the guide to manually copy etcd backups.\nThe nodes of your Shoot cluster must have network connectivity to the Shoot’s kube-apiserver and the vpn-seed-server once they are migrated to the Destination Seed. Otherwise, the Restore operation will get stuck at the Waiting until the Kubernetes API server can connect to the Shoot workers step. However, if you do end up in this case and cannot allow network traffic from the nodes to the Shoot’s control plane, you can annotate the Shoot with the shoot.gardener.cloud/skip-readiness annotation so that the Restore operation finishes, and then use the shoots/binding subresource to migrate the control plane back to the Source Seed.\n Copying ETCD Backups Manually During the Restore Operation Following is a workaround that can be used to copy etcd backups manually in situations where a Shoot’s control plane has been moved to a Destination Seed and the pods running in it lack network connectivity to the Source Seed’s storage provider:\n Follow the instructions in the etcd-backup-restore getting started documentation on how to run the etcdbrctl command locally or in a container. Follow the instructions in the passing-credentials guide on how to set up the required credentials for the copy operation depending on the storage providers for which you want to perform it. Use the etcdbrctl copy command to copy the backups by following the instructions in the etcdbrctl copy guide After you have successfully copied the etcd backups, wait for the EtcdCopyBackupsTask custom resource to be created in the Shoot’s control plane on the Destination Seed, if it does not already exist. Afterwards, mark it as successful by patching it using the following command: SHOOT_NAME=my-shoot PROJECT_NAME=my-project kubectl patch -n shoot--${PROJECT_NAME}--${SHOOT_NAME} etcdcopybackupstask ${SHOOT_NAME} --subresource status --type merge -p \"{\\\"status\\\":{\\\"conditions\\\":[{\\\"type\\\":\\\"Succeeded\\\",\\\"status\\\":\\\"True\\\",\\\"reason\\\":\\\"manual copy successful\\\",\\\"message\\\":\\\"manual copy successful\\\",\\\"lastTransitionTime\\\":\\\"$(date -Iseconds)\\\",\\\"lastUpdateTime\\\":\\\"$(date -Iseconds)\\\"}]}}\" After the main-etcd becomes Ready, and the source-etcd-backup secret is deleted from the Shoot’s control plane, remove the finalizer on the source extensions.gardener.cloud/v1alpha1.BackupEntry in the Destination Seed so that it can be deleted successfully (the resource name uses the following format: source-shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e--\u003cuid\u003e). This is necessary as the Destination Seed will not have network connectivity to the Source Seed’s storage provider and the deletion will fail. Once the control plane migration has finished successfully, make sure to manually clean up the source backup directory in the Source Seed’s storage provider. ","categories":"","description":"","excerpt":"Control Plane Migration Prerequisites The Seeds involved in the …","ref":"/docs/gardener/control_plane_migration/","tags":"","title":"Control Plane Migration"},{"body":"Registering Extension Controllers Extensions are registered in the garden cluster via ControllerRegistration resources. Deployment for respective extensions are specified via ControllerDeployment resources. Gardener evaluates the registrations and deployments and creates ControllerInstallation resources which describe the request “please install this controller X to this seed Y”.\nSimilar to how CloudProfile or Seed resources get into the system, the Gardener administrator must deploy the ControllerRegistration and ControllerDeployment resources (this does not happen automatically in any way - the administrator decides which extensions shall be enabled).\nThe specification mainly describes which of Gardener’s extension CRDs are managed, for example:\napiVersion: core.gardener.cloud/v1 kind: ControllerDeployment metadata: name: os-gardenlinux helm: ociRepository: ref: registry.example.com/os-gardenlinux/charts/os-gardenlinux:1.0.0 # or a base64-encoded, gzip'ed, tar'ed extension controller chart # rawChart: H4sIFAAAAAAA/yk... values: foo: bar --- apiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: os-gardenlinux spec: deployment: deploymentRefs: - name: os-gardenlinux resources: - kind: OperatingSystemConfig type: gardenlinux primary: true This information tells Gardener that there is an extension controller that can handle OperatingSystemConfig resources of type gardenlinux. A reference to the shown ControllerDeployment specifies how the deployment of the extension controller is accomplished.\nAlso, it specifies that this controller is the primary one responsible for the lifecycle of the OperatingSystemConfig resource. Setting primary to false would allow to register additional, secondary controllers that may also watch/react on the OperatingSystemConfig/coreos resources, however, only the primary controller may change/update the main status of the extension object (that are used to “communicate” with the gardenlet). Particularly, only the primary controller may set .status.lastOperation, .status.lastError, .status.observedGeneration, and .status.state. Secondary controllers may contribute to the .status.conditions[] if they like, of course.\nSecondary controllers might be helpful in scenarios where additional tasks need to be completed which are not part of the reconciliation logic of the primary controller but separated out into a dedicated extension.\n⚠️ There must be exactly one primary controller for every registered kind/type combination. Also, please note that the primary field cannot be changed after creation of the ControllerRegistration.\nDeploying Extension Controllers Submitting the above ControllerDeployment and ControllerRegistration will create a ControllerInstallation resource:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerInstallation metadata: name: os-gardenlinux spec: deploymentRef: name: os-gardenlinux registrationRef: name: os-gardenlinux seedRef: name: aws-eu1 This resource expresses that Gardener requires the os-gardenlinux extension controller to run on the aws-eu1 seed cluster.\ngardener-controller-manager automatically determines which extension is required on which seed cluster and will only create ControllerInstallation objects for those. Also, it will automatically delete ControllerInstallations referencing extension controllers that are no longer required on a seed (e.g., because all shoots on it have been deleted). There are additional configuration options, please see the Deployment Configuration Options section. After gardener-controller-manager has written the ControllerInstallation resource, gardenlet picks it up and installs the controller on the respective Seed using the referenced ControllerDeployment.\nIt is sufficient to create a Helm chart and deploy it together with some static configuration values. For this, operators have to provide the deployment information in the ControllerDeployment.helm section:\n... helm: rawChart: H4sIFAAAAAAA/yk... values: foo: bar You can check out hack/generate-controller-registration.yaml for generating a ControllerDeployment including a controller helm chart.\nIf ControllerDeployment.helm is specified, gardenlet either decodes the provided Helm chart (.helm.rawChart) or pulls the chart from the referenced OCI Repository (.helm.ociRepository). When referencing an OCI Repository, you have several options in how to specify where to pull the chart:\nhelm: ociRepository: # full ref with either tag or digest, or both ref: registry.example.com/foo:1.0.0@sha256:abc --- helm: ociRepository: # repository and tag repository: registry.example.com tag: 1.0.0 --- helm: ociRepository: # repository and digest repository: registry.example.com digest: sha256:abc --- helm: ociRepository: # when specifying both tag and digest, the tag is ignored. repository: registry.example.com tag: 1.0.0 digest: sha256:abc Gardenlet caches the downloaded chart in memory. It is recommended to always specify a digest, because if it is not specified, gardenlet needs to fetch the manifest in every reconciliation to compare the digest with the local cache.\nNo matter where the chart originates from, gardenlet deploys it with the provided static configuration (.helm.values). The chart and the values can be updated at any time - Gardener will recognize it and re-trigger the deployment process. In order to allow extensions to get information about the garden and the seed cluster, gardenlet mixes in certain properties into the values (root level) of every deployed Helm chart:\ngardener: version: \u003cgardener-version\u003e garden: clusterIdentity: \u003cuuid-of-gardener-installation\u003e genericKubeconfigSecretName: \u003cgeneric-garden-kubeconfig-secret-name\u003e seed: name: \u003cseed-name\u003e clusterIdentity: \u003cseed-cluster-identity\u003e annotations: \u003cseed-annotations\u003e labels: \u003cseed-labels\u003e provider: \u003cseed-provider-type\u003e region: \u003cseed-region\u003e volumeProvider: \u003cseed-first-volume-provider\u003e volumeProviders: \u003cseed-volume-providers\u003e ingressDomain: \u003cseed-ingress-domain\u003e protected: \u003cseed-protected-taint\u003e visible: \u003cseed-visible-setting\u003e taints: \u003cseed-taints\u003e networks: \u003cseed-networks\u003e blockCIDRs: \u003cseed-networks-blockCIDRs\u003e spec: \u003cseed-spec\u003e gardenlet: featureGates: \u003cgardenlet-feature-gates\u003e Extensions can use this information in their Helm chart in case they require knowledge about the garden and the seed environment. The list might be extended in the future.\ngardenlet reports whether the extension controller has been installed successfully and running in the ControllerInstallation status:\nstatus: conditions: - lastTransitionTime: \"2024-05-16T13:04:16Z\" lastUpdateTime: \"2024-05-16T13:04:16Z\" message: The controller running in the seed cluster is healthy. reason: ControllerHealthy status: \"True\" type: Healthy - lastTransitionTime: \"2024-05-16T13:04:06Z\" lastUpdateTime: \"2024-05-16T13:04:06Z\" message: The controller was successfully installed in the seed cluster. reason: InstallationSuccessful status: \"True\" type: Installed - lastTransitionTime: \"2024-05-16T13:04:16Z\" lastUpdateTime: \"2024-05-16T13:04:16Z\" message: The controller has been rolled out successfully. reason: ControllerRolledOut status: \"False\" type: Progressing - lastTransitionTime: \"2024-05-16T13:03:39Z\" lastUpdateTime: \"2024-05-16T13:03:39Z\" message: chart could be rendered successfully. reason: RegistrationValid status: \"True\" type: Valid Deployment Configuration Options The .spec.deployment resource allows to configure a deployment policy. There are the following policies:\n OnDemand (default): Gardener will demand the deployment and deletion of the extension controller to/from seed clusters dynamically. It will automatically determine (based on other resources like Shoots) whether it is required and decide accordingly. Always: Gardener will demand the deployment of the extension controller to seed clusters independent of whether it is actually required or not. This might be helpful if you want to add a new component/controller to all seed clusters by default. Another use-case is to minimize the durations until extension controllers get deployed and ready in case you have highly fluctuating seed clusters. AlwaysExceptNoShoots: Similar to Always, but if the seed does not have any shoots, then the extension is not being deployed. It will be deleted from a seed after the last shoot has been removed from it. Also, the .spec.deployment.seedSelector allows to specify a label selector for seed clusters. Only if it matches the labels of a seed, then it will be deployed to it. Please note that a seed selector can only be specified for secondary controllers (primary=false for all .spec.resources[]).\nExtensions in the Garden Cluster Itself The Shoot resource itself will contain some provider-specific data blobs. As a result, some extensions might also want to run in the garden cluster, e.g., to provide ValidatingWebhookConfigurations for validating the correctness of their provider-specific blobs:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: ... cloud: type: aws region: eu-west-1 providerConfig: apiVersion: aws.cloud.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: # specify either 'id' or 'cidr' # id: vpc-123456 cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a ... In the above example, Gardener itself does not understand the AWS-specific provider configuration for the infrastructure. However, if this part of the Shoot resource should be validated, then you should run an AWS-specific component in the garden cluster that registers a webhook. You can do it similarly if you want to default some fields of a resource (by using a MutatingWebhookConfiguration).\nAgain, similar to how Gardener is deployed to the garden cluster, these components must be deployed and managed by the Gardener administrator.\nExtension Resource Configurations The Extension resource allows injecting arbitrary steps into the shoot reconciliation flow that are unknown to Gardener. Hence, it is slightly special and allows further configuration when registering it:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: extension-foo spec: resources: - kind: Extension type: foo primary: true globallyEnabled: true reconcileTimeout: 30s lifecycle: reconcile: AfterKubeAPIServer delete: BeforeKubeAPIServer migrate: BeforeKubeAPIServer The globallyEnabled=true option specifies that the Extension/foo object shall be created by default for all shoots (unless they opted out by setting .spec.extensions[].enabled=false in the Shoot spec).\nThe reconcileTimeout tells Gardener how long it should wait during its shoot reconciliation flow for the Extension/foo’s reconciliation to finish.\nExtension Lifecycle The lifecycle field tells Gardener when to perform a certain action on the Extension resource during the reconciliation flows. If omitted, then the default behaviour will be applied. Please find more information on the defaults in the explanation below. Possible values for each control flow are AfterKubeAPIServer, BeforeKubeAPIServer, and AfterWorker. Let’s take the following configuration and explain it.\n ... lifecycle: reconcile: AfterKubeAPIServer delete: BeforeKubeAPIServer migrate: BeforeKubeAPIServer reconcile: AfterKubeAPIServer means that the extension resource will be reconciled after the successful reconciliation of the kube-apiserver during shoot reconciliation. This is also the default behaviour if this value is not specified. During shoot hibernation, the opposite rule is applied, meaning that in this case the reconciliation of the extension will happen before the kube-apiserver is scaled to 0 replicas. On the other hand, if the extension needs to be reconciled before the kube-apiserver and scaled down after it, then the value BeforeKubeAPIServer should be used. delete: BeforeKubeAPIServer means that the extension resource will be deleted before the kube-apiserver is destroyed during shoot deletion. This is the default behaviour if this value is not specified. migrate: BeforeKubeAPIServer means that the extension resource will be migrated before the kube-apiserver is destroyed in the source cluster during control plane migration. This is the default behaviour if this value is not specified. The restoration of the control plane follows the reconciliation control flow. The lifecycle value AfterWorker is only available during reconcile. When specified, the extension resource will be reconciled after the workers are deployed. This is useful for extensions that want to deploy a workload in the shoot control plane and want to wait for the workload to run and get ready on a node. During shoot creation the extension will start its reconciliation before the first workers have joined the cluster, they will become available at some later point.\n","categories":"","description":"","excerpt":"Registering Extension Controllers Extensions are registered in the …","ref":"/docs/gardener/extensions/controllerregistration/","tags":"","title":"ControllerRegistration"},{"body":"Controllers etcd-druid is an operator to manage etcd clusters, and follows the Operator pattern for Kubernetes. It makes use of the Kubebuilder framework which makes it quite easy to define Custom Resources (CRs) such as Etcds and EtcdCopyBackupTasks through Custom Resource Definitions (CRDs), and define controllers for these CRDs. etcd-druid uses Kubebuilder to define the Etcd CR and its corresponding controllers.\nAll controllers that are a part of etcd-druid reside in package internal/controller, as sub-packages.\nEtcd-druid currently consists of the following controllers, each having its own responsibility:\n etcd : responsible for the reconciliation of the Etcd CR spec, which allows users to run etcd clusters within the specified Kubernetes cluster, and also responsible for periodically updating the Etcd CR status with the up-to-date state of the managed etcd cluster. compaction : responsible for snapshot compaction. etcdcopybackupstask : responsible for the reconciliation of the EtcdCopyBackupsTask CR, which helps perform the job of copying snapshot backups from one object store to another. secret : responsible in making sure Secrets being referenced by Etcd resources are not deleted while in use. Package Structure The typical package structure for the controllers that are part of etcd-druid is shown with the compaction controller:\ninternal/controller/compaction ├── config.go ├── reconciler.go └── register.go config.go: contains all the logic for the configuration of the controller, including feature gate activations, CLI flag parsing and validations. register.go: contains the logic for registering the controller with the etcd-druid controller manager. reconciler.go: contains the controller reconciliation logic. Each controller package also contains auxiliary files which are relevant to that specific controller.\nController Manager A manager is first created for all controllers that are a part of etcd-druid. The controller manager is responsible for all the controllers that are associated with CRDs. Once the manager is Start()ed, all the controllers that are registered with it are started.\nEach controller is built using a controller builder, configured with details such as the type of object being reconciled, owned objects whose owner object is reconciled, event filters (predicates), etc. Predicates are filters which allow controllers to filter which type of events the controller should respond to and which ones to ignore.\nThe logic relevant to the controller manager like the creation of the controller manager and registering each of the controllers with the manager, is contained in internal/manager/manager.go.\nEtcd Controller The etcd controller is responsible for the reconciliation of the Etcd resource spec and status. It handles the provisioning and management of the etcd cluster. Different components that are required for the functioning of the cluster like Leases, ConfigMaps, and the Statefulset for the etcd cluster are all deployed and managed by the etcd controller.\nAdditionally, etcd controller also periodically updates the Etcd resource status with the latest available information from the etcd cluster, as well as results and errors from the recent-most reconciliation of the Etcd resource spec.\nThe etcd controller is essential to the functioning of the etcd cluster and etcd-druid, thus the minimum number of worker threads is 1 (default being 3), controlled by the CLI flag --etcd-workers.\nEtcd Spec Reconciliation While building the controller, an event filter is set such that the behavior of the controller, specifically for Etcd update operations, depends on the gardener.cloud/operation: reconcile annotation. This is controlled by the --enable-etcd-spec-auto-reconcile CLI flag, which, if set to false, tells the controller to perform reconciliation only when this annotation is present. If the flag is set to true, the controller will reconcile the etcd cluster anytime the Etcd spec, and thus generation, changes, and the next queued event for it is triggered.\n Note: Creation and deletion of Etcd resources are not affected by the above flag or annotation.\n The reason this filter is present is that any disruption in the Etcd resource due to reconciliation (due to changes in the Etcd spec, for example) while workloads are being run would cause unwanted downtimes to the etcd cluster. Hence, any user who wishes to avoid such disruptions, can choose to set the --enable-etcd-spec-auto-reconcile CLI flag to false. An example of this is Gardener’s gardenlet, which reconciles the Etcd resource only during a shoot cluster’s maintenance window.\nThe controller adds a finalizer to the Etcd resource in order to ensure that it does not get deleted until all dependent resources managed by etcd-druid, aka managed components, are properly cleaned up. Only the etcd controller can delete a resource once it adds finalizers to it. This ensures that the proper deletion flow steps are followed while deleting the resource. During deletion flow, managed components are deleted in parallel.\nEtcd Status Updates The Etcd resource status is updated periodically by etcd controller, the interval for which is determined by the CLI flag --etcd-status-sync-period.\nStatus fields of the Etcd resource such as LastOperation, LastErrors and ObservedGeneration, are updated to reflect the result of the recent reconciliation of the Etcd resource spec.\n LastOperation holds information about the last operation performed on the etcd cluster, indicated by fields Type, State, Description and LastUpdateTime. Additionally, a field RunID indicates the unique ID assigned to the specific reconciliation run, to allow for better debugging of issues. LastErrors is a slice of errors encountered by the last reconciliation run. Each error consists of fields Code to indicate the custom etcd-druid error code for the error, a human-readable Description, and the ObservedAt time when the error was seen. ObservedGeneration indicates the latest generation of the Etcd resource that etcd-druid has “observed” and consequently reconciled. It helps identify whether a change in the Etcd resource spec was acted upon by druid or not. Status fields of the Etcd resource which correspond to the StatefulSet like CurrentReplicas, ReadyReplicas and Replicas are updated to reflect those of the StatefulSet by the controller.\nStatus fields related to the etcd cluster itself, such as Members, PeerUrlTLSEnabled and Ready are updated as follows:\n Cluster Membership: The controller updates the information about etcd cluster membership like Role, Status, Reason, LastTransitionTime and identifying information like the Name and ID. For the Status field, the member is checked for the Ready condition, where the member can be in Ready, NotReady and Unknown statuses. Etcd resource conditions are indicated by status field Conditions. The condition checks that are currently performed are:\n AllMembersReady: indicates readiness of all members of the etcd cluster. Ready: indicates overall readiness of the etcd cluster in serving traffic. BackupReady: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. DataVolumesReady: indicates health of the persistent volumes containing the etcd data. Compaction Controller The compaction controller deploys the snapshot compaction job whenever required. To understand the rationale behind this controller, please read snapshot-compaction.md. The controller watches the number of events accumulated as part of delta snapshots in the etcd cluster’s backups, and triggers a snapshot compaction when the number of delta events crosses the set threshold, which is configurable through the --etcd-events-threshold CLI flag (1M events by default).\nThe controller watches for changes in snapshot Leases associated with Etcd resources. It checks the full and delta snapshot Leases and calculates the difference in events between the latest delta snapshot and the previous full snapshot, and initiates the compaction job if the event threshold is crossed.\nThe number of worker threads for the compaction controller needs to be greater than or equal to 0 (default 3), controlled by the CLI flag --compaction-workers. This is unlike other controllers which need at least one worker thread for the proper functioning of etcd-druid as snapshot compaction is not a core functionality for the etcd clusters to be deployed. The compaction controller should be explicitly enabled by the user, through the --enable-backup-compaction CLI flag.\nEtcdCopyBackupsTask Controller The etcdcopybackupstask controller is responsible for deploying the etcdbrctl copy command as a job. This controller reacts to create/update events arising from EtcdCopyBackupsTask resources, and deploys the EtcdCopyBackupsTask job with source and target backup storage providers as arguments, which are derived from source and target bucket secrets referenced by the EtcdCopyBackupsTask resource.\nThe number of worker threads for the etcdcopybackupstask controller needs to be greater than or equal to 0 (default being 3), controlled by the CLI flag --etcd-copy-backups-task-workers. This is unlike other controllers who need at least one worker thread for the proper functioning of etcd-druid as EtcdCopyBackupsTask is not a core functionality for the etcd clusters to be deployed.\nSecret Controller The secret controller’s primary responsibility is to add a finalizer on Secrets referenced by the Etcd resource. The secret controller is registered for Secrets, and the controller keeps a watch on the Etcd CR. This finalizer is added to ensure that Secrets which are referenced by the Etcd CR aren’t deleted while still being used by the Etcd resource.\nEvents arising from the Etcd resource are mapped to a list of Secrets such as backup and TLS secrets that are referenced by the Etcd resource, and are enqueued into the request queue, which the reconciler then acts on.\nThe number of worker threads for the secret controller must be at least 1 (default being 10) for this core controller, controlled by the CLI flag --secret-workers, since the referenced TLS and infrastructure access secrets are essential to the proper functioning of the etcd cluster.\n","categories":"","description":"","excerpt":"Controllers etcd-druid is an operator to manage etcd clusters, and …","ref":"/docs/other-components/etcd-druid/concepts/controllers/","tags":"","title":"Controllers"},{"body":"Controlling the Kubernetes Versions for Specific Worker Pools Since Gardener v1.36, worker pools can have different Kubernetes versions specified than the control plane.\nIn earlier Gardener versions, all worker pools inherited the Kubernetes version of the control plane. Once the Kubernetes version of the control plane was modified, all worker pools have been updated as well (either by rolling the nodes in case of a minor version change, or in-place for patch version changes).\nIn order to gracefully perform Kubernetes upgrades (triggering a rolling update of the nodes) with workloads sensitive to restarts (e.g., those dealing with lots of data), it might be required to be able to gradually perform the upgrade process. In such cases, the Kubernetes version for the worker pools can be pinned (.spec.provider.workers[].kubernetes.version) while the control plane Kubernetes version (.spec.kubernetes.version) is updated. This results in the nodes being untouched while the control plane is upgraded. Now a new worker pool (with the version equal to the control plane version) can be added. Administrators can then reschedule their workloads to the new worker pool according to their upgrade requirements and processes.\nExample Usage in a Shoot spec: kubernetes: version: 1.27.4 provider: workers: - name: data1 kubernetes: version: 1.26.8 - name: data2 If .kubernetes.version is not specified in a worker pool, then the Kubernetes version of the kubelet is inherited from the control plane (.spec.kubernetes.version), i.e., in the above example, the data2 pool will use 1.26.8. If .kubernetes.version is specified in a worker pool, then it must meet the following constraints: It must be at most two minor versions lower than the control plane version. If it was not specified before, then no downgrade is possible (you cannot set it to 1.26.8 while .spec.kubernetes.version is already 1.27.4). The “two minor version skew” is only possible if the worker pool version is set to the control plane version and then the control plane was updated gradually by two minor versions. If the version is removed from the worker pool, only one minor version difference is allowed to the control plane (you cannot upgrade a pool from version 1.25.0 to 1.27.0 in one go). Automatic updates of Kubernetes versions (see Shoot Maintenance) also apply to worker pool Kubernetes versions.\n","categories":"","description":"","excerpt":"Controlling the Kubernetes Versions for Specific Worker Pools Since …","ref":"/docs/gardener/worker_pool_k8s_versions/","tags":"","title":"Controlling the Kubernetes Versions for Specific Worker Pools"},{"body":"Contract: ControlPlane Resource Most Kubernetes clusters require a cloud-controller-manager or CSI drivers in order to work properly. Before introducing the ControlPlane extension resource Gardener was having several different Helm charts for the cloud-controller-manager deployments for the various providers. Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich control plane resources are required? As mentioned in the controlplane customization webhooks document, Gardener shall not deploy any cloud-controller-manager or any other provider-specific component. Instead, it creates a ControlPlane CRD that should be picked up by provider extensions. Its purpose is to trigger the deployment of such provider-specific components in the shoot namespace in the seed cluster.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane namespace: shoot--foo--bar spec: type: openstack region: europe-west1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: provider zone: eu-1a cloudControllerManager: featureGates: CustomResourceValidation: true infrastructureProviderStatus: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus networks: floatingPool: id: vpc-1234 subnets: - purpose: nodes id: subnetid The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used for the shoot cluster. However, the most important section is the .spec.providerConfig and the .spec.infrastructureProviderStatus. The first one contains an embedded declaration of the provider specific configuration for the control plane (that cannot be known by Gardener itself). You are responsible for designing how this configuration looks like. Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the Shoot resource. The second one contains the output of the Infrastructure resource (that might be relevant for the CCM config).\nIn order to support a new control plane provider, you need to write a controller that watches all ControlPlanes with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Alicloud provider.\nThe control plane controller as part of the ControlPlane reconciliation often deploys resources (e.g. pods/deployments) into the Shoot namespace in the Seed as part of its ControlPlane reconciliation loop. Because the namespace contains network policies that per default deny all ingress and egress traffic, the pods may need to have proper labels matching to the selectors of the network policies in order to allow the required network traffic. Otherwise, they won’t be allowed to talk to certain other components (e.g., the kube-apiserver of the shoot). For more information, see NetworkPolicys In Garden, Seed, Shoot Clusters.\nNon-Provider Specific Information Required for Infrastructure Creation Most providers might require further information that is not provider specific but already part of the shoot resource. One example for this is the GCP control plane controller, which needs the Kubernetes version of the shoot cluster (because it already uses the in-tree Kubernetes cloud-controller-manager). As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Infrastructure resource itself.\nReferences and Additional Resources ControlPlane API (Golang Specification) Exemplary Implementation for the Alicloud Provider ","categories":"","description":"","excerpt":"Contract: ControlPlane Resource Most Kubernetes clusters require a …","ref":"/docs/gardener/extensions/controlplane/","tags":"","title":"ControlPlane"},{"body":"Contract: ControlPlane Resource with Purpose exposure Some Kubernetes clusters require an additional deployments required by the seed cloud provider in order to work properly, e.g. AWS Load Balancer Readvertiser. Before using ControlPlane resources with purpose exposure, Gardener was having different Helm charts for the deployments for the various providers. Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich control plane resources are required? As mentioned in the controlplane document, Gardener shall not deploy any other provider-specific component. Instead, it creates a ControlPlane CRD with purpose exposure that should be picked up by provider extensions. Its purpose is to trigger the deployment of such provider-specific components in the shoot namespace in the seed cluster that are needed to expose the kube-apiserver.\nThe shoot cluster’s kube-apiserver are exposed via a Service of type LoadBalancer from the shoot provider (you may run the control plane of an Azure shoot in a GCP seed). It’s the seed provider extension controller that should act on the ControlPlane resources with purpose exposure.\nIf SNI is enabled, then the Service from above is of type ClusterIP and Gardner will not create ControlPlane resources with purpose exposure.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow, Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane-exposure namespace: shoot--foo--bar spec: type: aws purpose: exposure region: europe-west1 secretRef: name: cloudprovider namespace: shoot--foo--bar The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used for the shoot cluster. It is most likely not needed, however, still added for some potential corner cases. If you don’t need it, then just ignore it. The .spec.region contains the region of the seed cluster.\nIn order to support a control plane provider with purpose exposure, you need to write a controller or expand the existing controlplane controller that watches all ControlPlanes with .spec.type=\u003cmy-provider-name\u003e and purpose exposure. You can take a look at the below referenced example implementation for the AWS provider.\nNon-Provider Specific Information Required for Infrastructure Creation Most providers might require further information that is not provider specific but already part of the shoot resource. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information.\nReferences and Additional Resources ControlPlane API (Golang Specification) Exemplary Implementation for the AWS Provider AWS Load Balancer Readvertiser ","categories":"","description":"","excerpt":"Contract: ControlPlane Resource with Purpose exposure Some Kubernetes …","ref":"/docs/gardener/extensions/controlplane-exposure/","tags":"","title":"ControlPlane Exposure"},{"body":"ControlPlane Customization Webhooks Gardener creates the Shoot controlplane in several steps of the Shoot flow. At different point of this flow, it:\n Deploys standard controlplane components such as kube-apiserver, kube-controller-manager, and kube-scheduler by creating the corresponding deployments, services, and other resources in the Shoot namespace. Initiates the deployment of custom controlplane components by ControlPlane controllers by creating a ControlPlane resource in the Shoot namespace. In order to apply any provider-specific changes to the configuration provided by Gardener for the standard controlplane components, cloud extension providers can install mutating admission webhooks for the resources created by Gardener in the Shoot namespace.\nWhat needs to be implemented to support a new cloud provider? In order to support a new cloud provider, you should install “controlplane” mutating webhooks for any of the following resources:\n Deployment with name kube-apiserver, kube-controller-manager, or kube-scheduler Service with name kube-apiserver OperatingSystemConfig with any name, and purpose reconcile See Contract Specification for more details on the contract that Gardener and webhooks should adhere to regarding the content of the above resources.\nYou can install 3 different kinds of controlplane webhooks:\n Shoot, or controlplane webhooks apply changes needed by the Shoot cloud provider, for example the --cloud-provider command line flag of kube-apiserver and kube-controller-manager. Such webhooks should only operate on Shoot namespaces labeled with shoot.gardener.cloud/provider=\u003cprovider\u003e. Seed, or controlplaneexposure webhooks apply changes needed by the Seed cloud provider, for example annotations on the kube-apiserver service to ensure cloud-specific load balancers are correctly provisioned for a service of type LoadBalancer. Such webhooks should only operate on Shoot namespaces labeled with seed.gardener.cloud/provider=\u003cprovider\u003e. The labels shoot.gardener.cloud/provider and seed.gardener.cloud/provider are added by Gardener when it creates the Shoot namespace.\nThe resources mutated by the “controlplane” mutating webhooks are labeled with provider.extensions.gardener.cloud/mutated-by-controlplane-webhook: true by gardenlet. The provider extensions can add an object selector to their “controlplane” mutating webhooks to not intercept requests for unrelated objects.\nContract Specification This section specifies the contract that Gardener and webhooks should adhere to in order to ensure smooth interoperability. Note that this contract can’t be specified formally and is therefore easy to violate, especially by Gardener. The Gardener team will nevertheless do its best to adhere to this contract in the future and to ensure via additional measures (tests, validations) that it’s not unintentionally broken. If it needs to be changed intentionally, this can only happen after proper communication has taken place to ensure that the affected provider webhooks could be adapted to work with the new version of the contract.\n Note: The contract described below may not necessarily be what Gardener does currently (as of May 2019). Rather, it reflects the target state after changes for Gardener extensibility have been introduced.\n kube-apiserver To deploy kube-apiserver, Gardener shall create a deployment and a service both named kube-apiserver in the Shoot namespace. They can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-apiserver deployment shall contain a container named kube-apiserver.\nThe command field of the kube-apiserver container shall contain the kube-apiserver command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n admission plugins (--enable-admission-plugins, --disable-admission-plugins) secure communications (--etcd-cafile, --etcd-certfile, --etcd-keyfile, …) audit log (--audit-log-*) ports (--secure-port) The kube-apiserver command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config These flags can be added by webhooks if needed.\nThe kube-apiserver command line may contain a number of additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-apiserver behavior for the specific provider. Among the flags to be considered are:\n --endpoint-reconciler-type --advertise-address --feature-gates Gardener uses SNI to expose the apiserver. In this case, Gardener will label the kube-apiserver’s Deployment with core.gardener.cloud/apiserver-exposure: gardener-managed label (deprecated, the label will no longer be added as of v1.80) and expects that the --endpoint-reconciler-type and --advertise-address flags are not modified.\nThe --enable-admission-plugins flag may contain admission plugins that are not compatible with CSI plugins such as PersistentVolumeLabel. Webhooks should therefore ensure that such admission plugins are either explicitly enabled (if CSI plugins are not used) or disabled (otherwise).\nThe env field of the kube-apiserver container shall not contain any provider-specific environment variables (so it will be empty). If any provider-specific environment variables are needed, they should be added by webhooks.\nThe volumes field of the pod template of the kube-apiserver deployment, and respectively the volumeMounts field of the kube-apiserver container shall not contain any provider-specific Secret or ConfigMap resources. If such resources should be mounted as volumes, this should be done by webhooks.\nThe kube-apiserver Service may be of type LoadBalancer, but shall not contain any provider-specific annotations that may be needed to actually provision a load balancer resource in the Seed provider’s cloud. If any such annotations are needed, they should be added by webhooks (typically controlplaneexposure webhooks).\nThe kube-apiserver Service will be of type ClusterIP. In this case, Gardener will label this Service with core.gardener.cloud/apiserver-exposure: gardener-managed label (deprecated, the label will no longer be added as of v1.80) and expects that no mutations happen.\nkube-controller-manager To deploy kube-controller-manager, Gardener shall create a deployment named kube-controller-manager in the Shoot namespace. It can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-controller-manager deployment shall contain a container named kube-controller-manager.\nThe command field of the kube-controller-manager container shall contain the kube-controller-manager command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --kubeconfig, --authentication-kubeconfig, --authorization-kubeconfig --leader-elect secure communications (--tls-cert-file, --tls-private-key-file, …) cluster CIDR and identity (--cluster-cidr, --cluster-name) sync settings (--concurrent-deployment-syncs, --concurrent-replicaset-syncs) horizontal pod autoscaler (--horizontal-pod-autoscaler-*) ports (--port, --secure-port) The kube-controller-manager command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config --configure-cloud-routes --external-cloud-volume-plugin These flags can be added by webhooks if needed.\nThe kube-controller-manager command line may contain a number of additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-controller-manager behavior for the specific provider. Among the flags to be considered are:\n --feature-gates The env field of the kube-controller-manager container shall not contain any provider-specific environment variables (so it will be empty). If any provider-specific environment variables are needed, they should be added by webhooks.\nThe volumes field of the pod template of the kube-controller-manager deployment, and respectively the volumeMounts field of the kube-controller-manager container shall not contain any provider-specific Secret or ConfigMap resources. If such resources should be mounted as volumes, this should be done by webhooks.\nkube-scheduler To deploy kube-scheduler, Gardener shall create a deployment named kube-scheduler in the Shoot namespace. It can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nThe pod template of the kube-scheduler deployment shall contain a container named kube-scheduler.\nThe command field of the kube-scheduler container shall contain the kube-scheduler command line. It shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --config --authentication-kubeconfig, --authorization-kubeconfig secure communications (--tls-cert-file, --tls-private-key-file, …) ports (--port, --secure-port) The kube-scheduler command line may contain additional provider-independent flags. In general, webhooks should ignore these unless they are known to interfere with the desired kube-controller-manager behavior for the specific provider. Among the flags to be considered are:\n --feature-gates The kube-scheduler command line can’t contain provider-specific flags, and it makes no sense to specify provider-specific environment variables or mount provider-specific Secret or ConfigMap resources as volumes.\netcd-main and etcd-events To deploy etcd, Gardener shall create 2 Etcd named etcd-main and etcd-events in the Shoot namespace. They can be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener.\nGardener shall configure the Etcd resource completely to set up an etcd cluster which uses the default storage class of the seed cluster.\ncloud-controller-manager Gardener shall not deploy a cloud-controller-manager. If it is needed, it should be added by a ControlPlane controller\nCSI Controllers Gardener shall not deploy a CSI controller. If it is needed, it should be added by a ControlPlane controller\nkubelet To specify the kubelet configuration, Gardener shall create a OperatingSystemConfig resource with any name and purpose reconcile in the Shoot namespace. It can therefore also be mutated by webhooks to apply any provider-specific changes to the standard configuration provided by Gardener. Gardener may write multiple such resources with different type to the same Shoot namespaces if multiple OSs are used.\nThe OSC resource shall contain a unit named kubelet.service, containing the corresponding systemd unit configuration file. The [Service] section of this file shall contain a single ExecStart option having the kubelet command line as its value.\nThe OSC resource shall contain a file with path /var/lib/kubelet/config/kubelet, which contains a KubeletConfiguration resource in YAML format. Most of the flags that can be specified in the kubelet command line can alternatively be specified as options in this configuration as well.\nThe kubelet command line shall contain a number of provider-independent flags that should be ignored by webhooks, such as:\n --config --bootstrap-kubeconfig, --kubeconfig --network-plugin (and, if it equals cni, also --cni-bin-dir and --cni-conf-dir) --node-labels The kubelet command line shall not contain any provider-specific flags, such as:\n --cloud-provider --cloud-config --provider-id These flags can be added by webhooks if needed.\nThe kubelet command line / configuration may contain a number of additional provider-independent flags / options. In general, webhooks should ignore these unless they are known to interfere with the desired kubelet behavior for the specific provider. Among the flags / options to be considered are:\n --enable-controller-attach-detach (enableControllerAttachDetach) - should be set to true if CSI plugins are used, but in general can also be ignored since its default value is also true, and this should work both with and without CSI plugins. --feature-gates (featureGates) - should contain a list of specific feature gates if CSI plugins are used. If CSI plugins are not used, the corresponding feature gates can be ignored since enabling them should not harm in any way. ","categories":"","description":"","excerpt":"ControlPlane Customization Webhooks Gardener creates the Shoot …","ref":"/docs/gardener/extensions/controlplane-webhooks/","tags":"","title":"ControlPlane Webhooks"},{"body":"General Conventions All the extensions that are registered to Gardener are deployed to the seed clusters on which they are required (also see ControllerRegistration).\nSome of these extensions might need to create global resources in the seed (e.g., ClusterRoles), i.e., it’s important to have a naming scheme to avoid conflicts as it cannot be checked or validated upfront that two extensions don’t use the same names.\nConsequently, this page should help answering some general questions that might come up when it comes to developing an extension.\nPriorityClasses Extensions are not supposed to create and use self-defined PriorityClasses. Instead, they can and should rely on well-known PriorityClasses managed by gardenlet.\nHigh Availability of Deployed Components Extensions might deploy components via Deployments, StatefulSets, etc., as part of the shoot control plane, or the seed or shoot system components. In case a seed or shoot cluster is highly available, there are various failure tolerance types. For more information, see Highly Available Shoot Control Plane. Accordingly, the replicas, topologySpreadConstraints or affinity settings of the deployed components might need to be adapted.\nInstead of doing this one-by-one for each and every component, extensions can rely on a mutating webhook provided by Gardener. Please refer to High Availability of Deployed Components for details.\nTo reduce costs and to improve the network traffic latency in multi-zone clusters, extensions can make a Service topology-aware. Please refer to this document for details.\nIs there a naming scheme for (global) resources? As there is no formal process to validate non-existence of conflicts between two extensions, please follow these naming schemes when creating resources (especially, when creating global resources, but it’s in general a good idea for most created resources):\nThe resource name should be prefixed with extensions.gardener.cloud:\u003cextension-type\u003e-\u003cextension-name\u003e:\u003cresource-name\u003e, for example:\n extensions.gardener.cloud:provider-aws:some-controller-manager extensions.gardener.cloud:extension-certificate-service:cert-broker How to create resources in the shoot cluster? Some extensions might not only create resources in the seed cluster itself but also in the shoot cluster. Usually, every extension comes with a ServiceAccount and the required RBAC permissions when it gets installed to the seed. However, there are no credentials for the shoot for every extension.\nExtensions are supposed to use ManagedResources to manage resources in shoot clusters. gardenlet deploys gardener-resource-manager instances into all shoot control planes, that will reconcile ManagedResources without a specified class (spec.class=null) in shoot clusters. Mind that Gardener acts on ManagedResources with the origin=gardener label. In order to prevent unwanted behavior, extensions should omit the origin label or provide their own unique value for it when creating such resources.\nIf you need to deploy a non-DaemonSet resource, Gardener automatically ensures that it only runs on nodes that are allowed to host system components and extensions. For more information, see System Components Webhook.\nHow to create kubeconfigs for the shoot cluster? Historically, Gardener extensions used to generate kubeconfigs with client certificates for components they deploy into the shoot control plane. For this, they reused the shoot cluster CA secret (ca) to issue new client certificates. With gardener/gardener#4661 we moved away from using client certificates in favor of short-lived, auto-rotated ServiceAccount tokens. These tokens are managed by gardener-resource-manager’s TokenRequestor. Extensions are supposed to reuse this mechanism for requesting tokens and a generic-token-kubeconfig for authenticating against shoot clusters.\nWith GEP-18 (Shoot cluster CA rotation), a dedicated CA will be used for signing client certificates (gardener/gardener#5779) which will be rotated when triggered by the shoot owner. With this, extensions cannot reuse the ca secret anymore to issue client certificates. Hence, extensions must switch to short-lived ServiceAccount tokens in order to support the CA rotation feature.\nThe generic-token-kubeconfig secret contains the CA bundle for establishing trust to shoot API servers. However, as the secret is immutable, its name changes with the rotation of the cluster CA. Extensions need to look up the generic-token-kubeconfig.secret.gardener.cloud/name annotation on the respective Cluster object in order to determine which secret contains the current CA bundle. The helper function extensionscontroller.GenericTokenKubeconfigSecretNameFromCluster can be used for this task.\nYou can take a look at CA Rotation in Extensions for more details on the CA rotation feature in regard to extensions.\nHow to create certificates for the shoot cluster? Gardener creates several certificate authorities (CA) that are used to create server certificates for various components. For example, the shoot’s etcd has its own CA, the kube-aggregator has its own CA as well, and both are different to the actual cluster’s CA.\nWith GEP-18 (Shoot cluster CA rotation), extensions are required to do the same and generate dedicated CAs for their components (e.g. for signing a server certificate for cloud-controller-manager). They must not depend on the CA secrets managed by gardenlet.\nPlease see CA Rotation in Extensions for the exact requirements that extensions need to fulfill in order to support the CA rotation feature.\nHow to enforce a Pod Security Standard for extension namespaces? The pod-security.kubernetes.io/enforce namespace label enforces the Pod Security Standards.\nYou can set the pod-security.kubernetes.io/enforce label for extension namespace by adding the security.gardener.cloud/pod-security-enforce annotation to your ControllerRegistration. The value of the annotation would be the value set for the pod-security.kubernetes.io/enforce label. It is advised to set the annotation with the most restrictive pod security standard that your extension pods comply with.\nIf you are using the ./hack/generate-controller-registration.sh script to generate your ControllerRegistration you can use the -e, –pod-security-enforce option to set the security.gardener.cloud/pod-security-enforce annotation. If the option is not set, it defaults to baseline.\n","categories":"","description":"","excerpt":"General Conventions All the extensions that are registered to Gardener …","ref":"/docs/gardener/extensions/conventions/","tags":"","title":"Conventions"},{"body":"Packages:\n core.gardener.cloud/v1beta1 core.gardener.cloud/v1beta1 Package v1beta1 is a version of the API.\nResource Types: BackupBucket BackupEntry CloudProfile ControllerDeployment ControllerInstallation ControllerRegistration ExposureClass InternalSecret NamespacedCloudProfile Project Quota SecretBinding Seed Shoot ShootState BackupBucket BackupBucket holds details about backup bucket\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string BackupBucket metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BackupBucketSpec Specification of the Backup Bucket.\n provider BackupBucketProvider Provider holds the details of cloud provider of the object store. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n seedName string (Optional) SeedName holds the name of the seed allocated to BackupBucket for running controller. This field is immutable.\n status BackupBucketStatus Most recently observed status of the Backup Bucket.\n BackupEntry BackupEntry holds details about shoot backup.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string BackupEntry metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BackupEntrySpec (Optional) Spec contains the specification of the Backup Entry.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n seedName string (Optional) SeedName holds the name of the seed to which this BackupEntry is scheduled\n status BackupEntryStatus (Optional) Status contains the most recently observed status of the Backup Entry.\n CloudProfile CloudProfile represents certain properties about a provider environment.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string CloudProfile metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec CloudProfileSpec (Optional) Spec defines the provider environment properties.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig contains provider-specific configuration for the profile.\n regions []Region Regions contains constraints regarding allowed values for regions and zones.\n seedSelector SeedSelector (Optional) SeedSelector contains an optional list of labels on Seed resources that marks those seeds whose shoots may use this provider profile. An empty list means that all seeds of the same provider type are supported. This is useful for environments that are of the same type (like openstack) but may have different “instances”/landscapes. Optionally a list of possible providers can be added to enable cross-provider scheduling. By default, the provider type of the seed must match the shoot’s provider.\n type string Type is the name of the provider.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n bastion Bastion (Optional) Bastion contains the machine and image properties\n ControllerDeployment ControllerDeployment contains information about how this controller is deployed.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerDeployment metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. type string Type is the deployment type.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension ProviderConfig contains type-specific configuration. It contains assets that deploy the controller.\n ControllerInstallation ControllerInstallation represents an installation request for an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerInstallation metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ControllerInstallationSpec Spec contains the specification of this installation. If the object’s deletion timestamp is set, this field is immutable.\n registrationRef Kubernetes core/v1.ObjectReference RegistrationRef is used to reference a ControllerRegistration resource. The name field of the RegistrationRef is immutable.\n seedRef Kubernetes core/v1.ObjectReference SeedRef is used to reference a Seed resource. The name field of the SeedRef is immutable.\n deploymentRef Kubernetes core/v1.ObjectReference (Optional) DeploymentRef is used to reference a ControllerDeployment resource.\n status ControllerInstallationStatus Status contains the status of this installation.\n ControllerRegistration ControllerRegistration represents a registration of an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ControllerRegistration metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ControllerRegistrationSpec Spec contains the specification of this registration. If the object’s deletion timestamp is set, this field is immutable.\n resources []ControllerResource (Optional) Resources is a list of combinations of kinds (DNSProvider, Infrastructure, Generic, …) and their actual types (aws-route53, gcp, auditlog, …).\n deployment ControllerRegistrationDeployment (Optional) Deployment contains information for how this controller is deployed.\n ExposureClass ExposureClass represents a control plane endpoint exposure strategy.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ExposureClass metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. handler string Handler is the name of the handler which applies the control plane endpoint exposure strategy. This field is immutable.\n scheduling ExposureClassScheduling (Optional) Scheduling holds information how to select applicable Seed’s for ExposureClass usage. This field is immutable.\n InternalSecret InternalSecret holds secret data of a certain type. The total bytes of the values in the Data field must be less than MaxSecretSize bytes.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string InternalSecret metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object’s metadata. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata\nRefer to the Kubernetes API documentation for the fields of the metadata field. immutable bool (Optional) Immutable, if set to true, ensures that data stored in the Secret cannot be updated (only object metadata can be modified). If not set to true, the field can be modified at any time. Defaulted to nil.\n data map[string][]byte (Optional) Data contains the secret data. Each key must consist of alphanumeric characters, ‘-’, ‘_’ or ‘.’. The serialized form of the secret data is a base64 encoded string, representing the arbitrary (possibly non-string) data value here. Described in https://tools.ietf.org/html/rfc4648#section-4\n stringData map[string]string (Optional) stringData allows specifying non-binary secret data in string form. It is provided as a write-only input field for convenience. All keys and values are merged into the data field on write, overwriting any existing values. The stringData field is never output when reading from the API.\n type Kubernetes core/v1.SecretType (Optional) Used to facilitate programmatic handling of secret data. More info: https://kubernetes.io/docs/concepts/configuration/secret/#secret-types\n NamespacedCloudProfile NamespacedCloudProfile represents certain properties about a provider environment.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string NamespacedCloudProfile metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec NamespacedCloudProfileSpec Spec defines the provider environment properties.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings (Optional) Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage (Optional) MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType (Optional) MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n parent CloudProfileReference Parent contains a reference to a CloudProfile it inherits from.\n status NamespacedCloudProfileStatus Most recently observed status of the NamespacedCloudProfile.\n Project Project holds certain properties about a Gardener project.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Project metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ProjectSpec (Optional) Spec defines the project properties.\n createdBy Kubernetes rbac/v1.Subject (Optional) CreatedBy is a subject representing a user name, an email address, or any other identifier of a user who created the project. This field is immutable.\n description string (Optional) Description is a human-readable description of what the project is used for.\n owner Kubernetes rbac/v1.Subject (Optional) Owner is a subject representing a user name, an email address, or any other identifier of a user owning the project. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the owner role. The only way to change the owner will be by moving the owner role. In this API version the only way to change the owner is to use this field. TODO: Remove this field in favor of the owner role in v1.\n purpose string (Optional) Purpose is a human-readable explanation of the project’s purpose.\n members []ProjectMember (Optional) Members is a list of subjects representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n namespace string (Optional) Namespace is the name of the namespace that has been created for the Project object. A nil value means that Gardener will determine the name of the namespace. This field is immutable.\n tolerations ProjectTolerations (Optional) Tolerations contains the tolerations for taints on seed clusters.\n dualApprovalForDeletion []DualApprovalForDeletion (Optional) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n status ProjectStatus (Optional) Most recently observed status of the Project.\n Quota Quota represents a quota on resources consumed by shoot clusters either per project or per provider secret.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Quota metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec QuotaSpec (Optional) Spec defines the Quota constraints.\n clusterLifetimeDays int32 (Optional) ClusterLifetimeDays is the lifetime of a Shoot cluster in days before it will be terminated automatically.\n metrics Kubernetes core/v1.ResourceList Metrics is a list of resources which will be put under constraints.\n scope Kubernetes core/v1.ObjectReference Scope is the scope of the Quota object, either ‘project’, ‘secret’ or ‘workloadidentity’. This field is immutable.\n SecretBinding SecretBinding represents a binding to a secret in the same or another namespace.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string SecretBinding metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret object in the same or another namespace. This field is immutable.\n quotas []Kubernetes core/v1.ObjectReference (Optional) Quotas is a list of references to Quota objects in the same or another namespace. This field is immutable.\n provider SecretBindingProvider (Optional) Provider defines the provider type of the SecretBinding. This field is immutable.\n Seed Seed represents an installation request for an external controller.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Seed metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec SeedSpec Spec contains the specification of this installation.\n backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n status SeedStatus Status contains the status of this installation.\n Shoot Shoot represents a Shoot cluster created and managed by Gardener.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string Shoot metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootSpec (Optional) Specification of the Shoot cluster. If the object’s deletion timestamp is set, this field is immutable.\n addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n status ShootStatus (Optional) Most recently observed status of the Shoot cluster.\n ShootState ShootState contains a snapshot of the Shoot’s state required to migrate the Shoot’s control plane to a new Seed.\n Field Description apiVersion string core.gardener.cloud/v1beta1 kind string ShootState metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootStateSpec (Optional) Specification of the ShootState.\n gardener []GardenerResourceData (Optional) Gardener holds the data required to generate resources deployed by the gardenlet\n extensions []ExtensionResourceState (Optional) Extensions holds the state of custom resources reconciled by extension controllers in the seed\n resources []ResourceData (Optional) Resources holds the data of resources referred to by extension controller states\n APIServerLogging (Appears on: KubeAPIServerConfig) APIServerLogging contains configuration for the logs level and http access logs\n Field Description verbosity int32 (Optional) Verbosity is the kube-apiserver log verbosity level Defaults to 2.\n httpAccessVerbosity int32 (Optional) HTTPAccessVerbosity is the kube-apiserver access logs level\n APIServerRequests (Appears on: KubeAPIServerConfig) APIServerRequests contains configuration for request-specific settings for the kube-apiserver.\n Field Description maxNonMutatingInflight int32 (Optional) MaxNonMutatingInflight is the maximum number of non-mutating requests in flight at a given time. When the server exceeds this, it rejects requests.\n maxMutatingInflight int32 (Optional) MaxMutatingInflight is the maximum number of mutating requests in flight at a given time. When the server exceeds this, it rejects requests.\n Addon (Appears on: KubernetesDashboard, NginxIngress) Addon allows enabling or disabling a specific addon and is used to derive from.\n Field Description enabled bool Enabled indicates whether the addon is enabled or not.\n Addons (Appears on: ShootSpec) Addons is a collection of configuration for specific addons which are managed by the Gardener.\n Field Description kubernetesDashboard KubernetesDashboard (Optional) KubernetesDashboard holds configuration settings for the kubernetes dashboard addon.\n nginxIngress NginxIngress (Optional) NginxIngress holds configuration settings for the nginx-ingress addon.\n AdmissionPlugin (Appears on: KubeAPIServerConfig) AdmissionPlugin contains information about a specific admission plugin and its corresponding configuration.\n Field Description name string Name is the name of the plugin.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the configuration of the plugin.\n disabled bool (Optional) Disabled specifies whether this plugin should be disabled.\n kubeconfigSecretName string (Optional) KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this admission plugin.\n Alerting (Appears on: Monitoring) Alerting contains information about how alerting will be done (i.e. who will receive alerts and how).\n Field Description emailReceivers []string (Optional) MonitoringEmailReceivers is a list of recipients for alerts\n AuditConfig (Appears on: KubeAPIServerConfig) AuditConfig contains settings for audit of the api server\n Field Description auditPolicy AuditPolicy (Optional) AuditPolicy contains configuration settings for audit policy of the kube-apiserver.\n AuditPolicy (Appears on: AuditConfig) AuditPolicy contains audit policy for kube-apiserver\n Field Description configMapRef Kubernetes core/v1.ObjectReference (Optional) ConfigMapRef is a reference to a ConfigMap object in the same namespace, which contains the audit policy for the kube-apiserver.\n AvailabilityZone (Appears on: Region) AvailabilityZone is an availability zone.\n Field Description name string Name is an availability zone name.\n unavailableMachineTypes []string (Optional) UnavailableMachineTypes is a list of machine type names that are not availability in this zone.\n unavailableVolumeTypes []string (Optional) UnavailableVolumeTypes is a list of volume type names that are not availability in this zone.\n BackupBucketProvider (Appears on: BackupBucketSpec) BackupBucketProvider holds the details of cloud provider of the object store.\n Field Description type string Type is the type of provider.\n region string Region is the region of the bucket.\n BackupBucketSpec (Appears on: BackupBucket) BackupBucketSpec is the specification of a Backup Bucket.\n Field Description provider BackupBucketProvider Provider holds the details of cloud provider of the object store. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n seedName string (Optional) SeedName holds the name of the seed allocated to BackupBucket for running controller. This field is immutable.\n BackupBucketStatus (Appears on: BackupBucket) BackupBucketStatus holds the most recently observed status of the Backup Bucket.\n Field Description providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus is the configuration passed to BackupBucket resource.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the BackupBucket.\n lastError LastError (Optional) LastError holds information about the last occurred error during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this BackupBucket. It corresponds to the BackupBucket’s generation, which is updated on mutation by the API Server.\n generatedSecretRef Kubernetes core/v1.SecretReference (Optional) GeneratedSecretRef is reference to the secret generated by backup bucket, which will have object store specific credentials.\n BackupEntrySpec (Appears on: BackupEntry) BackupEntrySpec is the specification of a Backup Entry.\n Field Description bucketName string BucketName is the name of backup bucket for this Backup Entry.\n seedName string (Optional) SeedName holds the name of the seed to which this BackupEntry is scheduled\n BackupEntryStatus (Appears on: BackupEntry) BackupEntryStatus holds the most recently observed status of the Backup Entry.\n Field Description lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the BackupEntry.\n lastError LastError (Optional) LastError holds information about the last occurred error during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this BackupEntry. It corresponds to the BackupEntry’s generation, which is updated on mutation by the API Server.\n seedName string (Optional) SeedName is the name of the seed to which this BackupEntry is currently scheduled. This field is populated at the beginning of a create/reconcile operation. It is used when moving the BackupEntry between seeds.\n migrationStartTime Kubernetes meta/v1.Time (Optional) MigrationStartTime is the time when a migration to a different seed was initiated.\n Bastion (Appears on: CloudProfileSpec) Bastion contains the bastions creation info\n Field Description machineImage BastionMachineImage (Optional) MachineImage contains the bastions machine image properties\n machineType BastionMachineType (Optional) MachineType contains the bastions machine type properties\n BastionMachineImage (Appears on: Bastion) BastionMachineImage contains the bastions machine image properties\n Field Description name string Name of the machine image\n version string (Optional) Version of the machine image\n BastionMachineType (Appears on: Bastion) BastionMachineType contains the bastions machine type properties\n Field Description name string Name of the machine type\n CARotation (Appears on: ShootCredentialsRotation) CARotation contains information about the certificate authority credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the certificate authority credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the certificate authority credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the certificate authority credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the certificate authority credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the certificate authority credential rotation completion was triggered.\n CRI (Appears on: MachineImageVersion, Worker) CRI contains information about the Container Runtimes.\n Field Description name CRIName The name of the CRI library. Supported values are containerd.\n containerRuntimes []ContainerRuntime (Optional) ContainerRuntimes is the list of the required container runtimes supported for a worker pool.\n CRIName (string alias)\n (Appears on: CRI) CRIName is a type alias for the CRI name string.\nCloudProfileReference (Appears on: NamespacedCloudProfileSpec, ShootSpec) CloudProfileReference holds the information about a CloudProfile or a NamespacedCloudProfile.\n Field Description kind string Kind contains a CloudProfile kind.\n name string Name contains the name of the referenced CloudProfile.\n CloudProfileSpec (Appears on: CloudProfile, NamespacedCloudProfileStatus) CloudProfileSpec is the specification of a CloudProfile. It must contain exactly one of its defined keys.\n Field Description caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig contains provider-specific configuration for the profile.\n regions []Region Regions contains constraints regarding allowed values for regions and zones.\n seedSelector SeedSelector (Optional) SeedSelector contains an optional list of labels on Seed resources that marks those seeds whose shoots may use this provider profile. An empty list means that all seeds of the same provider type are supported. This is useful for environments that are of the same type (like openstack) but may have different “instances”/landscapes. Optionally a list of possible providers can be added to enable cross-provider scheduling. By default, the provider type of the seed must match the shoot’s provider.\n type string Type is the name of the provider.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n bastion Bastion (Optional) Bastion contains the machine and image properties\n ClusterAutoscaler (Appears on: Kubernetes) ClusterAutoscaler contains the configuration flags for the Kubernetes cluster autoscaler.\n Field Description scaleDownDelayAfterAdd Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterAdd defines how long after scale up that scale down evaluation resumes (default: 1 hour).\n scaleDownDelayAfterDelete Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterDelete how long after node deletion that scale down evaluation resumes, defaults to scanInterval (default: 0 secs).\n scaleDownDelayAfterFailure Kubernetes meta/v1.Duration (Optional) ScaleDownDelayAfterFailure how long after scale down failure that scale down evaluation resumes (default: 3 mins).\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down (default: 30 mins).\n scaleDownUtilizationThreshold float64 (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed (default: 0.5).\n scanInterval Kubernetes meta/v1.Duration (Optional) ScanInterval how often cluster is reevaluated for scale up or down (default: 10 secs).\n expander ExpanderMode (Optional) Expander defines the algorithm to use during scale up (default: least-waste). See: https://github.com/gardener/autoscaler/blob/machine-controller-manager-provider/cluster-autoscaler/FAQ.md#what-are-expanders.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long CA waits for node to be provisioned (default: 20 mins).\n maxGracefulTerminationSeconds int32 (Optional) MaxGracefulTerminationSeconds is the number of seconds CA waits for pod termination when trying to scale down a node (default: 600).\n ignoreTaints []string (Optional) IgnoreTaints specifies a list of taint keys to ignore in node templates when considering to scale a node group.\n newPodScaleUpDelay Kubernetes meta/v1.Duration (Optional) NewPodScaleUpDelay specifies how long CA should ignore newly created pods before they have to be considered for scale-up (default: 0s).\n maxEmptyBulkDelete int32 (Optional) MaxEmptyBulkDelete specifies the maximum number of empty nodes that can be deleted at the same time (default: 10).\n ignoreDaemonsetsUtilization bool (Optional) IgnoreDaemonsetsUtilization allows CA to ignore DaemonSet pods when calculating resource utilization for scaling down (default: false).\n verbosity int32 (Optional) Verbosity allows CA to modify its log level (default: 2).\n ClusterAutoscalerOptions (Appears on: Worker) ClusterAutoscalerOptions contains the cluster autoscaler configurations for a worker pool.\n Field Description scaleDownUtilizationThreshold float64 (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed.\n scaleDownGpuUtilizationThreshold float64 (Optional) ScaleDownGpuUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) of gpu resources under which a node is being removed.\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down.\n scaleDownUnreadyTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnreadyTime defines how long an unready node should be unneeded before it is eligible for scale down.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long CA waits for node to be provisioned.\n Condition (Appears on: ControllerInstallationStatus, SeedStatus, ShootStatus) Condition holds the information about the state of a resource.\n Field Description type ConditionType Type of the condition.\n status ConditionStatus Status of the condition, one of True, False, Unknown.\n lastTransitionTime Kubernetes meta/v1.Time Last time the condition transitioned from one status to another.\n lastUpdateTime Kubernetes meta/v1.Time Last time the condition was updated.\n reason string The reason for the condition’s last transition.\n message string A human readable message indicating details about the transition.\n codes []ErrorCode (Optional) Well-defined error codes in case the condition reports a problem.\n ConditionStatus (string alias)\n (Appears on: Condition) ConditionStatus is the status of a condition.\nConditionType (string alias)\n (Appears on: Condition) ConditionType is a string alias.\nContainerRuntime (Appears on: CRI) ContainerRuntime contains information about worker’s available container runtime\n Field Description type string Type is the type of the Container Runtime.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to container runtime resource.\n ControlPlane (Appears on: ShootSpec) ControlPlane holds information about the general settings for the control plane of a shoot.\n Field Description highAvailability HighAvailability (Optional) HighAvailability holds the configuration settings for high availability of the control plane of a shoot.\n ControllerDeploymentPolicy (string alias)\n (Appears on: ControllerRegistrationDeployment) ControllerDeploymentPolicy is a string alias.\nControllerInstallationSpec (Appears on: ControllerInstallation) ControllerInstallationSpec is the specification of a ControllerInstallation.\n Field Description registrationRef Kubernetes core/v1.ObjectReference RegistrationRef is used to reference a ControllerRegistration resource. The name field of the RegistrationRef is immutable.\n seedRef Kubernetes core/v1.ObjectReference SeedRef is used to reference a Seed resource. The name field of the SeedRef is immutable.\n deploymentRef Kubernetes core/v1.ObjectReference (Optional) DeploymentRef is used to reference a ControllerDeployment resource.\n ControllerInstallationStatus (Appears on: ControllerInstallation) ControllerInstallationStatus is the status of a ControllerInstallation.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of a ControllerInstallations’s current state.\n providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains type-specific status.\n ControllerRegistrationDeployment (Appears on: ControllerRegistrationSpec) ControllerRegistrationDeployment contains information for how this controller is deployed.\n Field Description policy ControllerDeploymentPolicy (Optional) Policy controls how the controller is deployed. It defaults to ‘OnDemand’.\n seedSelector Kubernetes meta/v1.LabelSelector (Optional) SeedSelector contains an optional label selector for seeds. Only if the labels match then this controller will be considered for a deployment. An empty list means that all seeds are selected.\n deploymentRefs []DeploymentRef (Optional) DeploymentRefs holds references to ControllerDeployments. Only one element is supported currently.\n ControllerRegistrationSpec (Appears on: ControllerRegistration) ControllerRegistrationSpec is the specification of a ControllerRegistration.\n Field Description resources []ControllerResource (Optional) Resources is a list of combinations of kinds (DNSProvider, Infrastructure, Generic, …) and their actual types (aws-route53, gcp, auditlog, …).\n deployment ControllerRegistrationDeployment (Optional) Deployment contains information for how this controller is deployed.\n ControllerResource (Appears on: ControllerRegistrationSpec) ControllerResource is a combination of a kind (DNSProvider, Infrastructure, Generic, …) and the actual type for this kind (aws-route53, gcp, auditlog, …).\n Field Description kind string Kind is the resource kind, for example “OperatingSystemConfig”.\n type string Type is the resource type, for example “coreos” or “ubuntu”.\n globallyEnabled bool (Optional) GloballyEnabled determines if this ControllerResource is required by all Shoot clusters. This field is defaulted to false when kind is “Extension”.\n reconcileTimeout Kubernetes meta/v1.Duration (Optional) ReconcileTimeout defines how long Gardener should wait for the resource reconciliation. This field is defaulted to 3m0s when kind is “Extension”.\n primary bool (Optional) Primary determines if the controller backed by this ControllerRegistration is responsible for the extension resource’s lifecycle. This field defaults to true. There must be exactly one primary controller for this kind/type combination. This field is immutable.\n lifecycle ControllerResourceLifecycle (Optional) Lifecycle defines a strategy that determines when different operations on a ControllerResource should be performed. This field is defaulted in the following way when kind is “Extension”. Reconcile: “AfterKubeAPIServer” Delete: “BeforeKubeAPIServer” Migrate: “BeforeKubeAPIServer”\n workerlessSupported bool (Optional) WorkerlessSupported specifies whether this ControllerResource supports Workerless Shoot clusters. This field is only relevant when kind is “Extension”.\n ControllerResourceLifecycle (Appears on: ControllerResource) ControllerResourceLifecycle defines the lifecycle of a controller resource.\n Field Description reconcile ControllerResourceLifecycleStrategy (Optional) Reconcile defines the strategy during reconciliation.\n delete ControllerResourceLifecycleStrategy (Optional) Delete defines the strategy during deletion.\n migrate ControllerResourceLifecycleStrategy (Optional) Migrate defines the strategy during migration.\n ControllerResourceLifecycleStrategy (string alias)\n (Appears on: ControllerResourceLifecycle) ControllerResourceLifecycleStrategy is a string alias.\nCoreDNS (Appears on: SystemComponents) CoreDNS contains the settings of the Core DNS components running in the data plane of the Shoot cluster.\n Field Description autoscaling CoreDNSAutoscaling (Optional) Autoscaling contains the settings related to autoscaling of the Core DNS components running in the data plane of the Shoot cluster.\n rewriting CoreDNSRewriting (Optional) Rewriting contains the setting related to rewriting of requests, which are obviously incorrect due to the unnecessary application of the search path.\n CoreDNSAutoscaling (Appears on: CoreDNS) CoreDNSAutoscaling contains the settings related to autoscaling of the Core DNS components running in the data plane of the Shoot cluster.\n Field Description mode CoreDNSAutoscalingMode The mode of the autoscaling to be used for the Core DNS components running in the data plane of the Shoot cluster. Supported values are horizontal and cluster-proportional.\n CoreDNSAutoscalingMode (string alias)\n (Appears on: CoreDNSAutoscaling) CoreDNSAutoscalingMode is a type alias for the Core DNS autoscaling mode string.\nCoreDNSRewriting (Appears on: CoreDNS) CoreDNSRewriting contains the setting related to rewriting requests, which are obviously incorrect due to the unnecessary application of the search path.\n Field Description commonSuffixes []string (Optional) CommonSuffixes are expected to be the suffix of a fully qualified domain name. Each suffix should contain at least one or two dots (‘.’) to prevent accidental clashes.\n CredentialsRotationPhase (string alias)\n (Appears on: CARotation, ETCDEncryptionKeyRotation, ServiceAccountKeyRotation) CredentialsRotationPhase is a string alias.\nDNS (Appears on: ShootSpec) DNS holds information about the provider, the hosted zone id and the domain.\n Field Description domain string (Optional) Domain is the external available domain of the Shoot cluster. This domain will be written into the kubeconfig that is handed out to end-users. This field is immutable.\n providers []DNSProvider (Optional) Providers is a list of DNS providers that shall be enabled for this shoot cluster. Only relevant if not a default domain is used.\nDeprecated: Configuring multiple DNS providers is deprecated and will be forbidden in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional providers.\n DNSIncludeExclude (Appears on: DNSProvider) DNSIncludeExclude contains information about which domains shall be included/excluded.\n Field Description include []string (Optional) Include is a list of domains that shall be included.\n exclude []string (Optional) Exclude is a list of domains that shall be excluded.\n DNSProvider (Appears on: DNS) DNSProvider contains information about a DNS provider.\n Field Description domains DNSIncludeExclude (Optional) Domains contains information about which domains shall be included/excluded for this provider.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional configuration.\n primary bool (Optional) Primary indicates that this DNSProvider is used for shoot related domains.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional and non-primary providers.\n secretName string (Optional) SecretName is a name of a secret containing credentials for the stated domain and the provider. When not specified, the Gardener will use the cloud provider credentials referenced by the Shoot and try to find respective credentials there (primary provider only). Specifying this field may override this behavior, i.e. forcing the Gardener to only look into the given secret.\n type string (Optional) Type is the DNS provider type.\n zones DNSIncludeExclude (Optional) Zones contains information about which hosted zones shall be included/excluded for this provider.\nDeprecated: This field is deprecated and will be removed in a future release. Please use the DNS extension provider config (e.g. shoot-dns-service) for additional configuration.\n DataVolume (Appears on: Worker) DataVolume contains information about a data volume.\n Field Description name string Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string VolumeSize is the size of the volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n DeploymentRef (Appears on: ControllerRegistrationDeployment) DeploymentRef contains information about ControllerDeployment references.\n Field Description name string Name is the name of the ControllerDeployment that is being referred to.\n DualApprovalForDeletion (Appears on: ProjectSpec) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n Field Description resource string Resource is the name of the resource this applies to.\n selector Kubernetes meta/v1.LabelSelector Selector is the label selector for the resources.\n includeServiceAccounts bool (Optional) IncludeServiceAccounts specifies whether the concept also applies when deletion is triggered by ServiceAccounts. Defaults to true.\n ETCDEncryptionKeyRotation (Appears on: ShootCredentialsRotation) ETCDEncryptionKeyRotation contains information about the ETCD encryption key credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the ETCD encryption key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the ETCD encryption key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the ETCD encryption key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the ETCD encryption key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the ETCD encryption key credential rotation completion was triggered.\n EncryptionConfig (Appears on: KubeAPIServerConfig) EncryptionConfig contains customizable encryption configuration of the API server.\n Field Description resources []string Resources contains the list of resources that shall be encrypted in addition to secrets. Each item is a Kubernetes resource name in plural (resource or resource.group) that should be encrypted. Note that configuring a custom resource is only supported for versions \u003e= 1.26. Wildcards are not supported for now. See https://github.com/gardener/gardener/blob/master/docs/usage/etcd_encryption_config.md for more details.\n ErrorCode (string alias)\n (Appears on: Condition, LastError) ErrorCode is a string alias.\nExpanderMode (string alias)\n (Appears on: ClusterAutoscaler) ExpanderMode is type used for Expander values\nExpirableVersion (Appears on: KubernetesSettings, MachineImageVersion) ExpirableVersion contains a version and an expiration date.\n Field Description version string Version is the version identifier.\n expirationDate Kubernetes meta/v1.Time (Optional) ExpirationDate defines the time at which this version expires.\n classification VersionClassification (Optional) Classification defines the state of a version (preview, supported, deprecated)\n ExposureClassScheduling (Appears on: ExposureClass) ExposureClassScheduling holds information to select applicable Seed’s for ExposureClass usage.\n Field Description seedSelector SeedSelector (Optional) SeedSelector is an optional label selector for Seed’s which are suitable to use the ExposureClass.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on Seed clusters.\n Extension (Appears on: ShootSpec) Extension contains type and provider information for Shoot extensions.\n Field Description type string Type is the type of the extension resource.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to extension resource.\n disabled bool (Optional) Disabled allows to disable extensions that were marked as ‘globally enabled’ by Gardener administrators.\n ExtensionResourceState (Appears on: ShootStateSpec) ExtensionResourceState contains the kind of the extension custom resource and its last observed state in the Shoot’s namespace on the Seed cluster.\n Field Description kind string Kind (type) of the extension custom resource\n name string (Optional) Name of the extension custom resource\n purpose string (Optional) Purpose of the extension custom resource\n state k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) State of the extension resource\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in the state by their names.\n FailureTolerance (Appears on: HighAvailability) FailureTolerance describes information about failure tolerance level of a highly available resource.\n Field Description type FailureToleranceType Type specifies the type of failure that the highly available resource can tolerate\n FailureToleranceType (string alias)\n (Appears on: FailureTolerance) FailureToleranceType specifies the type of failure that a highly available shoot control plane that can tolerate.\nGardener (Appears on: SeedStatus, ShootStatus) Gardener holds the information about the Gardener version that operated a resource.\n Field Description id string ID is the container id of the Gardener which last acted on a resource.\n name string Name is the hostname (pod name) of the Gardener which last acted on a resource.\n version string Version is the version of the Gardener which last acted on a resource.\n GardenerResourceData (Appears on: ShootStateSpec) GardenerResourceData holds the data which is used to generate resources, deployed in the Shoot’s control plane.\n Field Description name string Name of the object required to generate resources\n type string Type of the object\n data k8s.io/apimachinery/pkg/runtime.RawExtension Data contains the payload required to generate resources\n labels map[string]string (Optional) Labels are labels of the object\n HelmControllerDeployment HelmControllerDeployment configures how an extension controller is deployed using helm. This is the legacy structure that used to be defined in gardenlet’s ControllerInstallation controller for ControllerDeployment’s with type=helm. While this is not a proper API type, we need to define the structure in the API package so that we can convert it to the internal API version in the new representation.\n Field Description chart []byte Chart is a Helm chart tarball.\n values Kubernetes apiextensions/v1.JSON Values is a map of values for the given chart.\n ociRepository OCIRepository (Optional) OCIRepository defines where to pull the chart.\n Hibernation (Appears on: ShootSpec) Hibernation contains information whether the Shoot is suspended or not.\n Field Description enabled bool (Optional) Enabled specifies whether the Shoot needs to be hibernated or not. If it is true, the Shoot’s desired state is to be hibernated. If it is false or nil, the Shoot’s desired state is to be awakened.\n schedules []HibernationSchedule (Optional) Schedules determine the hibernation schedules.\n HibernationSchedule (Appears on: Hibernation) HibernationSchedule determines the hibernation schedule of a Shoot. A Shoot will be regularly hibernated at each start time and will be woken up at each end time. Start or End can be omitted, though at least one of each has to be specified.\n Field Description start string (Optional) Start is a Cron spec at which time a Shoot will be hibernated.\n end string (Optional) End is a Cron spec at which time a Shoot will be woken up.\n location string (Optional) Location is the time location in which both start and shall be evaluated.\n HighAvailability (Appears on: ControlPlane) HighAvailability specifies the configuration settings for high availability for a resource. Typical usages could be to configure HA for shoot control plane or for seed system components.\n Field Description failureTolerance FailureTolerance FailureTolerance holds information about failure tolerance level of a highly available resource.\n HorizontalPodAutoscalerConfig (Appears on: KubeControllerManagerConfig) HorizontalPodAutoscalerConfig contains horizontal pod autoscaler configuration settings for the kube-controller-manager. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description cpuInitializationPeriod Kubernetes meta/v1.Duration (Optional) The period after which a ready pod transition is considered to be the first.\n downscaleStabilization Kubernetes meta/v1.Duration (Optional) The configurable window at which the controller will choose the highest recommendation for autoscaling.\n initialReadinessDelay Kubernetes meta/v1.Duration (Optional) The configurable period at which the horizontal pod autoscaler considers a Pod “not yet ready” given that it’s unready and it has transitioned to unready during that time.\n syncPeriod Kubernetes meta/v1.Duration (Optional) The period for syncing the number of pods in horizontal pod autoscaler.\n tolerance float64 (Optional) The minimum change (from 1.0) in the desired-to-actual metrics ratio for the horizontal pod autoscaler to consider scaling.\n IPFamily (string alias)\n (Appears on: Networking, SeedNetworks) IPFamily is a type for specifying an IP protocol version to use in Gardener clusters.\nIngress (Appears on: SeedSpec) Ingress configures the Ingress specific settings of the cluster\n Field Description domain string Domain specifies the IngressDomain of the cluster pointing to the ingress controller endpoint. It will be used to construct ingress URLs for system applications running in Shoot/Garden clusters. Once set this field is immutable.\n controller IngressController Controller configures a Gardener managed Ingress Controller listening on the ingressDomain\n IngressController (Appears on: Ingress) IngressController enables a Gardener managed Ingress Controller listening on the ingressDomain\n Field Description kind string Kind defines which kind of IngressController to use. At the moment only nginx is supported\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig specifies infrastructure specific configuration for the ingressController\n KubeAPIServerConfig (Appears on: Kubernetes) KubeAPIServerConfig contains configuration settings for the kube-apiserver.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) admissionPlugins []AdmissionPlugin (Optional) AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding configuration.\n apiAudiences []string (Optional) APIAudiences are the identifiers of the API. The service account token authenticator will validate that tokens used against the API are bound to at least one of these audiences. Defaults to [“kubernetes”].\n auditConfig AuditConfig (Optional) AuditConfig contains configuration settings for the audit of the kube-apiserver.\n oidcConfig OIDCConfig (Optional) OIDCConfig contains configuration settings for the OIDC provider.\n runtimeConfig map[string]bool (Optional) RuntimeConfig contains information about enabled or disabled APIs.\n serviceAccountConfig ServiceAccountConfig (Optional) ServiceAccountConfig contains configuration settings for the service account handling of the kube-apiserver.\n watchCacheSizes WatchCacheSizes (Optional) WatchCacheSizes contains configuration of the API server’s watch cache sizes. Configuring these flags might be useful for large-scale Shoot clusters with a lot of parallel update requests and a lot of watching controllers (e.g. large ManagedSeed clusters). When the API server’s watch cache’s capacity is too small to cope with the amount of update requests and watchers for a particular resource, it might happen that controller watches are permanently stopped with too old resource version errors. Starting from kubernetes v1.19, the API server’s watch cache size is adapted dynamically and setting the watch cache size flags will have no effect, except when setting it to 0 (which disables the watch cache).\n requests APIServerRequests (Optional) Requests contains configuration for request-specific settings for the kube-apiserver.\n enableAnonymousAuthentication bool (Optional) EnableAnonymousAuthentication defines whether anonymous requests to the secure port of the API server should be allowed (flag --anonymous-auth). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n eventTTL Kubernetes meta/v1.Duration (Optional) EventTTL controls the amount of time to retain events. Defaults to 1h.\n logging APIServerLogging (Optional) Logging contains configuration for the log level and HTTP access logs.\n defaultNotReadyTolerationSeconds int64 (Optional) DefaultNotReadyTolerationSeconds indicates the tolerationSeconds of the toleration for notReady:NoExecute that is added by default to every pod that does not already have such a toleration (flag --default-not-ready-toleration-seconds). The field has effect only when the DefaultTolerationSeconds admission plugin is enabled. Defaults to 300.\n defaultUnreachableTolerationSeconds int64 (Optional) DefaultUnreachableTolerationSeconds indicates the tolerationSeconds of the toleration for unreachable:NoExecute that is added by default to every pod that does not already have such a toleration (flag --default-unreachable-toleration-seconds). The field has effect only when the DefaultTolerationSeconds admission plugin is enabled. Defaults to 300.\n encryptionConfig EncryptionConfig (Optional) EncryptionConfig contains customizable encryption configuration of the Kube API server.\n structuredAuthentication StructuredAuthentication (Optional) StructuredAuthentication contains configuration settings for structured authentication to the kube-apiserver. This field is only available for Kubernetes v1.30 or later.\n KubeControllerManagerConfig (Appears on: Kubernetes) KubeControllerManagerConfig contains configuration settings for the kube-controller-manager.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) horizontalPodAutoscaler HorizontalPodAutoscalerConfig (Optional) HorizontalPodAutoscalerConfig contains horizontal pod autoscaler configuration settings for the kube-controller-manager.\n nodeCIDRMaskSize int32 (Optional) NodeCIDRMaskSize defines the mask size for node cidr in cluster (default is 24). This field is immutable.\n podEvictionTimeout Kubernetes meta/v1.Duration (Optional) PodEvictionTimeout defines the grace period for deleting pods on failed nodes. Defaults to 2m.\nDeprecated: The corresponding kube-controller-manager flag --pod-eviction-timeout is deprecated in favor of the kube-apiserver flags --default-not-ready-toleration-seconds and --default-unreachable-toleration-seconds. The --pod-eviction-timeout flag does not have effect when the taint besed eviction is enabled. The taint based eviction is beta (enabled by default) since Kubernetes 1.13 and GA since Kubernetes 1.18. Hence, instead of setting this field, set the spec.kubernetes.kubeAPIServer.defaultNotReadyTolerationSeconds and spec.kubernetes.kubeAPIServer.defaultUnreachableTolerationSeconds.\n nodeMonitorGracePeriod Kubernetes meta/v1.Duration (Optional) NodeMonitorGracePeriod defines the grace period before an unresponsive node is marked unhealthy.\n KubeProxyConfig (Appears on: Kubernetes) KubeProxyConfig contains configuration settings for the kube-proxy.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) mode ProxyMode (Optional) Mode specifies which proxy mode to use. defaults to IPTables.\n enabled bool (Optional) Enabled indicates whether kube-proxy should be deployed or not. Depending on the networking extensions switching kube-proxy off might be rejected. Consulting the respective documentation of the used networking extension is recommended before using this field. defaults to true if not specified.\n KubeSchedulerConfig (Appears on: Kubernetes) KubeSchedulerConfig contains configuration settings for the kube-scheduler.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) kubeMaxPDVols string (Optional) KubeMaxPDVols allows to configure the KUBE_MAX_PD_VOLS environment variable for the kube-scheduler. Please find more information here: https://kubernetes.io/docs/concepts/storage/storage-limits/#custom-limits Note that using this field is considered alpha-/experimental-level and is on your own risk. You should be aware of all the side-effects and consequences when changing it.\n profile SchedulingProfile (Optional) Profile configures the scheduling profile for the cluster. If not specified, the used profile is “balanced” (provides the default kube-scheduler behavior).\n KubeletConfig (Appears on: Kubernetes, WorkerKubernetes) KubeletConfig contains configuration settings for the kubelet.\n Field Description KubernetesConfig KubernetesConfig (Members of KubernetesConfig are embedded into this type.) cpuCFSQuota bool (Optional) CPUCFSQuota allows you to disable/enable CPU throttling for Pods.\n cpuManagerPolicy string (Optional) CPUManagerPolicy allows to set alternative CPU management policies (default: none).\n evictionHard KubeletConfigEviction (Optional) EvictionHard describes a set of eviction thresholds (e.g. memory.available\u003c1Gi) that if met would trigger a Pod eviction. Default: memory.available: “100Mi/1Gi/5%” nodefs.available: “5%” nodefs.inodesFree: “5%” imagefs.available: “5%” imagefs.inodesFree: “5%”\n evictionMaxPodGracePeriod int32 (Optional) EvictionMaxPodGracePeriod describes the maximum allowed grace period (in seconds) to use when terminating pods in response to a soft eviction threshold being met. Default: 90\n evictionMinimumReclaim KubeletConfigEvictionMinimumReclaim (Optional) EvictionMinimumReclaim configures the amount of resources below the configured eviction threshold that the kubelet attempts to reclaim whenever the kubelet observes resource pressure. Default: 0 for each resource\n evictionPressureTransitionPeriod Kubernetes meta/v1.Duration (Optional) EvictionPressureTransitionPeriod is the duration for which the kubelet has to wait before transitioning out of an eviction pressure condition. Default: 4m0s\n evictionSoft KubeletConfigEviction (Optional) EvictionSoft describes a set of eviction thresholds (e.g. memory.available\u003c1.5Gi) that if met over a corresponding grace period would trigger a Pod eviction. Default: memory.available: “200Mi/1.5Gi/10%” nodefs.available: “10%” nodefs.inodesFree: “10%” imagefs.available: “10%” imagefs.inodesFree: “10%”\n evictionSoftGracePeriod KubeletConfigEvictionSoftGracePeriod (Optional) EvictionSoftGracePeriod describes a set of eviction grace periods (e.g. memory.available=1m30s) that correspond to how long a soft eviction threshold must hold before triggering a Pod eviction. Default: memory.available: 1m30s nodefs.available: 1m30s nodefs.inodesFree: 1m30s imagefs.available: 1m30s imagefs.inodesFree: 1m30s\n maxPods int32 (Optional) MaxPods is the maximum number of Pods that are allowed by the Kubelet. Default: 110\n podPidsLimit int64 (Optional) PodPIDsLimit is the maximum number of process IDs per pod allowed by the kubelet.\n failSwapOn bool (Optional) FailSwapOn makes the Kubelet fail to start if swap is enabled on the node. (default true).\n kubeReserved KubeletConfigReserved (Optional) KubeReserved is the configuration for resources reserved for kubernetes node components (mainly kubelet and container runtime). When updating these values, be aware that cgroup resizes may not succeed on active worker nodes. Look for the NodeAllocatableEnforced event to determine if the configuration was applied. Default: cpu=80m,memory=1Gi,pid=20k\n systemReserved KubeletConfigReserved (Optional) SystemReserved is the configuration for resources reserved for system processes not managed by kubernetes (e.g. journald). When updating these values, be aware that cgroup resizes may not succeed on active worker nodes. Look for the NodeAllocatableEnforced event to determine if the configuration was applied.\nDeprecated: Separately configuring resource reservations for system processes is deprecated in Gardener and will be forbidden starting from Kubernetes 1.31. Please merge existing resource reservations into the kubeReserved field. TODO(MichaelEischer): Drop this field after support for Kubernetes 1.30 is dropped.\n imageGCHighThresholdPercent int32 (Optional) ImageGCHighThresholdPercent describes the percent of the disk usage which triggers image garbage collection. Default: 50\n imageGCLowThresholdPercent int32 (Optional) ImageGCLowThresholdPercent describes the percent of the disk to which garbage collection attempts to free. Default: 40\n serializeImagePulls bool (Optional) SerializeImagePulls describes whether the images are pulled one at a time. Default: true\n registryPullQPS int32 (Optional) RegistryPullQPS is the limit of registry pulls per second. The value must not be a negative number. Setting it to 0 means no limit. Default: 5\n registryBurst int32 (Optional) RegistryBurst is the maximum size of bursty pulls, temporarily allows pulls to burst to this number, while still not exceeding registryPullQPS. The value must not be a negative number. Only used if registryPullQPS is greater than 0. Default: 10\n seccompDefault bool (Optional) SeccompDefault enables the use of RuntimeDefault as the default seccomp profile for all workloads. This requires the corresponding SeccompDefault feature gate to be enabled as well. This field is only available for Kubernetes v1.25 or later.\n containerLogMaxSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) A quantity defines the maximum size of the container log file before it is rotated. For example: “5Mi” or “256Ki”. Default: 100Mi\n containerLogMaxFiles int32 (Optional) Maximum number of container log files that can be present for a container.\n protectKernelDefaults bool (Optional) ProtectKernelDefaults ensures that the kernel tunables are equal to the kubelet defaults. Defaults to true for Kubernetes v1.26 or later.\n streamingConnectionIdleTimeout Kubernetes meta/v1.Duration (Optional) StreamingConnectionIdleTimeout is the maximum time a streaming connection can be idle before the connection is automatically closed. This field cannot be set lower than “30s” or greater than “4h”. Default: “4h” for Kubernetes \u003c v1.26. “5m” for Kubernetes \u003e= v1.26.\n memorySwap MemorySwapConfiguration (Optional) MemorySwap configures swap memory available to container workloads.\n KubeletConfigEviction (Appears on: KubeletConfig) KubeletConfigEviction contains kubelet eviction thresholds supporting either a resource.Quantity or a percentage based value.\n Field Description memoryAvailable string (Optional) MemoryAvailable is the threshold for the free memory on the host server.\n imageFSAvailable string (Optional) ImageFSAvailable is the threshold for the free disk space in the imagefs filesystem (docker images and container writable layers).\n imageFSInodesFree string (Optional) ImageFSInodesFree is the threshold for the available inodes in the imagefs filesystem.\n nodeFSAvailable string (Optional) NodeFSAvailable is the threshold for the free disk space in the nodefs filesystem (docker volumes, logs, etc).\n nodeFSInodesFree string (Optional) NodeFSInodesFree is the threshold for the available inodes in the nodefs filesystem.\n KubeletConfigEvictionMinimumReclaim (Appears on: KubeletConfig) KubeletConfigEvictionMinimumReclaim contains configuration for the kubelet eviction minimum reclaim.\n Field Description memoryAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MemoryAvailable is the threshold for the memory reclaim on the host server.\n imageFSAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) ImageFSAvailable is the threshold for the disk space reclaim in the imagefs filesystem (docker images and container writable layers).\n imageFSInodesFree k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) ImageFSInodesFree is the threshold for the inodes reclaim in the imagefs filesystem.\n nodeFSAvailable k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) NodeFSAvailable is the threshold for the disk space reclaim in the nodefs filesystem (docker volumes, logs, etc).\n nodeFSInodesFree k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) NodeFSInodesFree is the threshold for the inodes reclaim in the nodefs filesystem.\n KubeletConfigEvictionSoftGracePeriod (Appears on: KubeletConfig) KubeletConfigEvictionSoftGracePeriod contains grace periods for kubelet eviction thresholds.\n Field Description memoryAvailable Kubernetes meta/v1.Duration (Optional) MemoryAvailable is the grace period for the MemoryAvailable eviction threshold.\n imageFSAvailable Kubernetes meta/v1.Duration (Optional) ImageFSAvailable is the grace period for the ImageFSAvailable eviction threshold.\n imageFSInodesFree Kubernetes meta/v1.Duration (Optional) ImageFSInodesFree is the grace period for the ImageFSInodesFree eviction threshold.\n nodeFSAvailable Kubernetes meta/v1.Duration (Optional) NodeFSAvailable is the grace period for the NodeFSAvailable eviction threshold.\n nodeFSInodesFree Kubernetes meta/v1.Duration (Optional) NodeFSInodesFree is the grace period for the NodeFSInodesFree eviction threshold.\n KubeletConfigReserved (Appears on: KubeletConfig) KubeletConfigReserved contains reserved resources for daemons\n Field Description cpu k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) CPU is the reserved cpu.\n memory k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Memory is the reserved memory.\n ephemeralStorage k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) EphemeralStorage is the reserved ephemeral-storage.\n pid k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) PID is the reserved process-ids.\n Kubernetes (Appears on: ShootSpec) Kubernetes contains the version and configuration variables for the Shoot control plane.\n Field Description clusterAutoscaler ClusterAutoscaler (Optional) ClusterAutoscaler contains the configuration flags for the Kubernetes cluster autoscaler.\n kubeAPIServer KubeAPIServerConfig (Optional) KubeAPIServer contains configuration settings for the kube-apiserver.\n kubeControllerManager KubeControllerManagerConfig (Optional) KubeControllerManager contains configuration settings for the kube-controller-manager.\n kubeScheduler KubeSchedulerConfig (Optional) KubeScheduler contains configuration settings for the kube-scheduler.\n kubeProxy KubeProxyConfig (Optional) KubeProxy contains configuration settings for the kube-proxy.\n kubelet KubeletConfig (Optional) Kubelet contains configuration settings for the kubelet.\n version string (Optional) Version is the semantic Kubernetes version to use for the Shoot cluster. Defaults to the highest supported minor and patch version given in the referenced cloud profile. The version can be omitted completely or partially specified, e.g. \u003cmajor\u003e.\u003cminor\u003e.\n verticalPodAutoscaler VerticalPodAutoscaler (Optional) VerticalPodAutoscaler contains the configuration flags for the Kubernetes vertical pod autoscaler.\n enableStaticTokenKubeconfig bool (Optional) EnableStaticTokenKubeconfig indicates whether static token kubeconfig secret will be created for the Shoot cluster. Defaults to true for Shoots with Kubernetes versions \u003c 1.26. Defaults to false for Shoots with Kubernetes versions \u003e= 1.26. Starting Kubernetes 1.27 the field will be locked to false.\n KubernetesConfig (Appears on: KubeAPIServerConfig, KubeControllerManagerConfig, KubeProxyConfig, KubeSchedulerConfig, KubeletConfig) KubernetesConfig contains common configuration fields for the control plane components.\n Field Description featureGates map[string]bool (Optional) FeatureGates contains information about enabled feature gates.\n KubernetesDashboard (Appears on: Addons) KubernetesDashboard describes configuration values for the kubernetes-dashboard addon.\n Field Description Addon Addon (Members of Addon are embedded into this type.) authenticationMode string (Optional) AuthenticationMode defines the authentication mode for the kubernetes-dashboard.\n KubernetesSettings (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) KubernetesSettings contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n Field Description versions []ExpirableVersion (Optional) Versions is the list of allowed Kubernetes versions with optional expiration dates for Shoot clusters.\n LastError (Appears on: BackupBucketStatus, BackupEntryStatus, ShootStatus) LastError indicates the last occurred error for an operation on a resource.\n Field Description description string A human readable message indicating details about the last error.\n taskID string (Optional) ID of the task which caused this last error\n codes []ErrorCode (Optional) Well-defined error codes of the last error(s).\n lastUpdateTime Kubernetes meta/v1.Time (Optional) Last time the error was reported\n LastMaintenance (Appears on: ShootStatus) LastMaintenance holds information about a maintenance operation on the Shoot.\n Field Description description string A human-readable message containing details about the operations performed in the last maintenance.\n triggeredTime Kubernetes meta/v1.Time TriggeredTime is the time when maintenance was triggered.\n state LastOperationState Status of the last maintenance operation, one of Processing, Succeeded, Error.\n failureReason string (Optional) FailureReason holds the information about the last maintenance operation failure reason.\n LastOperation (Appears on: BackupBucketStatus, BackupEntryStatus, SeedStatus, ShootStatus) LastOperation indicates the type and the state of the last operation, along with a description message and a progress indicator.\n Field Description description string A human readable message indicating details about the last operation.\n lastUpdateTime Kubernetes meta/v1.Time Last time the operation state transitioned from one to another.\n progress int32 The progress in percentage (0-100) of the last operation.\n state LastOperationState Status of the last operation, one of Aborted, Processing, Succeeded, Error, Failed.\n type LastOperationType Type of the last operation, one of Create, Reconcile, Delete, Migrate, Restore.\n LastOperationState (string alias)\n (Appears on: LastMaintenance, LastOperation) LastOperationState is a string alias.\nLastOperationType (string alias)\n (Appears on: LastOperation) LastOperationType is a string alias.\nLoadBalancerServicesProxyProtocol (Appears on: SeedSettingLoadBalancerServices, SeedSettingLoadBalancerServicesZones) LoadBalancerServicesProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services.\n Field Description allowed bool Allowed controls whether the ProxyProtocol is optionally allowed for the load balancer services. This should only be enabled if the load balancer services are already using ProxyProtocol or will be reconfigured to use it soon. Until the load balancers are configured with ProxyProtocol, enabling this setting may allow clients to spoof their source IP addresses. The option allows a migration from non-ProxyProtocol to ProxyProtocol without downtime (depending on the infrastructure). Defaults to false.\n Machine (Appears on: Worker) Machine contains information about the machine type and image.\n Field Description type string Type is the machine type of the worker group.\n image ShootMachineImage (Optional) Image holds information about the machine image to use for all nodes of this pool. It will default to the latest version of the first image stated in the referenced CloudProfile if no value has been provided.\n architecture string (Optional) Architecture is CPU architecture of machines in this worker pool.\n MachineControllerManagerSettings (Appears on: Worker) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n Field Description machineDrainTimeout Kubernetes meta/v1.Duration (Optional) MachineDrainTimeout is the period after which machine is forcefully deleted.\n machineHealthTimeout Kubernetes meta/v1.Duration (Optional) MachineHealthTimeout is the period after which machine is declared failed.\n machineCreationTimeout Kubernetes meta/v1.Duration (Optional) MachineCreationTimeout is the period after which creation of the machine is declared failed.\n maxEvictRetries int32 (Optional) MaxEvictRetries are the number of eviction retries on a pod after which drain is declared failed, and forceful deletion is triggered.\n nodeConditions []string (Optional) NodeConditions are the set of conditions if set to true for the period of MachineHealthTimeout, machine will be declared failed.\n MachineImage (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) MachineImage defines the name and multiple versions of the machine image in any environment.\n Field Description name string Name is the name of the image.\n versions []MachineImageVersion Versions contains versions, expiration dates and container runtimes of the machine image\n updateStrategy MachineImageUpdateStrategy (Optional) UpdateStrategy is the update strategy to use for the machine image. Possible values are: - patch: update to the latest patch version of the current minor version. - minor: update to the latest minor and patch version. - major: always update to the overall latest version (default).\n MachineImageUpdateStrategy (string alias)\n (Appears on: MachineImage) MachineImageUpdateStrategy is the update strategy to use for a machine image\nMachineImageVersion (Appears on: MachineImage) MachineImageVersion is an expirable version with list of supported container runtimes and interfaces\n Field Description ExpirableVersion ExpirableVersion (Members of ExpirableVersion are embedded into this type.) cri []CRI (Optional) CRI list of supported container runtime and interfaces supported by this version\n architectures []string (Optional) Architectures is the list of CPU architectures of the machine image in this version.\n kubeletVersionConstraint string (Optional) KubeletVersionConstraint is a constraint describing the supported kubelet versions by the machine image in this version. If the field is not specified, it is assumed that the machine image in this version supports all kubelet versions. Examples: - ‘\u003e= 1.26’ - supports only kubelet versions greater than or equal to 1.26 - ‘\u003c 1.26’ - supports only kubelet versions less than 1.26\n MachineType (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) MachineType contains certain properties of a machine type.\n Field Description cpu k8s.io/apimachinery/pkg/api/resource.Quantity CPU is the number of CPUs for this machine type.\n gpu k8s.io/apimachinery/pkg/api/resource.Quantity GPU is the number of GPUs for this machine type.\n memory k8s.io/apimachinery/pkg/api/resource.Quantity Memory is the amount of memory for this machine type.\n name string Name is the name of the machine type.\n storage MachineTypeStorage (Optional) Storage is the amount of storage associated with the root volume of this machine type.\n usable bool (Optional) Usable defines if the machine type can be used for shoot clusters.\n architecture string (Optional) Architecture is the CPU architecture of this machine type.\n MachineTypeStorage (Appears on: MachineType) MachineTypeStorage is the amount of storage associated with the root volume of this machine type.\n Field Description class string Class is the class of the storage type.\n size k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) StorageSize is the storage size.\n type string Type is the type of the storage.\n minSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinSize is the minimal supported storage size. This overrides any other common minimum size configuration from spec.volumeTypes[*].minSize.\n Maintenance (Appears on: ShootSpec) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n Field Description autoUpdate MaintenanceAutoUpdate (Optional) AutoUpdate contains information about which constraints should be automatically updated.\n timeWindow MaintenanceTimeWindow (Optional) TimeWindow contains information about the time window for maintenance operations.\n confineSpecUpdateRollout bool (Optional) ConfineSpecUpdateRollout prevents that changes/updates to the shoot specification will be rolled out immediately. Instead, they are rolled out during the shoot’s maintenance time window. There is one exception that will trigger an immediate roll out which is changes to the Spec.Hibernation.Enabled field.\n MaintenanceAutoUpdate (Appears on: Maintenance) MaintenanceAutoUpdate contains information about which constraints should be automatically updated.\n Field Description kubernetesVersion bool KubernetesVersion indicates whether the patch Kubernetes version may be automatically updated (default: true).\n machineImageVersion bool (Optional) MachineImageVersion indicates whether the machine image version may be automatically updated (default: true).\n MaintenanceTimeWindow (Appears on: Maintenance) MaintenanceTimeWindow contains information about the time window for maintenance operations.\n Field Description begin string Begin is the beginning of the time window in the format HHMMSS+ZONE, e.g. “220000+0100”. If not present, a random value will be computed.\n end string End is the end of the time window in the format HHMMSS+ZONE, e.g. “220000+0100”. If not present, the value will be computed based on the “Begin” value.\n MemorySwapConfiguration (Appears on: KubeletConfig) MemorySwapConfiguration contains kubelet swap configuration For more information, please see KEP: 2400-node-swap\n Field Description swapBehavior SwapBehavior (Optional) SwapBehavior configures swap memory available to container workloads. May be one of {“LimitedSwap”, “UnlimitedSwap”} defaults to: LimitedSwap\n Monitoring (Appears on: ShootSpec) Monitoring contains information about the monitoring configuration for the shoot.\n Field Description alerting Alerting (Optional) Alerting contains information about the alerting configuration for the shoot cluster.\n NamedResourceReference (Appears on: ExtensionResourceState, ShootSpec) NamedResourceReference is a named reference to a resource.\n Field Description name string Name of the resource reference.\n resourceRef Kubernetes autoscaling/v1.CrossVersionObjectReference ResourceRef is a reference to a resource.\n NamespacedCloudProfileSpec (Appears on: NamespacedCloudProfile) NamespacedCloudProfileSpec is the specification of a NamespacedCloudProfile.\n Field Description caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every host machine of shoot cluster targeting this profile.\n kubernetes KubernetesSettings (Optional) Kubernetes contains constraints regarding allowed values of the ‘kubernetes’ block in the Shoot specification.\n machineImages []MachineImage (Optional) MachineImages contains constraints regarding allowed values for machine images in the Shoot specification.\n machineTypes []MachineType (Optional) MachineTypes contains constraints regarding allowed values for machine types in the ‘workers’ block in the Shoot specification.\n volumeTypes []VolumeType (Optional) VolumeTypes contains constraints regarding allowed values for volume types in the ‘workers’ block in the Shoot specification.\n parent CloudProfileReference Parent contains a reference to a CloudProfile it inherits from.\n NamespacedCloudProfileStatus (Appears on: NamespacedCloudProfile) NamespacedCloudProfileStatus holds the most recently observed status of the NamespacedCloudProfile.\n Field Description cloudProfileSpec CloudProfileSpec CloudProfile is the most recently generated CloudProfile of the NamespacedCloudProfile.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this NamespacedCloudProfile.\n Networking (Appears on: ShootSpec) Networking defines networking parameters for the shoot cluster.\n Field Description type string (Optional) Type identifies the type of the networking plugin. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to network resource.\n pods string (Optional) Pods is the CIDR of the pod network. This field is immutable.\n nodes string (Optional) Nodes is the CIDR of the entire node network. This field is mutable.\n services string (Optional) Services is the CIDR of the service network. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md. Defaults to [“IPv4”].\n NetworkingStatus (Appears on: ShootStatus) NetworkingStatus contains information about cluster networking such as CIDRs.\n Field Description pods []string (Optional) Pods are the CIDRs of the pod network.\n nodes []string (Optional) Nodes are the CIDRs of the node network.\n services []string (Optional) Services are the CIDRs of the service network.\n NginxIngress (Appears on: Addons) NginxIngress describes configuration values for the nginx-ingress addon.\n Field Description Addon Addon (Members of Addon are embedded into this type.) loadBalancerSourceRanges []string (Optional) LoadBalancerSourceRanges is list of allowed IP sources for NginxIngress\n config map[string]string (Optional) Config contains custom configuration for the nginx-ingress-controller configuration. See https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/nginx-configuration/configmap.md#configuration-options\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy controls the .spec.externalTrafficPolicy value of the load balancer Service exposing the nginx-ingress. Defaults to Cluster.\n NodeLocalDNS (Appears on: SystemComponents) NodeLocalDNS contains the settings of the node local DNS components running in the data plane of the Shoot cluster.\n Field Description enabled bool Enabled indicates whether node local DNS is enabled or not.\n forceTCPToClusterDNS bool (Optional) ForceTCPToClusterDNS indicates whether the connection from the node local DNS to the cluster DNS (Core DNS) will be forced to TCP or not. Default, if unspecified, is to enforce TCP.\n forceTCPToUpstreamDNS bool (Optional) ForceTCPToUpstreamDNS indicates whether the connection from the node local DNS to the upstream DNS (infrastructure DNS) will be forced to TCP or not. Default, if unspecified, is to enforce TCP.\n disableForwardToUpstreamDNS bool (Optional) DisableForwardToUpstreamDNS indicates whether requests from node local DNS to upstream DNS should be disabled. Default, if unspecified, is to forward requests for external domains to upstream DNS\n OCIRepository (Appears on: HelmControllerDeployment) OCIRepository configures where to pull an OCI Artifact, that could contain for example a Helm Chart.\n Field Description ref string (Optional) Ref is the full artifact Ref and takes precedence over all other fields.\n repository string (Optional) Repository is a reference to an OCI artifact repository.\n tag string (Optional) Tag is the image tag to pull.\n digest string (Optional) Digest of the image to pull, takes precedence over tag.\n OIDCConfig (Appears on: KubeAPIServerConfig) OIDCConfig contains configuration settings for the OIDC provider. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description caBundle string (Optional) If set, the OpenID server’s certificate will be verified by one of the authorities in the oidc-ca-file, otherwise the host’s root CA set will be used.\n clientAuthentication OpenIDConnectClientAuthentication (Optional) ClientAuthentication can optionally contain client configuration used for kubeconfig generation.\nDeprecated: This field has no implemented use and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n clientID string (Optional) The client ID for the OpenID Connect client, must be set.\n groupsClaim string (Optional) If provided, the name of a custom OpenID Connect claim for specifying user groups. The claim value is expected to be a string or array of strings. This flag is experimental, please see the authentication documentation for further details.\n groupsPrefix string (Optional) If provided, all groups will be prefixed with this value to prevent conflicts with other authentication strategies.\n issuerURL string (Optional) The URL of the OpenID issuer, only HTTPS scheme will be accepted. Used to verify the OIDC JSON Web Token (JWT).\n requiredClaims map[string]string (Optional) key=value pairs that describes a required claim in the ID Token. If set, the claim is verified to be present in the ID Token with a matching value.\n signingAlgs []string (Optional) List of allowed JOSE asymmetric signing algorithms. JWTs with a ‘alg’ header value not in this list will be rejected. Values are defined by RFC 7518 https://tools.ietf.org/html/rfc7518#section-3.1\n usernameClaim string (Optional) The OpenID claim to use as the user name. Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable. This flag is experimental, please see the authentication documentation for further details. (default “sub”)\n usernamePrefix string (Optional) If provided, all usernames will be prefixed with this value. If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes. To skip any prefixing, provide the value ‘-’.\n ObservabilityRotation (Appears on: ShootCredentialsRotation) ObservabilityRotation contains information about the observability credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the observability credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the observability credential rotation was successfully completed.\n OpenIDConnectClientAuthentication (Appears on: OIDCConfig) OpenIDConnectClientAuthentication contains configuration for OIDC clients.\n Field Description extraConfig map[string]string (Optional) Extra configuration added to kubeconfig’s auth-provider. Must not be any of idp-issuer-url, client-id, client-secret, idp-certificate-authority, idp-certificate-authority-data, id-token or refresh-token\n secret string (Optional) The client Secret for the OpenID Connect client.\n ProjectMember (Appears on: ProjectSpec) ProjectMember is a member of a project.\n Field Description Subject Kubernetes rbac/v1.Subject (Members of Subject are embedded into this type.) Subject is representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n role string Role represents the role of this member. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the roles list. TODO: Remove this field in favor of the roles list in v1.\n roles []string (Optional) Roles represents the list of roles of this member.\n ProjectPhase (string alias)\n (Appears on: ProjectStatus) ProjectPhase is a label for the condition of a project at the current time.\nProjectSpec (Appears on: Project) ProjectSpec is the specification of a Project.\n Field Description createdBy Kubernetes rbac/v1.Subject (Optional) CreatedBy is a subject representing a user name, an email address, or any other identifier of a user who created the project. This field is immutable.\n description string (Optional) Description is a human-readable description of what the project is used for.\n owner Kubernetes rbac/v1.Subject (Optional) Owner is a subject representing a user name, an email address, or any other identifier of a user owning the project. IMPORTANT: Be aware that this field will be removed in the v1 version of this API in favor of the owner role. The only way to change the owner will be by moving the owner role. In this API version the only way to change the owner is to use this field. TODO: Remove this field in favor of the owner role in v1.\n purpose string (Optional) Purpose is a human-readable explanation of the project’s purpose.\n members []ProjectMember (Optional) Members is a list of subjects representing a user name, an email address, or any other identifier of a user, group, or service account that has a certain role.\n namespace string (Optional) Namespace is the name of the namespace that has been created for the Project object. A nil value means that Gardener will determine the name of the namespace. This field is immutable.\n tolerations ProjectTolerations (Optional) Tolerations contains the tolerations for taints on seed clusters.\n dualApprovalForDeletion []DualApprovalForDeletion (Optional) DualApprovalForDeletion contains configuration for the dual approval concept for resource deletion.\n ProjectStatus (Appears on: Project) ProjectStatus holds the most recently observed status of the project.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this project.\n phase ProjectPhase Phase is the current phase of the project.\n staleSinceTimestamp Kubernetes meta/v1.Time (Optional) StaleSinceTimestamp contains the timestamp when the project was first discovered to be stale/unused.\n staleAutoDeleteTimestamp Kubernetes meta/v1.Time (Optional) StaleAutoDeleteTimestamp contains the timestamp when the project will be garbage-collected/automatically deleted because it’s stale/unused.\n lastActivityTimestamp Kubernetes meta/v1.Time (Optional) LastActivityTimestamp contains the timestamp from the last activity performed in this project.\n ProjectTolerations (Appears on: ProjectSpec) ProjectTolerations contains the tolerations for taints on seed clusters.\n Field Description defaults []Toleration (Optional) Defaults contains a list of tolerations that are added to the shoots in this project by default.\n whitelist []Toleration (Optional) Whitelist contains a list of tolerations that are allowed to be added to the shoots in this project. Please note that this list may only be added by users having the spec-tolerations-whitelist verb for project resources.\n Provider (Appears on: ShootSpec) Provider contains provider-specific information that are handed-over to the provider-specific extension controller.\n Field Description type string Type is the type of the provider. This field is immutable.\n controlPlaneConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ControlPlaneConfig contains the provider-specific control plane config blob. Please look up the concrete definition in the documentation of your provider extension.\n infrastructureConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureConfig contains the provider-specific infrastructure config blob. Please look up the concrete definition in the documentation of your provider extension.\n workers []Worker (Optional) Workers is a list of worker groups.\n workersSettings WorkersSettings (Optional) WorkersSettings contains settings for all workers.\n ProxyMode (string alias)\n (Appears on: KubeProxyConfig) ProxyMode available in Linux platform: ‘userspace’ (older, going to be EOL), ‘iptables’ (newer, faster), ‘ipvs’ (newest, better in performance and scalability). As of now only ‘iptables’ and ‘ipvs’ is supported by Gardener. In Linux platform, if the iptables proxy is selected, regardless of how, but the system’s kernel or iptables versions are insufficient, this always falls back to the userspace proxy. IPVS mode will be enabled when proxy mode is set to ‘ipvs’, and the fall back path is firstly iptables and then userspace.\nQuotaSpec (Appears on: Quota) QuotaSpec is the specification of a Quota.\n Field Description clusterLifetimeDays int32 (Optional) ClusterLifetimeDays is the lifetime of a Shoot cluster in days before it will be terminated automatically.\n metrics Kubernetes core/v1.ResourceList Metrics is a list of resources which will be put under constraints.\n scope Kubernetes core/v1.ObjectReference Scope is the scope of the Quota object, either ‘project’, ‘secret’ or ‘workloadidentity’. This field is immutable.\n Region (Appears on: CloudProfileSpec) Region contains certain properties of a region.\n Field Description name string Name is a region name.\n zones []AvailabilityZone (Optional) Zones is a list of availability zones in this region.\n labels map[string]string (Optional) Labels is an optional set of key-value pairs that contain certain administrator-controlled labels for this region. It can be used by Gardener administrators/operators to provide additional information about a region, e.g. wrt quality, reliability, access restrictions, etc.\n ResourceData (Appears on: ShootStateSpec) ResourceData holds the data of a resource referred to by an extension controller state.\n Field Description CrossVersionObjectReference Kubernetes autoscaling/v1.CrossVersionObjectReference (Members of CrossVersionObjectReference are embedded into this type.) data k8s.io/apimachinery/pkg/runtime.RawExtension Data of the resource\n ResourceWatchCacheSize (Appears on: WatchCacheSizes) ResourceWatchCacheSize contains configuration of the API server’s watch cache size for one specific resource.\n Field Description apiGroup string (Optional) APIGroup is the API group of the resource for which the watch cache size should be configured. An unset value is used to specify the legacy core API (e.g. for secrets).\n resource string Resource is the name of the resource for which the watch cache size should be configured (in lowercase plural form, e.g. secrets).\n size int32 CacheSize specifies the watch cache size that should be configured for the specified resource.\n SSHAccess (Appears on: WorkersSettings) SSHAccess contains settings regarding ssh access to the worker nodes.\n Field Description enabled bool Enabled indicates whether the SSH access to the worker nodes is ensured to be enabled or disabled in systemd. Defaults to true.\n SchedulingProfile (string alias)\n (Appears on: KubeSchedulerConfig) SchedulingProfile is a string alias used for scheduling profile values.\nSecretBindingProvider (Appears on: SecretBinding) SecretBindingProvider defines the provider type of the SecretBinding.\n Field Description type string Type is the type of the provider.\nFor backwards compatibility, the field can contain multiple providers separated by a comma. However the usage of single SecretBinding (hence Secret) for different cloud providers is strongly discouraged.\n SeedBackup (Appears on: SeedSpec) SeedBackup contains the object store configuration for backups for shoot (currently only etcd).\n Field Description provider string Provider is a provider name. This field is immutable.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to BackupBucket resource.\n region string (Optional) Region is a region name. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a Secret object containing the cloud provider credentials for the object store where backups should be stored. It should have enough privileges to manipulate the objects as well as buckets.\n SeedDNS (Appears on: SeedSpec) SeedDNS contains DNS-relevant information about this seed cluster.\n Field Description provider SeedDNSProvider (Optional) Provider configures a DNSProvider\n SeedDNSProvider (Appears on: SeedDNS) SeedDNSProvider configures a DNSProvider for Seeds\n Field Description type string Type describes the type of the dns-provider, for example aws-route53\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a Secret object containing cloud provider credentials used for registering external domains.\n SeedNetworks (Appears on: SeedSpec) SeedNetworks contains CIDRs for the pod, service and node networks of a Kubernetes cluster.\n Field Description nodes string (Optional) Nodes is the CIDR of the node network. This field is immutable.\n pods string Pods is the CIDR of the pod network. This field is immutable.\n services string Services is the CIDR of the service network. This field is immutable.\n shootDefaults ShootNetworks (Optional) ShootDefaults contains the default networks CIDRs for shoots.\n blockCIDRs []string (Optional) BlockCIDRs is a list of network addresses that should be blocked for shoot control plane components running in the seed cluster.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for seed networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md. Defaults to [“IPv4”].\n SeedProvider (Appears on: SeedSpec) SeedProvider defines the provider-specific information of this Seed cluster.\n Field Description type string Type is the name of the provider.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to Seed resource.\n region string Region is a name of a region.\n zones []string (Optional) Zones is the list of availability zones the seed cluster is deployed to.\n SeedSelector (Appears on: CloudProfileSpec, ExposureClassScheduling, ShootSpec) SeedSelector contains constraints for selecting seed to be usable for shoots using a profile\n Field Description LabelSelector Kubernetes meta/v1.LabelSelector (Members of LabelSelector are embedded into this type.) (Optional) LabelSelector is optional and can be used to select seeds by their label settings\n providerTypes []string (Optional) Providers is optional and can be used by restricting seeds by their provider type. ‘*’ can be used to enable seeds regardless of their provider type.\n SeedSettingDependencyWatchdog (Appears on: SeedSettings) SeedSettingDependencyWatchdog controls the dependency-watchdog settings for the seed.\n Field Description weeder SeedSettingDependencyWatchdogWeeder (Optional) Weeder controls the weeder settings for the dependency-watchdog for the seed.\n prober SeedSettingDependencyWatchdogProber (Optional) Prober controls the prober settings for the dependency-watchdog for the seed.\n SeedSettingDependencyWatchdogProber (Appears on: SeedSettingDependencyWatchdog) SeedSettingDependencyWatchdogProber controls the prober settings for the dependency-watchdog for the seed.\n Field Description enabled bool Enabled controls whether the probe controller(prober) of the dependency-watchdog should be enabled. This controller scales down the kube-controller-manager, machine-controller-manager and cluster-autoscaler of shoot clusters in case their respective kube-apiserver is not reachable via its external ingress in order to avoid melt-down situations.\n SeedSettingDependencyWatchdogWeeder (Appears on: SeedSettingDependencyWatchdog) SeedSettingDependencyWatchdogWeeder controls the weeder settings for the dependency-watchdog for the seed.\n Field Description enabled bool Enabled controls whether the endpoint controller(weeder) of the dependency-watchdog should be enabled. This controller helps to alleviate the delay where control plane components remain unavailable by finding the respective pods in CrashLoopBackoff status and restarting them once their dependants become ready and available again.\n SeedSettingExcessCapacityReservation (Appears on: SeedSettings) SeedSettingExcessCapacityReservation controls the excess capacity reservation for shoot control planes in the seed.\n Field Description enabled bool (Optional) Enabled controls whether the default excess capacity reservation should be enabled. When not specified, the functionality is enabled.\n configs []SeedSettingExcessCapacityReservationConfig (Optional) Configs configures excess capacity reservation deployments for shoot control planes in the seed.\n SeedSettingExcessCapacityReservationConfig (Appears on: SeedSettingExcessCapacityReservation) SeedSettingExcessCapacityReservationConfig configures excess capacity reservation deployments for shoot control planes in the seed.\n Field Description resources Kubernetes core/v1.ResourceList Resources specify the resource requests and limits of the excess-capacity-reservation pod.\n nodeSelector map[string]string (Optional) NodeSelector specifies the node where the excess-capacity-reservation pod should run.\n tolerations []Kubernetes core/v1.Toleration (Optional) Tolerations specify the tolerations for the the excess-capacity-reservation pod.\n SeedSettingLoadBalancerServices (Appears on: SeedSettings) SeedSettingLoadBalancerServices controls certain settings for services of type load balancer that are created in the seed.\n Field Description annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into every load balancer service object.\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy describes how nodes distribute service traffic they receive on one of the service’s “externally-facing” addresses. Defaults to “Cluster”.\n zones []SeedSettingLoadBalancerServicesZones (Optional) Zones controls settings, which are specific to the single-zone load balancers in a multi-zonal setup. Can be empty for single-zone seeds. Each specified zone has to relate to one of the zones in seed.spec.provider.zones.\n proxyProtocol LoadBalancerServicesProxyProtocol (Optional) ProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services. Defaults to nil, which is equivalent to not allowing ProxyProtocol.\n SeedSettingLoadBalancerServicesZones (Appears on: SeedSettingLoadBalancerServices) SeedSettingLoadBalancerServicesZones controls settings, which are specific to the single-zone load balancers in a multi-zonal setup.\n Field Description name string Name is the name of the zone as specified in seed.spec.provider.zones.\n annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into the zone-specific load balancer service object.\n externalTrafficPolicy Kubernetes core/v1.ServiceExternalTrafficPolicy (Optional) ExternalTrafficPolicy describes how nodes distribute service traffic they receive on one of the service’s “externally-facing” addresses. Defaults to “Cluster”.\n proxyProtocol LoadBalancerServicesProxyProtocol (Optional) ProxyProtocol controls whether ProxyProtocol is (optionally) allowed for the load balancer services. Defaults to nil, which is equivalent to not allowing ProxyProtocol.\n SeedSettingScheduling (Appears on: SeedSettings) SeedSettingScheduling controls settings for scheduling decisions for the seed.\n Field Description visible bool Visible controls whether the gardener-scheduler shall consider this seed when scheduling shoots. Invisible seeds are not considered by the scheduler.\n SeedSettingTopologyAwareRouting (Appears on: SeedSettings) SeedSettingTopologyAwareRouting controls certain settings for topology-aware traffic routing in the seed. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Field Description enabled bool Enabled controls whether certain Services deployed in the seed cluster should be topology-aware. These Services are etcd-main-client, etcd-events-client, kube-apiserver, gardener-resource-manager and vpa-webhook.\n SeedSettingVerticalPodAutoscaler (Appears on: SeedSettings) SeedSettingVerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n Field Description enabled bool Enabled controls whether the VPA components shall be deployed into the garden namespace in the seed cluster. It is enabled by default because Gardener heavily relies on a VPA being deployed. You should only disable this if your seed cluster already has another, manually/custom managed VPA deployment.\n SeedSettings (Appears on: SeedSpec) SeedSettings contains certain settings for this seed cluster.\n Field Description excessCapacityReservation SeedSettingExcessCapacityReservation (Optional) ExcessCapacityReservation controls the excess capacity reservation for shoot control planes in the seed.\n scheduling SeedSettingScheduling (Optional) Scheduling controls settings for scheduling decisions for the seed.\n loadBalancerServices SeedSettingLoadBalancerServices (Optional) LoadBalancerServices controls certain settings for services of type load balancer that are created in the seed.\n verticalPodAutoscaler SeedSettingVerticalPodAutoscaler (Optional) VerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n dependencyWatchdog SeedSettingDependencyWatchdog (Optional) DependencyWatchdog controls certain settings for the dependency-watchdog components deployed in the seed.\n topologyAwareRouting SeedSettingTopologyAwareRouting (Optional) TopologyAwareRouting controls certain settings for topology-aware traffic routing in the seed. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n SeedSpec (Appears on: Seed, SeedTemplate) SeedSpec is the specification of a Seed.\n Field Description backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n SeedStatus (Appears on: Seed) SeedStatus is the status of a Seed.\n Field Description gardener Gardener (Optional) Gardener holds information about the Gardener which last acted on the Shoot.\n kubernetesVersion string (Optional) KubernetesVersion is the Kubernetes version of the seed cluster.\n conditions []Condition (Optional) Conditions represents the latest available observations of a Seed’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Seed. It corresponds to the Seed’s generation, which is updated on mutation by the API Server.\n clusterIdentity string (Optional) ClusterIdentity is the identity of the Seed cluster. This field is immutable.\n capacity Kubernetes core/v1.ResourceList (Optional) Capacity represents the total resources of a seed.\n allocatable Kubernetes core/v1.ResourceList (Optional) Allocatable represents the resources of a seed that are available for scheduling. Defaults to Capacity.\n clientCertificateExpirationTimestamp Kubernetes meta/v1.Time (Optional) ClientCertificateExpirationTimestamp is the timestamp at which gardenlet’s client certificate expires.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the Seed.\n SeedTaint (Appears on: SeedSpec) SeedTaint describes a taint on a seed.\n Field Description key string Key is the taint key to be applied to a seed.\n value string (Optional) Value is the taint value corresponding to the taint key.\n SeedTemplate SeedTemplate is a template for creating a Seed object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec SeedSpec (Optional) Specification of the desired behavior of the Seed.\n backup SeedBackup (Optional) Backup holds the object store configuration for the backups of shoot (currently only etcd). If it is not specified, then there won’t be any backups taken for shoots associated with this seed. If backup field is present in seed, then backups of the etcd from shoot control plane will be stored under the configured object store.\n dns SeedDNS DNS contains DNS-relevant information about this seed cluster.\n networks SeedNetworks Networks defines the pod, service and worker network of the Seed cluster.\n provider SeedProvider Provider defines the provider type and region for this Seed cluster.\n taints []SeedTaint (Optional) Taints describes taints on the seed.\n volume SeedVolume (Optional) Volume contains settings for persistentvolumes created in the seed cluster.\n settings SeedSettings (Optional) Settings contains certain settings for this seed cluster.\n ingress Ingress (Optional) Ingress configures Ingress specific settings of the Seed cluster. This field is immutable.\n SeedVolume (Appears on: SeedSpec) SeedVolume contains settings for persistentvolumes created in the seed cluster.\n Field Description minimumSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinimumSize defines the minimum size that should be used for PVCs in the seed.\n providers []SeedVolumeProvider (Optional) Providers is a list of storage class provisioner types for the seed.\n SeedVolumeProvider (Appears on: SeedVolume) SeedVolumeProvider is a storage class provisioner type.\n Field Description purpose string Purpose is the purpose of this provider.\n name string Name is the name of the storage class provisioner type.\n ServiceAccountConfig (Appears on: KubeAPIServerConfig) ServiceAccountConfig is the kube-apiserver configuration for service accounts.\n Field Description issuer string (Optional) Issuer is the identifier of the service account token issuer. The issuer will assert this identifier in “iss” claim of issued tokens. This value is used to generate new service account tokens. This value is a string or URI. Defaults to URI of the API server.\n extendTokenExpiration bool (Optional) ExtendTokenExpiration turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.\n maxTokenExpiration Kubernetes meta/v1.Duration (Optional) MaxTokenExpiration is the maximum validity duration of a token created by the service account token issuer. If an otherwise valid TokenRequest with a validity duration larger than this value is requested, a token will be issued with a validity duration of this value. This field must be within [30d,90d].\n acceptedIssuers []string (Optional) AcceptedIssuers is an additional set of issuers that are used to determine which service account tokens are accepted. These values are not used to generate new service account tokens. Only useful when service account tokens are also issued by another external system or a change of the current issuer that is used for generating tokens is being performed.\n ServiceAccountKeyRotation (Appears on: ShootCredentialsRotation) ServiceAccountKeyRotation contains information about the service account key credential rotation.\n Field Description phase CredentialsRotationPhase Phase describes the phase of the service account key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the service account key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the service account key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the service account key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the service account key credential rotation completion was triggered.\n ShootAdvertisedAddress (Appears on: ShootStatus) ShootAdvertisedAddress contains information for the shoot’s Kube API server.\n Field Description name string Name of the advertised address. e.g. external\n url string The URL of the API Server. e.g. https://api.foo.bar or https://1.2.3.4\n ShootCredentials (Appears on: ShootStatus) ShootCredentials contains information about the shoot credentials.\n Field Description rotation ShootCredentialsRotation (Optional) Rotation contains information about the credential rotations.\n ShootCredentialsRotation (Appears on: ShootCredentials) ShootCredentialsRotation contains information about the rotation of credentials.\n Field Description certificateAuthorities CARotation (Optional) CertificateAuthorities contains information about the certificate authority credential rotation.\n kubeconfig ShootKubeconfigRotation (Optional) Kubeconfig contains information about the kubeconfig credential rotation.\n sshKeypair ShootSSHKeypairRotation (Optional) SSHKeypair contains information about the ssh-keypair credential rotation.\n observability ObservabilityRotation (Optional) Observability contains information about the observability credential rotation.\n serviceAccountKey ServiceAccountKeyRotation (Optional) ServiceAccountKey contains information about the service account key credential rotation.\n etcdEncryptionKey ETCDEncryptionKeyRotation (Optional) ETCDEncryptionKey contains information about the ETCD encryption key credential rotation.\n ShootKubeconfigRotation (Appears on: ShootCredentialsRotation) ShootKubeconfigRotation contains information about the kubeconfig credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the kubeconfig credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the kubeconfig credential rotation was successfully completed.\n ShootMachineImage (Appears on: Machine) ShootMachineImage defines the name and the version of the shoot’s machine image in any environment. Has to be defined in the respective CloudProfile.\n Field Description name string Name is the name of the image.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the shoot’s individual configuration passed to an extension resource.\n version string (Optional) Version is the version of the shoot’s image. If version is not provided, it will be defaulted to the latest version from the CloudProfile.\n ShootNetworks (Appears on: SeedNetworks) ShootNetworks contains the default networks CIDRs for shoots.\n Field Description pods string (Optional) Pods is the CIDR of the pod network.\n services string (Optional) Services is the CIDR of the service network.\n ShootPurpose (string alias)\n (Appears on: ShootSpec) ShootPurpose is a type alias for string.\nShootSSHKeypairRotation (Appears on: ShootCredentialsRotation) ShootSSHKeypairRotation contains information about the ssh-keypair credential rotation.\n Field Description lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the ssh-keypair credential rotation was initiated.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the ssh-keypair credential rotation was successfully completed.\n ShootSpec (Appears on: Shoot, ShootTemplate) ShootSpec is the specification of a Shoot.\n Field Description addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n ShootStateSpec (Appears on: ShootState) ShootStateSpec is the specification of the ShootState.\n Field Description gardener []GardenerResourceData (Optional) Gardener holds the data required to generate resources deployed by the gardenlet\n extensions []ExtensionResourceState (Optional) Extensions holds the state of custom resources reconciled by extension controllers in the seed\n resources []ResourceData (Optional) Resources holds the data of resources referred to by extension controller states\n ShootStatus (Appears on: Shoot) ShootStatus holds the most recently observed status of the Shoot cluster.\n Field Description conditions []Condition (Optional) Conditions represents the latest available observations of a Shoots’s current state.\n constraints []Condition (Optional) Constraints represents conditions of a Shoot’s current state that constraint some operations on it.\n gardener Gardener Gardener holds information about the Gardener which last acted on the Shoot.\n hibernated bool IsHibernated indicates whether the Shoot is currently hibernated.\n lastOperation LastOperation (Optional) LastOperation holds information about the last operation on the Shoot.\n lastErrors []LastError (Optional) LastErrors holds information about the last occurred error(s) during an operation.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Shoot. It corresponds to the Shoot’s generation, which is updated on mutation by the API Server.\n retryCycleStartTime Kubernetes meta/v1.Time (Optional) RetryCycleStartTime is the start time of the last retry cycle (used to determine how often an operation must be retried until we give up).\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot. This value is only written after a successful create/reconcile operation. It will be used when control planes are moved between Seeds.\n technicalID string TechnicalID is the name that is used for creating the Seed namespace, the infrastructure resources, and basically everything that is related to this particular Shoot. This field is immutable.\n uid k8s.io/apimachinery/pkg/types.UID UID is a unique identifier for the Shoot cluster to avoid portability between Kubernetes clusters. It is used to compute unique hashes. This field is immutable.\n clusterIdentity string (Optional) ClusterIdentity is the identity of the Shoot cluster. This field is immutable.\n advertisedAddresses []ShootAdvertisedAddress (Optional) List of addresses that are relevant to the shoot. These include the Kube API server address and also the service account issuer.\n migrationStartTime Kubernetes meta/v1.Time (Optional) MigrationStartTime is the time when a migration to a different seed was initiated.\n credentials ShootCredentials (Optional) Credentials contains information about the shoot credentials.\n lastHibernationTriggerTime Kubernetes meta/v1.Time (Optional) LastHibernationTriggerTime indicates the last time when the hibernation controller managed to change the hibernation settings of the cluster\n lastMaintenance LastMaintenance (Optional) LastMaintenance holds information about the last maintenance operations on the Shoot.\n encryptedResources []string (Optional) EncryptedResources is the list of resources in the Shoot which are currently encrypted. Secrets are encrypted by default and are not part of the list. See https://github.com/gardener/gardener/blob/master/docs/usage/etcd_encryption_config.md for more details.\n networking NetworkingStatus (Optional) Networking contains information about cluster networking such as CIDRs.\n ShootTemplate ShootTemplate is a template for creating a Shoot object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ShootSpec (Optional) Specification of the desired behavior of the Shoot.\n addons Addons (Optional) Addons contains information about enabled/disabled addons and their configuration.\n cloudProfileName string (Optional) CloudProfileName is a name of a CloudProfile object. This field will be deprecated soon, use CloudProfile instead.\n dns DNS (Optional) DNS contains information about the DNS settings of the Shoot.\n extensions []Extension (Optional) Extensions contain type and provider information for Shoot extensions.\n hibernation Hibernation (Optional) Hibernation contains information whether the Shoot is suspended or not.\n kubernetes Kubernetes Kubernetes contains the version and configuration settings of the control plane components.\n networking Networking (Optional) Networking contains information about cluster networking such as CNI Plugin type, CIDRs, …etc.\n maintenance Maintenance (Optional) Maintenance contains information about the time window for maintenance operations and which operations should be performed.\n monitoring Monitoring (Optional) Monitoring contains information about custom monitoring configurations for the shoot.\n provider Provider Provider contains all provider-specific and provider-relevant information.\n purpose ShootPurpose (Optional) Purpose is the purpose class for this cluster.\n region string Region is a name of a region. This field is immutable.\n secretBindingName string (Optional) SecretBindingName is the name of the a SecretBinding that has a reference to the provider secret. The credentials inside the provider secret will be used to create the shoot in the respective account. The field is mutually exclusive with CredentialsBindingName. This field is immutable.\n seedName string (Optional) SeedName is the name of the seed cluster that runs the control plane of the Shoot.\n seedSelector SeedSelector (Optional) SeedSelector is an optional selector which must match a seed’s labels for the shoot to be scheduled on that seed.\n resources []NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in extension configs by their names.\n tolerations []Toleration (Optional) Tolerations contains the tolerations for taints on seed clusters.\n exposureClassName string (Optional) ExposureClassName is the optional name of an exposure class to apply a control plane endpoint exposure strategy. This field is immutable.\n systemComponents SystemComponents (Optional) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n controlPlane ControlPlane (Optional) ControlPlane contains general settings for the control plane of the shoot.\n schedulerName string (Optional) SchedulerName is the name of the responsible scheduler which schedules the shoot. If not specified, the default scheduler takes over. This field is immutable.\n cloudProfile CloudProfileReference (Optional) CloudProfile contains a reference to a CloudProfile or a NamespacedCloudProfile.\n credentialsBindingName string (Optional) CredentialsBindingName is the name of the a CredentialsBinding that has a reference to the provider credentials. The credentials will be used to create the shoot in the respective account. The field is mutually exclusive with SecretBindingName.\n StructuredAuthentication (Appears on: KubeAPIServerConfig) StructuredAuthentication contains authentication config for kube-apiserver.\n Field Description configMapName string ConfigMapName is the name of the ConfigMap in the project namespace which contains AuthenticationConfiguration for the kube-apiserver.\n SwapBehavior (string alias)\n (Appears on: MemorySwapConfiguration) SwapBehavior configures swap memory available to container workloads\nSystemComponents (Appears on: ShootSpec) SystemComponents contains the settings of system components in the control or data plane of the Shoot cluster.\n Field Description coreDNS CoreDNS (Optional) CoreDNS contains the settings of the Core DNS components running in the data plane of the Shoot cluster.\n nodeLocalDNS NodeLocalDNS (Optional) NodeLocalDNS contains the settings of the node local DNS components running in the data plane of the Shoot cluster.\n Toleration (Appears on: ExposureClassScheduling, ProjectTolerations, ShootSpec) Toleration is a toleration for a seed taint.\n Field Description key string Key is the toleration key to be applied to a project or shoot.\n value string (Optional) Value is the toleration value corresponding to the toleration key.\n VersionClassification (string alias)\n (Appears on: ExpirableVersion) VersionClassification is the logical state of a version.\nVerticalPodAutoscaler (Appears on: Kubernetes) VerticalPodAutoscaler contains the configuration flags for the Kubernetes vertical pod autoscaler.\n Field Description enabled bool Enabled specifies whether the Kubernetes VPA shall be enabled for the shoot cluster.\n evictAfterOOMThreshold Kubernetes meta/v1.Duration (Optional) EvictAfterOOMThreshold defines the threshold that will lead to pod eviction in case it OOMed in less than the given threshold since its start and if it has only one container (default: 10m0s).\n evictionRateBurst int32 (Optional) EvictionRateBurst defines the burst of pods that can be evicted (default: 1)\n evictionRateLimit float64 (Optional) EvictionRateLimit defines the number of pods that can be evicted per second. A rate limit set to 0 or -1 will disable the rate limiter (default: -1).\n evictionTolerance float64 (Optional) EvictionTolerance defines the fraction of replica count that can be evicted for update in case more than one pod can be evicted (default: 0.5).\n recommendationMarginFraction float64 (Optional) RecommendationMarginFraction is the fraction of usage added as the safety margin to the recommended request (default: 0.15).\n updaterInterval Kubernetes meta/v1.Duration (Optional) UpdaterInterval is the interval how often the updater should run (default: 1m0s).\n recommenderInterval Kubernetes meta/v1.Duration (Optional) RecommenderInterval is the interval how often metrics should be fetched (default: 1m0s).\n targetCPUPercentile float64 (Optional) TargetCPUPercentile is the usage percentile that will be used as a base for CPU target recommendation. Doesn’t affect CPU lower bound, CPU upper bound nor memory recommendations. (default: 0.9)\n recommendationLowerBoundCPUPercentile float64 (Optional) RecommendationLowerBoundCPUPercentile is the usage percentile that will be used for the lower bound on CPU recommendation. (default: 0.5)\n recommendationUpperBoundCPUPercentile float64 (Optional) RecommendationUpperBoundCPUPercentile is the usage percentile that will be used for the upper bound on CPU recommendation. (default: 0.95)\n targetMemoryPercentile float64 (Optional) TargetMemoryPercentile is the usage percentile that will be used as a base for memory target recommendation. Doesn’t affect memory lower bound nor memory upper bound. (default: 0.9)\n recommendationLowerBoundMemoryPercentile float64 (Optional) RecommendationLowerBoundMemoryPercentile is the usage percentile that will be used for the lower bound on memory recommendation. (default: 0.5)\n recommendationUpperBoundMemoryPercentile float64 (Optional) RecommendationUpperBoundMemoryPercentile is the usage percentile that will be used for the upper bound on memory recommendation. (default: 0.95)\n Volume (Appears on: Worker) Volume contains information about the volume type, size, and encryption.\n Field Description name string (Optional) Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string VolumeSize is the size of the volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n VolumeType (Appears on: CloudProfileSpec, NamespacedCloudProfileSpec) VolumeType contains certain properties of a volume type.\n Field Description class string Class is the class of the volume type.\n name string Name is the name of the volume type.\n usable bool (Optional) Usable defines if the volume type can be used for shoot clusters.\n minSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinSize is the minimal supported storage size.\n WatchCacheSizes (Appears on: KubeAPIServerConfig) WatchCacheSizes contains configuration of the API server’s watch cache sizes.\n Field Description default int32 (Optional) Default configures the default watch cache size of the kube-apiserver (flag --default-watch-cache-size, defaults to 100). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n resources []ResourceWatchCacheSize (Optional) Resources configures the watch cache size of the kube-apiserver per resource (flag --watch-cache-sizes). See: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/\n Worker (Appears on: Provider) Worker is the base definition of a worker group.\n Field Description annotations map[string]string (Optional) Annotations is a map of key/value pairs for annotations for all the Node objects in this worker pool.\n caBundle string (Optional) CABundle is a certificate bundle which will be installed onto every machine of this worker pool.\n cri CRI (Optional) CRI contains configurations of CRI support of every machine in the worker pool. Defaults to a CRI with name containerd.\n kubernetes WorkerKubernetes (Optional) Kubernetes contains configuration for Kubernetes components related to this worker pool.\n labels map[string]string (Optional) Labels is a map of key/value pairs for labels for all the Node objects in this worker pool.\n name string Name is the name of the worker group.\n machine Machine Machine contains information about the machine type and image.\n maximum int32 Maximum is the maximum number of machines to create. This value is divided by the number of configured zones for a fair distribution.\n minimum int32 Minimum is the minimum number of machines to create. This value is divided by the number of configured zones for a fair distribution.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) MaxSurge is maximum number of machines that are created during an update. This value is divided by the number of configured zones for a fair distribution.\n maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString (Optional) MaxUnavailable is the maximum number of machines that can be unavailable during an update. This value is divided by the number of configured zones for a fair distribution.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the provider-specific configuration for this worker pool.\n taints []Kubernetes core/v1.Taint (Optional) Taints is a list of taints for all the Node objects in this worker pool.\n volume Volume (Optional) Volume contains information about the volume type and size.\n dataVolumes []DataVolume (Optional) DataVolumes contains a list of additional worker volumes.\n kubeletDataVolumeName string (Optional) KubeletDataVolumeName contains the name of a dataVolume that should be used for storing kubelet state.\n zones []string (Optional) Zones is a list of availability zones that are used to evenly distribute this worker pool. Optional as not every provider may support availability zones.\n systemComponents WorkerSystemComponents (Optional) SystemComponents contains configuration for system components related to this worker pool\n machineControllerManager MachineControllerManagerSettings (Optional) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n sysctls map[string]string (Optional) Sysctls is a map of kernel settings to apply on all machines in this worker pool.\n clusterAutoscaler ClusterAutoscalerOptions (Optional) ClusterAutoscaler contains the cluster autoscaler configurations for the worker pool.\n WorkerKubernetes (Appears on: Worker) WorkerKubernetes contains configuration for Kubernetes components related to this worker pool.\n Field Description kubelet KubeletConfig (Optional) Kubelet contains configuration settings for all kubelets of this worker pool. If set, all spec.kubernetes.kubelet settings will be overwritten for this worker pool (no merge of settings).\n version string (Optional) Version is the semantic Kubernetes version to use for the Kubelet in this Worker Group. If not specified the kubelet version is derived from the global shoot cluster kubernetes version. version must be equal or lower than the version of the shoot kubernetes version. Only one minor version difference to other worker groups and global kubernetes version is allowed.\n WorkerSystemComponents (Appears on: Worker) WorkerSystemComponents contains configuration for system components related to this worker pool\n Field Description allow bool Allow determines whether the pool should be allowed to host system components or not (defaults to true)\n WorkersSettings (Appears on: Provider) WorkersSettings contains settings for all workers.\n Field Description sshAccess SSHAccess (Optional) SSHAccess contains settings regarding ssh access to the worker nodes.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n core.gardener.cloud/v1beta1 core.gardener.cloud/v1beta1 …","ref":"/docs/gardener/api-reference/core/","tags":"","title":"Core"},{"body":"Packages:\n core.gardener.cloud/v1 core.gardener.cloud/v1 Package v1 is a version of the API.\nResource Types: ControllerDeployment ControllerDeployment ControllerDeployment contains information about how this controller is deployed.\n Field Description apiVersion string core.gardener.cloud/v1 kind string ControllerDeployment metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. helm HelmControllerDeployment (Optional) Helm configures that an extension controller is deployed using helm.\n HelmControllerDeployment (Appears on: ControllerDeployment) HelmControllerDeployment configures how an extension controller is deployed using helm.\n Field Description rawChart []byte (Optional) RawChart is the base64-encoded, gzip’ed, tar’ed extension controller chart.\n values Kubernetes apiextensions/v1.JSON (Optional) Values are the chart values.\n ociRepository OCIRepository (Optional) OCIRepository defines where to pull the chart.\n OCIRepository (Appears on: HelmControllerDeployment) OCIRepository configures where to pull an OCI Artifact, that could contain for example a Helm Chart.\n Field Description ref string (Optional) Ref is the full artifact Ref and takes precedence over all other fields.\n repository string (Optional) Repository is a reference to an OCI artifact repository.\n tag string (Optional) Tag is the image tag to pull.\n digest string (Optional) Digest of the image to pull, takes precedence over tag. The value should be in the format ‘sha256:’.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n core.gardener.cloud/v1 core.gardener.cloud/v1 Package …","ref":"/docs/gardener/api-reference/core-v1/","tags":"","title":"Core V1"},{"body":"Gardener Extension for CoreOS Container Linux \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It supports CoreOS Container Linux and Flatcar Container Linux (“a friendly fork of CoreOS Container Linux”).\nThe controller manages those objects that are requesting CoreOS Container Linux configuration (.spec.type=coreos) or Flatcar Container Linux configuration (.spec.type=flatcar):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: coreos units: ... files: ... Please find a concrete example in the example folder.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/coreos-cloudinit -from-file=\u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the CoreOS/FlatCar Container Linux operating system","excerpt":"Gardener extension controller for the CoreOS/FlatCar Container Linux …","ref":"/docs/extensions/os-extensions/gardener-extension-os-coreos/","tags":"","title":"CoreOS/FlatCar OS"},{"body":"Create a Shoot Cluster As you have already prepared an example Shoot manifest in the steps described in the development documentation, please open another Terminal pane/window with the KUBECONFIG environment variable pointing to the Garden development cluster and send the manifest to the Kubernetes API server:\nkubectl apply -f your-shoot-aws.yaml You should see that Gardener has immediately picked up your manifest and has started to deploy the Shoot cluster.\nIn order to investigate what is happening in the Seed cluster, please download its proper Kubeconfig yourself (see next paragraph). The namespace of the Shoot cluster in the Seed cluster will look like that: shoot-johndoe-johndoe-1, whereas the first johndoe is your namespace in the Garden cluster (also called “project”) and the johndoe-1 suffix is the actual name of the Shoot cluster.\nTo connect to the newly created Shoot cluster, you must download its Kubeconfig as well. Please connect to the proper Seed cluster, navigate to the Shoot namespace, and download the Kubeconfig from the kubecfg secret in that namespace.\nDelete a Shoot Cluster In order to delete your cluster, you have to set an annotation confirming the deletion first, and trigger the deletion after that. You can use the prepared delete shoot script which takes the Shoot name as first parameter. The namespace can be specified by the second parameter, but it is optional. If you don’t state it, it defaults to your namespace (the username you are logged in with to your machine).\n./hack/usage/delete shoot johndoe-1 johndoe (the hack bash script can be found at GitHub)\nConfigure a Shoot Cluster Aalert Receiver The receiver of the Shoot alerts can be configured from the .spec.monitoring.alerting.emailReceivers section in the Shoot specification. The value of the field has to be a list of valid mail addresses.\nThe alerting for the Shoot clusters is handled by the Prometheus Alertmanager. The Alertmanager will be deployed next to the control plane when the Shoot resource specifies .spec.monitoring.alerting.emailReceivers and if a SMTP secret exists.\nIf the field gets removed then the Alertmanager will be also removed during the next reconcilation of the cluster. The opposite is also valid if the field is added to an existing cluster.\n","categories":"","description":"","excerpt":"Create a Shoot Cluster As you have already prepared an example Shoot …","ref":"/docs/guides/administer-shoots/create-delete-shoot/","tags":"","title":"Create / Delete a Shoot Cluster"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Alibaba Cloud.\nPrerequisites You have created an Alibaba Cloud account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a project.\n To be able to add shoot clusters to this project, you must first create a technical user on Alibaba Cloud with sufficient permissions.\n Choose Secrets, then the plus icon and select AliCloud.\n To copy the policy for Alibaba Cloud from the Gardener dashboard, click on the help icon for Alibaba Cloud secrets, and choose copy .\n Create a custom policy in Alibaba Cloud:\n Log on to your Alibaba account and choose RAM \u003e Permissions \u003e Policies.\n Enter the name of your policy.\n Select Script.\n Paste the policy that you copied from the Gardener dashboard to this custom policy.\n Choose OK.\n In the Alibaba Cloud console, create a new technical user:\n Choose RAM \u003e Users.\n Choose Create User.\n Enter a logon and display name for your user.\n Select Open API Access.\n Choose OK.\n After the user is created, AccessKeyId and AccessKeySecret are generated and displayed. Remember to save them. The AccessKey is used later to create secrets for Gardener.\n Assign the policy you created to the technical user:\n Choose RAM \u003e Permissions \u003e Grants.\n Choose Grant Permission.\n Select Alibaba Cloud Account.\n Assign the policy you’ve created before to the technical user.\n Create your secret.\n Type the name of your secret. Copy and paste the Access Key ID and Secret Access Key you saved when you created the technical user on Alibaba Cloud. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select AliCloud in the Infrastructure tab.\n Type the name of your cluster in the Cluster Details tab.\n Choose the secret you created before in the Infrastructure Details tab.\n Choose Create.\n Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster. With it you can create shoot clusters on Alibaba Cloud. The size of persistent volumes in your shoot cluster must at least be 20 GiB large. If you choose smaller sizes in your Kubernetes PV definition, the allocation of cloud disk space on Alibaba Cloud fails.\n ","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/tutorials/kubernetes-cluster-on-alicloud-with-gardener/kubernetes-cluster-on-alicloud-with-gardener/","tags":"","title":"Create a Kubernetes Cluster on Alibaba Cloud with Gardener"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on Azure.\nPrerequisites You have created an Azure account. You have access to the Gardener dashboard and have permissions to create projects. You have an Azure Service Principal assigned to your subscription. Steps Go to the Gardener dashboard and create a Project.\n Get the properties of your Azure AD tenant, Subscription and Service Principal.\nBefore you can provision and access a Kubernetes cluster on Azure, you need to add the Azure service principal, AD tenant and subscription credentials in Gardener. Gardener needs the credentials to provision and operate the Azure infrastructure for your Kubernetes cluster.\nEnsure that the Azure service principal has the actions defined within the Azure Permissions within your Subscription assigned. If no fine-grained permission/actions are required, then simply the built-in Contributor role can be assigned.\n Tenant ID\nTo find your TenantID, follow this guide.\n SubscriptionID\nTo find your SubscriptionID, search for and select Subscriptions. After that, copy the SubscriptionID from your subscription of choice. Service Principal (SPN)\nA service principal consist of a ClientID (also called ApplicationID) and a Client Secret. For more information, see Application and service principal objects in Azure Active Directory. You need to obtain the:\n Client ID\nAccess the Azure Portal and navigate to the Active Directory service. Within the service navigate to App registrations and select your service principal. Copy the ClientID you see there.\n Client Secret\nSecrets for the Azure Account/Service Principal can be generated/rotated via the Azure Portal. After copying your ClientID, in the Detail view of your Service Principal navigate to Certificates \u0026 secrets. In the section, you can generate a new secret.\n Choose Secrets, then the plus icon and select Azure.\n Create your secret.\n Type the name of your secret. Copy and paste the TenantID, SubscriptionID and the Service Principal credentials (ClientID and ClientSecret). Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n Register resource providers for your subscription.\n Go to your Azure dashboard Navigate to Subscriptions -\u003e \u003cyour_subscription\u003e Pick resource providers from the sidebar Register microsoft.Network Register microsoft.Compute To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select Azure in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/tutorials/kubernetes-cluster-on-azure-with-gardener/kubernetes-cluster-on-azure-with-gardener/","tags":"","title":"Create a Kubernetes Cluster on Azure with Gardener"},{"body":"Overview Gardener can create a new VPC, or use an existing one for your shoot cluster. Depending on your needs, you may want to create shoot(s) into an already created VPC. The tutorial describes how to create a shoot cluster into an existing AWS VPC. The steps are identical for Alicloud, Azure, and GCP. Please note that the existing VPC must be in the same region like the shoot cluster that you want to deploy into the VPC.\nTL;DR If .spec.provider.infrastructureConfig.networks.vpc.cidr is specified, Gardener will create a new VPC with the given CIDR block and respectively will delete it on shoot deletion.\nIf .spec.provider.infrastructureConfig.networks.vpc.id is specified, Gardener will use the existing VPC and respectively won’t delete it on shoot deletion.\nNote It’s not recommended to create a shoot cluster into a VPC that is managed by Gardener (that is created for another shoot cluster). In this case the deletion of the initial shoot cluster will fail to delete the VPC because there will be resources attached to it.\nGardener won’t delete any manually created (unmanaged) resources in your cloud provider account.\n 1. Configure the AWS CLI The aws configure command is a convenient way to setup your AWS CLI. It will prompt you for your credentials and settings which will be used in the following AWS CLI invocations:\naws configure AWS Access Key ID [None]: \u003cACCESS_KEY_ID\u003e AWS Secret Access Key [None]: \u003cSECRET_ACCESS_KEY\u003e Default region name [None]: \u003cDEFAULT_REGION\u003e Default output format [None]: \u003cDEFAULT_OUTPUT_FORMAT\u003e 2. Create a VPC Create the VPC by running the following command:\naws ec2 create-vpc --cidr-block \u003ccidr-block\u003e { \"Vpc\": { \"VpcId\": \"vpc-ff7bbf86\", \"InstanceTenancy\": \"default\", \"Tags\": [], \"CidrBlockAssociations\": [ { \"AssociationId\": \"vpc-cidr-assoc-6e42b505\", \"CidrBlock\": \"10.0.0.0/16\", \"CidrBlockState\": { \"State\": \"associated\" } } ], \"Ipv6CidrBlockAssociationSet\": [], \"State\": \"pending\", \"DhcpOptionsId\": \"dopt-38f7a057\", \"CidrBlock\": \"10.0.0.0/16\", \"IsDefault\": false } } Gardener requires the VPC to have enabled DNS support, i.e the attributes enableDnsSupport and enableDnsHostnames must be set to true. enableDnsSupport attribute is enabled by default, enableDnsHostnames - not. Set the enableDnsHostnames attribute to true:\naws ec2 modify-vpc-attribute --vpc-id vpc-ff7bbf86 --enable-dns-hostnames 3. Create an Internet Gateway Gardener also requires that an internet gateway is attached to the VPC. You can create one by using:\naws ec2 create-internet-gateway { \"InternetGateway\": { \"Tags\": [], \"InternetGatewayId\": \"igw-c0a643a9\", \"Attachments\": [] } } and attach it to the VPC using:\naws ec2 attach-internet-gateway --internet-gateway-id igw-c0a643a9 --vpc-id vpc-ff7bbf86 4. Create the Shoot Prepare your shoot manifest (you could check the example manifests). Please make sure that you choose the region in which you had created the VPC earlier (step 2). Also, put your VPC ID in the .spec.provider.infrastructureConfig.networks.vpc.id field:\nspec: region: \u003caws-region-of-vpc\u003e provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: id: vpc-ff7bbf86 # ... Apply your shoot manifest:\nkubectl apply -f your-shoot-aws.yaml Ensure that the shoot cluster is properly created:\nkubectl get shoot $SHOOT_NAME -n $SHOOT_NAMESPACE NAME CLOUDPROFILE VERSION SEED DOMAIN OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE \u003cSHOOT_NAME\u003e aws 1.15.0 aws \u003cSHOOT_DOMAIN\u003e Succeeded 100 True True True True 20m ","categories":"","description":"","excerpt":"Overview Gardener can create a new VPC, or use an existing one for …","ref":"/docs/guides/administer-shoots/create-shoot-into-existing-aws-vpc/","tags":"","title":"Create a Shoot Cluster Into an Existing AWS VPC"},{"body":"Overview Gardener allows you to create a Kubernetes cluster on different infrastructure providers. This tutorial will guide you through the process of creating a cluster on GCP.\nPrerequisites You have created a GCP account. You have access to the Gardener dashboard and have permissions to create projects. Steps Go to the Gardener dashboard and create a Project.\n Check which roles are required by Gardener.\n Choose Secrets, then the plus icon and select GCP.\n Click on the help button .\n Create a service account with the correct roles in GCP:\n Create a new service account in GCP.\n Enter the name and description of your service account.\n Assign the roles required by Gardener.\n Choose Done.\n Create a key for your service:\n Locate your service account, then choose Actions and Manage keys.\n Choose Add Key, then Create new key.\n Save the private key of the service account in JSON format.\n Note Save the key of the user, it’s used later to create secrets for Gardener. Enable the Google Compute API by following these steps.\n When you are finished, you should see the following page:\n Enable the Google IAM API by following these steps.\n When you are finished, you should see the following page:\n On the Gardener dashboard, choose Secrets and then the plus sign . Select GCP from the drop down menu to add a new GCP secret.\n Create your secret.\n Type the name of your secret. Select your Cloud Profile. Copy and paste the contents of the .JSON file you saved when you created the secret key on GCP. Choose Add secret. After completing these steps, you should see your newly created secret in the Infrastructure Secrets section.\n To create a new cluster, choose Clusters and then the plus sign in the upper right corner.\n In the Create Cluster section:\n Select GCP in the Infrastructure tab. Type the name of your cluster in the Cluster Details tab. Choose the secret you created before in the Infrastructure Details tab. Choose Create. Wait for your cluster to get created.\n Result After completing the steps in this tutorial, you will be able to see and download the kubeconfig of your cluster.\n","categories":"","description":"","excerpt":"Overview Gardener allows you to create a Kubernetes cluster on …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/tutorials/kubernetes-cluster-on-gcp-with-gardener/kubernetes-cluster-on-gcp-with-gardener/","tags":"","title":"Create a Кubernetes Cluster on GCP with Gardener"},{"body":"Custom containerd Configuration In case a Shoot cluster uses containerd, it is possible to make the containerd process load custom configuration files. Gardener initializes containerd with the following statement:\nimports = [\"/etc/containerd/conf.d/*.toml\"] This means that all *.toml files in the /etc/containerd/conf.d directory will be imported and merged with the default configuration. To prevent unintended configuration overwrites, please be aware that containerd merges config sections, not individual keys (see here and here). Please consult the upstream containerd documentation for more information.\n ⚠️ Note that this only applies to nodes which were newly created after gardener/gardener@v1.51 was deployed. Existing nodes are not affected.\n ","categories":"","description":"","excerpt":"Custom containerd Configuration In case a Shoot cluster uses …","ref":"/docs/gardener/custom-containerd-config/","tags":"","title":"Custom containerd Configuration"},{"body":"Custom DNS Configuration Gardener provides Kubernetes-Clusters-As-A-Service where all the system components (e.g., kube-proxy, networking, dns) are managed. As a result, Gardener needs to ensure and auto-correct additional configuration to those system components to avoid unnecessary down-time.\nIn some cases, auto-correcting system components can prevent users from deploying applications on top of the cluster that requires bits of customization, DNS configuration can be a good example.\nTo allow for customizations for DNS configuration (that could potentially lead to downtime) while having the option to “undo”, we utilize the import plugin from CoreDNS [1]. which enables in-line configuration changes.\nHow to use To customize your CoreDNS cluster config, you can simply edit a ConfigMap named coredns-custom in the kube-system namespace. By editing, this ConfigMap, you are modifying CoreDNS configuration, therefore care is advised.\nFor example, to apply new config to CoreDNS that would point all .global DNS requests to another DNS pod, simply edit the configuration as follows:\napiVersion: v1 kind: ConfigMap metadata: name: coredns-custom namespace: kube-system data: istio.server: |global:8053 { errors cache 30 forward . 1.2.3.4 } corefile.override: |# \u003csome-plugin\u003e \u003csome-plugin-config\u003e debug whoami The port number 8053 in global:8053 is the specific port that CoreDNS is bound to and cannot be changed to any other port if it should act on ordinary name resolution requests from pods. Otherwise, CoreDNS will open a second port, but you are responsible to direct the traffic to this port. kube-dns service in kube-system namespace will direct name resolution requests within the cluster to port 8053 on the CoreDNS pods. Moreover, additional network policies are needed to allow corresponding ingress traffic to CoreDNS pods. In order for the destination DNS server to be reachable, it must listen on port 53 as it is required by network policies. Other ports are only possible if additional network policies allow corresponding egress traffic from CoreDNS pods.\nIt is important to have the ConfigMap keys ending with *.server (if you would like to add a new server) or *.override if you want to customize the current server configuration (it is optional setting both).\n[Optional] Reload CoreDNS As Gardener is configuring the reload plugin of CoreDNS a restart of the CoreDNS components is typically not necessary to propagate ConfigMap changes. However, if you don’t want to wait for the default (30s) to kick in, you can roll-out your CoreDNS deployment using:\nkubectl -n kube-system rollout restart deploy coredns This will reload the config into CoreDNS.\nThe approach we follow here was inspired by AKS’s approach [2].\nAnti-Pattern Applying a configuration that is in-compatible with the running version of CoreDNS is an anti-pattern (sometimes plugin configuration changes, simply applying a configuration can break DNS).\nIf incompatible changes are applied by mistake, simply delete the content of the ConfigMap and re-apply. This should bring the cluster DNS back to functioning state.\nNode Local DNS Custom DNS configuration] may not work as expected in conjunction with NodeLocalDNS. With NodeLocalDNS, ordinary DNS queries targeted at the upstream DNS servers, i.e. non-kubernetes domains, will not end up at CoreDNS, but will instead be directly sent to the upstream DNS server. Therefore, configuration applying to non-kubernetes entities, e.g. the istio.server block in the custom DNS configuration example, may not have any effect with NodeLocalDNS enabled. If this kind of custom configuration is required, forwarding to upstream DNS has to be disabled. This can be done by setting the option (spec.systemComponents.nodeLocalDNS.disableForwardToUpstreamDNS) in the Shoot resource to true:\n... spec: ... systemComponents: nodeLocalDNS: enabled: true disableForwardToUpstreamDNS: true ... References [1] Import plugin [2] AKS Custom DNS\n","categories":"","description":"","excerpt":"Custom DNS Configuration Gardener provides …","ref":"/docs/gardener/custom-dns-config/","tags":"","title":"Custom DNS Configuration"},{"body":"Custom Shoot Fields The Dashboard supports custom shoot fields, which can be configured to be displayed on the cluster list and cluster details page. Custom fields do not show up on the ALL_PROJECTS page.\nProject administration page: Each custom field configuration is shown with its own chip.\nClick on the chip to show more details for the custom field configuration.\nCustom fields can be shown on the cluster list, if showColumn is enabled. See configuration below for more details. In this example, a custom field for the Shoot status was configured.\nCustom fields can be shown in a dedicated card (Custom Fields) on the cluster details page, if showDetails is enabled. See configuration below for more details.\nConfiguration Property Type Default Required Description name String ✔️ Name of the custom field path String ✔️ Path in shoot resource, of which the value must be of primitive type (no object / array). Use lodash get path syntax, e.g. metadata.labels[\"shoot.gardener.cloud/status\"] or spec.networking.type icon String MDI icon for field on the cluster details page. See https://materialdesignicons.com/ for available icons. Must be in the format: mdi-\u003cicon-name\u003e. tooltip String Tooltip for the custom field that appears when hovering with the mouse over the value defaultValue String/Number Default value, in case there is no value for the given path showColumn Bool true Field shall appear as column in the cluster list columnSelectedByDefault Bool true Indicates if field shall be selected by default on the cluster list (not hidden by default) weight Number 0 Defines the order of the column. The built-in columns start with a weight of 100, increasing by 100 (200, 300, etc.) sortable Bool true Indicates if column is sortable on the cluster list searchable Bool true Indicates if column is searchable on the cluster list showDetails Bool true Indicates if field shall appear in a dedicated card (Custom Fields) on the cluster details page Editor for Custom Shoot Fields The Gardener Dashboard now includes an editor for custom shoot fields, allowing users to configure these fields directly from the dashboard without needing to use kubectl. This editor can be accessed from the project administration page.\nAccessing the Editor Navigate to the project administration page. Scroll down to the Custom Fields for Shoots section. Click on the gear icon to open the configuration panel for custom fields. Adding a New Custom Field In the Configure Custom Fields for Shoot Clusters panel, click on the + ADD NEW FIELD button. Fill in the details for the new custom field in the Add New Field form. Refer to the Configuration section for detailed descriptions of each field.\n Click the ADD button to save the new custom field.\n Example Custom shoot fields can be defined per project by specifying metadata.annotations[\"dashboard.gardener.cloud/shootCustomFields\"]. The following is an example project yaml:\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: annotations: dashboard.gardener.cloud/shootCustomFields: |{ \"shootStatus\": { \"name\": \"Shoot Status\", \"path\": \"metadata.labels[\\\"shoot.gardener.cloud/status\\\"]\", \"icon\": \"mdi-heart-pulse\", \"tooltip\": \"Indicates the health status of the cluster\", \"defaultValue\": \"unknown\", \"showColumn\": true, \"columnSelectedByDefault\": true, \"weight\": 950, \"searchable\": true, \"sortable\": true, \"showDetails\": true }, \"networking\": { \"name\": \"Networking Type\", \"path\": \"spec.networking.type\", \"icon\": \"mdi-table-network\", \"showColumn\": false } } ","categories":"","description":"","excerpt":"Custom Shoot Fields The Dashboard supports custom shoot fields, which …","ref":"/docs/dashboard/custom-fields/","tags":"","title":"Custom Fields"},{"body":"Overview Seccomp (secure computing mode) is a security facility in the Linux kernel for restricting the set of system calls applications can make.\nStarting from Kubernetes v1.3.0, the Seccomp feature is in Alpha. To configure it on a Pod, the following annotations can be used:\n seccomp.security.alpha.kubernetes.io/pod: \u003cseccomp-profile\u003e where \u003cseccomp-profile\u003e is the seccomp profile to apply to all containers in a Pod. container.seccomp.security.alpha.kubernetes.io/\u003ccontainer-name\u003e: \u003cseccomp-profile\u003e where \u003cseccomp-profile\u003e is the seccomp profile to apply to \u003ccontainer-name\u003e in a Pod. More details can be found in the PodSecurityPolicy documentation.\nInstallation of a Custom Profile By default, kubelet loads custom Seccomp profiles from /var/lib/kubelet/seccomp/. There are two ways in which Seccomp profiles can be added to a Node:\n to be baked in the machine image to be added at runtime This guide focuses on creating those profiles via a DaemonSet.\nCreate a file called seccomp-profile.yaml with the following content:\napiVersion: v1 kind: ConfigMap metadata: name: seccomp-profile namespace: kube-system data: my-profile.json: |{ \"defaultAction\": \"SCMP_ACT_ALLOW\", \"syscalls\": [ { \"name\": \"chmod\", \"action\": \"SCMP_ACT_ERRNO\" } ] } Note The policy above is a very simple one and not suitable for complex applications. The default docker profile can be used a reference. Feel free to modify it to your needs. Apply the ConfigMap in your cluster:\n$ kubectl apply -f seccomp-profile.yaml configmap/seccomp-profile created The next steps is to create the DaemonSet Seccomp installer. It’s going to copy the policy from above in /var/lib/kubelet/seccomp/my-profile.json.\nCreate a file called seccomp-installer.yaml with the following content:\napiVersion: apps/v1 kind: DaemonSet metadata: name: seccomp namespace: kube-system labels: security: seccomp spec: selector: matchLabels: security: seccomp template: metadata: labels: security: seccomp spec: initContainers: - name: installer image: alpine:3.10.0 command: [\"/bin/sh\", \"-c\", \"cp -r -L /seccomp/*.json /host/seccomp/\"] volumeMounts: - name: profiles mountPath: /seccomp - name: hostseccomp mountPath: /host/seccomp readOnly: false containers: - name: pause image: k8s.gcr.io/pause:3.1 terminationGracePeriodSeconds: 5 volumes: - name: hostseccomp hostPath: path: /var/lib/kubelet/seccomp - name: profiles configMap: name: seccomp-profile Create the installer and wait until it’s ready on all Nodes:\n$ kubectl apply -f seccomp-installer.yaml daemonset.apps/seccomp-installer created $ kubectl -n kube-system get pods -l security=seccomp NAME READY STATUS RESTARTS AGE seccomp-installer-wjbxq 1/1 Running 0 21s Create a Pod Using a Custom Seccomp Profile Finally, we want to create a profile which uses our new Seccomp profile my-profile.json.\nCreate a file called my-seccomp-pod.yaml with the following content:\napiVersion: v1 kind: Pod metadata: name: seccomp-app namespace: default annotations: seccomp.security.alpha.kubernetes.io/pod: \"localhost/my-profile.json\" # you can specify seccomp profile per container. If you add another profile you can configure # it for a specific container - 'pause' in this case. # container.seccomp.security.alpha.kubernetes.io/pause: \"localhost/some-other-profile.json\" spec: containers: - name: pause image: k8s.gcr.io/pause:3.1 Create the Pod and see that it’s running:\n$ kubectl apply -f my-seccomp-pod.yaml pod/seccomp-app created $ kubectl get pod seccomp-app NAME READY STATUS RESTARTS AGE seccomp-app 1/1 Running 0 42s Throubleshooting If an invalid or a non-existing profile is used, then the Pod will be stuck in ContainerCreating phase:\nbroken-seccomp-pod.yaml:\napiVersion: v1 kind: Pod metadata: name: broken-seccomp namespace: default annotations: seccomp.security.alpha.kubernetes.io/pod: \"localhost/not-existing-profile.json\" spec: containers: - name: pause image: k8s.gcr.io/pause:3.1 $ kubectl apply -f broken-seccomp-pod.yaml pod/broken-seccomp created $ kubectl get pod broken-seccomp NAME READY STATUS RESTARTS AGE broken-seccomp 1/1 ContainerCreating 0 2m $ kubectl describe pod broken-seccomp Name: broken-seccomp Namespace: default .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 18s default-scheduler Successfully assigned kube-system/broken-seccomp to docker-desktop Warning FailedCreatePodSandBox 4s (x2 over 18s) kubelet, docker-desktop Failed create pod sandbox: rpc error: code = Unknown desc = failed to make sandbox docker config for pod \"broken-seccomp\": failed to generate sandbox security options for sandbox \"broken-seccomp\": failed to generate seccomp security options for container: cannot load seccomp profile \"/var/lib/kubelet/seccomp/not-existing-profile.json\": open /var/lib/kubelet/seccomp/not-existing-profile.json: no such file or directory Related Links Seccomp A Seccomp Overview Seccomp Security Profiles for Docker Using Seccomp to Limit the Kernel Attack Surface ","categories":"","description":"","excerpt":"Overview Seccomp (secure computing mode) is a security facility in the …","ref":"/docs/guides/applications/secure-seccomp/","tags":"","title":"Custom Seccomp Profile"},{"body":"Theming and Branding Motivation Gardener landscape administrators should have the possibility to change the appearance and the branding of the Gardener Dashboard via configuration without the need to touch the code.\nBranding It is possible to change the branding of the Gardener Dashboard when using the helm chart in the frontendConfig.branding map. The following configuration properties are supported:\n name description default documentTitle Title of the browser window Gardener Dashboard productName Name of the Gardener product Gardener productTitle Title of the Gardener product displayed below the logo. It could also contain information about the specific Gardener instance (e.g. Development, Canary, Live) Gardener productTitleSuperscript Superscript next to the product title. To supress the superscript set to false Production version (e.g 1.73.1) productSlogan Slogan that is displayed under the product title and on the login page Universal Kubernetes at Scale productLogoUrl URL for the product logo. You can also use data: scheme for development. For production it is recommended to provide static assets /static/assets/logo.svg teaserHeight Height of the teaser in the GMainNavigation component 200 teaserTemplate Custom HTML template to replace to teaser content refer to GTeaser loginTeaserHeight Height of the login teaser in the GLogin component 260 loginTeaserTemplate Custom HTML template to replace to login teaser content refer to GLoginTeaser loginFooterHeight Height of the login footer in the GLogin component 24 loginFooterTemplate Custom HTML template to replace to login footer content refer to GLoginFooter loginHints Links { title: string; href: string; } to product related sites shown below the login button undefined oidcLoginTitle Title of tabstrip for loginType OIDC OIDC oidcLoginText Text show above the login button on the OIDC tabstrip Press Login to be redirected to\nconfigured OpenID Connect Provider. Colors Gardener Dashboard has been built with Vuetify. We use Vuetify’s built-in theming support to centrally configure colors that are used throughout the web application. Colors can be configured for both light and dark themes. Configuration is done via the helm chart, see the respective theme section there. Colors can be specified as HTML color code (e.g. #FF0000 for red) or by referencing a color (e.g grey.darken3 or shades.white) from Vuetify’s Material Design Color Pack.\nThe following colors can be configured:\n name usage primary icons, chips, buttons, popovers, etc. anchor links main-background main navigation, login page main-navigation-title text color on main navigation toolbar-background background color for toolbars in cards, dialogs, etc. toolbar-title text color for toolbars in cards, dialogs, etc. action-button buttons in tables and cards, e.g. cluster details page info notification info popups, texts and status tags success notification success popups, texts and status tags warning notification warning popups, texts and status tags error notification error popups, texts and status tags unknown status tags with unknown severity … all other Vuetify theme colors If you use the helm chart, you can configure those with frontendConfig.themes.light for the light theme and frontendConfig.themes.dark for the dark theme. The customization example below shows a possible custom color theme configuration.\nLogos and Icons It is also possible to exchange the Dashboard logo and icons. You can replace the assets folder when using the helm chart in the frontendConfig.assets map.\nAttention: You need to set values for all files as mapping the volume will overwrite all files. It is not possible to exchange single files.\nThe files have to be encoded as base64 for the chart - to generate the encoded files for the values.yaml of the helm chart, you can use the following shorthand with bash or zsh on Linux systems. If you use macOS, install coreutils with brew (brew install coreutils) or remove the -w0 parameter.\ncat \u003c\u003c EOF ### ### COPY EVERYTHING BELOW THIS LINE ### assets: favicon-16x16.png: | $(cat frontend/public/static/assets/favicon-16x16.png | base64 -w0) favicon-32x32.png: | $(cat frontend/public/static/assets/favicon-32x32.png | base64 -w0) favicon-96x96.png: | $(cat frontend/public/static/assets/favicon-96x96.png | base64 -w0) favicon.ico: | $(cat frontend/public/static/assets/favicon.ico | base64 -w0) logo.svg: | $(cat frontend/public/static/assets/logo.svg | base64 -w0) EOF Then, swap in the base64 encoded version of your files where needed.\nCustomization Example The following example configuration in values.yaml shows most of the possibilities to achieve a custom theming and branding:\nglobal: dashboard: frontendConfig: # ... branding: productName: Nucleus productTitle: Nucleus productSlogan: Supercool Cluster Service teaserHeight: 160 teaserTemplate: |\u003cdiv class=\"text-center px-2\" \u003e \u003ca href=\"/\" class=\"text-decoration-none\" \u003e \u003cimg src=\"{{ productLogoUrl }}\" width=\"80\" height=\"80\" alt=\"{{ productName }} Logo\" class=\"pointer-events-none\" \u003e \u003cdiv class=\"font-weight-thin text-grey-lighten-4\" style=\"font-size: 32px; line-height: 32px; letter-spacing: 2px;\" \u003e {{ productTitle }} \u003c/div\u003e \u003cdiv class=\"text-body-1 font-weight-normal text-primary mt-1\"\u003e {{ productSlogan }} \u003c/div\u003e \u003c/a\u003e \u003c/div\u003e loginTeaserHeight: 296 loginTeaserTemplate: |\u003cdiv class=\"d-flex flex-column align-center justify-center bg-main-background-darken-1 pa-3\" style=\"min-height: {{ minHeight }}px\" \u003e \u003cimg src=\"{{ productLogoUrl }}\" alt=\"Login to {{ productName }}\" width=\"140\" height=\"140\" class=\"mt-2\" \u003e \u003cdiv class=\"text-h3 text-center font-weight-thin text-white mt-4\"\u003e {{ productTitle }} \u003c/div\u003e \u003cdiv class=\"text-h5 text-center font-weight-light text-primary mt-1\"\u003e {{ productSlogan }} \u003c/div\u003e \u003c/div\u003e loginFooterTemplate: |\u003cdiv class=\"text-anchor text-caption\"\u003e Copyright 2023 by Nucleus Corporation \u003c/div\u003e loginHints: - title: Support href: https://gardener.cloud - title: Documentation href: https://gardener.cloud/docs oidcLoginTitle: IDS oidcLoginText: Press LOGIN to be redirected to the Nucleus Identity Service. themes: light: primary: '#354a5f' anchor: '#5b738b' main-background: '#354a5f' main-navigation-title: '#f5f6f7' toolbar-background: '#354a5f' toolbar-title: '#f5f6f7' action-button: '#354a5f' dark: primary: '#5b738b' anchor: '#5b738b' background: '#273849' surface: '#1d2b37' main-background: '#1a2733' main-navigation-title: '#f5f6f7' toolbar-background: '#0e1e2a' toolbar-title: '#f5f6f7' action-button: '#5b738b' assets: favicon-16x16.png: | iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAMAAAAoLQ9TAAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAABHVBMVEUAAAALgGILgWIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGL///8Vq54LAAAAXXRSTlMAAAAAFmu96Pv0TRQ5NB0FLLn67j1X8fDdnSoR3CJC9cZx9/7ZtAna9rvFe28Cl552ZLDUIS7X5UnfVOrrOKS+Q7/Kz61IAwQKC8gVEwYIyQxgZQYCVn9+IFjR7/wm8JKCAAAAAWJLR0ReBNZhuwAAAAd0SU1FB+cKCgkYLrOE10YAAADMSURBVBjTXY7XUsIAFETvxhJQQwlEEhUlCEizVxANCIoQwAIobf//NyTjMDqcp5192D0ivwCra+uqz78h2NzSAkEgFNZJRqICYztmWju7YXrsxQX7B/OQsJOHqRiZzggCR8zmVJX5QpYsQnB8Qp6e+fTzC/LyCqLg+oa3d6lS+Z4VA/AOHx6dau3JqDeeX+ApKGi+tpy22+lCkYVWz3l7X5E/0KstFc2Pz/6iwMDVhvbXtz3U3MF8dDS2JtOZpz2bTqzxSEyd/9BN4RI/8jsrfdR558kAAAAldEVYdGRhdGU6Y3JlYXRlADIwMjMtMTAtMTBUMDk6MjQ6MzMrMDA6MDC+UDWaAAAAJXRFWHRkYXRlOm1vZGlmeQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwzw2NJgAAABJ0RVh0ZXhpZjpFeGlmT2Zmc2V0ADI2UxuiZQAAABh0RVh0ZXhpZjpQaXhlbFhEaW1lbnNpb24AMTUwO0W0KAAAABh0RVh0ZXhpZjpQaXhlbFlEaW1lbnNpb24AMTUwpkpVXgAAACB0RVh0c29mdHdhcmUAaHR0cHM6Ly9pbWFnZW1hZ2ljay5vcme8zx2dAAAAGHRFWHRUaHVtYjo6RG9jdW1lbnQ6OlBhZ2VzADGn/7svAAAAGHRFWHRUaHVtYjo6SW1hZ2U6OkhlaWdodAAxOTJAXXFVAAAAF3RFWHRUaHVtYjo6SW1hZ2U6OldpZHRoADE5MtOsIQgAAAAZdEVYdFRodW1iOjpNaW1ldHlwZQBpbWFnZS9wbmc/slZOAAAAF3RFWHRUaHVtYjo6TVRpbWUAMTY5NjkyOTg3M4YMipUAAAAPdEVYdFRodW1iOjpTaXplADBCQpSiPuwAAABWdEVYdFRodW1iOjpVUkkAZmlsZTovLy9tbnRsb2cvZmF2aWNvbnMvMjAyMy0xMC0xMC9kNzEyMWM2YzM2OTg3NmQ0MGUxY2EyMjVlYjg3MGZjYi5pY28ucG5nU19BKAAAAABJRU5ErkJggg== favicon-32x32.png: | iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAMAAABEpIrGAAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAACTFBMVEUAAAALgWIKil4LgGIKgGEMf2MNgGIMgGILgGEKgWEIhVwGgl8JgVwMgGMKgGMNgGEKf2EMgWMMf2ILhmILhGIKjFwHgmYMf2EKgWMGgGIKgGIIg2UAgGMCkGINfGIMfGILfmELfWELgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgWELgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKgWELgGILgGILgGIMgmILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIKg2ALgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGIMgGELgGILgGILgGILgGILgGILgGILgGILgGILgGILgWILgGILgGILgGILgGILgGILgGILgGIKgGILgGILgGILgGILgGILgGILgmIMgWELgGIHgWILgGIMgWIHgGILgGILgGILgGILgGILgGILgWMLgGILgGILgGILgGMLgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGILgGL///+Wa9azAAAAwnRSTlMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGVWDrNTs+v77InTF9YIGVs5sAjhdYk0wDwVv8PlPBan96sF/MwLoL5PhcAwgy8kTePiAAmj8nwJe7qbmlNNzNOTE3bfvfm3YdRS8pdcuQuLDEvFRKtmVAXx7A6KjCHE2LeVfBbPKZNxLlo2KqgOHuT9Hi6BZAna2mlwRAQGeAQ4BAemOJSYXAwsfIwNAY32d0L1BGjlq0fLzazoNcu0Jrf5EbikAAAABYktHRMOKaI5CAAAAB3RJTUUH5woKCRgus4TXRgAAAdNJREFUOMtjYMAEjExKyiqqauoamloqzJjSLKzaOrp6hyBAH1MBs4GhkeYhGDAGKmA2MTUzN7OwtGJjB8pzWNvYwqUP2dmDFDg4HjrkZOfs4urmzszJbIgkf8jDE6SAywvK9fbxZfbzR5I/FBAIVMDNHATjB4eEhoHtDw+GCEREMoKcFeUEUxGtHAOiYuPiIfwEZh6QgkRnEEcvSevQoeQUkLxylB1YPskS7Etm31QQLy09I/NQONAwrazsHIgBuQ5gBbzMPmBuXn5BIYguKlaGGFBSCg0m5rJysEBFZRWIqq6phRhQVw9TwNGgBRIo9/OMPXRIvbGpGSyv5gkPZ+biWpDvwlta2w4darfqAMt3dvHxIyJAqRsUgD3MvYcO9fVPAAXJxEkCgshRVD95ytSp04SAjoiYPiN55qzZc5iFUWNRRHTuPDFxoIL5C+oX1i/Ckg6AQAKsQFKKARegm4LF+BT4LHFeiscEafHiZctXyOBQICu3ctWK1WvWrF6xaq28LHpar1Fat37DxpxNKUlJKZs2z09fv2VrDSysRBR8t22ftWMnIjMAgeau+I279+xVFGFgqG/cN1MdRRKuaP9EHdMDDI5LDuEBzgcZDhEAlCsAAOGIeNYQEfj6AAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwvlA1mgAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMy0xMC0xMFQwOToyNDozMyswMDowMM8NjSYAAAASdEVYdGV4aWY6RXhpZk9mZnNldAAyNlMbomUAAAAYdEVYdGV4aWY6UGl4ZWxYRGltZW5zaW9uADE1MDtFtCgAAAAYdEVYdGV4aWY6UGl4ZWxZRGltZW5zaW9uADE1MKZKVV4AAAAgdEVYdHNvZnR3YXJlAGh0dHBzOi8vaW1hZ2VtYWdpY2sub3JnvM8dnQAAABh0RVh0VGh1bWI6OkRvY3VtZW50OjpQYWdlcwAxp/+7LwAAABh0RVh0VGh1bWI6OkltYWdlOjpIZWlnaHQAMTkyQF1xVQAAABd0RVh0VGh1bWI6OkltYWdlOjpXaWR0aAAxOTLTrCEIAAAAGXRFWHRUaHVtYjo6TWltZXR5cGUAaW1hZ2UvcG5nP7JWTgAAABd0RVh0VGh1bWI6Ok1UaW1lADE2OTY5Mjk4NzOGDIqVAAAAD3RFWHRUaHVtYjo6U2l6ZQAwQkKUoj7sAAAAVnRFWHRUaHVtYjo6VVJJAGZpbGU6Ly8vbW50bG9nL2Zhdmljb25zLzIwMjMtMTAtMTAvZDcxMjFjNmMzNjk4NzZkNDBlMWNhMjI1ZWI4NzBmY2IuaWNvLnBuZ1NfQSgAAAAASUVORK5CYII= favicon-96x96.png: | iVBORw0KGgoAAAANSUhEUgAAAGAAAABgCAYAAADimHc4AAAABGdBTUEAALGPC/xhBQAAACBjSFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAABmJLR0QA/wD/AP+gvaeTAAAAB3RJTUUH5woKCRgus4TXRgAADuRJREFUeNrtnXuMXNV9xz/fc2dm3+uAwcEGg02IAwQIOC1xkrZGQQ0NiUpVaNVWRXjXJkLlj7RVEeA0Zmy3Uas0D6UoCZRXMIoSUNsoioQSqlBISQhpbew4xjwMNgYbG3vxvmZ2Hvf8+seZNbvE3r3z2J21mY80snfunXvPPd97f/d3fuf3u1e0mERndjUgQZySyGA6HTgHOANYAJwJLK78/5TKpxdoB9KVj4AyUARywDDwFvAN4D6A0ex9AKSafcBzgY71/ZiQM5uH2UJhHzD0QYxlwFnAQuA0oBuICB38zg8T/oUgRAroBOYTRFz8zn2/awXoWt8HwskzD+NCMy4DLQc+DJwl6AIyTO7UapkoTAwUVwJPTFjhXSVAT/ZmPMOITMbMn4OxwuBThE5fDHTM4O5LQO5xwE348l0jQFe2H08uA6kLDX8N8EngIqCH+s7ypBSB0RTgJ3x5cguQvYIe3gfykRnvM3QN8GfAB4G2WW5NARiKmaz2SStAV3YNRYaIsdNlugb4S+ByZtbMTMWY4IhhaIIROikF6MyuxmRRxrovAW4G/ojgiTSTnIkBeYd3bxuhk06Armw/yLdhugq4lXDWz4XjzIEdBiBlR790tW5tLtKd7cehDkzXA18CPsbc6HyAI5iGELz/9re/nLJxHRvWIAxQSlgXZp2YtQMZUBpwhhTWwYOVQQWwAiKPaSS2jqJTkVz232b06LrDYKoLb6uA2wgDqLmCAfuEigBb3QNHF0wpgPPjtspWAKuA00G9hBFhB5AW5io7KIXOZwQ0jHEYOBApfwDY25XtfwnsVaFhDwWBjQ/H66VzfR9ePiOvPzd0OyFcMJeIgdcRJWzygqSX5weA6wkjw2opA0eAN0H7DbYJfibY0bW+f/d7X75gdP+ybeQ/v6mmI+ta3wfOImJdDbqFudf5432wO0r5YhxPHnIkFWAUGKM2AVKEOMppwAXASqDPYBfGUweWPveoyulfdmy4cQDzPn/HvYk33LOxj0LBkU77i0F/Byyb9a5NxhjwaqngfLqrMGnBNDfh8evFRghRvUYQAfOA5cBfAXfL+Kbz8bURnELf1+je0JdoQ96LVNqfhulm4COz1Jm1cAA4KGcM3vrQpAXTCHA0lnSEEFJtNBHhZnkt8HUz+1LXOds+HnuXCWHh49OZXY33zsn0KeAa5o63cyxeB9481oKpBXjbXA0AQzPYQBHi7TcAdzlslbB5nXf007Oh/zdWbt/YjzPDyS/lqHMwp9mDahDAxk2Q7DAwOAsNTRHiNBuBWyTObBN0ZCeLkIoFThFwNfDbs9CuehgDngfyx1o4pQDxUS9UI8DhWWz0AuCvgbW5mIVCdG284ehCw2PmFxNMT88stqsW3kL8anTfmNcxYq7RVL+Mn3iWzBWXQRDqw4Rh/WyNnjPA+QJJtgXv8pkrL6Ft5XJKpImwq4AbaV5wLSm7gW9kutMDisoUH986aeG0nSnAiTLwCiGmPZv0AJ/FdL2HdosjDJGm3A18guBNNYocIWTcSAzYCQwgY+QLD/7GCtMK4JynHLtY8CLHsWMzzHzgJgeXp2JPxTNYDKxI0v4EHAF+BNwFvNbgtpeAX3jnB012zBWmPYChdQ/gnMdgP/BGgxuYlGVAXzmtUytjkw8Bi+rcZhnYCnyBELI+TPDEGskBYIvzLi5G6WOukPgMUmjgSw1uYFIccLW8PpmOyhFwMXBqHdsbA/4duEmObxFM2Z8QJuIbyU6D5w0o//3dxz2w6ZFhwRXdCVii3zSeBcCnSz46Fzgvcdt/k2HgHuBWr/hp75UhTFNe2OD2loHNiIOaYsY52UGYMFMR2E6wmc1iBaaPUXuoOQfcbbKN5t0eZymEX05wZ9M1bvN4HEY8qS2UplqpGhME8CvCsLoeSoQb+i8JA5RBkl9Vi4A/BpbUsN8i8LDEV62cOuicIdENug44t85jOhZbMJ7lUvjcuuOvlCh+4p3DeY/BHsEuQjpHrQjYJfE1sAEzLgKtAK6sdOxUY5P2ynrtNez3KeBfYvOvR2mwPNDOpcCnk/ZDFeSBH8dEBxyeL+r4Ed5EV0B+3T0gIeeGgSepz19OAVeacRNopFjU/ZJuAfoJeZOHpvitI9woowT7mchrwJ0Q7XBEmAfa1IlxLbVdTdOxS/BkRBxrUhbQsQ8oEc458D4WPA3sq7OBaeAzZqxta+NM7/2QSv5Jw24H1hMGfY2iBDxi8JgRm0+pYvDsYuAPaPzZ7wlX206A1DQCJD6Tio9vpu2K5YDGCLH8er0GR7C9eYn/JXIlobzENuAgcCkh87hedoD+wRm7zQlnBqIN6AP+sJo+SMhexD/HJXZGKRjOPjBtJyRGQJrUIeC/gZEGNLYbWAX6veLeqNIaG3OyR4CvMLU5SkIR+E9gu0lEisGEsLMJZ38tM3xT4YFHMXsmShsjd0w/512VAAYUKZrgv4DnGtToJcCazNnlM/CAPN40BnwHeITgT9fKy8CjBmMZ8ph3pIXM9AnC9Gij2Q087NCgJUw3rUqAkWy4mxt+N/AoTO3jJkTASkxXoVjmHcoYeAYQ9xJc31rwwE9NbJeMt7LfAaBsdgYhI7q3AW2fSAw8auIZD/j2ZJ511aNJ78pAVCAIsKtBjZ8PXIdFi0CMrL0fnAG2DXiY2oKAR4DHMBvFoLMyqWMhjjQT88fbEZtc2Y+YE2O33Z/oR1ULkF+3KRxGuFn+B40L4a4ALs+QoTPbh6WEYpUEP6TiUVTJS8AWIYtIo9DqNuAKQoZGIxkA7gLbbJEjty55ZkdN8RQhMHLAd4FnG3QQpwJXF91Yl4B0VAopX+Eqe4xwiSfFA78w8YYJ/NGfaiHwOzTW9SwD30c8grmSy1R3y6pJgJHsvZjAO54DNtGY+WIHfATTeSDKxRQixnB5hcFfNaHwPLAlNepGiAyTH/f9fwt4fwPaOpEtwDejOD7knRhe+2BVP655QiO37mycp0wwQz+gPm9lnCWYLk8N92IyUAR4LMTtX6xiO68DL8SdHlcWhvCR6yAk6zYyg2KP4MsRfos5FyIGVVL7jJKy4CAei/YjvkoYIddLD7C83DPYI+9w7cVKCaIdInhDSc3QPhRG01Yx/vK2iJBB0ahypAPAl5H9wMvFw9lkN913UteU3ui6+1C7USa1lZAO/kIDDuxS0BkAQ7c+FK4EFxfANpPcG9qLhbC5FIM5hJ1D48zPQcJA8UEz5aM6Atl1z6nms/eQspKX7EfAF6nOVByLc8HOlInOjTfg2kpYnDbQCyTLzisCu5BKAOYjpHIEXEJjfP/XgH+S9C2JQUgzuLb2LO+GpJiMZu/HjALY94DPA9vq2Fw36HylvHM+Yvi2hypGQwMkCwIWgT3OqayjU1HKEELotYSxx4kJ96LbBHeb+SHJMZq9q66+a5g7Npq9n87s6jHg+8KGCUUSH6X6eEsbcIGPleboGMMQDFUSA6ajBOz3sQ9D0RD87JZxNrXb/yHCwPPOlHjaG+WRGm3+O2loklUuey84Sj4V/RhxM/CvBD++Gh8+ApZYRbj27KrxhwHkCQOe6ShQmTYVLtyAjdOpbRI/B/wMWIv4mzGf+5+i+fJwgwpLYAay3HLr7sWVY/+37k9/bVgW7LPAnQQvZohkYvQKmwfgVHEjUUyy2NN4LQP2dix+PsmTuGJCFPYnwDqg38RdBvvjDd8lP014uVpmJKV7NHsfG7mPU/7x+pGBtZt+0ruh7xlvWgpcRkhxXEKI9XcSznhPONtGCO7d0+FvQ2bBlUxOnsqYxDhqczoIk0CeyfPPVvlulJA+vgf4NfAEYgtoH1DKVVE0Ui2zUaLPyuwPeYpNtNPlfOiIdsEpiC6MFBALcqAhk41iVgKLQQaGKQLoldlXgNXT7G6rwV8AO1BlC2KpjKsJiVedlfUKBNEPEdzK/YRw8oB3FGVYLkE8/4QQoF4616+GGgRoOyVGOFLvKTO2JxN571MGkQSG814Wx92upKKncNvMd/axmMtVJXXz1ue+PfHPmOqcgVnhpCrUPhFpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJNpCdBkWgI0mZYATaYlQJM5GQU4IaZZxzmRpiSNkJp4mONX1ovwrpY5N/V4PE4kAXKERK/vJVjv1WY3tkWLFi1atJiOOeeydWTXEOOUodRmch1m1qaQst4GlqkUjkUEF3rcjfaVT1xJaSwCBYOCpILM54ukCxHe8tnq67hmklkXYF52DRGogKUMSzvRZnAapoXAArDxJ63PJ6SU9xJqx3oIz5bIELy38TfVQUjGLfH26wNHCC7rMCEje4Dgvh4ChVxQ2X7BIW8UhEptqByDDc6yQDMqQFd2FViMorbIzHowWwCcLjjLQr3WUsIjKMfTx3sJzwPqmIG2GSFzepQgyiBBlL3AK4IXLZQfvYl0UNKwxYUYRYw2OCV9Ig09yM71fRgmZ64dWQ/oLIzzCQ/GOJeQln428B7CmTz+4stmUnn7B0VCYcerhCzpl4HnEDvBXsM07OXHhCx3R2OqY6j34HuyNyKQl6UN5mO2FOw8wrN+PkR4uN742X0iDfogmLPxq+Q1Qn3Ys6CXkF4RHHamkoEN1/F+nKoF6M6uwuFdmVSv4AzBRRaeKX0RwawsItw0G/0gpGYTE2oK9hEqQbcLnjHYbvBGivKQx/mRKs3VtAL0ZvtxYe8pk80z0zLC2b2CUO2ykBPzDK+X8StkP/B/hKqerZK9INNgBGUPDE1TT3ZcAbqzawguX3yaifMwPk6oeryQcONsp/n2e65ghLq0vcAO4OeIp2S8BNEhsHjkON7VpA7sCpUoDujFuADso8DvEp4Rt4D66mzfTYwRyp42Az8F/RzxHMH78qMTas7Ukb2BjNooU+7EWEowLb9PuJEuYfbfOnqyUSB4Vc8SHrvzNOKVFKlc0QqkIrn3lq28nNDpKwnvDOukZV4aRRuhT5cBnwGex3iiTPmxSG5zykzfJtj1RZx8nstcQoRB5nKCE3OdmXakgKua3bJ3IRHBkVl8Ms4Jn1C0BGgyLQGaTEuAJtMSoMm0BGgyLQGaTEuAJtMSoMn8P4f/JnJ3AKQjAAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIzLTEwLTEwVDA5OjI0OjMzKzAwOjAwvlA1mgAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMy0xMC0xMFQwOToyNDozMyswMDowMM8NjSYAAAASdEVYdGV4aWY6RXhpZk9mZnNldAAyNlMbomUAAAAYdEVYdGV4aWY6UGl4ZWxYRGltZW5zaW9uADE1MDtFtCgAAAAYdEVYdGV4aWY6UGl4ZWxZRGltZW5zaW9uADE1MKZKVV4AAAAgdEVYdHNvZnR3YXJlAGh0dHBzOi8vaW1hZ2VtYWdpY2sub3JnvM8dnQAAABh0RVh0VGh1bWI6OkRvY3VtZW50OjpQYWdlcwAxp/+7LwAAABh0RVh0VGh1bWI6OkltYWdlOjpIZWlnaHQAMTkyQF1xVQAAABd0RVh0VGh1bWI6OkltYWdlOjpXaWR0aAAxOTLTrCEIAAAAGXRFWHRUaHVtYjo6TWltZXR5cGUAaW1hZ2UvcG5nP7JWTgAAABd0RVh0VGh1bWI6Ok1UaW1lADE2OTY5Mjk4NzOGDIqVAAAAD3RFWHRUaHVtYjo6U2l6ZQAwQkKUoj7sAAAAVnRFWHRUaHVtYjo6VVJJAGZpbGU6Ly8vbW50bG9nL2Zhdmljb25zLzIwMjMtMTAtMTAvZDcxMjFjNmMzNjk4NzZkNDBlMWNhMjI1ZWI4NzBmY2IuaWNvLnBuZ1NfQSgAAAAASUVORK5CYII= favicon.ico: | AAABAAEAEBAAAAEAIABoBAAAFgAAACgAAAAQAAAAIAAAAAEAIAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAABigAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL3WKAC/pigAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv/YoAL/2KAC/9igAv6YoAL3WKACyBigAtYYoALnWKAC9FigAvvYoAL/GKAC/9igAv/YoAL/2KAC/9igAv8YoAL72KAC9FigAudYoALWGKACyAAAAAAYoALAGKACwJigAsVYoALNGKAC1ZigAtxYoALfmKAC35igAtxYoALVmKACzRigAsVYoALAmKACwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABigAsAYoALBGKAC2BigAtlYoAKBmKACgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYoALAGKACwhigAu/YoALyWKACgxigAoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGKACwBigAsIYoALv2KAC8ligAoMYoAKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABhggoAYoALAGKACwRigAsKYoALC2KAC75igAvIYoALFGKACxNigAsGYoALAGGCCgAAAAAAAAAAAAAAAABigAsAYoALAGKACzhigAukYoALv2KAC0JigAu/YoALymKAC1digAvQYoALrmKAC0higAsDYoALAAAAAAAAAAAAYoALAGKACy5igAvXYoAL/2KAC+ZigAtJYoAL2WKAC99igAtUYoAL6mKAC/9igAvrYoALV2KACwBigAsAYoALAGKADAJigAuXYoAL/2KAC/9igAueYoALdmKAC/xigAv6YoALZGKAC7BigAv/YoAL/2KAC9RigAshYoALAGKACwBigAsdYoAL2mKAC/9igAv2YoALu2KAC+higAvoYoAL/2KAC8VigAt6YoAL9mKAC/9igAv/YoALb2KACwBigAsAYoALQWKAC/ZigAv/YoAL/2KAC/9igAvGYoALcWKAC/digAv+YoAL2WKAC/BigAv/YoAL/2KAC7RigAsJYoALAGKAC1digAvyYoAL8GKAC91igAueYoALKmKACxFigAu5YoAL/2KAC/9igAv/YoAL/2KAC/9igAvcYoALImKACwBigAsUYoALOGKACzRigAsdYoALBWKACwBigAsAYoALLGKAC7ligAv6YoAL/2KAC/9igAv/YoAL72KACz0AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYoEKAGKACwBigAsVYoALa2KAC71igAvpYoAL+2KAC/RigAtNAAAAAAAAAAAAAAAAwAMAAPw/AAD8PwAA/D8AAPAPAADgAwAAwAMAAIABAACAAQAAgAAAAIAAAACDAAAA/4AAAA== logo.svg: | PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCI+PHBhdGggZmlsbD0iIzBCODA2MiIgZD0iTTIsMjJWMjBDMiwyMCA3LDE4IDEyLDE4QzE3LDE4IDIyLDIwIDIyLDIwVjIySDJNMTEuMyw5LjFDMTAuMSw1LjIgNCw2LjEgNCw2LjFDNCw2LjEgNC4yLDEzLjkgOS45LDEyLjdDOS41LDkuOCA4LDkgOCw5QzEwLjgsOSAxMSwxMi40IDExLDEyLjRWMTdDMTEuMywxNyAxMS43LDE3IDEyLDE3QzEyLjMsMTcgMTIuNywxNyAxMywxN1YxMi44QzEzLDEyLjggMTMsOC45IDE2LDcuOUMxNiw3LjkgMTQsMTAuOSAxNCwxMi45QzIxLDEzLjYgMjEsNCAyMSw0QzIxLDQgMTIuMSwzIDExLjMsOS4xWiIgLz48L3N2Zz4= Login Screen In this example, the login screen now displays the custom logo in a different size. The product title is also shown, and the OIDC tabstrip title and text have been changed to a custom-specific one. Product-related links are displayed below the login button. The footer contains a copyright notice for the custom company.\nTeaser in Main Navigation The template approach is also used in this case to change the font-size and line-height of the product title and slogan. The product version (superscript) is omitted.\nAbout Dialog By changing the productLogoUrl and the productName, the changes automatically effect the apperance of the About Dialog and the document title.\n","categories":"","description":"","excerpt":"Theming and Branding Motivation Gardener landscape administrators …","ref":"/docs/dashboard/customization/","tags":"","title":"Customization"},{"body":"Documentation Index Overview Gardener Landing Page gardener.cloud Usage Working with Projects Project Operations Automating Project Resource Management Use the Webterminal Terminal Shortcuts Connect kubectl Custom Shoot Fields Operations Configure Access Restrictions Theming and Branding Webterminals Development Dashboard Architecture Setting Up a Local Development Environment Testing Hotfixes ","categories":"","description":"","excerpt":"Documentation Index Overview Gardener Landing Page gardener.cloud …","ref":"/docs/dashboard/readme/","tags":"","title":"Dashboard"},{"body":"Data Disk Restore From Image Table of Contents Summary Motivation Goals Non-Goals Proposal Alternatives Summary Currently, we have no support either in the shoot spec or in the MCM GCP Provider for restoring GCP Data Disks from images.\nMotivation The primary motivation is to support Integration of vSMP MemeoryOne in Azure. We implemented support for this in AWS via Support for data volume snapshot ID . In GCP we have the option to restore data disk from a custom image which is more convenient and flexible.\nGoals Extend the GCP provider specific WorkerConfig section in the shoot YAML and support provider configuration for data-disks to support data-disk creation from an image name by supplying an image name. Proposal Shoot Specification At this current time, there is no support for provider specific configuration of data disks in an GCP shoot spec. The below shows an example configuration at the time of this proposal:\nproviderConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: # optional, skipped detail here serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 We propose that the worker config section be enahnced to support data disk configuration\nproviderConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: # optional, skipped detail here dataVolumes: # \u003c-- NEW SUB_SECTION - name: vsmp1 image: imgName serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 In the above imgName specified in providerConfig.dataVolumes.image represents the image name of a previously created image created by a tool or process. See Google Cloud Create Image.\nThe MCM GCP Provider will ensure when a VM instance is instantiated, that the data disk(s) for the VM are created with the source image set to the provided imgName. The mechanics of this is left to MCM GCP provider. See image param to --create-disk flag in Google Cloud Instance Creation\n","categories":"","description":"","excerpt":"Data Disk Restore From Image Table of Contents Summary Motivation …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/proposals/datadisk-image-restore/","tags":"","title":"Data Disk Restore From Image"},{"body":"Default Seccomp Profile and Configuration This is a short guide describing how to enable the defaulting of seccomp profiles for Gardener managed workloads in the seed. Running pods in Unconfined (seccomp disabled) mode is undesirable since this is the least restrictive profile. Also, mind that any privileged container will always run as Unconfined. More information about seccomp can be found in this Kubernetes tutorial.\nSetting the Seccomp Profile to RuntimeDefault for Seed Clusters To address the above issue, Gardener provides a webhook that is capable of mutating pods in the seed clusters, explicitly providing them with a seccomp profile type of RuntimeDefault. This profile is defined by the container runtime and represents a set of default syscalls that are allowed or not.\nspec: securityContext: seccompProfile: type: RuntimeDefault A Pod is mutated when all of the following preconditions are fulfilled:\n The Pod is created in a Gardener managed namespace. The Pod is NOT labeled with seccompprofile.resources.gardener.cloud/skip. The Pod does NOT explicitly specify .spec.securityContext.seccompProfile.type. How to Configure To enable this feature, the gardenlet DefaultSeccompProfile feature gate must be set to true.\nfeatureGates: DefaultSeccompProfile: true Please refer to the examples in this yaml file for more information.\nOnce the feature gate is enabled, the webhook will be registered and configured for the seed cluster. Newly created pods will be mutated to have their seccomp profile set to RuntimeDefault.\n Note: Please note that this feature is still in Alpha, so you might see instabilities every now and then.\n Setting the Seccomp Profile to RuntimeDefault for Shoot Clusters You can enable the use of RuntimeDefault as the default seccomp profile for all workloads. If enabled, the kubelet will use the RuntimeDefault seccomp profile by default, which is defined by the container runtime, instead of using the Unconfined mode. More information for this feature can be found in the Kubernetes documentation.\nTo use seccomp profile defaulting, you must run the kubelet with the SeccompDefault feature gate enabled (this is the default).\nHow to Configure To enable this feature, the kubelet seccompDefault configuration parameter must be set to true in the shoot’s spec.\nspec: kubernetes: version: 1.25.0 kubelet: seccompDefault: true Please refer to the examples in this yaml file for more information.\n","categories":"","description":"Enable the use of `RuntimeDefault` as the default seccomp profile through `spec.kubernetes.kubelet.seccompDefault`","excerpt":"Enable the use of `RuntimeDefault` as the default seccomp profile …","ref":"/docs/gardener/default_seccomp_profile/","tags":"","title":"Default Seccomp Profile"},{"body":"Defaulting Strategy and Developer Guidelines This document walks you through:\n Conventions to be followed when writing defaulting functions How to write a test for a defaulting function The document is aimed towards developers who want to contribute code and need to write defaulting code and unit tests covering the defaulting functions, as well as maintainers and reviewers who review code. It serves as a common guide that we commit to follow in our project to ensure consistency in our defaulting code, good coverage for high confidence, and good maintainability.\nWriting defaulting code Every kubernetes type should have a dedicated defaults_*.go file. For instance, if you have a Shoot type, there should be a corresponding defaults_shoot.go file containing all defaulting logic for that type. If there is only one type under an api group then we can just have types.go and a corresponding defaults.go. For instance, resourcemanager api has only one types.go, hence in this case only defaults.go file would suffice. Aim to segregate each struct type into its own SetDefaults_* function. These functions encapsulate the defaulting logic specific to the corresponding struct type, enhancing modularity and maintainability. For example, ServerConfiguration struct in resourcemanager api has corresponding SetDefaults_ServerConfiguration() function. ⚠️ Ensure to run the make generate WHAT=codegen command when new SetDefaults_* function is added, which generates the zz_generated.defaults.go file containing the overall defaulting function.\nWriting unit tests for defaulting code Each test case should validate the overall defaulting function SetObjectDefaults_* generated by defaulter-gen and not a specific SetDefaults_*. This way we also test if the zz_generated.defaults.go was generated correctly. For example, the spec.machineImages[].updateStrategy field in the CloudProfile is defaulted as follows: https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_cloudprofile.go#L23-L29 The defaulting should be tested with the overall defaulting function SetObjectDefaults_CloudProfile (and not with SetDefaults_MachineImage): https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_cloudprofile_test.go#L40-L47\n Test each defaulting function carefully to ensure:\n Proper defaulting behaviour when fields are empty or nil. Note that some fields may be optional and should not be defaulted.\n Preservation of existing values, ensuring that defaulting does not accidentally overwrite them.\nFor example, when spec.secretRef.namespace field of SecretBinding is nil, it should be defaulted to the namespace of SecretBinding object. But spec.secretRef.namespace field should not be overwritten by defaulting logic if it is already set. https://github.com/gardener/gardener/blob/ff5a5be6049777b0695659a50189e461e1b17796/pkg/apis/core/v1beta1/defaults_secretbinding_test.go#L26-L54\n ","categories":"","description":"","excerpt":"Defaulting Strategy and Developer Guidelines This document walks you …","ref":"/docs/gardener/defaulting/","tags":"","title":"Defaulting"},{"body":"DEP-NN: Your short, descriptive title Table of Contents Summary Motivation Goals Non-Goals Proposal Alternatives Summary Motivation Goals Non-Goals Proposal Alternatives ","categories":"","description":"","excerpt":"DEP-NN: Your short, descriptive title Table of Contents Summary …","ref":"/docs/other-components/etcd-druid/proposals/00-template/","tags":"","title":"DEP Title"},{"body":"Testing We follow the BDD-style testing principles and are leveraging the Ginkgo framework along with Gomega as matcher library. In order to execute the existing tests, you can use\nmake test # runs tests make verify # runs static code checks and test There is an additional command for analyzing the code coverage of the tests. Ginkgo will generate standard Golang cover profiles which will be translated into a HTML file by the Go Cover Tool. Another command helps you to clean up the filesystem from the temporary cover profile files and the HTML report:\nmake test-cov open gardener.coverage.html make test-cov-clean sigs.k8s.io/controller-runtime env test Some of the integration tests in Gardener are using the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) against the integration tests can run. The test and test-cov rules in the Makefile prepare this env test automatically by downloading the respective binaries (if not yet present) and set the necessary environment variables.\nYou can also run go test or ginkgo without the test/test-cov rules. In this case you have to set the KUBEBUILDER_ASSETS environment variable to the path that contains the etcd + kube-apiserver binaries or you need to have the binaries pre-installed under /usr/local/kubebuilder/bin.\nDependency Management We are using go modules for depedency management. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating Dependencies The Makefile contains a rule called revendor which performs go mod vendor and go mod tidy. go mod vendor resets the main module’s vendor directory to include all packages needed to build and test all the main module’s packages. It does not include test code for vendored packages. go mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages.\nmake revendor The dependencies are installed into the vendor folder which should be added to the VCS.\nWarning Make sure that you test the code after you have updated the dependencies! ","categories":"","description":"","excerpt":"Testing We follow the BDD-style testing principles and are leveraging …","ref":"/docs/contribute/code/dependencies/","tags":"","title":"Dependencies"},{"body":"Dependency Management We are using go modules for dependency management. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating Dependencies The Makefile contains a rule called tidy which performs go mod tidy:\n go mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages. make tidy ⚠️ Make sure that you test the code after you have updated the dependencies!\nExported Packages This repository contains several packages that could be considered “exported packages”, in a sense that they are supposed to be reused in other Go projects. For example:\n Gardener’s API packages: pkg/apis Library for building Gardener extensions: extensions Gardener’s Test Framework: test/framework There are a few more folders in this repository (non-Go sources) that are reused across projects in the Gardener organization:\n GitHub templates: .github Concourse / cc-utils related helpers: hack/.ci Development, build and testing helpers: hack These packages feature a dummy doc.go file to allow other Go projects to pull them in as go mod dependencies.\nThese packages are explicitly not supposed to be used in other projects (consider them as “non-exported”):\n API validation packages: pkg/apis/*/*/validation Operation package (main Gardener business logic regarding Seed and Shoot clusters): pkg/gardenlet/operation Third party code: third_party Currently, we don’t have a mechanism yet for selectively syncing out these exported packages into dedicated repositories like kube’s staging mechanism (publishing-bot).\nImport Restrictions We want to make sure that other projects can depend on this repository’s “exported” packages without pulling in the entire repository (including “non-exported” packages) or a high number of other unwanted dependencies. Hence, we have to be careful when adding new imports or references between our packages.\n ℹ️ General rule of thumb: the mentioned “exported” packages should be as self-contained as possible and depend on as few other packages in the repository and other projects as possible.\n In order to support that rule and automatically check compliance with that goal, we leverage import-boss. The tool checks all imports of the given packages (including transitive imports) against rules defined in .import-restrictions files in each directory. An import is allowed if it matches at least one allowed prefix and does not match any forbidden prefixes.\n Note: '' (the empty string) is a prefix of everything. For more details, see the import-boss topic.\n import-boss is executed on every pull request and blocks the PR if it doesn’t comply with the defined import restrictions. You can also run it locally using make check.\nImport restrictions should be changed in the following situations:\n We spot a new pattern of imports across our packages that was not restricted before but makes it more difficult for other projects to depend on our “exported” packages. In that case, the imports should be further restricted to disallow such problematic imports, and the code/package structure should be reworked to comply with the newly given restrictions. We want to share code between packages, but existing import restrictions prevent us from doing so. In that case, please consider what additional dependencies it will pull in, when loosening existing restrictions. Also consider possible alternatives, like code restructurings or extracting shared code into dedicated packages for minimal impact on dependent projects. ","categories":"","description":"","excerpt":"Dependency Management We are using go modules for dependency …","ref":"/docs/gardener/dependencies/","tags":"","title":"Dependencies"},{"body":"Documentation Index Concepts Prober Weeder Development Contributions Testing Setup Dependency Watchdog using local Garden cluster Deployment Configure dependency watchdog ","categories":"","description":"","excerpt":"Documentation Index Concepts Prober Weeder Development …","ref":"/docs/other-components/dependency-watchdog/readme/","tags":"","title":"Dependency Watchdog"},{"body":"Deploying Gardenlets Gardenlets act as decentralized agents to manage the shoot clusters of a seed cluster.\nProcedure After you have deployed the Gardener control plane, you need one or more seed clusters in order to be able to create shoot clusters.\nYou can either register an existing cluster as “seed” (this could also be the cluster in which the control plane runs), or you can create new clusters (typically shoots, i.e., this approach registers at least one first initial seed) and then register them as “seeds”.\nThe following sections describe the scenarios.\nRegister A First Seed Cluster If you have not registered a seed cluster yet (thus, you need to deploy a first, so-called “unmanaged seed”), your approach depends on how you deployed the Gardener control plane.\nGardener Control Plane Deployed Via gardener/controlplane Helm chart You can follow Deploy a gardenlet Manually.\nGardener Control Plane Deployed Via gardener-operator If you want to register the same cluster in which gardener-operator runs, or if you want to register another cluster that is reachable (network-wise) for gardener-operator, you can follow Deploy gardenlet via gardener-operator. If you want to register a cluster that is not reachable (network-wise) (e.g., because it runs behind a firewall), you can follow Deploy a gardenlet Manually. Register Further Seed Clusters If you already have a seed cluster, and you want to deploy further seed clusters (so-called “managed seeds”), you can follow Deploy a gardenlet Automatically.\n","categories":"","description":"","excerpt":"Deploying Gardenlets Gardenlets act as decentralized agents to manage …","ref":"/docs/gardener/deployment/deploy_gardenlet/","tags":"","title":"Deploy Gardenlet"},{"body":"Deploy a gardenlet Automatically The gardenlet can automatically deploy itself into shoot clusters, and register them as seed clusters. These clusters are called “managed seeds” (aka “shooted seeds”). This procedure is the preferred way to add additional seed clusters, because shoot clusters already come with production-grade qualities that are also demanded for seed clusters.\nPrerequisites The only prerequisite is to register an initial cluster as a seed cluster that already has a manually deployed gardenlet (for a step-by-step manual installation guide, see Deploy a Gardenlet Manually).\n [!TIP] The initial seed cluster can be the garden cluster itself, but for better separation of concerns, it is recommended to only register other clusters as seeds.\n Auto-Deployment of Gardenlets into Shoot Clusters For a better scalability of your Gardener landscape (e.g., when the total number of Shoots grows), you usually need more seed clusters that you can create, as follows:\n Use the initial seed cluster (“unmanaged seed”) to create shoot clusters that you later register as seed clusters. The gardenlet deployed in the initial cluster can deploy itself into the shoot clusters (which eventually makes them getting registered as seeds) if ManagedSeed resources are created. The advantage of this approach is that there’s only one initial gardenlet installation required. Every other managed seed cluster gets an automatically deployed gardenlet.\nRelated Links ManagedSeeds: Register Shoot as Seed ","categories":"","description":"","excerpt":"Deploy a gardenlet Automatically The gardenlet can automatically …","ref":"/docs/gardener/deployment/deploy_gardenlet_automatically/","tags":"","title":"Deploy Gardenlet Automatically"},{"body":"Deploy a gardenlet Manually Manually deploying a gardenlet is usually only required if the Kubernetes cluster to be registered as a seed cluster is managed via third-party tooling (i.e., the Kubernetes cluster is not a shoot cluster, so Deploy a gardenlet Automatically cannot be used). In this case, gardenlet needs to be deployed manually, meaning that its Helm chart must be installed.\n [!TIP] Once you’ve deployed a gardenlet manually, you can deploy new gardenlets automatically. The manually deployed gardenlet is then used as a template for the new gardenlets. For more information, see Deploy a gardenlet Automatically.\n Prerequisites Kubernetes Cluster that Should Be Registered as a Seed Cluster Verify that the cluster has a supported Kubernetes version.\n Determine the nodes, pods, and services CIDR of the cluster. You need to configure this information in the Seed configuration. Gardener uses this information to check that the shoot cluster isn’t created with overlapping CIDR ranges.\n Every seed cluster needs an Ingress controller which distributes external requests to internal components like Plutono and Prometheus. For this, configure the following lines in your Seed resource:\nspec: dns: provider: type: aws-route53 secretRef: name: ingress-secret namespace: garden ingress: domain: ingress.my-seed.example.com controller: kind: nginx providerConfig: \u003csome-optional-provider-specific-config-for-the-ingressController\u003e Procedure Overview Prepare the garden cluster: Create a bootstrap token secret in the kube-system namespace of the garden cluster Create RBAC roles for the gardenlet to allow bootstrapping in the garden cluster Prepare the gardenlet Helm chart. Automatically register shoot cluster as a seed cluster. Deploy the gardenlet Check that the gardenlet is successfully deployed Create a Bootstrap Token Secret in the kube-system Namespace of the Garden Cluster The gardenlet needs to talk to the Gardener API server residing in the garden cluster.\nUse gardenlet’s ability to request a signed certificate for the garden cluster by leveraging Kubernetes Certificate Signing Requests. The gardenlet performs a TLS bootstrapping process that is similar to the Kubelet TLS Bootstrapping. Make sure that the API server of the garden cluster has bootstrap token authentication enabled.\nThe client credentials required for the gardenlet’s TLS bootstrapping process need to be either token or certificate (OIDC isn’t supported) and have permissions to create a Certificate Signing Request (CSR). It’s recommended to use bootstrap tokens due to their desirable security properties (such as a limited token lifetime).\nTherefore, first create a bootstrap token secret for the garden cluster:\napiVersion: v1 kind: Secret metadata: # Name MUST be of form \"bootstrap-token-\u003ctoken id\u003e\" name: bootstrap-token-07401b namespace: kube-system # Type MUST be 'bootstrap.kubernetes.io/token' type: bootstrap.kubernetes.io/token stringData: # Human readable description. Optional. description: \"Token to be used by the gardenlet for Seed `sweet-seed`.\" # Token ID and secret. Required. token-id: 07401b # 6 characters token-secret: f395accd246ae52d # 16 characters # Expiration. Optional. # expiration: 2017-03-10T03:22:11Z # Allowed usages. usage-bootstrap-authentication: \"true\" usage-bootstrap-signing: \"true\" When you later prepare the gardenlet Helm chart, a kubeconfig based on this token is shared with the gardenlet upon deployment.\nPrepare the gardenlet Helm Chart This section only describes the minimal configuration, using the global configuration values of the gardenlet Helm chart. For an overview over all values, see the configuration values. We refer to the global configuration values as gardenlet configuration in the following procedure.\n Create a gardenlet configuration gardenlet-values.yaml based on this template.\n Create a bootstrap kubeconfig based on the bootstrap token created in the garden cluster.\nReplace the \u003cbootstrap-token\u003e with token-id.token-secret (from our previous example: 07401b.f395accd246ae52d) from the bootstrap token secret.\napiVersion: v1 kind: Config current-context: gardenlet-bootstrap@default clusters: - cluster: certificate-authority-data: \u003cca-of-garden-cluster\u003e server: https://\u003cendpoint-of-garden-cluster\u003e name: default contexts: - context: cluster: default user: gardenlet-bootstrap name: gardenlet-bootstrap@default users: - name: gardenlet-bootstrap user: token: \u003cbootstrap-token\u003e In the gardenClientConnection.bootstrapKubeconfig section of your gardenlet configuration, provide the bootstrap kubeconfig together with a name and namespace to the gardenlet Helm chart.\ngardenClientConnection: bootstrapKubeconfig: name: gardenlet-kubeconfig-bootstrap namespace: garden kubeconfig: | \u003cbootstrap-kubeconfig\u003e # will be base64 encoded by helm The bootstrap kubeconfig is stored in the specified secret.\n In the gardenClientConnection.kubeconfigSecret section of your gardenlet configuration, define a name and a namespace where the gardenlet stores the real kubeconfig that it creates during the bootstrap process. If the secret doesn’t exist, the gardenlet creates it for you.\ngardenClientConnection: kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden Updating the Garden Cluster CA The kubeconfig created by the gardenlet in step 4 will not be recreated as long as it exists, even if a new bootstrap kubeconfig is provided. To enable rotation of the garden cluster CA certificate, a new bundle can be provided via the gardenClientConnection.gardenClusterCACert field. If the provided bundle differs from the one currently in the gardenlet’s kubeconfig secret then it will be updated. To remove the CA completely (e.g. when switching to a publicly trusted endpoint), this field can be set to either none or null.\nPrepare Seed Specification When gardenlet starts, it tries to register a Seed resource in the garden cluster based on the specification provided in seedConfig in its configuration.\n This procedure doesn’t describe all the possible configurations for the Seed resource. For more information, see:\n Example Seed resource Configurable Seed settings Supply the Seed resource in the seedConfig section of your gardenlet configuration gardenlet-values.yaml.\n Add the seedConfig to your gardenlet configuration gardenlet-values.yaml. The field seedConfig.spec.provider.type specifies the infrastructure provider type (for example, aws) of the seed cluster. For all supported infrastructure providers, see Known Extension Implementations.\n# ... seedConfig: metadata: name: sweet-seed labels: environment: evaluation annotations: custom.gardener.cloud/option: special spec: dns: provider: type: \u003cprovider\u003e secretRef: name: ingress-secret namespace: garden ingress: # see prerequisites domain: ingress.dev.my-seed.example.com controller: kind: nginx networks: # see prerequisites nodes: 10.240.0.0/16 pods: 100.244.0.0/16 services: 100.32.0.0/13 shootDefaults: # optional: non-overlapping default CIDRs for shoot clusters of that Seed pods: 100.96.0.0/11 services: 100.64.0.0/13 provider: region: eu-west-1 type: \u003cprovider\u003e Apart from the seed’s name, seedConfig.metadata can optionally contain labels and annotations. gardenlet will set the labels of the registered Seed object to the labels given in the seedConfig plus gardener.cloud/role=seed. Any custom labels on the Seed object will be removed on the next restart of gardenlet. If a label is removed from the seedConfig it is removed from the Seed object as well. In contrast to labels, annotations in the seedConfig are added to existing annotations on the Seed object. Thus, custom annotations that are added to the Seed object during runtime are not removed by gardenlet on restarts. Furthermore, if an annotation is removed from the seedConfig, gardenlet does not remove it from the Seed object.\nOptional: Enable HA Mode You may consider running gardenlet with multiple replicas, especially if the seed cluster is configured to host HA shoot control planes. Therefore, the following Helm chart values define the degree of high availability you want to achieve for the gardenlet deployment.\nreplicaCount: 2 # or more if a higher failure tolerance is required. failureToleranceType: zone # One of `zone` or `node` - defines how replicas are spread. Optional: Enable Backup and Restore The seed cluster can be set up with backup and restore for the main etcds of shoot clusters.\nGardener uses etcd-backup-restore that integrates with different storage providers to store the shoot cluster’s main etcd backups. Make sure to obtain client credentials that have sufficient permissions with the chosen storage provider.\nCreate a secret in the garden cluster with client credentials for the storage provider. The format of the secret is cloud provider specific and can be found in the repository of the respective Gardener extension. For example, the secret for AWS S3 can be found in the AWS provider extension (30-etcd-backup-secret.yaml).\napiVersion: v1 kind: Secret metadata: name: sweet-seed-backup namespace: garden type: Opaque data: # client credentials format is provider specific Configure the Seed resource in the seedConfig section of your gardenlet configuration to use backup and restore:\n# ... seedConfig: metadata: name: sweet-seed spec: backup: provider: \u003cprovider\u003e secretRef: name: sweet-seed-backup namespace: garden Optional: Enable Self-Upgrades In order to take off the continuous task of deploying gardenlet’s Helm chart in case you want to upgrade its version, it supports self-upgrades. The way this works is that it pulls information (its configuration and deployment values) from a seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in the garden cluster. This resource must be in the garden namespace and must have the same name as the Seed the gardenlet is responsible for. For more information, see this section.\nIn order to make gardenlet automatically create a corresponding seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource, you must provide\nselfUpgrade: deployment: helm: ociRepository: ref: \u003curl-to-oci-repository-containing-gardenlet-helm-chart\u003e in your gardenlet-values.yaml file. Please replace the ref placeholder with the URL to the OCI repository containing the gardenlet Helm chart you are installing.\n [!NOTE]\nIf you don’t configure this selfUpgrade section in the initial deployment, you can also do it later, or you directly create the corresponding seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in the garden cluster.\n Deploy the gardenlet The gardenlet-values.yaml looks something like this (with backup for shoot clusters enabled):\n# \u003cdefault config\u003e # ... config: gardenClientConnection: # ... bootstrapKubeconfig: name: gardenlet-bootstrap-kubeconfig namespace: garden kubeconfig: |apiVersion: v1 clusters: - cluster: certificate-authority-data: \u003cdummy\u003e server: \u003cmy-garden-cluster-endpoint\u003e name: my-kubernetes-cluster # ... kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden # ... # \u003cdefault config\u003e # ... seedConfig: metadata: name: sweet-seed spec: dns: provider: type: \u003cprovider\u003e secretRef: name: ingress-secret namespace: garden ingress: # see prerequisites domain: ingress.dev.my-seed.example.com controller: kind: nginx networks: nodes: 10.240.0.0/16 pods: 100.244.0.0/16 services: 100.32.0.0/13 shootDefaults: pods: 100.96.0.0/11 services: 100.64.0.0/13 provider: region: eu-west-1 type: \u003cprovider\u003e backup: provider: \u003cprovider\u003e secretRef: name: sweet-seed-backup namespace: garden Deploy the gardenlet Helm chart to the Kubernetes cluster:\nhelm install gardenlet charts/gardener/gardenlet \\ --namespace garden \\ -f gardenlet-values.yaml \\ --wait This Helm chart creates:\n A service account gardenlet that the gardenlet can use to talk to the Seed API server. RBAC roles for the service account (full admin rights at the moment). The secret (garden/gardenlet-bootstrap-kubeconfig) containing the bootstrap kubeconfig. The gardenlet deployment in the garden namespace. Check that the gardenlet Is Successfully Deployed Check that the gardenlets certificate bootstrap was successful.\nCheck if the secret gardenlet-kubeconfig in the namespace garden in the seed cluster is created and contains a kubeconfig with a valid certificate.\n Get the kubeconfig from the created secret.\n$ kubectl -n garden get secret gardenlet-kubeconfig -o json | jq -r .data.kubeconfig | base64 -d Test against the garden cluster and verify it’s working.\n Extract the client-certificate-data from the user gardenlet.\n View the certificate:\n$ openssl x509 -in ./gardenlet-cert -noout -text Check that the bootstrap secret gardenlet-bootstrap-kubeconfig has been deleted from the seed cluster in namespace garden.\n Check that the seed cluster is registered and READY in the garden cluster.\nCheck that the seed cluster sweet-seed exists and all conditions indicate that it’s available. If so, the Gardenlet is sending regular heartbeats and the seed bootstrapping was successful.\nCheck that the conditions on the Seed resource look similar to the following:\n$ kubectl get seed sweet-seed -o json | jq .status.conditions [ { \"lastTransitionTime\": \"2020-07-17T09:17:29Z\", \"lastUpdateTime\": \"2020-07-17T09:17:29Z\", \"message\": \"Gardenlet is posting ready status.\", \"reason\": \"GardenletReady\", \"status\": \"True\", \"type\": \"GardenletReady\" }, { \"lastTransitionTime\": \"2020-07-17T09:17:49Z\", \"lastUpdateTime\": \"2020-07-17T09:53:17Z\", \"message\": \"Backup Buckets are available.\", \"reason\": \"BackupBucketsAvailable\", \"status\": \"True\", \"type\": \"BackupBucketsReady\" } ] Self Upgrades In order to keep your gardenlets in such “unmanaged seeds” up-to-date (i.e., in seeds which are no shoot clusters), its Helm chart must be regularly deployed. This requires network connectivity to such clusters which can be challenging if they reside behind a firewall or in restricted environments. It is much simpler if gardenlet could keep itself up-to-date, based on configuration read from the garden cluster. This approach greatly reduces operational complexity.\ngardenlet runs a controller which watches for seedmanagement.gardener.cloud/v1alpha1.Gardenlet resources in the garden cluster in the garden namespace having the same name as the Seed the gardenlet is responsible for. Such resources contain its component configuration and deployment values. Most notably, a URL to an OCI repository containing gardenlet’s Helm chart is included.\nAn example Gardenlet resource looks like this:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: deployment: replicaCount: 1 revisionHistoryLimit: 2 helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 config: apiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration gardenClientConnection: kubeconfigSecret: name: gardenlet-kubeconfig namespace: garden controllers: shoot: reconcileInMaintenanceOnly: true respectSyncPeriodOverwrite: true shootState: concurrentSyncs: 0 featureGates: DefaultSeccompProfile: true HVPA: true HVPAForShootedSeed: true IPv6SingleStack: true ShootManagedIssuer: true etcdConfig: featureGates: UseEtcdWrapper: true logging: enabled: true vali: enabled: true shootNodeLogging: shootPurposes: - infrastructure - production - development - evaluation seedConfig: apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: base: kind spec: backup: provider: local region: local secretRef: name: backup-local namespace: garden dns: provider: secretRef: name: internal-domain-internal-local-gardener-cloud namespace: garden type: local ingress: controller: kind: nginx domain: ingress.local.seed.local.gardener.cloud networks: nodes: 172.18.0.0/16 pods: 10.1.0.0/16 services: 10.2.0.0/16 shootDefaults: pods: 10.3.0.0/16 services: 10.4.0.0/16 provider: region: local type: local zones: - \"0\" settings: excessCapacityReservation: enabled: false scheduling: visible: true verticalPodAutoscaler: enabled: true On reconciliation, gardenlet downloads the Helm chart, renders it with the provided values, and then applies it to its own cluster. Hence, in order to keep a gardenlet up-to-date, it is enough to update the tag/digest of the OCI repository ref for the Helm chart:\nspec: deployment: helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 This way, network connectivity to the cluster in which gardenlet runs is not required at all (at least for deployment purposes).\nWhen you delete this resource, nothing happens: gardenlet remains running with the configuration as before. However, self-upgrades are obviously not possible anymore. In order to upgrade it, you have to either recreate the Gardenlet object, or redeploy the Helm chart.\nRelated Links Issue #1724: Harden Gardenlet RBAC privileges. Backup and Restore. ","categories":"","description":"","excerpt":"Deploy a gardenlet Manually Manually deploying a gardenlet is usually …","ref":"/docs/gardener/deployment/deploy_gardenlet_manually/","tags":"","title":"Deploy Gardenlet Manually"},{"body":"Deploy a gardenlet Via gardener-operator The gardenlet can automatically be deployed by gardener-operator into existing Kubernetes clusters in order to register them as seeds.\nPrerequisites Using this method only works when gardener-operator is managing the garden cluster. If you have used the gardener/controlplane Helm chart for the deployment of the Gardener control plane, please refer to this document.\n [!TIP] The initial seed cluster can be the garden cluster itself, but for better separation of concerns, it is recommended to only register other clusters as seeds.\n Deployment of gardenlets Using this method, gardener-operator is only taking care of the very first deployment of gardenlet. Once running, the gardenlet leverages the self upgrade strategy in order to keep itself up-to-date. Concretely, gardener-operator only acts when there is no respective Seed resource yet.\nIn order to request a gardenlet deployment, create following resource in the (virtual) garden cluster:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: deployment: replicaCount: 1 revisionHistoryLimit: 2 helm: ociRepository: ref: \u003curl-to-gardenlet-chart-repository\u003e:v1.97.0 config: apiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration controllers: shoot: reconcileInMaintenanceOnly: true respectSyncPeriodOverwrite: true shootState: concurrentSyncs: 0 featureGates: ShootManagedIssuer: true etcdConfig: featureGates: UseEtcdWrapper: true logging: enabled: true vali: enabled: true shootNodeLogging: shootPurposes: - infrastructure - production - development - evaluation seedConfig: apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: labels: base: kind spec: backup: provider: local region: local secretRef: name: backup-local namespace: garden dns: provider: secretRef: name: internal-domain-internal-local-gardener-cloud namespace: garden type: local ingress: controller: kind: nginx domain: ingress.local.seed.local.gardener.cloud networks: nodes: 172.18.0.0/16 pods: 10.1.0.0/16 services: 10.2.0.0/16 shootDefaults: pods: 10.3.0.0/16 services: 10.4.0.0/16 provider: region: local type: local zones: - \"0\" settings: excessCapacityReservation: enabled: false scheduling: visible: true verticalPodAutoscaler: enabled: true This causes gardener-operator to deploy gardenlet to the same cluster where it is running. Once it comes up, gardenlet will create a Seed resource with the same name and uses the Gardenlet resource for self-upgrades (see this document).\nRemote Clusters If you want gardener-operator to deploy gardenlet into some other cluster, create a kubeconfig Secret and reference it in the Gardenlet resource:\napiVersion: v1 kind: Secret metadata: name: remote-cluster-kubeconfig namespace: garden type: Opaque data: kubeconfig: base64(kubeconfig-to-remote-cluster) --- apiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: Gardenlet metadata: name: local namespace: garden spec: kubeconfigSecretRef: name: remote-cluster-kubeconfig # ... After successful deployment of gardenlet, gardener-operator will delete the remote-cluster-kubeconfig Secret and set .spec.kubeconfigSecretRef to nil. This is because the kubeconfig will never ever be needed anymore (gardener-operator is only responsible for initial deployment, and gardenlet updates itself with an in-cluster kubeconfig).\n","categories":"","description":"","excerpt":"Deploy a gardenlet Via gardener-operator The gardenlet can …","ref":"/docs/gardener/deployment/deploy_gardenlet_via_operator/","tags":"","title":"Deploy Gardenlet Via Operator"},{"body":"Deploying Registry Cache Extension in Gardener’s Local Setup with Provider Extensions Prerequisites Make sure that you have a running local Gardener setup with enabled provider extensions. The steps to complete this can be found in the Deploying Gardener Locally and Enabling Provider-Extensions guide. Setting up the Registry Cache Extension Make sure that your KUBECONFIG environment variable is targeting the local Gardener cluster.\nThe location of the Gardener project from the Gardener setup step is expected to be under the same root (e.g. ~/go/src/github.com/gardener/). If this is not the case, the location of Gardener project should be specified in GARDENER_REPO_ROOT environment variable:\nexport GARDENER_REPO_ROOT=\"\u003cpath_to_gardener_project\u003e\" Then you can run:\nmake remote-extension-up In case you have added additional Seeds you can specify the seed name:\nmake remote-extension-up SEED_NAME=\u003cseed-name\u003e The corresponding make target will build the extension image, push it into the Seed cluster image registry, and deploy the registry-cache ControllerDeployment and ControllerRegistration resources into the kind cluster. The container image in the ControllerDeployment will be the image that was build and pushed into the Seed cluster image registry.\nThe make target will then deploy the registry-cache admission component. It will build the admission image, push it into the kind cluster image registry, and finally install the admission component charts to the kind cluster.\nCreating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster. In order to create a Shoot cluster, please create your own Shoot definition depending on providers on your Seed cluster.\nTearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the registry-cache extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake remote-extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the registry-cache admission helm deployment.\n","categories":"","description":"Learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster","excerpt":"Learn how to set up a development environment using own Seed clusters …","ref":"/docs/extensions/others/gardener-extension-registry-cache/getting-started-remotely/","tags":"","title":"Deploying Registry Cache Extension in Gardener's Local Setup with Provider Extensions"},{"body":"Deploying Registry Cache Extension Locally Prerequisites Make sure that you have a running local Gardener setup. The steps to complete this can be found in the Deploying Gardener Locally guide. Setting up the Registry Cache Extension Make sure that your KUBECONFIG environment variable is targeting the local Gardener cluster. When this is ensured, run:\nmake extension-up The corresponding make target will build the extension image, load it into the kind cluster Nodes, and deploy the registry-cache ControllerDeployment and ControllerRegistration resources. The container image in the ControllerDeployment will be the image that was build and loaded into the kind cluster Nodes.\nThe make target will then deploy the registry-cache admission component. It will build the admission image, load it into the kind cluster Nodes, and finally install the admission component charts to the kind cluster.\nCreating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster.\nexample/shoot-registry-cache.yaml contains a Shoot specification with the registry-cache extension:\nkubectl create -f example/shoot-registry-cache.yaml example/shoot-registry-mirror.yaml contains a Shoot specification with the registry-mirror extension:\nkubectl create -f example/shoot-registry-mirror.yaml Tearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the registry-cache extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the registry-cache admission helm deployment.\n","categories":"","description":"Learn how to set up a local development environment","excerpt":"Learn how to set up a local development environment","ref":"/docs/extensions/others/gardener-extension-registry-cache/getting-started-locally/","tags":"","title":"Deploying Registry Cache Extension Locally"},{"body":"Deploying Rsyslog Relp Extension Remotely This document will walk you through running the Rsyslog Relp extension controller on a remote seed cluster and the rsyslog relp admission component in your local garden cluster for development purposes. This guide uses Gardener’s setup with provider extensions and builds on top of it.\nIf you encounter difficulties, please open an issue so that we can make this process easier.\nPrerequisites Make sure that you have a running Gardener setup with provider extensions. The steps to complete this can be found in the Deploying Gardener Locally and Enabling Provider-Extensions guide. Make sure you are running Gardener version \u003e= 1.95.0 or the latest version of the master branch. Setting up the Rsyslog Relp Extension Important: Make sure that your KUBECONFIG env variable is targeting the local Gardener cluster!\nThe location of the Gardener project from the Gardener setup is expected to be under the same root as this repository (e.g. ~/go/src/github.com/gardener/). If this is not the case, the location of Gardener project should be specified in GARDENER_REPO_ROOT environment variable:\nexport GARDENER_REPO_ROOT=\"\u003cpath_to_gardener_project\u003e\" Then you can run:\nmake remote-extension-up In case you have added additional Seeds you can specify the seed name:\nmake remote-extension-up SEED_NAME=\u003cseed-name\u003e Creating a Shoot Cluster Once the above step is completed, you can create a Shoot cluster. In order to create a Shoot cluster, please create your own Shoot definition depending on providers on your Seed cluster.\nConfiguring the Shoot Cluster and deploying the Rsyslog Relp Echo Server To be able to properly test the rsyslog relp extension you need a running rsyslog relp echo server to which logs from the Shoot nodes can be sent. To deploy the server and configure the rsyslog relp extension on your Shoot cluster you can run:\nmake configure-shoot SHOOT_NAME=\u003cshoot-name\u003e SHOOT_NAMESPACE=\u003cshoot-namespace\u003e This command will deploy an rsyslog relp echo server in your Shoot cluster in the rsyslog-relp-echo-server namespace. It will also add configuration for the shoot-rsyslog-relp extension to your Shoot spec by patching it with ./example/extension/\u003cshoot-name\u003e--\u003cshoot-namespace\u003e--extension-config-patch.yaml. This file is automatically copied from extension-config-patch.yaml.tmpl in the same directory when you run make configure-shoot for the first time. The file also includes explanations of the properties you should set or change. The command will also deploy the rsyslog-relp-tls secret in case you wish to enable tls.\nTearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the shoot-rsyslog-relp extension in the Shoot’s specification. When the extension is not used by the Shoot anymore, you can run:\nmake remote-extension-down The make target will delete the ControllerDeployment and ControllerRegistration of the extension, and the shoot-rsyslog-relp admission helm deployment.\n","categories":"","description":"Learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster","excerpt":"Learn how to set up a development environment using own Seed clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/getting-started-remotely/","tags":"","title":"Deploying Rsyslog Relp Extension Remotely"},{"body":"Deployment of the AliCloud provider extension Disclaimer: This document is NOT a step by step installation guide for the AliCloud provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AliCloud provider extension repository.\ngardener-extension-admission-alicloud Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-alicloud component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the AliCloud provider extension Disclaimer: This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the AWS provider extension Disclaimer: This document is NOT a step by step installation guide for the AWS provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the AWS provider extension repository.\ngardener-extension-admission-aws Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-aws component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the AWS provider extension Disclaimer: This document is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the Azure provider extension Disclaimer: This document is NOT a step by step installation guide for the Azure provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the Azure provider extension repository.\ngardener-extension-admission-azure Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-azure component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the Azure provider extension Disclaimer: This document …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the GCP provider extension Disclaimer: This document is NOT a step-by-step installation guide for the GCP provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the GCP provider extension repository.\ngardener-extension-admission-gcp Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-gcp component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the GCP provider extension Disclaimer: This document is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the OpenStack provider extension Disclaimer: This document is NOT a step by step installation guide for the OpenStack provider extension and only contains some configuration specifics regarding the installation of different components via the helm charts residing in the OpenStack provider extension repository.\ngardener-extension-admission-openstack Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-openstack component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the OpenStack provider extension Disclaimer: This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/deployment/","tags":"","title":"Deployment"},{"body":"Deployment of the networking Calico extension Disclaimer: This document is NOT a step by step deployment guide for the networking Calico extension and only contains some configuration specifics regarding the deployment of different components via the helm charts residing in the networking Calico extension repository.\ngardener-extension-admission-calico Authentication against the Garden cluster There are several authentication possibilities depending on whether or not the concept of Virtual Garden is used.\nVirtual Garden is not used, i.e., the runtime Garden cluster is also the target Garden cluster. Automounted Service Account Token The easiest way to deploy the gardener-extension-admission-calico component will be to not provide kubeconfig at all. This way in-cluster configuration and an automounted service account token will be used. The drawback of this approach is that the automounted token will not be automatically rotated.\nService Account Token Volume Projection Another solution will be to use Service Account Token Volume Projection combined with a kubeconfig referencing a token file (see example below).\napiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://default.kubernetes.svc.cluster.local name: garden contexts: - context: cluster: garden user: garden name: garden current-context: garden users: - name: garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token This will allow for automatic rotation of the service account token by the kubelet. The configuration can be achieved by setting both .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.kubeconfig in the respective chart’s values.yaml file.\nVirtual Garden is used, i.e., the runtime Garden cluster is different from the target Garden cluster. Service Account The easiest way to setup the authentication will be to create a service account and the respective roles will be bound to this service account in the target cluster. Then use the generated service account token and craft a kubeconfig which will be used by the workload in the runtime cluster. This approach does not provide a solution for the rotation of the service account token. However, this setup can be achieved by setting .Values.global.virtualGarden.enabled: true and following these steps:\n Deploy the application part of the charts in the target cluster. Get the service account token and craft the kubeconfig. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Client Certificate Another solution will be to bind the roles in the target cluster to a User subject instead of a service account and use a client certificate for authentication. This approach does not provide a solution for the client certificate rotation. However, this setup can be achieved by setting both .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name, then following these steps:\n Generate a client certificate for the target cluster for the respective user. Deploy the application part of the charts in the target cluster. Craft a kubeconfig using the already generated client certificate. Set the crafted kubeconfig and deploy the runtime part of the charts in the runtime cluster. Projected Service Account Token This approach requires an already deployed and configured oidc-webhook-authenticator for the target cluster. Also the runtime cluster should be registered as a trusted identity provider in the target cluster. Then projected service accounts tokens from the runtime cluster can be used to authenticate against the target cluster. The needed steps are as follows:\n Deploy OWA and establish the needed trust. Set .Values.global.virtualGarden.enabled: true and .Values.global.virtualGarden.user.name. Note: username value will depend on the trust configuration, e.g., \u003cprefix\u003e:system:serviceaccount:\u003cnamespace\u003e:\u003cserviceaccount\u003e Set .Values.global.serviceAccountTokenVolumeProjection.enabled: true and .Values.global.serviceAccountTokenVolumeProjection.audience. Note: audience value will depend on the trust configuration, e.g., \u003ccliend-id-from-trust-config\u003e. Craft a kubeconfig (see example below). Deploy the application part of the charts in the target cluster. Deploy the runtime part of the charts in the runtime cluster. apiVersion: v1 kind: Config clusters: - cluster: certificate-authority-data: \u003cCA-DATA\u003e server: https://virtual-garden.api name: virtual-garden contexts: - context: cluster: virtual-garden user: virtual-garden name: virtual-garden current-context: virtual-garden users: - name: virtual-garden user: tokenFile: /var/run/secrets/projected/serviceaccount/token ","categories":"","description":"","excerpt":"Deployment of the networking Calico extension Disclaimer: This …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Certificate Management Introduction Gardener comes with an extension that enables shoot owners to request X.509 compliant certificates for shoot domains.\nExtension Installation The Shoot-Cert-Service extension can be deployed and configured via Gardener’s native resource ControllerRegistration.\nPrerequisites To let the Shoot-Cert-Service operate properly, you need to have:\n a DNS service in your seed contact details and optionally a private key for a pre-existing Let’s Encrypt account ControllerRegistration An example of a ControllerRegistration for the Shoot-Cert-Service can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploy the Shoot-Cert-Service to seed clusters. It offers some configuration options, mainly to set up a default issuer for shoot clusters. With a default issuer, pre-existing Let’s Encrypt accounts can be used and shared with shoot clusters (See “One Account or Many?” of the Integration Guide).\n Please keep the Let’s Encrypt Rate Limits in mind when using this shared account model. Depending on the amount of shoots and domains it is recommended to use an account with increased rate limits.\n apiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... values: certificateConfig: defaultIssuer: acme: email: foo@example.com privateKey: |------BEGIN RSA PRIVATE KEY----- ... -----END RSA PRIVATE KEY----- server: https://acme-v02.api.letsencrypt.org/directory name: default-issuer # restricted: true # restrict default issuer to any sub-domain of shoot.spec.dns.domain # defaultRequestsPerDayQuota: 50 # precheckNameservers: 8.8.8.8,8.8.4.4 # caCertificates: | # optional custom CA certificates when using private ACME provider # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- # # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- shootIssuers: enabled: false # if true, allows to specify issuers in the shoot clusters Enablement If the Shoot-Cert-Service should be enabled for every shoot cluster in your Gardener managed environment, you need to globally enable it in the ControllerRegistration:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... resources: - globallyEnabled: true kind: Extension type: shoot-cert-service Alternatively, you’re given the option to only enable the service for certain shoots:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 ... spec: extensions: - type: shoot-cert-service ... ","categories":"","description":"","excerpt":"Gardener Certificate Management Introduction Gardener comes with an …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener DNS Management for Shoots Introduction Gardener allows Shoot clusters to request DNS names for Ingresses and Services out of the box. To support this the gardener must be installed with the shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. So, far only the external DNS domain of a shoot (already used for the kubernetes api server and ingress DNS names) can be used for managed DNS names.\nConfiguration To generally enable the DNS management for shoot objects the shoot-dns-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots, the registration must set the globallyEnabled flag to true.\nspec: resources: - kind: Extension type: shoot-dns-service globallyEnabled: true Deployment of DNS controller manager If you are using Gardener version \u003e= 1.54, please make sure to deploy the DNS controller manager by adding the dnsControllerManager section to the providerConfig.values section.\nFor example:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-shoot-dns-service type: helm providerConfig: chart: ... values: image: ... dnsControllerManager: image: repository: europe-docker.pkg.dev/gardener-project/releases/dns-controller-manager tag: v0.16.0 configuration: cacheTtl: 300 controllers: dnscontrollers,dnssources dnsPoolResyncPeriod: 30m #poolSize: 20 #providersPoolResyncPeriod: 24h serverPortHttp: 8080 createCRDs: false deploy: true replicaCount: 1 #resources: # limits: # memory: 1Gi # requests: # cpu: 50m # memory: 500Mi dnsProviderManagement: enabled: true Providing Base Domains usable for a Shoot So, far only the external DNS domain of a shoot already used for the kubernetes api server and ingress DNS names can be used for managed DNS names. This is either the shoot domain as subdomain of the default domain configured for the gardener installation, or a dedicated domain with dedicated access credentials configured for a dedicated shoot via the shoot manifest.\nAlternatively, you can specify DNSProviders and its credentials Secret directly in the shoot, if this feature is enabled. By default, DNSProvider replication is disabled, but it can be enabled globally in the ControllerDeployment or for a shoot cluster in the shoot manifest (details see further below).\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-shoot-dns-service type: helm providerConfig: chart: ... values: image: ... dnsProviderReplication: enabled: true See example files (20-* and 30-*) for details for the various provider types.\nShoot Feature Gate If the shoot DNS feature is not globally enabled by default (depends on the extension registration on the garden cluster), it must be enabled per shoot.\nTo enable the feature for a shoot, the shoot manifest must explicitly add the shoot-dns-service extension.\n... spec: extensions: - type: shoot-dns-service ... Enable/disable DNS provider replication for a shoot The DNSProvider` replication feature enablement can be overwritten in the shoot manifest, e.g.\nKind: Shoot ... spec: extensions: - type: shoot-dns-service providerConfig: apiVersion: service.dns.extensions.gardener.cloud/v1alpha1 kind: DNSConfig dnsProviderReplication: enabled: true ... ","categories":"","description":"","excerpt":"Gardener DNS Management for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Lakom Service for Shoots Introduction Gardener allows Shoot clusters to use Lakom admission controller for cosign image signing verification. To support this the Gardener must be installed with the shoot-lakom-service extension.\nConfiguration To generally enable the Lakom service for shoot objects the shoot-lakom-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\nspec: resources: - kind: Extension type: shoot-lakom-service globallyEnabled: true Shoot Feature Gate If the shoot Lakom service is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-lakom-service extension.\n... spec: extensions: - type: shoot-lakom-service ... If the shoot Lakom service is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\n... spec: extensions: - type: shoot-lakom-service disabled: true ... ","categories":"","description":"","excerpt":"Gardener Lakom Service for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Networking Policy Filter for Shoots Introduction Gardener allows shoot clusters to filter egress traffic on node level. To support this the Gardener must be installed with the shoot-networking-filter extension.\nConfiguration To generally enable the networking filter for shoot objects the shoot-networking-filter extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... spec: resources: - kind: Extension type: shoot-networking-filter globallyEnabled: true ControllerRegistration An example of a ControllerRegistration for the shoot-networking-filter can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploys the shoot-networking-filter to seed clusters. It offers some configuration options, mainly to set up a static filter list or provide the configuration for downloading the filter list from a service endpoint.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment ... values: egressFilter: blackholingEnabled: true filterListProviderType: static staticFilterList: - network: 1.2.3.4/31 policy: BLOCK_ACCESS - network: 5.6.7.8/32 policy: BLOCK_ACCESS - network: ::2/128 policy: BLOCK_ACCESS #filterListProviderType: download #downloaderConfig: # endpoint: https://my.filter.list.server/lists/policy # oauth2Endpoint: https://my.auth.server/oauth2/token # refreshPeriod: 1h ## if the downloader needs an OAuth2 access token, client credentials can be provided with oauth2Secret #oauth2Secret: # clientID: 1-2-3-4 # clientSecret: secret!! ## either clientSecret of client certificate is required # client.crt.pem: | # -----BEGIN CERTIFICATE----- # ... # -----END CERTIFICATE----- # client.key.pem: | # -----BEGIN PRIVATE KEY----- # ... # -----END PRIVATE KEY----- Enablement for a Shoot If the shoot networking filter is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-networking-filter extension.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter ... If the shoot networking filter is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter disabled: true ... ","categories":"","description":"","excerpt":"Gardener Networking Policy Filter for Shoots Introduction Gardener …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/deployment/","tags":"","title":"Deployment"},{"body":"Gardener Networking Policy Filter for Shoots Introduction Gardener allows shoot clusters to add network problem observability using the network problem detector. To support this the Gardener must be installed with the shoot-networking-problemdetector extension.\nConfiguration To generally enable the networking problem detector for shoot objects the shoot-networking-problemdetector extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration ... spec: resources: - kind: Extension type: shoot-networking-problemdetector globallyEnabled: true ControllerRegistration An example of a ControllerRegistration for the shoot-networking-problemdetector can be found at controller-registration.yaml.\nThe ControllerRegistration contains a Helm chart which eventually deploys the shoot-networking-problemdetector to seed clusters. It offers some configuration options, mainly to set up a static filter list or provide the configuration for downloading the filter list from a service endpoint.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment ... values: #networkProblemDetector: # defaultPeriod: 30s Enablement for a Shoot If the shoot network problem detector is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-networking-problemdetector extension.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector ... If the shoot network problem detector is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector disabled: true ... ","categories":"","description":"","excerpt":"Gardener Networking Policy Filter for Shoots Introduction Gardener …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/deployment/","tags":"","title":"Deployment"},{"body":"Gardener OIDC Service for Shoots Introduction Gardener allows Shoot clusters to dynamically register OpenID Connect providers. To support this the Gardener must be installed with the shoot-oidc-service extension.\nConfiguration To generally enable the OIDC service for shoot objects the shoot-oidc-service extension must be registered by providing an appropriate extension registration in the garden cluster.\nHere it is possible to decide whether the extension should be always available for all shoots or whether the extension must be separately enabled per shoot.\nIf the extension should be used for all shoots the globallyEnabled flag should be set to true.\nspec: resources: - kind: Extension type: shoot-oidc-service globallyEnabled: true Shoot Feature Gate If the shoot OIDC service is not globally enabled by default (depends on the extension registration on the garden cluster), it can be enabled per shoot. To enable the service for a shoot, the shoot manifest must explicitly add the shoot-oidc-service extension.\n... spec: extensions: - type: shoot-oidc-service ... If the shoot OIDC service is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\n... spec: extensions: - type: shoot-oidc-service disabled: true ... ","categories":"","description":"","excerpt":"Gardener OIDC Service for Shoots Introduction Gardener allows Shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/deployment/","tags":"","title":"Deployment"},{"body":"Deploying the Machine Controller Manager into a Kubernetes cluster Deploying the Machine Controller Manager into a Kubernetes cluster Prepare the cluster Build the Docker image Configuring optional parameters while deploying Usage As already mentioned, the Machine Controller Manager is designed to run as controller in a Kubernetes cluster. The existing source code can be compiled and tested on a local machine as described in Setting up a local development environment. You can deploy the Machine Controller Manager using the steps described below.\nPrepare the cluster Connect to the remote kubernetes cluster where you plan to deploy the Machine Controller Manager using the kubectl. Set the environment variable KUBECONFIG to the path of the yaml file containing the cluster info. Now, create the required CRDs on the remote cluster using the following command, $ kubectl apply -f kubernetes/crds Build the Docker image ⚠️ Modify the Makefile to refer to your own registry.\n Run the build which generates the binary to bin/machine-controller-manager $ make build Build docker image from latest compiled binary $ make docker-image Push the last created docker image onto the online docker registry. $ make push Now you can deploy this docker image to your cluster. A sample development file is provided. By default, the deployment manages the cluster it is running in. Optionally, the kubeconfig could also be passed as a flag as described in /kubernetes/deployment/out-of-tree/deployment.yaml. This is done when you want your controller running outside the cluster to be managed from. $ kubectl apply -f kubernetes/deployment/out-of-tree/deployment.yaml Also deploy the required clusterRole and clusterRoleBindings $ kubectl apply -f kubernetes/deployment/out-of-tree/clusterrole.yaml $ kubectl apply -f kubernetes/deployment/out-of-tree/clusterrolebinding.yaml Configuring optional parameters while deploying Machine-controller-manager supports several configurable parameters while deploying. Refer to the following lines, to know how each parameter can be configured, and what it’s purpose is for.\nUsage To start using Machine Controller Manager, follow the links given at usage here.\n","categories":"","description":"","excerpt":"Deploying the Machine Controller Manager into a Kubernetes cluster …","ref":"/docs/other-components/machine-controller-manager/deployment/","tags":"","title":"Deployment"},{"body":"Developer Docs for Gardener Extension Registry Cache This document outlines how Shoot reconciliation and deletion works for a Shoot with the registry-cache extension enabled.\nShoot Reconciliation This section outlines how the reconciliation works for a Shoot with the registry-cache extension enabled.\nExtension Enablement / Reconciliation This section outlines how the extension enablement/reconciliation works, e.g., the extension has been added to the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet deploys the Extension resource. The registry-cache extension reconciles the Extension resource. pkg/controller/cache/actuator.go contains the implementation of the extension.Actuator interface. The reconciliation of an Extension of type registry-cache consists of the following steps: The registry-cache extension deploys resources to the Shoot cluster via ManagedResource. For every configured upstream, it creates a StatefulSet (with PVC), Service, and other resources. It lists all Services from the kube-system namespace that have the upstream-host label. It will return an error (and retry in exponential backoff) until the Services count matches the configured registries count. When there is a Service created for each configured upstream registry, the registry-cache extension populates the Extension resource status. In the Extension status, for each upstream, it maintains an endpoint (in the format http://\u003ccluster-ip\u003e:5000) which can be used to access the registry cache from within the Shoot cluster. \u003ccluster-ip\u003e is the cluster IP of the registry cache Service. The cluster IP of a Service is assigned by the Kubernetes API server on Service creation. As part of the Shoot reconciliation flow, the gardenlet deploys the OperatingSystemConfig resource. The registry-cache extension serves a webhook that mutates the OperatingSystemConfig resource for Shoots having the registry-cache extension enabled (the corresponding namespace gets labeled by the gardenlet with extensions.gardener.cloud/registry-cache=true). pkg/webhook/cache/ensurer.go contains an implementation of the genericmutator.Ensurer interface. The webhook appends or updates RegistryConfig entries in the OperatingSystemConfig CRI configuration that corresponds to configured registry caches in the Shoot. The RegistryConfig readiness probe is enabled so that gardener-node-agent creates a hosts.toml containerd registry configuration file when all RegistryConfig hosts are reachable. Extension Disablement This section outlines how the extension disablement works, i.e., the extension has to be removed from the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet destroys the Extension resource because it is no longer needed. The extension deletes the ManagedResource containing the registry cache resources. The OperatingSystemConfig resource will not be mutated and no RegistryConfig entries will be added or updated. The gardener-node-agent detects that RegistryConfig entries have been removed or changed and deletes or updates corresponding hosts.toml configuration files under /etc/containerd/certs.d folder. Shoot Deletion This section outlines how the deletion works for a Shoot with the registry-cache extension enabled.\n As part of the Shoot deletion flow, the gardenlet destroys the Extension resource. The extension deletes the ManagedResource containing the registry cache resources. ","categories":"","description":"Learn about the inner workings","excerpt":"Learn about the inner workings","ref":"/docs/extensions/others/gardener-extension-registry-cache/extension-registry-cache/","tags":"","title":"Developer Docs for Gardener Extension Registry Cache"},{"body":"DNS Autoscaling This is a short guide describing different options how to automatically scale CoreDNS in the shoot cluster.\nBackground Currently, Gardener uses CoreDNS as DNS server. Per default, it is installed as a deployment into the shoot cluster that is auto-scaled horizontally to cover for QPS-intensive applications. However, doing so does not seem to be enough to completely circumvent DNS bottlenecks such as:\n Cloud provider limits for DNS lookups. Unreliable UDP connections that forces a period of timeout in case packets are dropped. Unnecessary node hopping since CoreDNS is not deployed on all nodes, and as a result DNS queries end-up traversing multiple nodes before reaching the destination server. Inefficient load-balancing of services (e.g., round-robin might not be enough when using IPTables mode). Overload of the CoreDNS replicas as the maximum amount of replicas is fixed. and more … As an alternative with extended configuration options, Gardener provides cluster-proportional autoscaling of CoreDNS. This guide focuses on the configuration of cluster-proportional autoscaling of CoreDNS and its advantages/disadvantages compared to the horizontal autoscaling. Please note that there is also the option to use a node-local DNS cache, which helps mitigate potential DNS bottlenecks (see Trade-offs in conjunction with NodeLocalDNS for considerations regarding using NodeLocalDNS together with one of the CoreDNS autoscaling approaches).\nConfiguring Cluster-Proportional DNS Autoscaling All that needs to be done to enable the usage of cluster-proportional autoscaling of CoreDNS is to set the corresponding option (spec.systemComponents.coreDNS.autoscaling.mode) in the Shoot resource to cluster-proportional:\n... spec: ... systemComponents: coreDNS: autoscaling: mode: cluster-proportional ... To switch back to horizontal DNS autoscaling, you can set the spec.systemComponents.coreDNS.autoscaling.mode to horizontal (or remove the coreDNS section).\nOnce the cluster-proportional autoscaling of CoreDNS has been enabled and the Shoot cluster has been reconciled afterwards, a ConfigMap called coredns-autoscaler will be created in the kube-system namespace with the default settings. The content will be similar to the following:\nlinear: '{\"coresPerReplica\":256,\"min\":2,\"nodesPerReplica\":16}' It is possible to adapt the ConfigMap according to your needs in case the defaults do not work as desired. The number of CoreDNS replicas is calculated according to the following formula:\nreplicas = max( ceil( cores × 1 / coresPerReplica ) , ceil( nodes × 1 / nodesPerReplica ) ) Depending on your needs, you can adjust coresPerReplica or nodesPerReplica, but it is also possible to override min if required.\nTrade-Offs of Horizontal and Cluster-Proportional DNS Autoscaling The horizontal autoscaling of CoreDNS as implemented by Gardener is fully managed, i.e., you do not need to perform any configuration changes. It scales according to the CPU usage of CoreDNS replicas, meaning that it will create new replicas if the existing ones are under heavy load. This approach scales between 2 and 5 instances, which is sufficient for most workloads. In case this is not enough, the cluster-proportional autoscaling approach can be used instead, with its more flexible configuration options.\nThe cluster-proportional autoscaling of CoreDNS as implemented by Gardener is fully managed, but allows more configuration options to adjust the default settings to your individual needs. It scales according to the cluster size, i.e., if your cluster grows in terms of cores/nodes so will the amount of CoreDNS replicas. However, it does not take the actual workload, e.g., CPU consumption, into account.\nExperience shows that the horizontal autoscaling of CoreDNS works for a variety of workloads. It does reach its limits if a cluster has a high amount of DNS requests, though. The cluster-proportional autoscaling approach allows to fine-tune the amount of CoreDNS replicas. It helps to scale in clusters of changing size. However, please keep in mind that you need to cater for the maximum amount of DNS requests as the replicas will not be adapted according to the workload, but only according to the cluster size (cores/nodes).\nTrade-Offs in Conjunction with NodeLocalDNS Using a node-local DNS cache can mitigate a lot of the potential DNS related problems. It works fine with a DNS workload that can be handle through the cache and reduces the inter-node DNS communication. As node-local DNS cache reduces the amount of traffic being sent to the cluster’s CoreDNS replicas, it usually works fine with horizontally scaled CoreDNS. Nevertheless, it also works with CoreDNS scaled in a cluster-proportional approach. In this mode, though, it might make sense to adapt the default settings as the CoreDNS workload is likely significantly reduced.\nOverall, you can view the DNS options on a scale. Horizontally scaled DNS provides a small amount of DNS servers. Especially for bigger clusters, a cluster-proportional approach will yield more CoreDNS instances and hence may yield a more balanced DNS solution. By adapting the settings you can further increase the amount of CoreDNS replicas. On the other end of the spectrum, a node-local DNS cache provides DNS on every node and allows to reduce the amount of (backend) CoreDNS instances regardless if they are horizontally or cluster-proportionally scaled.\n","categories":"","description":"","excerpt":"DNS Autoscaling This is a short guide describing different options how …","ref":"/docs/gardener/dns-autoscaling/","tags":"","title":"DNS Autoscaling"},{"body":"Request DNS Names in Shoot Clusters Introduction Within a shoot cluster, it is possible to request DNS records via the following resource types:\n Ingress Service DNSEntry It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In some Gardener setups the shoot-dns-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-dns-service ... Before you start You should :\n Have created a shoot cluster Have created and correctly configured a DNS Provider (Please consult this page for more information) Have a basic understanding of DNS (see link under References) There are 2 types of DNS that you can use within Kubernetes :\n internal (usually managed by coreDNS) external (managed by a public DNS provider). This page, and the extension, exclusively works for external DNS handling.\nGardener allows 2 way of managing your external DNS:\n Manually, which means you are in charge of creating / maintaining your Kubernetes related DNS entries Via the Gardener DNS extension Gardener DNS extension The managed external DNS records feature of the Gardener clusters makes all this easier. You do not need DNS service provider specific knowledge, and in fact you do not need to leave your cluster at all to achieve that. You simply annotate the Ingress / Service that needs its DNS records managed and it will be automatically created / managed by Gardener.\nManaged external DNS records are supported with the following DNS provider types:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns Request DNS records for Ingress resources To request a DNS name for Ingress, Service or Gateway (Istio or Gateway API) objects in the shoot cluster it must be annotated with the DNS class garden and an annotation denoting the desired DNS names.\nExample for an annotated Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: # Let Gardener manage external DNS records for this Ingress. dns.gardener.cloud/dnsnames: special.example.com # Use \"*\" to collects domains names from .spec.rules[].host dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden # If you are delegating the certificate management to Gardener, uncomment the following line #cert.gardener.cloud/purpose: managed spec: rules: - host: special.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 # Uncomment the following part if you are delegating the certificate management to Gardener #tls: # - hosts: # - special.example.com # secretName: my-cert-secret-name For an Ingress, the DNS names are already declared in the specification. Nevertheless the dnsnames annotation must be present. Here a subset of the DNS names of the ingress can be specified. If DNS names for all names are desired, the value all can be used.\nKeep in mind that ingress resources are ignored unless an ingress controller is set up. Gardener does not provide an ingress controller by default. For more details, see Ingress Controllers and Service in the Kubernetes documentation.\nRequest DNS records for service type LoadBalancer Example for an annotated Service (it must have the type LoadBalancer) resource:\napiVersion: v1 kind: Service metadata: name: amazing-svc annotations: # Let Gardener manage external DNS records for this Service. dns.gardener.cloud/dnsnames: special.example.com dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden spec: selector: app: amazing-app ports: - protocol: TCP port: 80 targetPort: 8080 type: LoadBalancer Request DNS records for Gateway resources Please see Istio Gateways or Gateway API for details.\nCreating a DNSEntry resource explicitly It is also possible to create a DNS entry via the Kubernetes resource called DNSEntry:\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSEntry metadata: annotations: # Let Gardener manage this DNS entry. dns.gardener.cloud/class: garden name: special-dnsentry namespace: default spec: dnsName: special.example.com ttl: 600 targets: - 1.2.3.4 If one of the accepted DNS names is a direct subname of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nYou can check the status of the DNSEntry with\n$ kubectl get dnsentry NAME DNS TYPE PROVIDER STATUS AGE mydnsentry special.example.com aws-route53 default/aws Ready 24s As soon as the status of the entry is Ready, the provider has accepted the new DNS record. Depending on the provider and your DNS settings and cache, it may take up to 24 hours for the new entry to be propagated over all internet.\nMore examples can be found here\nRequest DNS records for Service/Ingress resources using a DNSAnnotation resource In rare cases it may not be possible to add annotations to a Service or Ingress resource object.\nE.g.: the helm chart used to deploy the resource may not be adaptable for some reasons or some automation is used, which always restores the original content of the resource object by dropping any additional annotations.\nIn these cases, it is recommended to use an additional DNSAnnotation resource in order to have more flexibility that DNSentry resources. The DNSAnnotation resource makes the DNS shoot service behave as if annotations have been added to the referenced resource.\nFor the Ingress example shown above, you can create a DNSAnnotation resource alternatively to provide the annotations.\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSAnnotation metadata: annotations: dns.gardener.cloud/class: garden name: test-ingress-annotation namespace: default spec: resourceRef: kind: Ingress apiVersion: networking.k8s.io/v1 name: test-ingress namespace: default annotations: dns.gardener.cloud/dnsnames: '*' dns.gardener.cloud/class: garden Note that the DNSAnnotation resource itself needs the dns.gardener.cloud/class=garden annotation. This also only works for annotations known to the DNS shoot service (see Accepted External DNS Records Annotations).\nFor more details, see also DNSAnnotation objects\nAccepted External DNS Records Annotations Here are all of the accepted annotation related to the DNS extension:\n Annotation Description dns.gardener.cloud/dnsnames Mandatory for service and ingress resources, accepts a comma-separated list of DNS names if multiple names are required. For ingress you can use the special value '*'. In this case, the DNS names are collected from .spec.rules[].host. dns.gardener.cloud/class Mandatory, in the context of the shoot-dns-service it must always be set to garden. dns.gardener.cloud/ttl Recommended, overrides the default Time-To-Live of the DNS record. dns.gardener.cloud/cname-lookup-interval Only relevant if multiple domain name targets are specified. It specifies the lookup interval for CNAMEs to map them to IP addresses (in seconds) dns.gardener.cloud/realms Internal, for restricting provider access for shoot DNS entries. Typcially not set by users of the shoot-dns-service. dns.gardener.cloud/ip-stack Only relevant for provider type aws-route53 if target is an AWS load balancer domain name. Can be set for service, ingress and DNSEntry resources. It specify which DNS records with alias targets are created instead of the usual CNAME records. If the annotation is not set (or has the value ipv4), only an A record is created. With value dual-stack, both A and AAAA records are created. With value ipv6 only an AAAA record is created. service.beta.kubernetes.io/aws-load-balancer-ip-address-type=dualstack For services, behaves similar to dns.gardener.cloud/ip-stack=dual-stack. loadbalancer.openstack.org/load-balancer-address Internal, for services only: support for PROXY protocol on Openstack (which needs a hostname as ingress). Typcially not set by users of the shoot-dns-service. If one of the accepted DNS names is a direct subdomain of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore, this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nTroubleshooting General DNS tools To check the DNS resolution, use the nslookup or dig command.\n$ nslookup special.your-domain.com or with dig\n$ dig +short special.example.com Depending on your network settings, you may get a successful response faster using a public DNS server (e.g. 8.8.8.8, 8.8.4.4, or 1.1.1.1) dig @8.8.8.8 +short special.example.com DNS record events The DNS controller publishes Kubernetes events for the resource which requested the DNS record (Ingress, Service, DNSEntry). These events reveal more information about the DNS requests being processed and are especially useful to check any kind of misconfiguration, e.g. requests for a domain you don’t own.\nEvents for a successfully created DNS record:\n$ kubectl describe service my-service Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal dns-annotation 19s dns-controller-manager special.example.com: dns entry is pending Normal dns-annotation 19s (x3 over 19s) dns-controller-manager special.example.com: dns entry pending: waiting for dns reconciliation Normal dns-annotation 9s (x3 over 10s) dns-controller-manager special.example.com: dns entry active Please note, events vanish after their retention period (usually 1h).\nDNSEntry status DNSEntry resources offer a .status sub-resource which can be used to check the current state of the object.\nStatus of a erroneous DNSEntry.\n status: message: No responsible provider found observedGeneration: 3 provider: remote state: Error References Understanding DNS Kubernetes Internal DNS DNSEntry API (Golang) Managing Certificates with Gardener ","categories":"","description":"","excerpt":"Request DNS Names in Shoot Clusters Introduction Within a shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/dns_names/","tags":"","title":"DNS Names"},{"body":"DNS Providers Introduction Gardener can manage DNS records on your behalf, so that you can request them via different resource types (see here) within the shoot cluster. The domains for which you are permitted to request records, are however restricted and depend on the DNS provider configuration.\nShoot provider By default, every shoot cluster is equipped with a default provider. It is the very same provider that manages the shoot cluster’s kube-apiserver public DNS record (DNS address in your Kubeconfig).\nkind: Shoot ... dns: domain: shoot.project.default-domain.gardener.cloud You are permitted to request any sub-domain of .dns.domain that is not already taken (e.g. api.shoot.project.default-domain.gardener.cloud, *.ingress.shoot.project.default-domain.gardener.cloud) with this provider.\nAdditional providers If you need to request DNS records for domains not managed by the default provider, additional providers can be configured in the shoot specification. Alternatively, if it is enabled, it can be added as DNSProvider resources to the shoot cluster.\nAdditional providers in the shoot specification To add a providers in the shoot spec, you need set them in the spec.dns.providers list.\nFor example:\nkind: Shoot ... spec: dns: domain: shoot.project.default-domain.gardener.cloud providers: - secretName: my-aws-account type: aws-route53 - secretName: my-gcp-account type: google-clouddns Please consult the API-Reference to get a complete list of supported fields and configuration options.\n Referenced secrets should exist in the project namespace in the Garden cluster and must comply with the provider specific credentials format. The External-DNS-Management project provides corresponding examples (20-secret-\u003cprovider-name\u003e-credentials.yaml) for known providers.\nAdditional providers as resources in the shoot cluster If it is not enabled globally, you have to enable the feature in the shoot manifest:\nKind: Shoot ... spec: extensions: - type: shoot-dns-service providerConfig: apiVersion: service.dns.extensions.gardener.cloud/v1alpha1 kind: DNSConfig dnsProviderReplication: enabled: true ... To add a provider directly in the shoot cluster, provide a DNSProvider in any namespace together with Secret containing the credentials.\nFor example if the domain is hosted with AWS Route 53 (provider type aws-route53):\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSProvider metadata: annotations: dns.gardener.cloud/class: garden name: my-own-domain namespace: my-namespace spec: type: aws-route53 secretRef: name: my-own-domain-credentials domains: include: - my.own.domain.com --- apiVersion: v1 kind: Secret metadata: name: my-own-domain-credentials namespace: my-namespace type: Opaque data: # replace '...' with values encoded as base64 AWS_ACCESS_KEY_ID: ... AWS_SECRET_ACCESS_KEY: ... The External-DNS-Management project provides examples with more details for DNSProviders (30-provider-\u003cprovider-name\u003e.yaml) and credential Secrets (20-secret-\u003cprovider-name\u003e.yaml) at https://github.com/gardener/external-dns-management//examples for all supported provider types.\n","categories":"","description":"","excerpt":"DNS Providers Introduction Gardener can manage DNS records on your …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/dns_providers/","tags":"","title":"DNS Providers"},{"body":"Contract: DNSRecord Resources Every shoot cluster requires external DNS records that are publicly resolvable. The management of these DNS records requires provider-specific knowledge which is to be developed outside the Gardener’s core repository.\nCurrently, Gardener uses DNSProvider and DNSEntry resources. However, this introduces undesired coupling of Gardener to a controller that does not adhere to the Gardener extension contracts. Because of this, we plan to stop using DNSProvider and DNSEntry resources for Gardener DNS records in the future and use the DNSRecord resources described here instead.\nWhat does Gardener create DNS records for? Internal Domain Name Every shoot cluster’s kube-apiserver running in the seed is exposed via a load balancer that has a public endpoint (IP or hostname). This endpoint is used by end-users and also by system components (that are running in another network, e.g., the kubelet or kube-proxy) to talk to the cluster. In order to be robust against changes of this endpoint (e.g., caused due to re-creation of the load balancer or move of the DNS record to another seed cluster), Gardener creates a so-called internal domain name for every shoot cluster. The internal domain name is a publicly resolvable DNS record that points to the load balancer of the kube-apiserver. Gardener uses this domain name in the kubeconfigs of all system components, instead of using directly the load balancer endpoint. This way Gardener does not need to recreate all kubeconfigs if the endpoint changes - it just needs to update the DNS record.\nExternal Domain Name The internal domain name is not configurable by end-users directly but configured by the Gardener administrator. However, end-users usually prefer to have another DNS name, maybe even using their own domain sometimes, to access their Kubernetes clusters. Gardener supports that by creating another DNS record, named external domain name, that actually points to the internal domain name. The kubeconfig handed out to end-users does contain this external domain name, i.e., users can access their clusters with the DNS name they like to.\nAs not every end-user has an own domain, it is possible for Gardener administrators to configure so-called default domains. If configured, shoots that do not specify a domain explicitly get an external domain name based on a default domain (unless explicitly stated that this shoot should not get an external domain name (.spec.dns.provider=unmanaged).\nIngress Domain Name (Deprecated) Gardener allows to deploy a nginx-ingress-controller into a shoot cluster (deprecated). This controller is exposed via a public load balancer (again, either IP or hostname). Gardener creates a wildcard DNS record pointing to this load balancer. Ingress resources can later use this wildcard DNS record to expose underlying applications.\nSeed Ingress If .spec.ingress is configured in the Seed, Gardener deploys the ingress controller mentioned in .spec.ingress.controller.kind to the seed cluster. Currently, the only supported kind is “nginx”. If the ingress field is set, then .spec.dns.provider must also be set. Gardener creates a wildcard DNS record pointing to the load balancer of the ingress controller. The Ingress resources of components like Plutono and Prometheus in the garden namespace and the shoot namespaces use this wildcard DNS record to expose their underlying applications.\nWhat needs to be implemented to support a new DNS provider? As part of the shoot flow, Gardener will create a number of DNSRecord resources in the seed cluster (one for each of the DNS records mentioned above) that need to be reconciled by an extension controller. These resources contain the following information:\n The DNS provider type (e.g., aws-route53, google-clouddns, …) A reference to a Secret object that contains the provider-specific credentials used to communicate with the provider’s API. The fully qualified domain name (FQDN) of the DNS record, e.g. “api.\u003cshoot domain\u003e”. The DNS record type, one of A, AAAA, CNAME, or TXT. The DNS record values, that is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records. Optionally, the DNSRecord resource may contain also the following information:\n The region of the DNS record. If not specified, the region specified in the referenced Secret shall be used. If that is also not specified, the extension controller shall use a certain default region. The DNS hosted zone of the DNS record. If not specified, it shall be determined automatically by the extension controller by getting all hosted zones of the account and searching for the longest zone name that is a suffix of the fully qualified domain name (FQDN) mentioned above. The TTL of the DNS record in seconds. If not specified, it shall be set by the extension controller to 120. Example DNSRecord:\n--- apiVersion: v1 kind: Secret metadata: name: dnsrecord-bar-external namespace: shoot--foo--bar type: Opaque data: # aws-route53 specific credentials here --- apiVersion: extensions.gardener.cloud/v1alpha1 kind: DNSRecord metadata: name: dnsrecord-external namespace: default spec: type: aws-route53 secretRef: name: dnsrecord-bar-external namespace: shoot--foo--bar # region: eu-west-1 # zone: ZFOO name: api.bar.foo.my-fancy-domain.com recordType: A values: - 1.2.3.4 # ttl: 600 In order to support a new DNS record provider, you need to write a controller that watches all DNSRecords with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the AWS route53 provider.\nKey Names in Secrets Containing Provider-Specific Credentials For compatibility with existing setups, extension controllers shall support two different namings of keys in secrets containing provider-specific credentials:\n The naming used by the external-dns-management DNS controller. For example, on AWS the key names are AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION. The naming used by other provider-specific extension controllers, e.g. for infrastructure. For example, on AWS the key names are accessKeyId, secretAccessKey, and region. Avoiding Reading the DNS Hosted Zones If the DNS hosted zone is not specified in the DNSRecord resource, during the first reconciliation the extension controller shall determine the correct DNS hosted zone for the specified FQDN and write it to the status of the resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: DNSRecord metadata: name: dnsrecord-external namespace: shoot--foo--bar spec: ... status: lastOperation: ... zone: ZFOO On subsequent reconciliations, the extension controller shall use the zone from the status and avoid reading the DNS hosted zones from the provider. If the DNSRecord resource specifies a zone in .spec.zone and the extension controller has written a value to .status.zone, the first one shall be considered with higher priority by the extension controller.\nNon-Provider Specific Information Required for DNS Record Creation Some providers might require further information that is not provider specific but already part of the shoot resource. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the DNSRecord resource itself.\nUsing DNSRecord Resources gardenlet manages DNSRecord resources for all three DNS records mentioned above (internal, external, and ingress). In order to successfully reconcile a shoot with the feature gate enabled, extension controllers for DNSRecord resources for types used in the default, internal, and custom domain secrets should be registered via ControllerRegistration resources.\n Note: For compatibility reasons, the spec.dns.providers section is still used to specify additional providers. Only the one marked as primary: true will be used for DNSRecord. All others are considered by the shoot-dns-service extension only (if deployed).\n Support for DNSRecord Resources in the Provider Extensions The following table contains information about the provider extension version that adds support for DNSRecord resources:\n Extension Version provider-alicloud v1.26.0 provider-aws v1.27.0 provider-azure v1.21.0 provider-gcp v1.18.0 provider-openstack v1.21.0 provider-vsphere N/A provider-equinix-metal N/A provider-kubevirt N/A provider-openshift N/A Support for DNSRecord IPv6 recordType: AAAA in the Provider Extensions The following table contains information about the provider extension version that adds support for DNSRecord IPv6 recordType: AAAA:\n Extension Version provider-alicloud N/A provider-aws N/A provider-azure N/A provider-gcp N/A provider-openstack N/A provider-vsphere N/A provider-equinix-metal N/A provider-kubevirt N/A provider-openshift N/A provider-local v1.63.0 References and Additional Resources DNSRecord API (Golang specification) Sample Implementation for the AWS Route53 Provider ","categories":"","description":"","excerpt":"Contract: DNSRecord Resources Every shoot cluster requires external …","ref":"/docs/gardener/extensions/dnsrecord/","tags":"","title":"DNS Record"},{"body":"DNS Search Path Optimization DNS Search Path Using fully qualified names has some downsides, e.g., it may become harder to move deployments from one landscape to the next. It is far easier and simple to rely on short/local names, which may have different meaning depending on the context they are used in.\nThe DNS search path allows for the usage of short/local names. It is an ordered list of DNS suffixes to append to short/local names to create a fully qualified name.\nIf a short/local name should be resolved, each entry is appended to it one by one to check whether it can be resolved. The process stops when either the name could be resolved or the DNS search path ends. As the last step after trying the search path, the short/local name is attempted to be resolved on it own.\nDNS Option ndots As explained in the section above, the DNS search path is used for short/local names to create fully qualified names. The DNS option ndots specifies how many dots (.) a name needs to have to be considered fully qualified. For names with less than ndots dots (.), the DNS search path will be applied.\nDNS Search Path, ndots, and Kubernetes Kubernetes tries to make it easy/convenient for developers to use name resolution. It provides several means to address a service, most notably by its name directly, using the namespace as suffix, utilizing \u003cnamespace\u003e.svc as suffix or as a fully qualified name as \u003cservice\u003e.\u003cnamespace\u003e.svc.cluster.local (assuming cluster.local to be the cluster domain).\nThis is why the DNS search path is fairly long in Kubernetes, usually consisting of \u003cnamespace\u003e.svc.cluster.local, svc.cluster.local, cluster.local, and potentially some additional entries coming from the local network of the cluster. For various reasons, the default ndots value in the context of Kubernetes is with 5, also fairly large. See this comment for a more detailed description.\nDNS Search Path/ndots Problem in Kubernetes As the DNS search path is long and ndots is large, a lot of DNS queries might traverse the DNS search path. This results in an explosion of DNS requests.\nFor example, consider the name resolution of the default kubernetes service kubernetes.default.svc.cluster.local. As this name has only four dots, it is not considered a fully qualified name according to the default ndots=5 setting. Therefore, the DNS search path is applied, resulting in the following queries being created\n kubernetes.default.svc.cluster.local.some-namespace.svc.cluster.local kubernetes.default.svc.cluster.local.svc.cluster.local kubernetes.default.svc.cluster.local.cluster.local kubernetes.default.svc.cluster.local.network-domain … In IPv4/IPv6 dual stack systems, the amount of DNS requests may even double as each name is resolved for IPv4 and IPv6.\nGeneral Workarounds/Mitigations Kubernetes provides the capability to set the DNS options for each pod (see Pod DNS config for details). However, this has to be applied for every pod (doing name resolution) to resolve the problem. A mutating webhook may be useful in this regard. Unfortunately, the DNS requirements may be different depending on the workload. Therefore, a general solution may difficult to impossible.\nAnother approach is to use always fully qualified names and append a dot (.) to the name to prevent the name resolution system from using the DNS search path. This might be somewhat counterintuitive as most developers are not used to the trailing dot (.). Furthermore, it makes moving to different landscapes more difficult/error-prone.\nGardener Specific Workarounds/Mitigations Gardener allows users to customize their DNS configuration. CoreDNS allows several approaches to deal with the requests generated by the DNS search path. Caching is possible as well as query rewriting. There are also several other plugins available, which may mitigate the situation.\nGardener DNS Query Rewriting As explained above, the application of the DNS search path may lead to the undesired creation of DNS requests. Especially with the default setting of ndots=5, seemingly fully qualified names pointing to services in the cluster may trigger the DNS search path application.\nGardener allows to automatically rewrite some obviously incorrect DNS names, which stem from an application of the DNS search path to the most likely desired name. This will automatically rewrite requests like service.namespace.svc.cluster.local.svc.cluster.local to service.namespace.svc.cluster.local.\nIn case the applications also target services for name resolution, which are outside of the cluster and have less than ndots dots, it might be helpful to prevent search path application for them as well. One way to achieve it is by adding them to the commonSuffixes:\n... spec: ... systemComponents: coreDNS: rewriting: commonSuffixes: - gardener.cloud - example.com ... DNS requests containing a common suffix and ending in .svc.cluster.local are assumed to be incorrect application of the DNS search path. Therefore, they are rewritten to everything ending in the common suffix. For example, www.gardener.cloud.svc.cluster.local would be rewritten to www.gardener.cloud.\nPlease note that the common suffixes should be long enough and include enough dots (.) to prevent random overlap with other DNS queries. For example, it would be a bad idea to simply put com on the list of common suffixes, as there may be services/namespaces which have com as part of their name. The effect would be seemingly random DNS requests. Gardener requires that common suffixes contain at least one dot (.) and adds a second dot at the beginning. For instance, a common suffix of example.com in the configuration would match *.example.com.\nSince some clients verify the host in the response of a DNS query, the host must also be rewritten. For that reason, we can’t rewrite a query for service.dst-namespace.svc.cluster.local.src-namespace.svc.cluster.local or www.example.com.src-namespace.svc.cluster.local, as for an answer rewrite src-namespace would not be known.\n","categories":"","description":"","excerpt":"DNS Search Path Optimization DNS Search Path Using fully qualified …","ref":"/docs/gardener/dns-search-path-optimization/","tags":"","title":"DNS Search Path Optimization"},{"body":"Gardener Extension for DNS services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nExtension-Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: \"extension-dns-service\" namespace: shoot--project--abc spec: type: shoot-dns-service How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for DNS services for shoot clusters","excerpt":"Gardener extension controller for DNS services for shoot clusters","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/","tags":"","title":"DNS services"},{"body":"Using the latest Tag for an Image Many Dockerfiles use the FROM package:latest pattern at the top of their Dockerfiles to pull the latest image from a Docker registry.\nBad Dockerfile FROMalpineWhile simple, using the latest tag for an image means that your build can suddenly break if that image gets updated. This can lead to problems where everything builds fine locally (because your local cache thinks it is the latest), while a build server may fail, because some pipelines make a clean pull on every build. Additionally, troubleshooting can prove to be difficult, since the maintainer of the Dockerfile didn’t actually make any changes.\nGood Dockerfile A digest takes the place of the tag when pulling an image. This will ensure that your Dockerfile remains immutable.\nFROMalpine@sha256:7043076348bf5040220df6ad703798fd8593a0918d06d3ce30c6c93be117e430Running apt/apk/yum update Running apt-get install is one of those things virtually every Debian-based Dockerfile will have to do in order to satiate some external package requirements your code needs to run. However, using apt-get as an example, this comes with its own problems.\napt-get upgrade\nThis will update all your packages to their latests versions, which can be bad because it prevents your Dockerfile from creating consistent, immutable builds.\napt-get update (in a different line than the one running your apt-get install command)\nRunning apt-get update as a single line entry will get cached by the build and won’t actually run every time you need to run apt-get install. Instead, make sure you run apt-get update in the same line with all the packages to ensure that all are updated correctly.\nAvoid Big Container Images Building a small container image will reduce the time needed to start or restart pods. An image based on the popular Alpine Linux project is much smaller than most distribution based images (~5MB). For most popular languages and products, there is usually an official Alpine Linux image, e.g., golang, nodejs, and postgres.\n$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE postgres 9.6.9-alpine 6583932564f8 13 days ago 39.26 MB postgres 9.6 d92dad241eff 13 days ago 235.4 MB postgres 10.4-alpine 93797b0f31f4 13 days ago 39.56 MB In addition, for compiled languages such as Go or C++ that do not require build time tooling during runtime, it is recommended to avoid build time tooling in the final images. With Docker’s support for multi-stages builds, this can be easily achieved with minimal effort. Such an example can be found at Multi-stage builds.\nGoogle’s distroless image is also a good base image.\n","categories":"","description":"Common Dockerfile pitfalls","excerpt":"Common Dockerfile pitfalls","ref":"/docs/guides/applications/dockerfile-pitfall/","tags":"","title":"Dockerfile Pitfalls"},{"body":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster Motivation IPv6 adoption is continuously growing, already overtaking IPv4 in certain regions, e.g. India, or scenarios, e.g. mobile. Even though most IPv6 installations deploy means to reach IPv4, it might still be beneficial to expose services natively via IPv4 and IPv6 instead of just relying on IPv4.\nDisadvantages of full IPv4/IPv6 (dual-stack) Deployments Enabling full IPv4/IPv6 (dual-stack) support in a kubernetes cluster is a major endeavor. It requires a lot of changes and restarts of all pods so that all pods get addresses for both IP families. A side-effect of dual-stack networking is that failures may be hidden as network traffic may take the other protocol to reach the target. For this reason and also due to reduced operational complexity, service teams might lean towards staying in a single-stack environment as much as possible. Luckily, this is possible with Gardener and IPv4/IPv6 (dual-stack) ingress on AWS.\nSimplifying IPv4/IPv6 (dual-stack) Ingress with Protocol Translation on AWS Fortunately, the network load balancer on AWS supports automatic protocol translation, i.e. it can expose both IPv4 and IPv6 endpoints while communicating with just one protocol to the backends. Under the hood, automatic protocol translation takes place. Client IP address preservation can be achieved by using proxy protocol.\nThis approach enables users to expose IPv4 workload to IPv6-only clients without having to change the workload/service. Without requiring invasive changes, it allows a fairly simple first step into the IPv6 world for services just requiring ingress (incoming) communication.\nNecessary Shoot Cluster Configuration Changes for IPv4/IPv6 (dual-stack) Ingress To be able to utilize IPv4/IPv6 (dual-stack) Ingress in an IPv4 shoot cluster, the cluster needs to meet two preconditions:\n dualStack.enabled needs to be set to true to configure VPC/subnet for IPv6 and add a routing rule for IPv6. (This does not add IPv6 addresses to kubernetes nodes.) loadBalancerController.enabled needs to be set to true as well to use the load balancer controller, which supports dual-stack ingress. apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig dualStack: enabled: true controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerController: enabled: true ... When infrastructureConfig.networks.vpc.id is set to the ID of an existing VPC, please make sure that your VPC has an Amazon-provided IPv6 CIDR block added.\nAfter adapting the shoot specification and reconciling the cluster, dual-stack load balancers can be created using kubernetes services objects.\nCreating an IPv4/IPv6 (dual-stack) Ingress With the preconditions set, creating an IPv4/IPv6 load balancer is as easy as annotating a service with the correct annotations:\napiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance service.beta.kubernetes.io/aws-load-balancer-type: external name: ... namespace: ... spec: ... type: LoadBalancer In case the client IP address should be preserved, the following annotation can be used to enable proxy protocol. (The pod receiving the traffic needs to be configured for proxy protocol as well.)\n service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: \"*\" Please note that changing an existing Service to dual-stack may cause the creation of a new load balancer without deletion of the old AWS load balancer resource. While this helps in a seamless migration by not cutting existing connections it may lead to wasted/forgotten resources. Therefore, the (manual) cleanup needs to be taken into account when migrating an existing Service instance.\nFor more details see AWS Load Balancer Documentation - Network Load Balancer.\nDNS Considerations to Prevent Downtime During a Dual-Stack Migration In case the migration of an existing service is desired, please check if there are DNS entries directly linked to the corresponding load balancer. The migrated load balancer will have a new domain name immediately, which will not be ready in the beginning. Therefore, a direct migration of the domain name entries is not desired as it may cause a short downtime, i.e. domain name entries without backing IP addresses.\nIf there are DNS entries directly linked to the corresponding load balancer and they are managed by the shoot-dns-service, you can identify this via annotations with the prefix dns.gardener.cloud/. Those annotations can be linked to a Service, Ingress or Gateway resources. Alternatively, they may also use DNSEntry or DNSAnnotation resources.\nFor a seamless migration without downtime use the following three step approach:\n Temporarily prevent direct DNS updates Migrate the load balancer and wait until it is operational Allow DNS updates again To prevent direct updates of the DNS entries when the load balancer is migrated add the annotation dns.gardener.cloud/ignore: 'true' to all affected resources next to the other dns.gardener.cloud/... annotations before starting the migration. For example, in case of a Service ensure that the service looks like the following:\nkind: Service metadata: annotations: dns.gardener.cloud/ignore: 'true' dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: '...' ... Next, migrate the load balancer to be dual-stack enabled by adding/changing the corresponding annotations.\nYou have multiple options how to check that the load balancer has been provisioned successfully. It might be useful to peek into status.loadBalancer.ingress of the corresponding Service to identify the load balancer:\n Check in the AWS console for the corresponding load balancer provisioning state Perform domain name lookups with nslookup/dig to check whether the name resolves to an IP address. Call your workload via the new load balancer, e.g. using curl --resolve \u003cmy-domain-name\u003e:\u003cport\u003e:\u003cIP-address\u003e https://\u003cmy-domain-name\u003e:\u003cport\u003e, which allows you to call your service with the “correct” domain name without using actual name resolution. Wait a fixed period of time as load balancer creation is usually finished within 15 minutes Once the load balancer has been provisioned, you can remove the annotation dns.gardener.cloud/ignore: 'true' again from the affected resources. It may take some additional time until the domain name change finally propagates (up to one hour).\n","categories":"","description":"","excerpt":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/dual-stack-ingress/","tags":"","title":"Dual Stack Ingress"},{"body":"Dependency Watchdog with Local Garden Cluster Setting up Local Garden cluster A convenient way to test local dependency-watchdog changes is to use a local garden cluster. To setup a local garden cluster you can follow the setup-guide.\nDependency Watchdog resources As part of the local garden installation, a local seed will be available.\nDependency Watchdog resources created in the seed Namespaced resources In the garden namespace of the seed cluster, following resources will be created:\n Resource (GVK) Name {apiVersion: v1, Kind: ServiceAccount} dependency-watchdog-prober {apiVersion: v1, Kind: ServiceAccount} dependency-watchdog-weeder {apiVersion: apps/v1, Kind: Deployment} dependency-watchdog-prober {apiVersion: apps/v1, Kind: Deployment} dependency-watchdog-weeder {apiVersion: v1, Kind: ConfigMap} dependency-watchdog-prober-* {apiVersion: v1, Kind: ConfigMap} dependency-watchdog-weeder-* {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:dependency-watchdog-prober:role {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:dependency-watchdog-weeder:role {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:dependency-watchdog-prober:role-binding {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:dependency-watchdog-weeder:role-binding {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} dependency-watchdog-prober {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} dependency-watchdog-weeder {apiVersion: v1, Kind: Secret} managedresource-dependency-watchdog-weeder {apiVersion: v1, Kind: Secret} managedresource-dependency-watchdog-prober Cluster resources Resource (GVK) Name {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRole} gardener.cloud:dependency-watchdog-prober:cluster-role {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRole} gardener.cloud:dependency-watchdog-weeder:cluster-role {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRoleBinding} gardener.cloud:dependency-watchdog-prober:cluster-role-binding {apiVersion: rbac.authorization.k8s.io/v1, Kind: ClusterRoleBinding} gardener.cloud:dependency-watchdog-weeder:cluster-role-binding Dependency Watchdog resources created in Shoot control namespace Resource (GVK) Name {apiVersion: v1, Kind: Secret} dependency-watchdog-prober {apiVersion: resources.gardener.cloud/v1alpha1, Kind: ManagedResource} shoot-core-dependency-watchdog Dependency Watchdog resources created in the kube-node-lease namespace of the shoot Resource (GVK) Name {apiVersion: rbac.authorization.k8s.io/v1, Kind: Role} gardener.cloud:target:dependency-watchdog {apiVersion: rbac.authorization.k8s.io/v1, Kind: RoleBinding} gardener.cloud:target:dependency-watchdog These will be created by the GRM and will have a managed resource named shoot-core-dependency-watchdog in the shoot namespace in the seed.\nUpdate Gardener with custom Dependency Watchdog Docker images Build, Tag and Push docker images To build dependency watchdog docker images run the following make target:\n\u003e make docker-build Local gardener hosts a docker registry which can be access at localhost:5001. To enable local gardener to be able to access the custom docker images you need to tag and push these images to the embedded docker registry. To do that execute the following commands:\n\u003e docker images # Get the IMAGE ID of the dependency watchdog images that were built using docker-build make target. \u003e docker tag \u003cIMAGE-ID\u003e localhost:5001/europe-docker.pkg.dev/gardener-project/public/gardener/dependency-watchdog-prober:\u003cTAGNAME\u003e \u003e docker push localhost:5001/europe-docker.pkg.dev/gardener-project/public/gardener/dependency-watchdog-prober:\u003cTAGNAME\u003e Update ManagedResource Garden resource manager will revert back any changes that are done to the kubernetes deployment for dependency watchdog. This is quite useful in live landscapes where only tested and qualified images are used for all gardener managed components. Any change therefore is automatically reverted.\nHowever, during development and testing you will need to use custom docker images. To prevent garden resource manager from reverting the changes done to the kubernetes deployment for dependency watchdog components you must update the respective managed resources first.\n# List the managed resources \u003e kubectl get mr -n garden | grep dependency # Sample response dependency-watchdog-weeder seed True True False 26h dependency-watchdog-prober seed True True False 26h # Lets assume that you are currently testing prober and would like to use a custom docker image \u003e kubectl edit mr dependency-watchdog-prober -n garden # This will open the resource YAML for editing. Add the annotation resources.gardener.cloud/ignore=true # Reference: https://github.com/gardener/gardener/blob/master/docs/concepts/resource-manager.md # Save the YAML file. When you are done with your testing then you can again edit the ManagedResource and remove the annotation. Garden resource manager will revert back to the image with which gardener was initially built and started.\nUpdate Kubernetes Deployment Find and update the kubernetes deployment for dependency watchdog.\n\u003e kubectl get deploy -n garden | grep dependency # Sample response dependency-watchdog-weeder 1/1 1 1 26h dependency-watchdog-prober 1/1 1 1 26h # Lets assume that you are currently testing prober and would like to use a custom docker image \u003e kubectl edit deploy dependency-watchdog-prober -n garden # This will open the resource YAML for editing. Change the image or any other changes and save. ","categories":"","description":"","excerpt":"Dependency Watchdog with Local Garden Cluster Setting up Local Garden …","ref":"/docs/other-components/dependency-watchdog/setup/dwd-using-local-garden/","tags":"","title":"Dwd Using Local Garden"},{"body":"Overview The example shows how to run a Postgres database on Kubernetes and how to dynamically provision and mount the storage volumes needed by the database\nRun Postgres Database Define the following Kubernetes resources in a yaml file:\n PersistentVolumeClaim (PVC) Deployment PersistentVolumeClaim apiVersion: v1 kind: PersistentVolumeClaim metadata: name: postgresdb-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 9Gi storageClassName: 'default' This defines a PVC using the storage class default. Storage classes abstract from the underlying storage provider as well as other parameters, like disk-type (e.g., solid-state vs standard disks).\nThe default storage class has the annotation {“storageclass.kubernetes.io/is-default-class”:“true”}.\n $ kubectl describe sc default Name: default IsDefaultClass: Yes Annotations: kubectl.kubernetes.io/last-applied-configuration={\"apiVersion\":\"storage.k8s.io/v1beta1\",\"kind\":\"StorageClass\",\"metadata\":{\"annotations\":{\"storageclass.kubernetes.io/is-default-class\":\"true\"},\"labels\":{\"addonmanager.kubernetes.io/mode\":\"Exists\"},\"name\":\"default\",\"namespace\":\"\"},\"parameters\":{\"type\":\"gp2\"},\"provisioner\":\"kubernetes.io/aws-ebs\"} ,storageclass.kubernetes.io/is-default-class=true Provisioner: kubernetes.io/aws-ebs Parameters: type=gp2 AllowVolumeExpansion: \u003cunset\u003e MountOptions: \u003cnone\u003e ReclaimPolicy: Delete VolumeBindingMode: Immediate Events: \u003cnone\u003e A Persistent Volume is automatically created when it is dynamically provisioned. In the following example, the PVC is defined as “postgresdb-pvc”, and a corresponding PV “pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb” is created and associated with the PVC automatically.\n$ kubectl create -f .\\postgres_deployment.yaml persistentvolumeclaim \"postgresdb-pvc\" created $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Delete Bound default/postgresdb-pvc default 3s $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresdb-pvc Bound pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO default 8s Notice that the RECLAIM POLICY is Delete (default value), which is one of the two reclaim policies, the other one is Retain. (A third policy Recycle has been deprecated). In the case of Delete, the PV is deleted automatically when the PVC is removed, and the data on the PVC will also be lost.\nOn the other hand, a PV with Retain policy will not be deleted when the PVC is removed, and moved to Release status, so that data can be recovered by Administrators later.\nYou can use the kubectl patch command to change the reclaim policy as described in Change the Reclaim Policy of a PersistentVolume or use kubectl edit pv \u003cpv-name\u003e to edit it online as shown below:\n$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Delete Bound default/postgresdb-pvc default 44m # change the reclaim policy from \"Delete\" to \"Retain\" $ kubectl edit pv pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb persistentvolume \"pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb\" edited # check the reclaim policy afterwards $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Bound default/postgresdb-pvc default 45m Deployment Once a PVC is created, you can use it in your container via volumes.persistentVolumeClaim.claimName. In the below example, the PVC postgresdb-pvc is mounted as readable and writable, and in volumeMounts two paths in the container are mounted to subfolders in the volume.\napiVersion: apps/v1 kind: Deployment metadata: name: postgres namespace: default labels: app: postgres annotations: deployment.kubernetes.io/revision: \"1\" spec: replicas: 1 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 1 selector: matchLabels: app: postgres template: metadata: name: postgres labels: app: postgres spec: containers: - name: postgres image: \"cpettech.docker.repositories.sap.ondemand.com/jtrack_postgres:howto\" env: - name: POSTGRES_USER value: postgres - name: POSTGRES_PASSWORD value: p5FVqfuJFrM42cVX9muQXxrC3r8S9yn0zqWnFR6xCoPqxqVQ - name: POSTGRES_INITDB_XLOGDIR value: \"/var/log/postgresql/logs\" ports: - containerPort: 5432 volumeMounts: - mountPath: /var/lib/postgresql/data name: postgre-db subPath: data # https://github.com/kubernetes/website/pull/2292. Solve the issue of crashing initdb due to non-empty directory (i.e. lost+found) - mountPath: /var/log/postgresql/logs name: postgre-db subPath: logs volumes: - name: postgre-db persistentVolumeClaim: claimName: postgresdb-pvc readOnly: false imagePullSecrets: - name: cpettechregistry To check the mount points in the container:\n$ kubectl get po NAME READY STATUS RESTARTS AGE postgres-7f485fd768-c5jf9 1/1 Running 0 32m $ kubectl exec -it postgres-7f485fd768-c5jf9 bash root@postgres-7f485fd768-c5jf9:/# ls /var/lib/postgresql/data/ base pg_clog pg_dynshmem pg_ident.conf pg_multixact pg_replslot pg_snapshots pg_stat_tmp pg_tblspc PG_VERSION postgresql.auto.conf postmaster.opts global pg_commit_ts pg_hba.conf pg_logical pg_notify pg_serial pg_stat pg_subtrans pg_twophase pg_xlog postgresql.conf postmaster.pid root@postgres-7f485fd768-c5jf9:/# ls /var/log/postgresql/logs/ 000000010000000000000001 archive_status Deleting a PersistentVolumeClaim In case of a Delete policy, deleting a PVC will also delete its associated PV. If Retain is the reclaim policy, the PV will change status from Bound to Released when the PVC is deleted.\n# Check pvc and pv before deletion $ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE postgresdb-pvc Bound pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO default 50m $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Bound default/postgresdb-pvc default 50m # delete pvc $ kubectl delete pvc postgresdb-pvc persistentvolumeclaim \"postgresdb-pvc\" deleted # pv changed to status \"Released\" $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-06c81c30-72ea-11e8-ada2-aa3b2329c8bb 9Gi RWO Retain Released default/postgresdb-pvc default 51m ","categories":"","description":"Running a Postgres database on Kubernetes","excerpt":"Running a Postgres database on Kubernetes","ref":"/docs/guides/applications/dynamic-pvc/","tags":"","title":"Dynamic Volume Provisioning"},{"body":"Gardener Extension for Networking Filter \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-networking-filter extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nExtension Resources Currently there is nothing to specify in the extension spec.\nExample extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-networking-filter namespace: shoot--project--abc spec: When an extension resource is reconciled, the extension controller will create a daemonset egress-filter-applier on the shoot containing a Dockerfile container.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for egress filtering for shoot clusters","excerpt":"Gardener extension controller for egress filtering for shoot clusters","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/","tags":"","title":"Egress filtering"},{"body":"Using IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster Motivation IPv6 adoption is continuously growing, already overtaking IPv4 in certain regions, e.g. India, or scenarios, e.g. mobile. Even though most IPv6 installations deploy means to reach IPv4, it might still be beneficial to expose services natively via IPv4 and IPv6 instead of just relying on IPv4.\nDisadvantages of full IPv4/IPv6 (dual-stack) Deployments Enabling full IPv4/IPv6 (dual-stack) support in a kubernetes cluster is a major endeavor. It requires a lot of changes and restarts of all pods so that all pods get addresses for both IP families. A side-effect of dual-stack networking is that failures may be hidden as network traffic may take the other protocol to reach the target. For this reason and also due to reduced operational complexity, service teams might lean towards staying in a single-stack environment as much as possible. Luckily, this is possible with Gardener and IPv4/IPv6 (dual-stack) ingress on AWS.\nSimplifying IPv4/IPv6 (dual-stack) Ingress with Protocol Translation on AWS Fortunately, the network load balancer on AWS supports automatic protocol translation, i.e. it can expose both IPv4 and IPv6 endpoints while communicating with just one protocol to the backends. Under the hood, automatic protocol translation takes place. Client IP address preservation can be achieved by using proxy protocol.\nThis approach enables users to expose IPv4 workload to IPv6-only clients without having to change the workload/service. Without requiring invasive changes, it allows a fairly simple first step into the IPv6 world for services just requiring ingress (incoming) communication.\nNecessary Shoot Cluster Configuration Changes for IPv4/IPv6 (dual-stack) Ingress To be able to utilize IPv4/IPv6 (dual-stack) Ingress in an IPv4 shoot cluster, the cluster needs to meet two preconditions:\n dualStack.enabled needs to be set to true to configure VPC/subnet for IPv6 and add a routing rule for IPv6. (This does not add IPv6 addresses to kubernetes nodes.) loadBalancerController.enabled needs to be set to true as well to use the load balancer controller, which supports dual-stack ingress. apiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig dualStack: enabled: true controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerController: enabled: true ... When infrastructureConfig.networks.vpc.id is set to the ID of an existing VPC, please make sure that your VPC has an Amazon-provided IPv6 CIDR block added.\nAfter adapting the shoot specification and reconciling the cluster, dual-stack load balancers can be created using kubernetes services objects.\nCreating an IPv4/IPv6 (dual-stack) Ingress With the preconditions set, creating an IPv4/IPv6 load balancer is as easy as annotating a service with the correct annotations:\napiVersion: v1 kind: Service metadata: annotations: service.beta.kubernetes.io/aws-load-balancer-ip-address-type: dualstack service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance service.beta.kubernetes.io/aws-load-balancer-type: external name: ... namespace: ... spec: ... type: LoadBalancer In case the client IP address should be preserved, the following annotation can be used to enable proxy protocol. (The pod receiving the traffic needs to be configured for proxy protocol as well.)\n service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: \"*\" Please note that changing an existing Service to dual-stack may cause the creation of a new load balancer without deletion of the old AWS load balancer resource. While this helps in a seamless migration by not cutting existing connections it may lead to wasted/forgotten resources. Therefore, the (manual) cleanup needs to be taken into account when migrating an existing Service instance.\nFor more details see AWS Load Balancer Documentation - Network Load Balancer.\nDNS Considerations to Prevent Downtime During a Dual-Stack Migration In case the migration of an existing service is desired, please check if there are DNS entries directly linked to the corresponding load balancer. The migrated load balancer will have a new domain name immediately, which will not be ready in the beginning. Therefore, a direct migration of the domain name entries is not desired as it may cause a short downtime, i.e. domain name entries without backing IP addresses.\nIf there are DNS entries directly linked to the corresponding load balancer and they are managed by the shoot-dns-service, you can identify this via annotations with the prefix dns.gardener.cloud/. Those annotations can be linked to a Service, Ingress or Gateway resources. Alternatively, they may also use DNSEntry or DNSAnnotation resources.\nFor a seamless migration without downtime use the following three step approach:\n Temporarily prevent direct DNS updates Migrate the load balancer and wait until it is operational Allow DNS updates again To prevent direct updates of the DNS entries when the load balancer is migrated add the annotation dns.gardener.cloud/ignore: 'true' to all affected resources next to the other dns.gardener.cloud/... annotations before starting the migration. For example, in case of a Service ensure that the service looks like the following:\nkind: Service metadata: annotations: dns.gardener.cloud/ignore: 'true' dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: '...' ... Next, migrate the load balancer to be dual-stack enabled by adding/changing the corresponding annotations.\nYou have multiple options how to check that the load balancer has been provisioned successfully. It might be useful to peek into status.loadBalancer.ingress of the corresponding Service to identify the load balancer:\n Check in the AWS console for the corresponding load balancer provisioning state Perform domain name lookups with nslookup/dig to check whether the name resolves to an IP address. Call your workload via the new load balancer, e.g. using curl --resolve \u003cmy-domain-name\u003e:\u003cport\u003e:\u003cIP-address\u003e https://\u003cmy-domain-name\u003e:\u003cport\u003e, which allows you to call your service with the “correct” domain name without using actual name resolution. Wait a fixed period of time as load balancer creation is usually finished within 15 minutes Once the load balancer has been provisioned, you can remove the annotation dns.gardener.cloud/ignore: 'true' again from the affected resources. It may take some additional time until the domain name change finally propagates (up to one hour).\n","categories":"","description":"Use IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster on AWS","excerpt":"Use IPv4/IPv6 (dual-stack) Ingress in an IPv4 single-stack cluster on …","ref":"/docs/guides/networking/dual-stack-ipv4-ipv6-ingress-aws/","tags":"","title":"Enable IPv4/IPv6 (dual-stack) Ingress on AWS"},{"body":"etcd - Key-Value Store for Kubernetes etcd is a strongly consistent key-value store and the most prevalent choice for the Kubernetes persistence layer. All API cluster objects like Pods, Deployments, Secrets, etc., are stored in etcd, which makes it an essential part of a Kubernetes control plane.\nGarden or Shoot Cluster Persistence Each garden or shoot cluster gets its very own persistence for the control plane. It runs in the shoot namespace on the respective seed cluster (or in the garden namespace in the garden cluster, respectively). Concretely, there are two etcd instances per shoot cluster, which the kube-apiserver is configured to use in the following way:\n etcd-main A store that contains all “cluster critical” or “long-term” objects. These object kinds are typically considered for a backup to prevent any data loss.\n etcd-events A store that contains all Event objects (events.k8s.io) of a cluster. Events usually have a short retention period and occur frequently, but are not essential for a disaster recovery.\nThe setup above prevents both, the critical etcd-main is not flooded by Kubernetes Events, as well as backup space is not occupied by non-critical data. This separation saves time and resources.\netcd Operator Configuring, maintaining, and health-checking etcd is outsourced to a dedicated operator called etcd Druid. When a gardenlet reconciles a Shoot resource or a gardener-operator reconciles a Garden resource, they manage an Etcd resource in the seed or garden cluster, containing necessary information (backup information, defragmentation schedule, resources, etc.). etcd-druid needs to manage the lifecycle of the desired etcd instance (today main or events). Likewise, when the Shoot or Garden is deleted, gardenlet or gardener-operator deletes the Etcd resources and etcd Druid takes care of cleaning up all related objects, e.g. the backing StatefulSets.\nBackup If Seeds specify backups for etcd (example), then Gardener and the respective provider extensions are responsible for creating a bucket on the cloud provider’s side (modelled through a BackupBucket resource). The bucket stores backups of Shoots scheduled on that Seed. Furthermore, Gardener creates a BackupEntry, which subdivides the bucket and thus makes it possible to store backups of multiple shoot clusters.\nHow long backups are stored in the bucket after a shoot has been deleted depends on the configured retention period in the Seed resource. Please see this example configuration for more information.\nFor Gardens specifying backups for etcd (example), the bucket must be pre-created externally and provided via the Garden specification.\nBoth etcd instances are configured to run with a special backup-restore sidecar. It takes care about regularly backing up etcd data and restoring it in case of data loss (in the main etcd only). The sidecar also performs defragmentation and other house-keeping tasks. More information can be found in the component’s GitHub repository.\nHousekeeping etcd maintenance tasks must be performed from time to time in order to re-gain database storage and to ensure the system’s reliability. The backup-restore sidecar takes care about this job as well.\nFor both Shoots and Gardens, a random time within the shoot’s maintenance time is chosen for scheduling these tasks.\n","categories":"","description":"How Gardener uses the etcd key-value store","excerpt":"How Gardener uses the etcd key-value store","ref":"/docs/gardener/concepts/etcd/","tags":"","title":"etcd"},{"body":"etcd-druid \n \netcd-druid is an etcd operator which makes it easy to configure, provision, reconcile and monitor etcd clusters. It enables management of an etcd cluster through declarative Kubernetes API model.\nIn every etcd cluster managed by etcd-druid, each etcd member is a two container Pod which consists of:\n etcd-wrapper which manages the lifecycle (validation \u0026 initialization) of an etcd. etcd-backup-restore sidecar which currently provides the following capabilities (the list is not comprehensive): etcd DB validation. Scheduled etcd DB defragmentation. Backup - etcd DB snapshots are taken regularly and backed in an object store if one is configured. Restoration - In case of a DB corruption for a single-member cluster it helps in restoring from latest set of snapshots (full \u0026 delta). Member control operations. etcd-druid additional provides the following capabilities:\n Facilitates declarative scale-out of etcd clusters.\n Provides protection against accidental deletion/mutation of resources provisioned as part of an etcd cluster.\n Offers an asynchronous and threshold based capability to process backed up snapshots to:\n Potentially minimize the recovery time by leveraging restoration from backups followed by etcd’s compaction and defragmentation. Indirectly assert integrity of the backed up snaphots. Allows seamless copy of backups between any two object store buckets.\n Start using or developing etcd-druid locally If you are looking to try out druid then you can use a Kind cluster based setup.\nhttps://github.com/user-attachments/assets/cfe0d891-f709-4d7f-b975-4300c6de67e4\nFor detailed documentation, see our /docs folder. Please find the index here.\nContributions If you wish to contribute then please see our guidelines.\nFeedback and Support We always look forward to active community engagement. Please report bugs or suggestions on how we can enhance etcd-druid on GitHub Issues.\nLicense Release under Apache-2.0 license.\n","categories":"","description":"A druid for etcd management in Gardener","excerpt":"A druid for etcd management in Gardener","ref":"/docs/other-components/etcd-druid/","tags":"","title":"Etcd Druid"},{"body":"Documentation Index Concepts Controllers Webhooks Development Testing(Unit, Integration and E2E Tests) etcd Network Latency Getting started locally using azurite emulator Getting started locally using localstack emulator Getting started locally Local End-To-End Tests Deployment etcd-druid CLI Flags Feature Gates Operations Metrics Recovery from Permanent Quorum Loss in etcd cluster Restoring single member in a Multi-Node etcd cluster Proposals DEP: Template DEP-1: Multi-Node etcd clusters DEP-2: Snapshot compaction DEP-3: Scaling up an Etcd cluster DEP-4: Etcd Member custom resource DEP-5: Etcd Operator Tasks Usage Supported K8S versions ","categories":"","description":"","excerpt":"Documentation Index Concepts Controllers Webhooks Development …","ref":"/docs/other-components/etcd-druid/readme/","tags":"","title":"Etcd Druid"},{"body":"ETCD Encryption Config The spec.kubernetes.kubeAPIServer.encryptionConfig field in the Shoot API allows users to customize encryption configurations for the API server. It provides options to specify additional resources for encryption beyond secrets.\nUsage Guidelines The resources field can be used to specify resources that should be encrypted in addition to secrets. Secrets are always encrypted. Each item is a Kubernetes resource name in plural (resource or resource.group). Wild cards are not supported. Adding an item to this list will cause patch requests for all the resources of that kind to encrypt them in the etcd. See Encrypting Confidential Data at Rest for more details. Removing an item from this list will cause patch requests for all the resources of that type to decrypt and rewrite the resource as plain text. See Decrypt Confidential Data that is Already Encrypted at Rest for more details. ℹ️ Note that configuring encryption for a custom resource is only supported for Kubernetes versions \u003e= 1.26.\n Example Usage in a Shoot spec: kubernetes: kubeAPIServer: encryptionConfig: resources: - configmaps - statefulsets.apps - customresource.fancyoperator.io ","categories":"","description":"Specifying resource types for encryption with `spec.kubernetes.kubeAPIServer.encryptionConfig`","excerpt":"Specifying resource types for encryption with …","ref":"/docs/gardener/etcd_encryption_config/","tags":"","title":"ETCD Encryption Config"},{"body":"Network Latency analysis: sn-etcd-sz vs mn-etcd-sz vs mn-etcd-mz This page captures the etcd cluster latency analysis for below scenarios using the benchmark tool (build from etcd benchmark tool).\nsn-etcd-sz -\u003e single-node etcd single zone (Only single replica of etcd will be running)\nmn-etcd-sz -\u003e multi-node etcd single zone (Multiple replicas of etcd pods will be running across nodes in a single zone)\nmn-etcd-mz -\u003e multi-node etcd multi zone (Multiple replicas of etcd pods will be running across nodes in multiple zones)\nPUT Analysis Summary sn-etcd-sz latency is ~20% less than mn-etcd-sz when benchmark tool with single client. mn-etcd-sz latency is less than mn-etcd-mz but the difference is ~+/-5%. Compared to mn-etcd-sz, sn-etcd-sz latency is higher and gradually grows with more clients and larger value size. Compared to mn-etcd-mz, mn-etcd-sz latency is higher and gradually grows with more clients and larger value size. Compared to follower, leader latency is less, when benchmark tool with single client for all cases. Compared to follower, leader latency is high, when benchmark tool with multiple clients for all cases. Sample commands:\n# write to leader benchmark put --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --val-size=256 --total=10000 \\ --endpoints=$ETCD_HOST # write to follower benchmark put --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --val-size=256 --total=10000 \\ --endpoints=$ETCD_FOLLOWER_HOST Latency analysis during PUT requests to etcd In this case benchmark tool tries to put key with random 256 bytes value. Benchmark tool loads key/value to leader with single client .\n sn-etcd-sz latency (~0.815ms) is ~50% lesser than mn-etcd-sz (~1.74ms ). mn-etcd-sz latency (~1.74ms ) is slightly lesser than mn-etcd-mz (~1.8ms) but the difference is negligible (within same ms). Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 leader 1220.0520 0.815ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 leader 586.545 1.74ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 leader 554.0155654442634 1.8ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with single client.\n mn-etcd-sz latency(~2.2ms) is 20% to 30% lesser than mn-etcd-mz(~2.7ms). Compare to follower, leader has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 follower-1 445.743 2.23ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 follower-1 378.9366747610789 2.63ms eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 follower-2 457.967 2.17ms eu-west-1a etcd-main-2 mn-etcd-sz 10000 256 1 1 follower-2 345.6586129825796 2.89ms eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool loads key/value to leader with multiple clients.\n sn-etcd-sz latency(~78.3ms) is ~10% greater than mn-etcd-sz(~71.81ms). mn-etcd-sz latency(~71.81ms) is less than mn-etcd-mz(~72.5ms) but the difference is negligible. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 leader 12638.905 78.32ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 leader 13789.248 71.81ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 leader 13728.446436395223 72.5ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with multiple clients.\n mn-etcd-sz latency(~69.8ms) is ~5% greater than mn-etcd-mz(~72.6ms). Compare to leader, follower has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 follower-1 14271.983 69.80ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 follower-1 13695.98 72.62ms eu-west-1a etcd-main-1 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 follower-2 14325.436 69.47ms eu-west-1a etcd-main-2 mn-etcd-sz 100000 256 100 1000 follower-2 15750.409490407475 63.3ms eu-west-1b etcd-main-2 mn-etcd-mz In this case benchmark tool tries to put key with random 1 MB value. Benchmark tool loads key/value to leader with single client.\n sn-etcd-sz latency(~16.35ms) is ~20% lesser than mn-etcd-sz(~20.64ms). mn-etcd-sz latency(~20.64ms) is less than mn-etcd-mz(~21.08ms) but the difference is negligible.. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 leader 61.117 16.35ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 leader 48.416 20.64ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 leader 45.7517341664802 21.08ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value withto follower single client.\n mn-etcd-sz latency(~23.10ms) is ~10% greater than mn-etcd-mz(~21.8ms). Compare to follower, leader has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 follower-1 43.261 23.10ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 follower-1 45.7517341664802 21.8ms eu-west-1c etcd-main-0 mn-etcd-mz 1000 1000000 1 1 follower-1 45.33 22.05ms eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 follower-2 40.0518 24.95ms eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 1 1 follower-2 43.28573155709838 23.09ms eu-west-1b etcd-main-2 mn-etcd-mz 1000 1000000 1 1 follower-2 45.92 21.76ms eu-west-1a etcd-main-1 mn-etcd-mz 1000 1000000 1 1 follower-2 35.5705 28.1ms eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool loads key/value to leader with multiple clients.\n sn-etcd-sz latency(~6.0375secs) is ~30% greater than mn-etcd-sz``~4.000secs). mn-etcd-sz latency(~4.000secs) is less than mn-etcd-mz(~ 4.09secs) but the difference is negligible. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 leader 55.373 6.0375secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 300 leader 67.319 4.000secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 300 leader 65.91914167957594 4.09secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool loads key/value to follower with multiple clients.\n mn-etcd-sz latency(~4.04secs) is ~5% greater than mn-etcd-mz(~ 3.90secs). Compare to leader, follower has lower latency. Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 follower-1 66.528 4.0417secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 300 follower-1 70.6493461856332 3.90secs eu-west-1c etcd-main-0 mn-etcd-mz 1000 1000000 100 300 follower-1 71.95 3.84secs eu-west-1c etcd-main-0 mn-etcd-mz Number of keys Value size Number of connections Number of clients Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 300 follower-2 66.447 4.0164secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 300 follower-2 67.53038086369484 3.87secs eu-west-1b etcd-main-2 mn-etcd-mz 1000 1000000 100 300 follower-2 68.46 3.92secs eu-west-1a etcd-main-1 mn-etcd-mz Range Analysis Sample commands are:\n# Single connection read request with sequential keys benchmark range 0 --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --total=10000 \\ --consistency=l \\ --endpoints=$ETCD_HOST # --consistency=s [Serializable] benchmark range 0 --target-leader --conns=1 --clients=1 --precise \\ --sequential-keys --key-starts 0 --total=10000 \\ --consistency=s \\ --endpoints=$ETCD_HOST # Each read request with range query matches key 0 9999 and repeats for total number of requests. benchmark range 0 9999 --target-leader --conns=1 --clients=1 --precise \\ --total=10 \\ --consistency=s \\ --endpoints=https://etcd-main-client:2379 # Read requests with multiple connections benchmark range 0 --target-leader --conns=100 --clients=1000 --precise \\ --sequential-keys --key-starts 0 --total=100000 \\ --consistency=l \\ --endpoints=$ETCD_HOST benchmark range 0 --target-leader --conns=100 --clients=1000 --precise \\ --sequential-keys --key-starts 0 --total=100000 \\ --consistency=s \\ --endpoints=$ETCD_HOST Latency analysis during Range requests to etcd In this case benchmark tool tries to get specific key with random 256 bytes value. Benchmark tool range requests to leader with single client.\n sn-etcd-sz latency(~1.24ms) is ~40% greater than mn-etcd-sz(~0.67ms).\n mn-etcd-sz latency(~0.67ms) is ~20% lesser than mn-etcd-mz(~0.85ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true l leader 800.272 1.24ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 true l leader 1173.9081 0.67ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 true l leader 999.3020189178693 0.85ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~40% less for all cases\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true s leader 1411.229 0.70ms eu-west-1c etcd-main-0 sn-etcd-sz 10000 256 1 1 true s leader 2033.131 0.35ms eu-west-1a etcd-main-1 mn-etcd-sz 10000 256 1 1 true s leader 2100.2426362012025 0.47ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with single client .\n mn-etcd-sz latency(~1.3ms) is ~20% lesser than mn-etcd-mz(~1.6ms). Compare to follower, leader read request latency is ~50% less for both mn-etcd-sz, mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true l follower-1 765.325 1.3ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 true l follower-1 596.1 1.6ms eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~50% less for all cases Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 10000 256 1 1 true s follower-1 1823.631 0.54ms eu-west-1a etcd-main-0 mn-etcd-sz 10000 256 1 1 true s follower-1 1442.6 0.69ms eu-west-1c etcd-main-0 mn-etcd-mz 10000 256 1 1 true s follower-1 1416.39 0.70ms eu-west-1c etcd-main-0 mn-etcd-mz 10000 256 1 1 true s follower-1 2077.449 0.47ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to leader with multiple client.\n sn-etcd-sz latency(~84.66ms) is ~20% greater than mn-etcd-sz(~73.95ms).\n mn-etcd-sz latency(~73.95ms) is more or less equal to mn-etcd-mz(~ 73.8ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true l leader 11775.721 84.66ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 true l leader 13446.9598 73.95ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 true l leader 13527.19810605353 73.8ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~20% lesser for all cases\n sn-etcd-sz latency(~69.37ms) is more or less equal to mn-etcd-sz(~69.89ms).\n mn-etcd-sz latency(~69.89ms) is slightly higher than mn-etcd-mz(~67.63ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true s leader 14334.9027 69.37ms eu-west-1c etcd-main-0 sn-etcd-sz 100000 256 100 1000 true s leader 14270.008 69.89ms eu-west-1a etcd-main-1 mn-etcd-sz 100000 256 100 1000 true s leader 14715.287354023869 67.63ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with multiple client.\n mn-etcd-sz latency(~60.69ms) is ~20% lesser than mn-etcd-mz(~70.76ms).\n Compare to leader, follower has lower read request latency.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true l follower-1 11586.032 60.69ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 true l follower-1 14050.5 70.76ms eu-west-1c etcd-main-0 mn-etcd-mz mn-etcd-sz latency(~86.09ms) is ~20 higher than mn-etcd-mz(~64.6ms).\n Compare to mn-etcd-sz consistency Linearizable, Serializable is ~20% higher.* Compare to mn-etcd-mz consistency Linearizable, Serializable is ~slightly less.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 100000 256 100 1000 true s follower-1 11582.438 86.09ms eu-west-1a etcd-main-0 mn-etcd-sz 100000 256 100 1000 true s follower-1 15422.2 64.6ms eu-west-1c etcd-main-0 mn-etcd-mz Benchmark tool range requests to leader all keys.\n sn-etcd-sz latency(~678.77ms) is ~5% slightly lesser than mn-etcd-sz(~697.29ms).\n mn-etcd-sz latency(~697.29ms) is less than mn-etcd-mz(~701ms) but the difference is negligible.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false l leader 6.8875 678.77ms eu-west-1c etcd-main-0 sn-etcd-sz 20 256 2 5 false l leader 6.720 697.29ms eu-west-1a etcd-main-1 mn-etcd-sz 20 256 2 5 false l leader 6.7 701ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~5% slightly higher for all cases sn-etcd-sz latency(~687.36ms) is less than mn-etcd-sz(~692.68ms) but the difference is negligible.\n mn-etcd-sz latency(~692.68ms) is ~5% slightly lesser than mn-etcd-mz(~735.7ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false s leader 6.76 687.36ms eu-west-1c etcd-main-0 sn-etcd-sz 20 256 2 5 false s leader 6.635 692.68ms eu-west-1a etcd-main-1 mn-etcd-sz 20 256 2 5 false s leader 6.3 735.7ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower all keys\n mn-etcd-sz(~737.68ms) latency is ~5% slightly higher than mn-etcd-mz(~713.7ms).\n Compare to leader consistency Linearizableread request, follower is ~5% slightly higher.\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false l follower-1 6.163 737.68ms eu-west-1a etcd-main-0 mn-etcd-sz 20 256 2 5 false l follower-1 6.52 713.7ms eu-west-1c etcd-main-0 mn-etcd-mz mn-etcd-sz latency(~757.73ms) is ~10% higher than mn-etcd-mz(~690.4ms).\n Compare to follower consistency Linearizableread request, follower consistency Serializable is ~3% slightly higher for mn-etcd-sz.\n Compare to follower consistency Linearizableread request, follower consistency Serializable is ~5% less for mn-etcd-mz.\n *Compare to leader consistency Serializableread request, follower consistency Serializable is ~5% less for mn-etcd-mz. *\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 256 2 5 false s follower-1 6.0295 757.73ms eu-west-1a etcd-main-0 mn-etcd-sz 20 256 2 5 false s follower-1 6.87 690.4ms eu-west-1c etcd-main-0 mn-etcd-mz In this case benchmark tool tries to get specific key with random `1MB` value. Benchmark tool range requests to leader with single client.\n sn-etcd-sz latency(~5.96ms) is ~5% lesser than mn-etcd-sz(~6.28ms).\n mn-etcd-sz latency(~6.28ms) is ~10% higher than mn-etcd-mz(~5.3ms).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true l leader 167.381 5.96ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 true l leader 158.822 6.28ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 true l leader 187.94 5.3ms eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~15% less for sn-etcd-sz, mn-etcd-sz, mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true s leader 184.95 5.398ms eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 1 1 true s leader 176.901 5.64ms eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 1 1 true s leader 209.99 4.7ms eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with single client.\n mn-etcd-sz latency(~6.66ms) is ~10% higher than mn-etcd-mz(~6.16ms).\n Compare to leader, follower read request latency is ~10% high for mn-etcd-sz\n Compare to leader, follower read request latency is ~20% high for mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true l follower-1 150.680 6.66ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 true l follower-1 162.072 6.16ms eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizable, Serializable is ~15% less for mn-etcd-sz(~5.84ms), mn-etcd-mz(~5.01ms).\n Compare to leader, follower read request latency is ~5% slightly high for mn-etcd-sz, mn-etcd-mz\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 1 1 true s follower-1 170.918 5.84ms eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 1 1 true s follower-1 199.01 5.01ms eu-west-1c etcd-main-0 mn-etcd-mz Benchmark tool range requests to leader with multiple clients.\n sn-etcd-sz latency(~1.593secs) is ~20% lesser than mn-etcd-sz(~1.974secs).\n mn-etcd-sz latency(~1.974secs) is ~5% greater than mn-etcd-mz(~1.81secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l leader 252.149 1.593secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 500 true l leader 205.589 1.974secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 500 true l leader 230.42 1.81secs eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizable, Serializable is more or less same for sn-etcd-sz(~1.57961secs), mn-etcd-mz(~1.8secs) not a big difference\n Compare to consistency Linearizable, Serializable is ~10% high for mn-etcd-sz(~ 2.277secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s leader 252.406 1.57961secs eu-west-1c etcd-main-0 sn-etcd-sz 1000 1000000 100 500 true s leader 181.905 2.277secs eu-west-1a etcd-main-1 mn-etcd-sz 1000 1000000 100 500 true s leader 227.64 1.8secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower with multiple client.\n mn-etcd-sz latency is ~20% less than mn-etcd-mz.\n Compare to leader consistency Linearizable, follower read request latency is ~15 less for mn-etcd-sz(~1.694secs).\n Compare to leader consistency Linearizable, follower read request latency is ~10% higher for mn-etcd-sz(~1.977secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l follower-1 248.489 1.694secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 500 true l follower-1 210.22 1.977secs eu-west-1c etcd-main-0 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true l follower-2 205.765 1.967secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 500 true l follower-2 195.2 2.159secs eu-west-1b etcd-main-2 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s follower-1 231.458 1.7413secs eu-west-1a etcd-main-0 mn-etcd-sz 1000 1000000 100 500 true s follower-1 214.80 1.907secs eu-west-1c etcd-main-0 mn-etcd-mz Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 1000 1000000 100 500 true s follower-2 183.320 2.2810secs eu-west-1a etcd-main-2 mn-etcd-sz 1000 1000000 100 500 true s follower-2 195.40 2.164secs eu-west-1b etcd-main-2 mn-etcd-mz Benchmark tool range requests to leader all keys.\n sn-etcd-sz latency(~8.993secs) is ~3% slightly lower than mn-etcd-sz(~9.236secs).\n mn-etcd-sz latency(~9.236secs) is ~2% slightly lower than mn-etcd-mz(~9.100secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false l leader 0.5139 8.993secs eu-west-1c etcd-main-0 sn-etcd-sz 20 1000000 2 5 false l leader 0.506 9.236secs eu-west-1a etcd-main-1 mn-etcd-sz 20 1000000 2 5 false l leader 0.508 9.100secs eu-west-1a etcd-main-1 mn-etcd-mz Compare to consistency Linearizableread request, follower for sn-etcd-sz(~9.secs) is a slight difference 10ms.\n Compare to consistency Linearizableread request, follower for mn-etcd-sz(~9.113secs) is ~1% less, not a big difference.\n Compare to consistency Linearizableread request, follower for mn-etcd-mz(~8.799secs) is ~3% less, not a big difference.\n sn-etcd-sz latency(~9.secs) is ~1% slightly less than mn-etcd-sz(~9.113secs).\n mn-etcd-sz latency(~9.113secs) is ~3% slightly higher than mn-etcd-mz(~8.799secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false s leader 0.51125 9.0003secs eu-west-1c etcd-main-0 sn-etcd-sz 20 1000000 2 5 false s leader 0.4993 9.113secs eu-west-1a etcd-main-1 mn-etcd-sz 20 1000000 2 5 false s leader 0.522 8.799secs eu-west-1a etcd-main-1 mn-etcd-mz Benchmark tool range requests to follower all keys\n mn-etcd-sz latency(~9.065secs) is ~1% slightly higher than mn-etcd-mz(~9.007secs).\n Compare to leader consistency Linearizableread request, follower is ~1% slightly higher for both cases mn-etcd-sz, mn-etcd-mz .\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false l follower-1 0.512 9.065secs eu-west-1a etcd-main-0 mn-etcd-sz 20 1000000 2 5 false l follower-1 0.533 9.007secs eu-west-1c etcd-main-0 mn-etcd-mz Compare to consistency Linearizableread request, follower for mn-etcd-sz(~9.553secs) is ~5% high.\n Compare to consistency Linearizableread request, follower for mn-etcd-mz(~7.7433secs) is ~15% less.\n mn-etcd-sz(~9.553secs) latency is ~20% higher than mn-etcd-mz(~7.7433secs).\n Number of requests Value size Number of connections Number of clients sequential-keys Consistency Target etcd server Average write QPS Average latency per request zone server name Test name 20 1000000 2 5 false s follower-1 0.4743 9.553secs eu-west-1a etcd-main-0 mn-etcd-sz 20 1000000 2 5 false s follower-1 0.5500 7.7433secs eu-west-1c etcd-main-0 mn-etcd-mz NOTE: This Network latency analysis is inspired by etcd performance.\n ","categories":"","description":"","excerpt":"Network Latency analysis: sn-etcd-sz vs mn-etcd-sz vs mn-etcd-mz This …","ref":"/docs/other-components/etcd-druid/etcd-network-latency/","tags":"","title":"etcd Network Latency"},{"body":"DEP-04: EtcdMember Custom Resource Table of Contents DEP-04: EtcdMember Custom Resource Table of Contents Summary Terminology Motivation Goals Non-Goals Proposal Etcd Member Metadata Etcd Member State Transitions States and Sub-States Top Level State Transitions Starting an Etcd-Member in a Single-Node Etcd Cluster Addition of a New Etcd-Member in a Multi-Node Etcd Cluster Restart of a Voting Etcd-Member in a Multi-Node Etcd Cluster Deterministic Etcd Member Creation/Restart During Scale-Up TLS Enablement for Peer Communication Monitoring Backup Health Enhanced Snapshot Compaction Enhanced Defragmentation Monitoring Defragmentations Monitoring Restorations Monitoring Volume Mismatches Custom Resource API Spec vs Status Representing State Transitions Reason Codes API EtcdMember Etcd Lifecycle of an EtcdMember Creation Updation Deletion Reconciliation Stale EtcdMember Status Handling Reference Summary Today, etcd-druid mainly acts as an etcd cluster provisioner, and seldom takes remediatory actions if the etcd cluster goes into an undesired state that needs to be resolved by a human operator. In other words, etcd-druid cannot perform day-2 operations on etcd clusters in its current form, and hence cannot carry out its full set of responsibilities as a true “operator” of etcd clusters. For etcd-druid to be fully capable of its responsibilities, it must know the latest state of the etcd clusters and their individual members at all times.\nThis proposal aims to bridge that gap by introducing EtcdMember custom resource allowing individual etcd cluster members to publish information/state (previously unknown to etcd-druid). This provides etcd-druid a handle to potentially take cluster-scoped remediatory actions.\nTerminology druid: etcd-druid - an operator for etcd clusters.\n etcd-member: A single etcd pod in an etcd cluster that is realised as a StatefulSet.\n backup-sidecar: It is the etcd-backup-restore sidecar container in each etcd-member pod.\n NOTE: Term sidecar can now be confused with the latest definition in KEP-73. etcd-backup-restore container is currently not set as an init-container as proposed in the KEP but as a regular container in a multi-container [Pod](Pods | Kubernetes).\n leading-backup-sidecar: A backup-sidecar that is associated to an etcd leader.\n restoration: It refers to an individual etcd-member restoring etcd data from an existing backup (comprising of full and delta snapshots). The authors have deliberately chosen to distinguish between restoration and learning. Learning refers to a process where a learner “learns” from an etcd-cluster leader.\n Motivation Sharing state of an individual etcd-member with druid is essential for diagnostics, monitoring, cluster-wide-operations and potential remediation. At present, only a subset of etcd-member state is shared with druid using leases. It was always meant as a stopgap arrangement as mentioned in the corresponding issue and is not the best use of leases.\nThere is a need to have a clear distinction between an etcd-member state and etcd cluster state since most of an etcd cluster state is often derived by looking at individual etcd-member states. In addition, actors which update each of these states should be clearly identified so as to prevent multiple actors updating a single resource holding the state of either an etcd cluster or an etcd-member. As a consequence, etcd-members should not directly update the Etcd resource status and would therefore need a new custom resource allowing each member to publish detailed information about its latest state.\nGoals Introduce EtcdMember custom resource via which each etcd-member can publish information about its state. This enables druid to deterministically orchestrate out-of-turn operations like compaction, defragmentation, volume management etc. Define and capture states, sub-states and deterministic transitions amongst states of an etcd-member. Today leases are misused to share member-specific information with druid. Their usage to share member state [leader, follower, learner], member-id, snapshot revisions etc should be removed. Non-Goals Auto-recovery from quorum loss or cluster-split due to network partitioning. Auto-recovery of an etcd-member due to volume mismatch. Relooking at segregating responsiblities between etcd and backup-sidecar containers. Proposal This proposal introduces a new custom resource EtcdMember, and in the following sections describes different sets of information that should be captured as part of the new resource.\nEtcd Member Metadata Every etcd-member has a unique memberID and it is part of an etcd cluster which has a unique clusterID. In a well-formed etcd cluster every member must have the same clusterID. Publishing this information to druid helps in identifying issues when one or more etcd-members form their own individual clusters, thus resulting in multiple clusters where only one was expected. Issues Issue#419, Canary#4027, Canary#3973 are some such occurrences.\nToday, this information is published by using a member lease. Both these fields are populated in the leases’ Spec.HolderIdentity by the backup-sidecar container.\nThe authors propose to publish member metadata information in EtcdMember resource.\nid: \u003cetcd-member id\u003e clusterID: \u003cetcd cluster id\u003e NOTE: Druid would not do any auto-recovery when it finds out that there are more than one clusters being formed. Instead this information today will be used for diagnostic and alerting.\n Etcd Member State Transitions Each etcd-member goes through different States during its lifetime. State is a derived high-level summary of where an etcd-member is in its lifecycle. A SubState gives additional information about the state. This proposal extends the concept of states with the notion of a SubState, since State indicates a top-level state of an EtcdMember resource, which can have one or more SubStates.\nWhile State is sufficient for many human operators, the notion of a SubState provides operators with an insight about the discrete stage of an etcd-member in its lifecycle. For example, consider a top-level State: Starting, which indicates that an etcd-member is starting. Starting is meant to be a transient state for an etcd-member. If an etcd-member remains in this State longer than expected, then an operator would require additional insight, which the authors propose to provide via SubState (in this case, the possible SubStates could be PendingLearner and Learner, which are detailed in the following sections).\nAt present, these states are not captured and only the final state is known - i.e the etcd-member either fails to come up (all re-attempts to bring up the pod via the StatefulSet controller has exhausted) or it comes up. Getting an insight into all its state transitions would help in diagnostics.\nThe status of an etcd-member at any given point in time can be best categorized as a combination of a top-level State and a SubState. The authors propose to introduce the following states and sub-states:\nStates and Sub-States NOTE: Abbreviations have been used wherever possible, only to represent sub-states. These representations are chosen only for brevity and will have proper longer names.\n States Sub-States Description New - Every newly created etcd-member will start in this state and is termed as the initial state or the start state. Initializing DBV-S (DBValidationSanity) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State DBV-S which is an abbreviation for DBValidationSanity denotes that currently sanity etcd DB validation is in progress. Initializing DBV-F (DBValidationFull) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State DBV-F which is an abbreviation for DBValidationFull denotes that currently full etcd DB validation is in progress. Initializing R (Restoration) This state denotes that backup-restore container in etcd-member pod has started initialization. Sub-State R which is an abbreviation for Restoration denotes that DB validation failed and now backup-restore has commenced restoration of etcd DB from the backup (comprising of full snapshot and delta-snapshots). An etcd-member will transition to this sub-state only when it is part of a single-node etcd-cluster. Starting (SI) PL (PendingLearner) An etcd-member can transition from Initializing state to PendingLearner state. In this state backup-restore container will optionally delete any existing etcd data directory and then attempts to add its peer etcd-member process as a learner. Since there can be only one learner at a time in an etcd cluster, an etcd-member could be in this state for some time till its request to get added as a learner is accepted. Starting (SI) Learner When backup-restore is successfully able to add its peer etcd-member process as a Learner. In this state the etcd-member process will start its DB sync from an etcd leader. Started (Sd) Follower A follower is a voting raft member. A Learner etcd-member will get promoted to a Follower once its DB is in sync with the leader. It could also become a follower if during a re-election it loses leadership and transitions from being a Leader to Follower. Started (Sd) Leader A leader is an etcd-member which will handle all client write requests and linearizable read requests. A member could transition to being a Leader from an existing Follower role due to winning a leader election or for a single node etcd cluster it directly transitions from Initializing state to Leader state as there is no other member. In the following sub-sections, the state transitions are categorized into several flows making it easier to grasp the different transitions.\nTop Level State Transitions Following DFA represents top level state transitions (without any representation of sub-states). As described in the table above there are 4 top level states:\n New- this is a start state for all newly created etcd-members\n Initializing - In this state backup-restore will perform pre-requisite actions before it triggers the start of an etcd process. DB validation and optionally restoration is done in this state. Possible sub-states are: DBValidationSanity, DBValidationFull and Restoration\n Starting - Once the optional initialization is done backup-restore will trigger the start of an etcd process. It can either directly go to Learner sub-state or wait for getting added as a learner and therefore be in PendingLearner sub-state.\n Started - In this state the etcd-member is a full voting member. It can either be in Leader or Follower sub-states.\n Starting an Etcd-Member in a Single-Node Etcd Cluster Following DFA represents the states, sub-states and transitions of a single etcd-member for a cluster that is bootstrapped from cluster size of 0 -\u003e 1.\nAddition of a New Etcd-Member in a Multi-Node Etcd Cluster Following DFA represents the states, sub-states and transitions of an etcd cluster which starts with having a single member (Leader) and then one or more new members are added which represents a scale-up of an etcd cluster from 1 -\u003e n, where n is odd.\nRestart of a Voting Etcd-Member in a Multi-Node Etcd Cluster Following DFA represents the states, sub-states and transitions when a voting etcd-member in a multi-node etcd cluster restarts.\n NOTE: If the DB validation fails then data directory of the etcd-member is removed and etcd-member is removed from cluster membership, thus transitioning it to New state. The state transitions from New state are depicted by this section.\n Deterministic Etcd Member Creation/Restart During Scale-Up Bootstrap information:\nWhen an etcd-member starts, then it needs to find out:\n If it should join an existing cluster or start a new cluster.\n If it should add itself as a Learner or directly start as a voting member.\n Issue with the current approach:\nAt present, this is facilitated by three things:\n During scale-up, druid adds an annotation gardener.cloud/scaled-to-multi-node to the StatefulSet. Each etcd-members looks up this annotation.\n backup-sidecar attempts to fetch etcd cluster member-list and checks if this etcd-member is already part of the cluster.\n Size of the cluster by checking initial-cluster in the etcd config.\n Druid adds an annotation gardener.cloud/scaled-to-multi-node on the StatefulSet which is then shared by all etcd-members irrespective of the starting state of an etcd-member (as Learner or Voting-Member). This especially creates an issue for the current leader (often pod with index 0) during the scale-up of an etcd cluster as described in this issue.\nIt has been agreed that the current solution to this issue is a quick and dirty fix and needs to be revisited to be uniformly applied to all etcd-members. The authors propose to provide a more deterministic approach to scale-up using the EtcdMember resource.\nNew approach\nInstead of adding an annotation gardener.cloud/scaled-to-multi-node on the StatefulSet, a new annotation druid.gardener.cloud/create-as-learner should be added by druid on an EtcdMember resource. This annotation will only be added to newly created members during scale-up.\nEach etcd-member should look at the following to deterministically compute the bootstrap information specified above:\n druid.gardener.cloud/create-as-learner annotation on its respective EtcdMember resource. This new annotation will be honored in the following cases:\n When an etcd-member is created for the very first time.\n An etcd-member is restarted while it is in Starting state (PendingLearner and Learner sub-states).\n Etcd-cluster member list. to check if it is already part of the cluster.\n Existing etcd data directory and its validity.\n NOTE: When the etcd-member gets promoted to a voting-member, then it should remove the annotation on its respective EtcdMember resource.\n TLS Enablement for Peer Communication Etcd-members in a cluster use peer URL(s) to communicate amongst each other. If the advertised peer URL(s) for an etcd-member are updated then etcd mandates a restart of the etcd-member.\nDruid only supports toggling the transport level security for the advertised peer URL(s). To indicate that the etcd process within the etcd-member has the updated advertised peer URL(s), an annotation member.etcd.gardener.cloud/tls-enabled is added by backup-sidecar container to the member lease object.\nDuring the reconciliation run for an Etcd resource in druid, if reconciler detects a change in advertised peer URL(s) TLS configuration then it will watch for the above mentioned annotation on the member lease. If the annotation has a value of false then it will trigger a restart of the etcd-member pod.\nThe authors propose to publish member metadata information in EtcdMember resource and not misuse member leases.\npeerTLSEnabled: \u003cbool\u003e Monitoring Backup Health Backup-sidecar takes delta and full snapshot both periodically and threshold based. These backed-up snapshots are essential for restoration operations for bootstrapping an etcd cluster from 0 -\u003e 1 replicas. It is essential that leading-backup-sidecar container which is responsible for taking delta/full snapshots and uploading these snapshots to the configured backup store, publishes this information for druid to consume.\nAt present, information about backed-up snapshot (only latest-revision-number) is published by leading-backup-sidecar container by updating Spec.HolderIdentity of the delta-snapshot and full-snapshot leases.\nDruid maintains conditions in the Etcd resource status, which include but are not limited to maintaining information on whether backups being taken for an etcd cluster are healthy (up-to-date) or stale (outdated in context to a configured schedule). Druid computes these conditions using information from full/delta snapshot leases.\nIn order to provide a holistic view of the health of backups to human operators, druid requires additional information about the snapshots that are being backed-up. The authors propose to not misuse leases and instead publish the following snapshot information as part EtcdMember custom resource:\nsnapshots: lastFull: timestamp: \u003ctime of full snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e lastDelta: timestamp: \u003ctime of delta snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e While this information will primarily help druid compute accurate conditions regarding backup health from snapshot information and publish this to human operators, it could be further utilised by human operators to take remediatory actions (e.g. manually triggering a full or delta snapshot or further restarting the leader if the issue is still not resolved) if backup is unhealthy.\nEnhanced Snapshot Compaction Druid can be configured to perform regular snapshot compactions for etcd clusters, to reduce the total number of delta snapshots to be restored if and when a DB restoration for an etcd cluster is required. Druid triggers a snapshot compaction job when the accumulated etcd events in the latest set of delta snapshots (taken after the last full snapshot) crosses a specified threshold.\nAs described in Issue#591 scheduling compaction only based on number of accumulated etcd events is not sufficient to ensure a successful compaction. This is specifically targeted for kubernetes clusters where each etcd event is larger in size owing to large spec or status fields or respective resources.\nDruid will now need information regarding snapshot sizes, and more importantly the total size of accumulated delta snapshots since the last full snapshot.\nThe authors propose to enhance the proposed snapshots field described in Use Case #3 with the following additional field:\nsnapshots: accumulatedDeltaSize: \u003ctotal size of delta snapshots since last full snapshot\u003e Druid can then use this information in addition to the existing revision information to decide to trigger an early snapshot compaction job. This effectively allows druid to be proactive in performing regular compactions for etcds receiving large events, reducing the probability of a failed snapshot compaction or restoration.\nEnhanced Defragmentation Reader is recommended to read Etcd Compaction \u0026 Defragmentation in order to understand the following terminology:\ndbSize - total storage space used by the etcd database\ndbSizeInUse - logical storage space used by the etcd database, not accounting for free pages in the DB due to etcd history compaction\nThe leading-backup-sidecar performs periodic defragmentations of the DBs of all the etcd-members in the cluster, controlled via a defragmentation cron schedule provided to each backup-sidecar. Defragmentation is a costly maintenance operation and causes a brief downtime to the etcd-member being defragmented, due to which the leading-backup-sidecar defragments each etcd-member sequentially. This ensures that only one etcd-member would be unavailable at any given time, thus avoiding an accidental quorum loss in the etcd cluster.\nThe authors propose to move the responsibility of orchestrating these individual defragmentations to druid due to the following reasons:\n Since each backup-sidecar only has knowledge of the health of its own etcd, it can only determine whether its own etcd can be defragmented or not, based on etcd-member health. Trying to defragment a different healthy etcd-member while another etcd-member is unhealthy would lead to a transient quorum loss. Each backup-sidecar is only a sidecar to its own etcd-member, and by good design principles, it must not be performing any cluster-wide maintenance operations, and this responsibility should remain with the etcd cluster operator. Additionally, defragmentation of an etcd DB becomes inevitable if the DB size exceeds the specified DB space quota, since the etcd DB then becomes read-only, ie no write operations on the etcd would be possible unless the etcd DB is defragmented and storage space is freed up. In order to automate this, druid will now need information about the etcd DB size from each member, specifically the leading etcd-member, so that a cluster-wide defragmentation can be triggered if the DB size reaches a certain threshold, as already described by this issue.\nThe authors propose to enhance each etcd-member to regularly publish information about the dbSize and dbSizeInUse so that druid may trigger defragmentation for the etcd cluster.\ndbSize: \u003cdb-size\u003e # e.g 6Gi dbSizeInUse: \u003cdb-size-in-use\u003e # e.g 3.5Gi Difference between dbSize and dbSizeInUse gives a clear indication of how much storage space would be freed up if a defragmentation is performed. If the difference is not significant (based on a configurable threshold provided to druid), then no defragmentation should be performed. This will ensure that druid does not perform frequent defragmentations that do not yield much benefit. Effectively it is to maximise the benefit of defragmentation since this operations involves transient downtime for each etcd-member.\nMonitoring Defragmentations As discussed in the previous section, every etcd-member is defragmented periodically, and can also be defragmented based on the DB size reaching a certain threshold. It is beneficial for druid to have knowledge of this data from each etcd-member for the following reasons:\n [Diagnostics] It is expected that backup-sidecar will push releveant metrics and configure alerts on these metrics.\n [Operational] Derive status of defragmentation at etcd cluster level. In case of partial failures for a subset of etcd-members druid can potentially re-trigger defragmentation only for those etcd-members.\n The authors propose to capture this information as part of lastDefragmentation section in the EtcdMember resource.\nlastDefragmentation: startTime: \u003cstart time of defragmentation\u003e endTime: \u003cend time of defragmentation\u003e status: \u003cSucceeded | Failed\u003e message: \u003csuccess or failure message\u003e initialDBSize: \u003csize of etcd DB prior to defragmentation\u003e finalDBSize: \u003csize of etcd DB post defragmentation\u003e NOTE: Defragmentation is a cluster-wide operation, and insights derived from aggregating defragmentation data from individual etcd-members would be captured in the Etcd resource status\n Monitoring Restorations Each etcd-member may perform restoration of data multiple times throughout its lifecycle, possibly owing to data corruptions. It would be useful to capture this information as part of an EtcdMember resource, for the following use cases:\n [Diagnostics] It is expected that backup-sidecar will push a metric indicating failure to restore.\n [Operational] Restoration from backup-bucket only happens for a single node etcd cluster. If restoration is failing then druid cannot take any remediatory actions since there is no etcd quorum.\n The authors propose to capture this information under lastRestoration section in the EtcdMember resource.\nlastRestoration: status: \u003cFailed | Success | In-Progress\u003e reason: \u003creason-code for status\u003e message: \u003chuman readable message for status\u003e startTime: \u003cstart time of restoration\u003e endTime: \u003cend time of restoration\u003e Authors have considered the following cases to better understand how errors during restoration will be handled:\nCase #1 - Failure to connect to Provider Object Store\nAt present full and delta snapshots are downloaded during restoration. If there is a failure then initialization status transitions to Failed followed by New which forces etcd-wrapper to trigger the initialization again. This in a way forces a retry and currently there is no limit on the number of attempts.\nAuthors propose to improve the retry logic but keep the overall behavior of not forcing a container restart the same.\nCase #2 - Read-Only Mounted volume\nIf a mounted volume which is used to create the etcd data directory turns read-only then authors propose to capture this state via EtcdMember.\nAuthors propose that druid should initiate recovery by deleting the PVC for this etcd-member and letting StatefulSet controller re-create the Pod and the PVC. Removing PVC and deleting the pod is considered safe because:\n Data directory is present and is the DB is corrupt resulting in an un-usasble etcd. Data directory is not present but any attempt to create a directory structure fails due to read-only FS. In both these cases there is no side-effect of deleting the PVC and the Pod.\nCase #3 - Revision mismatch\nThere is currently an issue in backup-sidecar which results in a revision mismatch in the snapshots (full/delta) taken by leading the backup-sidecar container. This results in a restoration failure. One occurance of such issue has been captured in Issue#583. This occurence points to a bug which should be fixed however there is a rare possibility that these snapshots (full/delta) get corrupted. In this rare situation, backup-sidecar should only raise an alert.\nAuthors propose that druid should not take any remediatory actions as this involves:\n Inspecting snapshots If the full snapshot is corrupt then a decision needs to be taken to recover from the last full snapshot as the base snapshot. This can result in data loss and therefore needs manual intervention. If a delta snapshot is corrupt, then recovery can be done till the corrupt revision in the delta snapshot. Since this will also result in a loss of data therefore this decision needs to be take by an operator. Monitoring Volume Mismatches Each etcd-member checks for possible etcd data volume mismatches, based on which it decides whether to start the etcd process or not, but this information is not captured anywhere today. It would be beneficial to capture this information as part of the EtcdMember resource so that a human operator may check this and manually fix the underlying problem with the wrong volume being attached or mounted to an etcd-member pod.\nThe authors propose to capture this information under volumeMismatches section in the EtcdMember resource.\nvolumeMismatches: - identifiedAt: \u003ctime at which wrong volume mount was identified\u003e fixedAt: \u003ctime at which correct volume was mounted\u003e volumeID: \u003cvolume ID of wrong volume that got mounted\u003e numRestarts: \u003cnum of etcd-member restarts that were attempted\u003e Each entry under volumeMismatches will be for a unique volumeID. If there is a pod restart and it results in yet another unexpected volumeID (different from the already captured volumeIDs) then a new entry will get created. numRestarts denotes the number of restarts seen by the etcd-member for a specific volumeID.\nBased on information from the volumeMismatches section, druid may choose to perform rudimentary remediatory actions as simple as restarting the member pod to force a possible rescheduling of the pod to a different node which could potentially force the correct volume to be mounted to the member.\nCustom Resource API Spec vs Status Information that is captured in the etcd-member custom resource could be represented either as EtcdMember.Status or EtcdMemberState.Spec.\nGardener has a similar need to capture a shoot state and they have taken the decision to represent it via ShootState resource where the state or status of a shoot is captured as part of the Spec field in the ShootState custom resource.\nThe authors wish to instead align themselves with the K8S API conventions and choose to use EtcdMember custom resource and capture the status of each member in Status field of this resource. This has the following advantages:\n Spec represents a desired state of a resource and what is intended to be captured is the As-Is state of a resource which Status is meant to capture. Therefore, semantically using Status is the correct choice.\n Not mis-using Spec now to represent As-Is state provides us with a choice to extend the custom resource with any future need for a Spec a.k.a desired state.\n Representing State Transitions The authors propose to use a custom representation for states, sub-states and transitions.\nConsider the following representation:\ntransitions: - state: \u003cname of the state that the etcd-member has transitioned to\u003e subState: \u003cname of the sub-state if any\u003e reason: \u003creason code for the transition\u003e transitionTime: \u003ctime of transition to this state\u003e message: \u003cdetailed message if any\u003e As an example, consider the following transitions which represent addition of an etcd-member during scale-up of an etcd cluster, followed by a restart of the etcd-member which detects a corrupt DB:\nstatus: transitions: - state: New subState: New reason: ClusterScaledUp transitionTime: \"2023-07-17T05:00:00Z\" message: \"New member added due to etcd cluster scale-up\" - state: Starting subState: PendingLearner reason: WaitingToJoinAsLearner transitionTime: \"2023-07-17T05:00:30Z\" message: \"Waiting to join the cluster as a learner\" - state: Starting subState: Learner reason: JoinedAsLearner transitionTime: \"2023-07-17T05:01:20Z\" message: \"Joined the cluster as a learner\" - state: Started subState: Follower reason: PromotedAsVotingMember transitionTime: \"2023-07-17T05:02:00Z\" message: \"Now in sync with leader, promoted as voting member\" - state: Initializing subState: DBValidationFull reason: DetectedPreviousUncleanExit transitionTime: \"2023-07-17T08:00:00Z\" message: \"Detected previous unclean exit, requires full DB validation\" - state: New subState: New reason: DBCorruptionDetected transitionTime: \"2023-07-17T08:01:30Z\" message: \"Detected DB corruption during initialization, removing member from cluster\" - state: Starting subState: PendingLearner reason: WaitingToJoinAsLearner transitionTime: \"2023-07-17T08:02:10Z\" message: \"Waiting to join the cluster as a learner\" - state: Starting subState: Learner reason: JoinedAsLearner transitionTime: \"2023-07-17T08:02:20Z\" message: \"Joined the cluster as a learner\" - state: Started subState: Follower reason: PromotedAsVotingMember transitionTime: \"2023-07-17T08:04:00Z\" message: \"Now in sync with leader, promoted as voting member\" Reason Codes The authors propose the following list of possible reason codes for transitions. This list is not exhaustive, and can be further enhanced to capture any new transitions in the future.\n Reason Transition From State (SubState) Transition To State (SubState) ClusterScaledUp | NewSingleNodeClusterCreated nil New DetectedPreviousCleanExit New | Started (Leader) | Started (Follower) Initializing (DBValidationSanity) DetectedPreviousUncleanExit New | Started (Leader) | Started (Follower) Initializing (DBValidationFull) DBValidationFailed Initializing (DBValidationSanity) | Initializing (DBValidationFull) Initializing (Restoration) | New DBValidationSucceeded Initializing (DBValidationSanity) | Initializing (DBValidationFull) Started (Leader) | Started (Follower) Initializing (Restoration)Succeeded Initializing (Restoration) Started (Leader) WaitingToJoinAsLearner New Starting (PendingLearner) JoinedAsLearner Starting (PendingLearner) Starting (Learner) PromotedAsVotingMember Starting (Learner) Started (Follower) GainedClusterLeadership Started (Follower) Started (Leader) LostClusterLeadership Started (Leader) Started (Follower) API EtcdMember The authors propose to add the EtcdMember custom resource API to etcd-druid APIs and initially introduce it with v1alpha1 version.\napiVersion: druid.gardener.cloud/v1alpha1 kind: EtcdMember metadata: labels: gardener.cloud/owned-by: \u003cname of parent Etcd resource\u003e name: \u003cname of the etcd-member\u003e namespace: \u003cnamespace | will be the same as that of parent Etcd resource\u003e ownerReferences: - apiVersion: druid.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: Etcd name: \u003cname of the parent Etcd resource\u003e uid: \u003cUID of the parent Etcd resource\u003e status: id: \u003cetcd-member id\u003e clusterID: \u003cetcd cluster id\u003e peerTLSEnabled: \u003cbool\u003e dbSize: \u003cdb-size\u003e dbSizeInUse: \u003cdb-size-in-use\u003e snapshots: lastFull: timestamp: \u003ctime of full snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e lastDelta: timestamp: \u003ctime of delta snapshot\u003e name: \u003cname of the file that is uploaded\u003e size: \u003csize of the un-compressed snapshot file uploaded\u003e startRevision: \u003cstart revision of etcd db captured in the snapshot\u003e endRevision: \u003cend revision of etcd db captured in the snapshot\u003e accumulatedDeltaSize: \u003ctotal size of delta snapshots since last full snapshot\u003e lastRestoration: type: \u003cFromSnapshot | FromLeader\u003e status: \u003cFailed | Success | In-Progress\u003e startTime: \u003cstart time of restoration\u003e endTime: \u003cend time of restoration\u003e lastDefragmentation: startTime: \u003cstart time of defragmentation\u003e endTime: \u003cend time of defragmentation\u003e reason: message: initialDBSize: \u003csize of etcd DB prior to defragmentation\u003e finalDBSize: \u003csize of etcd DB post defragmentation\u003e volumeMismatches: - identifiedAt: \u003ctime at which wrong volume mount was identified\u003e fixedAt: \u003ctime at which correct volume was mounted\u003e volumeID: \u003cvolume ID of wrong volume that got mounted\u003e numRestarts: \u003cnum of pod restarts that were attempted\u003e transitions: - state: \u003cname of the state that the etcd-member has transitioned to\u003e subState: \u003cname of the sub-state if any\u003e reason: \u003creason code for the transition\u003e transitionTime: \u003ctime of transition to this state\u003e message: \u003cdetailed message if any\u003e Etcd Authors propose the following changes to the Etcd API:\n In the Etcd.Status resource API, member status is computed and stored. This field will be marked as deprecated and in a later version of druid it will be removed. In its place, the authors propose to introduce the following: type EtcdStatus struct { // MemberRefs contains references to all existing EtcdMember resources MemberRefs []CrossVersionObjectReference } In Etcd.Status resource API, PeerUrlTLSEnabled reflects the status of enabling TLS for peer communication across all etcd-members. Currentlty this field is not been used anywhere. In this proposal, the authors have also proposed that each EtcdMember resource should capture the status of TLS enablement of peer URL. The authors propose to relook at the need to have this field under EtcdStatus. Lifecycle of an EtcdMember Creation Druid creates an EtcdMember resource for every replica in etcd.Spec.Replicas during reconciliation of an etcd resource. For a fresh etcd cluster this is done prior to creation of the StatefulSet resource and for an existing cluster which has now been scaled-up, it is done prior to updating the StatefulSet resource.\nUpdation All fields in EtcdMember.Status are only updated by the corresponding etcd-member. Druid only consumes the information published via EtcdMember resources.\nDeletion Druid is responsible for deletion of all existing EtcdMember resources for an etcd cluster. There are three scenarios where an EtcdMember resource will be deleted:\n Deletion of etcd resource.\n Scale down of an etcd cluster to 0 replicas due to hibernation of the k8s control plane.\n Transient scale down of an etcd cluster to 0 replicas to recover from a quorum loss.\n Authors found no reason to retain EtcdMember resources when the etcd cluster is scale down to 0 replicas since the information contained in each EtcdMember resource would no longer represent the current state of each member and would thus be stale. Any controller in druid which acts upon the EtcdMember.Status could potentially take incorrect actions.\nReconciliation Authors propose to introduce a new controller (let’s call it etcd-member-controller) which watches for changes to the EtcdMember resource(s). If a reconciliation of an Etcd resource is required as a result of change in EtcdMember status then this controller should enqueue an event and force a reconciliation via existing etcd-controller, thus preserving the single-actor-principal constraint which ensures deterministic changes to etcd cluster resources.\n NOTE: Further decisions w.r.t responsibility segregation will be taken during implementation and will not be documented in this proposal.\n Stale EtcdMember Status Handling It is possible that an etcd-member is unable to update its respective EtcdMember resource. Following can be some of the implications which should be kept in mind while reconciling EtcdMember resource in druid:\n Druid sees stale state transitions (this assumes that the backup-sidecar attempts to update the state/sub-state in etcdMember.status.transitions with best attempt). There is currently no implication other than an operator seeing a stale state. dbSize and dbSizeInUse could not be updated. A consequence could be that druid continues to see high value for dbSize - dbSizeInUse for a extended amount of time. Druid should ensure that it does not trigger repeated defragmentations. If VolumeMismatches is stale, then druid should no longer attempt to recover by repeatedly restarting the pod. Failed restoration was recorded last and further updates to this array failed. Druid should not repeatedly take full-snapshots. If snapshots.accumulatedDeltaSize could not be updated, then druid should not schedule repeated compaction Jobs. Reference Disaster recovery | etcd\n etcd API Reference | etcd\n Raft Consensus Algorithm\n ","categories":"","description":"","excerpt":"DEP-04: EtcdMember Custom Resource Table of Contents DEP-04: …","ref":"/docs/other-components/etcd-druid/proposals/04-etcd-member-custom-resource/","tags":"","title":"EtcdMember Custom Resource"},{"body":"Excess Reserve Capacity Excess Reserve Capacity Goal Note Possible Approaches Approach 1: Enhance Machine-controller-manager to also entertain the excess machines Approach 2: Enhance Cluster-autoscaler by simulating fake pods in it Approach 3: Enhance cluster-autoscaler to support pluggable scaling-events Approach 4: Make intelligent use of Low-priority pods Goal Currently, autoscaler optimizes the number of machines for a given application-workload. Along with effective resource utilization, this feature brings concern where, many times, when new application instances are created - they don’t find space in existing cluster. This leads the cluster-autoscaler to create new machines via MachineDeployment, which can take from 3-4 minutes to ~10 minutes, for the machine to really come-up and join the cluster. In turn, application-instances have to wait till new machines join the cluster.\nOne of the promising solutions to this issue is Excess Reserve Capacity. Idea is to keep a certain number of machines or percent of resources[cpu/memory] always available, so that new workload, in general, can be scheduled immediately unless huge spike in the workload. Also, the user should be given enough flexibility to choose how many resources or how many machines should be kept alive and non-utilized as this affects the Cost directly.\nNote We decided to go with Approach-4 which is based on low priority pods. Please find more details here: https://github.com/gardener/gardener/issues/254 Approach-3 looks more promising in long term, we may decide to adopt that in future based on developments/contributions in autoscaler-community. Possible Approaches Following are the possible approaches, we could think of so far.\nApproach 1: Enhance Machine-controller-manager to also entertain the excess machines Machine-controller-manager currently takes care of the machines in the shoot cluster starting from creation-deletion-health check to efficient rolling-update of the machines. From the architecture point of view, MachineSet makes sure that X number of machines are always running and healthy. MachineDeployment controller smartly uses this facility to perform rolling-updates.\n We can expand the scope of MachineDeployment controller to maintain excess number of machines by introducing new parallel independent controller named MachineTaint controller. This will result in MCM to include Machine, MachineSet, MachineDeployment, MachineSafety, MachineTaint controllers. MachineTaint controller does not need to introduce any new CRD - analogy fits where taint-controller also resides into kube-controller-manager.\n Only Job of MachineTaint controller will be:\n List all the Machines under each MachineDeployment. Maintain taints of noSchedule and noExecute on X latest MachineObjects. There should be an event-based informer mechanism where MachineTaintController gets to know about any Update/Delete/Create event of MachineObjects - in turn, maintains the noSchedule and noExecute taints on all the latest machines. - Why latest machines? - Whenever autoscaler decides to add new machines - essentially ScaleUp event - taints from the older machines are removed and newer machines get the taints. This way X number of Machines immediately becomes free for new pods to be scheduled. - While ScaleDown event, autoscaler specifically mentions which machines should be deleted, and that should not bring any concerns. Though we will have to put proper label/annotation defined by autoscaler on taintedMachines, so that autoscaler does not consider the taintedMachines for deletion while scale-down. * Annotation on tainted node: \"cluster-autoscaler.kubernetes.io/scale-down-disabled\": \"true\" Implementation Details:\n Expect new optional field ExcessReplicas in MachineDeployment.Spec. MachineDeployment controller now adds both Spec.Replicas and Spec.ExcessReplicas[if provided], and considers that as a standard desiredReplicas. - Current working of MCM will not be affected if ExcessReplicas field is kept nil. MachineController currently reads the NodeObject and sets the MachineConditions in MachineObject. Machine-controller will now also read the taints/labels from the MachineObject - and maintains it on the NodeObject. We expect cluster-autoscaler to intelligently make use of the provided feature from MCM.\n CA gets the input of min:max:excess from Gardener. CA continues to set the MachineDeployment.Spec.Replicas as usual based on the application-workload. In addition, CA also sets the MachieDeployment.Spec.ExcessReplicas . Corner-case: * CA should decrement the excessReplicas field accordingly when desiredReplicas+excessReplicas on MachineDeployment goes beyond max. Approach 2: Enhance Cluster-autoscaler by simulating fake pods in it There was already an attempt by community to support this feature. Refer for details to: https://github.com/kubernetes/autoscaler/pull/77/files Approach 3: Enhance cluster-autoscaler to support pluggable scaling-events Forked version of cluster-autoscaler could be improved to plug-in the algorithm for excess-reserve capacity. Needs further discussion around upstream support. Create golang channel to separate the algorithms to trigger scaling (hard-coded in cluster-autoscaler, currently) from the algorithms about how to to achieve the scaling (already pluggable in cluster-autoscaler). This kind of separation can help us introduce/plug-in new algorithms (such as based node resource utilisation) without affecting existing code-base too much while almost completely re-using the code-base for the actual scaling. Also this approach is not specific to our fork of cluster-autoscaler. It can be made upstream eventually as well. Approach 4: Make intelligent use of Low-priority pods Refer to: pod-priority-preemption TL; DR: High priority pods can preempt the low-priority pods which are already scheduled. Pre-create bunch[equivivalent of X shoot-control-planes] of low-priority pods with priority of zero, then start creating the workload pods with better priority which will reschedule the low-priority pods or otherwise keep them in pending state if the limit for max-machines has reached. This is still alpha feature. ","categories":"","description":"","excerpt":"Excess Reserve Capacity Excess Reserve Capacity Goal Note Possible …","ref":"/docs/other-components/machine-controller-manager/proposals/excess_reserve_capacity/","tags":"","title":"Excess Reserve Capacity"},{"body":"ExposureClasses The Gardener API server provides a cluster-scoped ExposureClass resource. This resource is used to allow exposing the control plane of a Shoot cluster in various network environments like restricted corporate networks, DMZ, etc.\nBackground The ExposureClass resource is based on the concept for the RuntimeClass resource in Kubernetes.\nA RuntimeClass abstracts the installation of a certain container runtime (e.g., gVisor, Kata Containers) on all nodes or a subset of the nodes in a Kubernetes cluster. See Runtime Class for more information.\nIn contrast, an ExposureClass abstracts the ability to expose a Shoot clusters control plane in certain network environments (e.g., corporate networks, DMZ, internet) on all Seeds or a subset of the Seeds.\nExample: RuntimeClass and ExposureClass\napiVersion: node.k8s.io/v1 kind: RuntimeClass metadata: name: gvisor handler: gvisorconfig # scheduling: # nodeSelector: # env: prod --- kind: ExposureClass metadata: name: internet handler: internet-config # scheduling: # seedSelector: # matchLabels: # network/env: internet Similar to RuntimeClasses, ExposureClasses also define a .handler field reflecting the name reference for the corresponding CRI configuration of the RuntimeClass and the control plane exposure configuration for the ExposureClass.\nThe CRI handler for RuntimeClasses is usually installed by an administrator (e.g., via a DaemonSet which installs the corresponding container runtime on the nodes). The control plane exposure configuration for ExposureClasses will be also provided by an administrator. This exposure configuration is part of the gardenlet configuration, as this component is responsible to configure the control plane accordingly. See the gardenlet Configuration ExposureClass Handlers section for more information.\nThe RuntimeClass also supports the selection of a node subset (which have the respective controller runtime binaries installed) for pod scheduling via its .scheduling section. The ExposureClass also supports the selection of a subset of available Seed clusters whose gardenlet is capable of applying the exposure configuration for the Shoot control plane accordingly via its .scheduling section.\nUsage by a Shoot A Shoot can reference an ExposureClass via the .spec.exposureClassName field.\n ⚠️ When creating a Shoot resource, the Gardener scheduler will try to assign the Shoot to a Seed which will host its control plane.\n The scheduling behaviour can be influenced via the .spec.seedSelectors and/or .spec.tolerations fields in the Shoot. ExposureClasses can also contain scheduling instructions. If a Shoot is referencing an ExposureClass, then the scheduling instructions of both will be merged into the Shoot. Those unions of scheduling instructions might lead to a selection of a Seed which is not able to deal with the handler of the ExposureClass and the Shoot creation might end up in an error. In such case, the Shoot scheduling instructions should be revisited to check that they are not interfering with the ones from the ExposureClass. If this is not feasible, then the combination with the ExposureClass might not be possible and you need to contact your Gardener administrator.\n Example: Shoot and ExposureClass scheduling instructions merge flow Assuming there is the following Shoot which is referencing the ExposureClass below: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: abc namespace: garden-dev spec: exposureClassName: abc seedSelectors: matchLabels: env: prod --- apiVersion: core.gardener.cloud/v1beta1 kind: ExposureClass metadata: name: abc handler: abc scheduling: seedSelector: matchLabels: network: internal Both seedSelectors would be merged into the Shoot. The result would be the following: apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: abc namespace: garden-dev spec: exposureClassName: abc seedSelectors: matchLabels: env: prod network: internal Now the Gardener Scheduler would try to find a Seed with those labels. If there are no Seeds with matching labels for the seed selector, then the Shoot will be unschedulable. If there are Seeds with matching labels for the seed selector, then the Shoot will be assigned to the best candidate after the scheduling strategy is applied, see Gardener Scheduler. If the Seed is not able to serve the ExposureClass handler abc, then the Shoot will end up in error state. If the Seed is able to serve the ExposureClass handler abc, then the Shoot will be created. gardenlet Configuration ExposureClass Handlers The gardenlet is responsible to realize the control plane exposure strategy defined in the referenced ExposureClass of a Shoot.\nTherefore, the GardenletConfiguration can contain an .exposureClassHandlers list with the respective configuration.\nExample of the GardenletConfiguration:\nexposureClassHandlers: - name: internet-config loadBalancerService: annotations: loadbalancer/network: internet - name: internal-config loadBalancerService: annotations: loadbalancer/network: internal sni: ingress: namespace: ingress-internal labels: network: internal Each gardenlet can define how the handler of a certain ExposureClass needs to be implemented for the Seed(s) where it is responsible for.\nThe .name is the name of the handler config and it must match to the .handler in the ExposureClass.\nAll control planes on a Seed are exposed via a load balancer, either a dedicated one or a central shared one. The load balancer service needs to be configured in a way that it is reachable from the target network environment. Therefore, the configuration of load balancer service need to be specified, which can be done via the .loadBalancerService section. The common way to influence load balancer service behaviour is via annotations where the respective cloud-controller-manager will react on and configure the infrastructure load balancer accordingly.\nThe control planes on a Seed will be exposed via a central load balancer and with Envoy via TLS SNI passthrough proxy. In this case, the gardenlet will install a dedicated ingress gateway (Envoy + load balancer + respective configuration) for each handler on the Seed. The configuration of the ingress gateways can be controlled via the .sni section in the same way like for the default ingress gateways.\n","categories":"","description":"","excerpt":"ExposureClasses The Gardener API server provides a cluster-scoped …","ref":"/docs/gardener/exposureclasses/","tags":"","title":"ExposureClasses"},{"body":"Contract: Extension Resource Gardener defines common procedures which must be passed to create a functioning shoot cluster. Well known steps are represented by special resources like Infrastructure, OperatingSystemConfig or DNS. These resources are typically reconciled by dedicated controllers setting up the infrastructure on the hyperscaler or managing DNS entries, etc.\nBut, some requirements don’t match with those special resources or don’t depend on being proceeded at a specific step in the creation / deletion flow of the shoot. They require a more generic hook. Therefore, Gardener offers the Extension resource.\nWhat is required to register and support an Extension type? Gardener creates one Extension resource per registered extension type in ControllerRegistration per shoot.\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerRegistration metadata: name: extension-example spec: resources: - kind: Extension type: example globallyEnabled: true workerlessSupported: true If spec.resources[].globallyEnabled is true, then the Extension resources of the given type is created for every shoot cluster. Set to false, the Extension resource is only created if configured in the Shoot manifest. In case of workerless Shoot, a globally enabled Extension resource is created only if spec.resources[].workerlessSupported is also set to true. If an extension configured in the spec of a workerless Shoot is not supported yet, the admission request will be rejected.\nThe Extension resources are created in the shoot namespace of the seed cluster.\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: example namespace: shoot--foo--bar spec: type: example providerConfig: {} Your controller needs to reconcile extensions.extensions.gardener.cloud. Since there can exist multiple Extension resources per shoot, each one holds a spec.type field to let controllers check their responsibility (similar to all other extension resources of Gardener).\nProviderConfig It is possible to provide data in the Shoot resource which is copied to spec.providerConfig of the Extension resource.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: bar namespace: garden-foo spec: extensions: - type: example providerConfig: foo: bar ... results in\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: example namespace: shoot--foo--bar spec: type: example providerConfig: foo: bar Shoot Reconciliation Flow and Extension Status Gardener creates Extension resources as part of the Shoot reconciliation. Moreover, it is guaranteed that the Cluster resource exists before the Extension resource is created. Extensions can be reconciled at different stages during Shoot reconciliation depending on the defined extension lifecycle strategy in the respective ControllerRegistration resource. Please consult the Extension Lifecycle section for more information.\nFor an Extension controller it is crucial to maintain the Extension’s status correctly. At the end Gardener checks the status of each Extension and only reports a successful shoot reconciliation if the state of the last operation is Succeeded.\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: generation: 1 name: example namespace: shoot--foo--bar spec: type: example status: lastOperation: state: Succeeded observedGeneration: 1 ","categories":"","description":"","excerpt":"Contract: Extension Resource Gardener defines common procedures which …","ref":"/docs/gardener/extensions/extension/","tags":"","title":"Extension"},{"body":"Packages:\n extensions.gardener.cloud/v1alpha1 extensions.gardener.cloud/v1alpha1 Package v1alpha1 is the v1alpha1 version of the API.\nResource Types: BackupBucket BackupEntry Bastion Cluster ContainerRuntime ControlPlane DNSRecord Extension Infrastructure Network OperatingSystemConfig Worker BackupBucket BackupBucket is a specification for backup bucket.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string BackupBucket metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BackupBucketSpec Specification of the BackupBucket. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this bucket. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n status BackupBucketStatus (Optional) BackupEntry BackupEntry is a specification for backup Entry.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string BackupEntry metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BackupEntrySpec Specification of the BackupEntry. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n backupBucketProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) BackupBucketProviderStatus contains the provider status that has been generated by the controller responsible for the BackupBucket resource.\n region string Region is the region of this Entry. This field is immutable.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n status BackupEntryStatus (Optional) Bastion Bastion is a bastion or jump host that is dynamically created to provide SSH access to shoot nodes.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Bastion metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec BastionSpec Spec is the specification of this Bastion. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n userData []byte UserData is the base64-encoded user data for the bastion instance. This should contain code to provision the SSH key on the bastion instance. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n status BastionStatus (Optional) Status is the bastion’s status.\n Cluster Cluster is a specification for a Cluster resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Cluster metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec ClusterSpec cloudProfile k8s.io/apimachinery/pkg/runtime.RawExtension CloudProfile is a raw extension field that contains the cloudprofile resource referenced by the shoot that has to be reconciled.\n seed k8s.io/apimachinery/pkg/runtime.RawExtension Seed is a raw extension field that contains the seed resource referenced by the shoot that has to be reconciled.\n shoot k8s.io/apimachinery/pkg/runtime.RawExtension Shoot is a raw extension field that contains the shoot resource that has to be reconciled.\n ContainerRuntime ContainerRuntime is a specification for a container runtime resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string ContainerRuntime metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec ContainerRuntimeSpec Specification of the ContainerRuntime. If the object’s deletion timestamp is set, this field is immutable.\n binaryPath string BinaryPath is the Worker’s machine path where container runtime extensions should copy the binaries to.\n workerPool ContainerRuntimeWorkerPool WorkerPool identifies the worker pool of the Shoot. For each worker pool and type, Gardener deploys a ContainerRuntime CRD.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n status ContainerRuntimeStatus (Optional) ControlPlane ControlPlane is a specification for a ControlPlane resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string ControlPlane metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec ControlPlaneSpec Specification of the ControlPlane. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose Purpose (Optional) Purpose contains the data if a cloud provider needs additional components in order to expose the control plane. This field is immutable.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the region of this control plane. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n status ControlPlaneStatus (Optional) DNSRecord DNSRecord is a specification for a DNSRecord resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string DNSRecord metadata Kubernetes meta/v1.ObjectMeta Refer to the Kubernetes API documentation for the fields of the metadata field. spec DNSRecordSpec Specification of the DNSRecord. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n region string (Optional) Region is the region of this DNS record. If not specified, the region specified in SecretRef will be used. If that is also not specified, the extension controller will use its default region.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record. If not specified, it will be determined automatically by getting all hosted zones of the account and searching for the longest zone name that is a suffix of Name.\n name string Name is the fully qualified domain name, e.g. “api.”. This field is immutable.\n recordType DNSRecordType RecordType is the DNS record type. Only A, CNAME, and TXT records are currently supported. This field is immutable.\n values []string Values is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records.\n ttl int64 (Optional) TTL is the time to live in seconds. Defaults to 120.\n status DNSRecordStatus (Optional) Extension Extension is a specification for a Extension resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Extension metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec ExtensionSpec Specification of the Extension. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n status ExtensionStatus (Optional) Infrastructure Infrastructure is a specification for cloud provider infrastructure.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Infrastructure metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec InfrastructureSpec Specification of the Infrastructure. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this infrastructure. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with this infrastructure.\n status InfrastructureStatus (Optional) Network Network is the specification for cluster networking.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Network metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec NetworkSpec Specification of the Network. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n podCIDR string PodCIDR defines the CIDR that will be used for pods. This field is immutable.\n serviceCIDR string ServiceCIDR defines the CIDR that will be used for services. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md\n status NetworkStatus (Optional) OperatingSystemConfig OperatingSystemConfig is a specification for a OperatingSystemConfig resource\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string OperatingSystemConfig metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec OperatingSystemConfigSpec Specification of the OperatingSystemConfig. If the object’s deletion timestamp is set, this field is immutable.\n criConfig CRIConfig (Optional) CRI config is a structure contains configurations of the CRI library\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose OperatingSystemConfigPurpose Purpose describes how the result of this OperatingSystemConfig is used by Gardener. Either it gets sent to the Worker extension controller to bootstrap a VM, or it is downloaded by the gardener-node-agent already running on a bootstrapped VM. This field is immutable.\n units []Unit (Optional) Units is a list of unit for the operating system configuration (usually, a systemd unit).\n files []File (Optional) Files is a list of files that should get written to the host’s file system.\n status OperatingSystemConfigStatus (Optional) Worker Worker is a specification for a Worker resource.\n Field Description apiVersion string extensions.gardener.cloud/v1alpha1 kind string Worker metadata Kubernetes meta/v1.ObjectMeta (Optional) Refer to the Kubernetes API documentation for the fields of the metadata field. spec WorkerSpec Specification of the Worker. If the object’s deletion timestamp is set, this field is immutable.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus is a raw extension field that contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the name of the region where the worker pool should be deployed to. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with these workers.\n pools []WorkerPool Pools is a list of worker pools.\n status WorkerStatus (Optional) BackupBucketSpec (Appears on: BackupBucket) BackupBucketSpec is the spec for an BackupBucket resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this bucket. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n BackupBucketStatus (Appears on: BackupBucket) BackupBucketStatus is the status for an BackupBucket resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n generatedSecretRef Kubernetes core/v1.SecretReference (Optional) GeneratedSecretRef is reference to the secret generated by backup bucket, which will have object store specific credentials.\n BackupEntrySpec (Appears on: BackupEntry) BackupEntrySpec is the spec for an BackupEntry resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n backupBucketProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) BackupBucketProviderStatus contains the provider status that has been generated by the controller responsible for the BackupBucket resource.\n region string Region is the region of this Entry. This field is immutable.\n bucketName string BucketName is the name of backup bucket for this Backup Entry.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the credentials to access object store.\n BackupEntryStatus (Appears on: BackupEntry) BackupEntryStatus is the status for an BackupEntry resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n BastionIngressPolicy (Appears on: BastionSpec) BastionIngressPolicy represents an ingress policy for SSH bastion hosts.\n Field Description ipBlock Kubernetes networking/v1.IPBlock IPBlock defines an IP block that is allowed to access the bastion.\n BastionSpec (Appears on: Bastion) BastionSpec contains the specification for an SSH bastion host.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n userData []byte UserData is the base64-encoded user data for the bastion instance. This should contain code to provision the SSH key on the bastion instance. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n BastionStatus (Appears on: Bastion) BastionStatus holds the most recently observed status of the Bastion.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n ingress Kubernetes core/v1.LoadBalancerIngress (Optional) Ingress is the external IP and/or hostname of the bastion host.\n CRIConfig (Appears on: OperatingSystemConfigSpec) CRIConfig contains configurations of the CRI library.\n Field Description name CRIName Name is a mandatory string containing the name of the CRI library. Supported values are containerd.\n cgroupDriver CgroupDriverName (Optional) CgroupDriver configures the CRI’s cgroup driver. Supported values are cgroupfs or systemd.\n containerd ContainerdConfig (Optional) ContainerdConfig is the containerd configuration. Only to be set for OperatingSystemConfigs with purpose ‘reconcile’.\n CRIName (string alias)\n (Appears on: CRIConfig) CRIName is a type alias for the CRI name string.\nCgroupDriverName (string alias)\n (Appears on: CRIConfig) CgroupDriverName is a string denoting the CRI cgroup driver.\nCloudConfig (Appears on: OperatingSystemConfigStatus) CloudConfig contains the generated output for the given operating system config spec. It contains a reference to a secret as the result may contain confidential data.\n Field Description secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the actual result of the generated cloud config.\n ClusterAutoscalerOptions (Appears on: WorkerPool) ClusterAutoscalerOptions contains the cluster autoscaler configurations for a worker pool.\n Field Description scaleDownUtilizationThreshold string (Optional) ScaleDownUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) under which a node is being removed.\n scaleDownGpuUtilizationThreshold string (Optional) ScaleDownGpuUtilizationThreshold defines the threshold in fraction (0.0 - 1.0) of gpu resources under which a node is being removed.\n scaleDownUnneededTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnneededTime defines how long a node should be unneeded before it is eligible for scale down.\n scaleDownUnreadyTime Kubernetes meta/v1.Duration (Optional) ScaleDownUnreadyTime defines how long an unready node should be unneeded before it is eligible for scale down.\n maxNodeProvisionTime Kubernetes meta/v1.Duration (Optional) MaxNodeProvisionTime defines how long cluster autoscaler should wait for a node to be provisioned.\n ClusterSpec (Appears on: Cluster) ClusterSpec is the spec for a Cluster resource.\n Field Description cloudProfile k8s.io/apimachinery/pkg/runtime.RawExtension CloudProfile is a raw extension field that contains the cloudprofile resource referenced by the shoot that has to be reconciled.\n seed k8s.io/apimachinery/pkg/runtime.RawExtension Seed is a raw extension field that contains the seed resource referenced by the shoot that has to be reconciled.\n shoot k8s.io/apimachinery/pkg/runtime.RawExtension Shoot is a raw extension field that contains the shoot resource that has to be reconciled.\n ContainerRuntimeSpec (Appears on: ContainerRuntime) ContainerRuntimeSpec is the spec for a ContainerRuntime resource.\n Field Description binaryPath string BinaryPath is the Worker’s machine path where container runtime extensions should copy the binaries to.\n workerPool ContainerRuntimeWorkerPool WorkerPool identifies the worker pool of the Shoot. For each worker pool and type, Gardener deploys a ContainerRuntime CRD.\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n ContainerRuntimeStatus (Appears on: ContainerRuntime) ContainerRuntimeStatus is the status for a ContainerRuntime resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n ContainerRuntimeWorkerPool (Appears on: ContainerRuntimeSpec) ContainerRuntimeWorkerPool identifies a Shoot worker pool by its name and selector.\n Field Description name string Name specifies the name of the worker pool the container runtime should be available for. This field is immutable.\n selector Kubernetes meta/v1.LabelSelector Selector is the label selector used by the extension to match the nodes belonging to the worker pool.\n ContainerdConfig (Appears on: CRIConfig) ContainerdConfig contains configuration options for containerd.\n Field Description registries []RegistryConfig (Optional) Registries configures the registry hosts for containerd.\n sandboxImage string SandboxImage configures the sandbox image for containerd.\n plugins []PluginConfig (Optional) Plugins configures the plugins section in containerd’s config.toml.\n ControlPlaneSpec (Appears on: ControlPlane) ControlPlaneSpec is the spec of a ControlPlane resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose Purpose (Optional) Purpose contains the data if a cloud provider needs additional components in order to expose the control plane. This field is immutable.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the region of this control plane. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n ControlPlaneStatus (Appears on: ControlPlane) ControlPlaneStatus is the status of a ControlPlane resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n DNSRecordSpec (Appears on: DNSRecord) DNSRecordSpec is the spec of a DNSRecord resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n region string (Optional) Region is the region of this DNS record. If not specified, the region specified in SecretRef will be used. If that is also not specified, the extension controller will use its default region.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record. If not specified, it will be determined automatically by getting all hosted zones of the account and searching for the longest zone name that is a suffix of Name.\n name string Name is the fully qualified domain name, e.g. “api.”. This field is immutable.\n recordType DNSRecordType RecordType is the DNS record type. Only A, CNAME, and TXT records are currently supported. This field is immutable.\n values []string Values is a list of IP addresses for A records, a single hostname for CNAME records, or a list of texts for TXT records.\n ttl int64 (Optional) TTL is the time to live in seconds. Defaults to 120.\n DNSRecordStatus (Appears on: DNSRecord) DNSRecordStatus is the status of a DNSRecord resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n zone string (Optional) Zone is the DNS hosted zone of this DNS record.\n DNSRecordType (string alias)\n (Appears on: DNSRecordSpec) DNSRecordType is a string alias.\nDataVolume (Appears on: WorkerPool) DataVolume contains information about a data volume.\n Field Description name string Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string Size is the of the root volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n DefaultSpec (Appears on: BackupBucketSpec, BackupEntrySpec, BastionSpec, ContainerRuntimeSpec, ControlPlaneSpec, DNSRecordSpec, ExtensionSpec, InfrastructureSpec, NetworkSpec, OperatingSystemConfigSpec, WorkerSpec) DefaultSpec contains common status fields for every extension resource.\n Field Description type string Type contains the instance of the resource’s kind.\n class ExtensionClass (Optional) Class holds the extension class used to control the responsibility for multiple provider extensions.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the provider specific configuration.\n DefaultStatus (Appears on: BackupBucketStatus, BackupEntryStatus, BastionStatus, ContainerRuntimeStatus, ControlPlaneStatus, DNSRecordStatus, ExtensionStatus, InfrastructureStatus, NetworkStatus, OperatingSystemConfigStatus, WorkerStatus) DefaultStatus contains common status fields for every extension resource.\n Field Description providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains provider-specific status.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Seed’s current state.\n lastError github.com/gardener/gardener/pkg/apis/core/v1beta1.LastError (Optional) LastError holds information about the last occurred error during an operation.\n lastOperation github.com/gardener/gardener/pkg/apis/core/v1beta1.LastOperation (Optional) LastOperation holds information about the last operation on the resource.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n state k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) State can be filled by the operating controller with what ever data it needs.\n resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.NamedResourceReference (Optional) Resources holds a list of named resource references that can be referred to in the state by their names.\n DropIn (Appears on: Unit) DropIn is a drop-in configuration for a systemd unit.\n Field Description name string Name is the name of the drop-in.\n content string Content is the content of the drop-in.\n ExtensionClass (string alias)\n (Appears on: DefaultSpec) ExtensionClass is a string alias for an extension class.\nExtensionSpec (Appears on: Extension) ExtensionSpec is the spec for a Extension resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n ExtensionStatus (Appears on: Extension) ExtensionStatus is the status for a Extension resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n File (Appears on: OperatingSystemConfigSpec, OperatingSystemConfigStatus) File is a file that should get written to the host’s file system. The content can either be inlined or referenced from a secret in the same namespace.\n Field Description path string Path is the path of the file system where the file should get written to.\n permissions int32 (Optional) Permissions describes with which permissions the file should get written to the file system. If no permissions are set, the operating system’s defaults are used.\n content FileContent Content describe the file’s content.\n FileCodecID (string alias)\n FileCodecID is the id of a FileCodec for cloud-init scripts.\nFileContent (Appears on: File) FileContent can either reference a secret or contain inline configuration.\n Field Description secretRef FileContentSecretRef (Optional) SecretRef is a struct that contains information about the referenced secret.\n inline FileContentInline (Optional) Inline is a struct that contains information about the inlined data.\n transmitUnencoded bool (Optional) TransmitUnencoded set to true will ensure that the os-extension does not encode the file content when sent to the node. This for example can be used to manipulate the clear-text content before it reaches the node.\n imageRef FileContentImageRef (Optional) ImageRef describes a container image which contains a file.\n FileContentImageRef (Appears on: FileContent) FileContentImageRef describes a container image which contains a file\n Field Description image string Image contains the container image repository with tag.\n filePathInImage string FilePathInImage contains the path in the image to the file that should be extracted.\n FileContentInline (Appears on: FileContent) FileContentInline contains keys for inlining a file content’s data and encoding.\n Field Description encoding string Encoding is the file’s encoding (e.g. base64).\n data string Data is the file’s data.\n FileContentSecretRef (Appears on: FileContent) FileContentSecretRef contains keys for referencing a file content’s data from a secret in the same namespace.\n Field Description name string Name is the name of the secret.\n dataKey string DataKey is the key in the secret’s .data field that should be read.\n IPFamily (string alias)\n (Appears on: NetworkSpec) IPFamily is a type for specifying an IP protocol version to use in Gardener clusters.\nInfrastructureSpec (Appears on: Infrastructure) InfrastructureSpec is the spec for an Infrastructure resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n region string Region is the region of this infrastructure. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with this infrastructure.\n InfrastructureStatus (Appears on: Infrastructure) InfrastructureStatus is the status for an Infrastructure resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n nodesCIDR string (Optional) NodesCIDR is the CIDR of the node network that was optionally created by the acting extension controller. This might be needed in environments in which the CIDR for the network for the shoot worker node cannot be statically defined in the Shoot resource but must be computed dynamically.\n egressCIDRs []string (Optional) EgressCIDRs is a list of CIDRs used by the shoot as the source IP for egress traffic. For certain environments the egress IPs may not be stable in which case the extension controller may opt to not populate this field.\n networking InfrastructureStatusNetworking (Optional) Networking contains information about cluster networking such as CIDRs.\n InfrastructureStatusNetworking (Appears on: InfrastructureStatus) InfrastructureStatusNetworking is a structure containing information about the node, service and pod network ranges.\n Field Description pods []string (Optional) Pods are the CIDRs of the pod network.\n nodes []string (Optional) Nodes are the CIDRs of the node network.\n services []string (Optional) Services are the CIDRs of the service network.\n MachineDeployment (Appears on: WorkerStatus) MachineDeployment is a created machine deployment.\n Field Description name string Name is the name of the MachineDeployment resource.\n minimum int32 Minimum is the minimum number for this machine deployment.\n maximum int32 Maximum is the maximum number for this machine deployment.\n MachineImage (Appears on: WorkerPool) MachineImage contains logical information about the name and the version of the machie image that should be used. The logical information must be mapped to the provider-specific information (e.g., AMIs, …) by the provider itself.\n Field Description name string Name is the logical name of the machine image.\n version string Version is the version of the machine image.\n NetworkSpec (Appears on: Network) NetworkSpec is the spec for an Network resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n podCIDR string PodCIDR defines the CIDR that will be used for pods. This field is immutable.\n serviceCIDR string ServiceCIDR defines the CIDR that will be used for services. This field is immutable.\n ipFamilies []IPFamily (Optional) IPFamilies specifies the IP protocol versions to use for shoot networking. This field is immutable. See https://github.com/gardener/gardener/blob/master/docs/usage/ipv6.md\n NetworkStatus (Appears on: Network) NetworkStatus is the status for an Network resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n NodeTemplate (Appears on: WorkerPool) NodeTemplate contains information about the expected node properties.\n Field Description capacity Kubernetes core/v1.ResourceList Capacity represents the expected Node capacity.\n Object Object is an extension object resource.\nOperatingSystemConfigPurpose (string alias)\n (Appears on: OperatingSystemConfigSpec) OperatingSystemConfigPurpose is a string alias.\nOperatingSystemConfigSpec (Appears on: OperatingSystemConfig) OperatingSystemConfigSpec is the spec for a OperatingSystemConfig resource.\n Field Description criConfig CRIConfig (Optional) CRI config is a structure contains configurations of the CRI library\n DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n purpose OperatingSystemConfigPurpose Purpose describes how the result of this OperatingSystemConfig is used by Gardener. Either it gets sent to the Worker extension controller to bootstrap a VM, or it is downloaded by the gardener-node-agent already running on a bootstrapped VM. This field is immutable.\n units []Unit (Optional) Units is a list of unit for the operating system configuration (usually, a systemd unit).\n files []File (Optional) Files is a list of files that should get written to the host’s file system.\n OperatingSystemConfigStatus (Appears on: OperatingSystemConfig) OperatingSystemConfigStatus is the status for a OperatingSystemConfig resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n extensionUnits []Unit (Optional) ExtensionUnits is a list of additional systemd units provided by the extension.\n extensionFiles []File (Optional) ExtensionFiles is a list of additional files provided by the extension.\n cloudConfig CloudConfig (Optional) CloudConfig is a structure for containing the generated output for the given operating system config spec. It contains a reference to a secret as the result may contain confidential data.\n PluginConfig (Appears on: ContainerdConfig) PluginConfig contains configuration values for the containerd plugins section.\n Field Description op PluginPathOperation (Optional) Op is the operation for the given path. Possible values are ‘add’ and ‘remove’, defaults to ‘add’.\n path []string Path is a list of elements that construct the path in the plugins section.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the values configured at the given path. If defined, it is expected as json format: - A given json object will be put to the given path. - If not configured, only the table entry to be created.\n PluginPathOperation (string alias)\n (Appears on: PluginConfig) PluginPathOperation is a type alias for operations at containerd’s plugin configuration.\nPurpose (string alias)\n (Appears on: ControlPlaneSpec) Purpose is a string alias.\nRegistryCapability (string alias)\n (Appears on: RegistryHost) RegistryCapability specifies an action a client can perform against a registry.\nRegistryConfig (Appears on: ContainerdConfig) RegistryConfig contains registry configuration options.\n Field Description upstream string Upstream is the upstream name of the registry.\n server string (Optional) Server is the URL to registry server of this upstream. It corresponds to the server field in the hosts.toml file, see https://github.com/containerd/containerd/blob/c51463010e0682f76dfdc10edc095e6596e2764b/docs/hosts.md#server-field for more information.\n hosts []RegistryHost Hosts are the registry hosts. It corresponds to the host fields in the hosts.toml file, see https://github.com/containerd/containerd/blob/c51463010e0682f76dfdc10edc095e6596e2764b/docs/hosts.md#host-fields-in-the-toml-table-format for more information.\n readinessProbe bool (Optional) ReadinessProbe determines if host registry endpoints should be probed before they are added to the containerd config.\n RegistryHost (Appears on: RegistryConfig) RegistryHost contains configuration values for a registry host.\n Field Description url string URL is the endpoint address of the registry mirror.\n capabilities []RegistryCapability Capabilities determine what operations a host is capable of performing. Defaults to - pull - resolve\n caCerts []string CACerts are paths to public key certificates used for TLS.\n Spec Spec is the spec section of an Object.\nStatus Status is the status of an Object.\nUnit (Appears on: OperatingSystemConfigSpec, OperatingSystemConfigStatus) Unit is a unit for the operating system configuration (usually, a systemd unit).\n Field Description name string Name is the name of a unit.\n command UnitCommand (Optional) Command is the unit’s command.\n enable bool (Optional) Enable describes whether the unit is enabled or not.\n content string (Optional) Content is the unit’s content.\n dropIns []DropIn (Optional) DropIns is a list of drop-ins for this unit.\n filePaths []string FilePaths is a list of files the unit depends on. If any file changes a restart of the dependent unit will be triggered. For each FilePath there must exist a File with matching Path in OperatingSystemConfig.Spec.Files.\n UnitCommand (string alias)\n (Appears on: Unit) UnitCommand is a string alias.\nVolume (Appears on: WorkerPool) Volume contains information about the root disks that should be used for worker pools.\n Field Description name string (Optional) Name of the volume to make it referencable.\n type string (Optional) Type is the type of the volume.\n size string Size is the of the root volume.\n encrypted bool (Optional) Encrypted determines if the volume should be encrypted.\n WorkerPool (Appears on: WorkerSpec) WorkerPool is the definition of a specific worker pool.\n Field Description machineType string MachineType contains information about the machine type that should be used for this worker pool.\n maximum int32 Maximum is the maximum size of the worker pool.\n maxSurge k8s.io/apimachinery/pkg/util/intstr.IntOrString MaxSurge is maximum number of VMs that are created during an update.\n maxUnavailable k8s.io/apimachinery/pkg/util/intstr.IntOrString MaxUnavailable is the maximum number of VMs that can be unavailable during an update.\n annotations map[string]string (Optional) Annotations is a map of key/value pairs for annotations for all the Node objects in this worker pool.\n labels map[string]string (Optional) Labels is a map of key/value pairs for labels for all the Node objects in this worker pool.\n taints []Kubernetes core/v1.Taint (Optional) Taints is a list of taints for all the Node objects in this worker pool.\n machineImage MachineImage MachineImage contains logical information about the name and the version of the machie image that should be used. The logical information must be mapped to the provider-specific information (e.g., AMIs, …) by the provider itself.\n minimum int32 Minimum is the minimum size of the worker pool.\n name string Name is the name of this worker pool.\n nodeAgentSecretName string (Optional) NodeAgentSecretName is uniquely identifying selected aspects of the OperatingSystemConfig. If it changes, then the worker pool must be rolled.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is a provider specific configuration for the worker pool.\n userDataSecretRef Kubernetes core/v1.SecretKeySelector UserDataSecretRef references a Secret and a data key containing the data that is sent to the provider’s APIs when a new machine/VM that is part of this worker pool shall be spawned.\n volume Volume (Optional) Volume contains information about the root disks that should be used for this worker pool.\n dataVolumes []DataVolume (Optional) DataVolumes contains a list of additional worker volumes.\n kubeletDataVolumeName string (Optional) KubeletDataVolumeName contains the name of a dataVolume that should be used for storing kubelet state.\n zones []string (Optional) Zones contains information about availability zones for this worker pool.\n machineControllerManager github.com/gardener/gardener/pkg/apis/core/v1beta1.MachineControllerManagerSettings (Optional) MachineControllerManagerSettings contains configurations for different worker-pools. Eg. MachineDrainTimeout, MachineHealthTimeout.\n kubernetesVersion string (Optional) KubernetesVersion is the kubernetes version in this worker pool\n nodeTemplate NodeTemplate (Optional) NodeTemplate contains resource information of the machine which is used by Cluster Autoscaler to generate nodeTemplate during scaling a nodeGroup from zero\n architecture string (Optional) Architecture is the CPU architecture of the worker pool machines and machine image.\n clusterAutoscaler ClusterAutoscalerOptions (Optional) ClusterAutoscaler contains the cluster autoscaler configurations for the worker pool.\n WorkerSpec (Appears on: Worker) WorkerSpec is the spec for a Worker resource.\n Field Description DefaultSpec DefaultSpec (Members of DefaultSpec are embedded into this type.) DefaultSpec is a structure containing common fields used by all extension resources.\n infrastructureProviderStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) InfrastructureProviderStatus is a raw extension field that contains the provider status that has been generated by the controller responsible for the Infrastructure resource.\n region string Region is the name of the region where the worker pool should be deployed to. This field is immutable.\n secretRef Kubernetes core/v1.SecretReference SecretRef is a reference to a secret that contains the cloud provider specific credentials.\n sshPublicKey []byte (Optional) SSHPublicKey is the public SSH key that should be used with these workers.\n pools []WorkerPool Pools is a list of worker pools.\n WorkerStatus (Appears on: Worker) WorkerStatus is the status for a Worker resource.\n Field Description DefaultStatus DefaultStatus (Members of DefaultStatus are embedded into this type.) DefaultStatus is a structure containing common fields used by all extension resources.\n machineDeployments []MachineDeployment MachineDeployments is a list of created machine deployments. It will be used to e.g. configure the cluster-autoscaler properly.\n machineDeploymentsLastUpdateTime Kubernetes meta/v1.Time (Optional) MachineDeploymentsLastUpdateTime is the timestamp when the status.MachineDeployments slice was last updated.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n extensions.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/extensions/","tags":"","title":"Extensions"},{"body":"Feature Gates in Gardener This page contains an overview of the various feature gates an administrator can specify on different Gardener components.\nOverview Feature gates are a set of key=value pairs that describe Gardener features. You can turn these features on or off using the component configuration file for a specific component.\nEach Gardener component lets you enable or disable a set of feature gates that are relevant to that component. For example, this is the configuration of the gardenlet component.\nThe following tables are a summary of the feature gates that you can set on different Gardener components.\n The “Since” column contains the Gardener release when a feature is introduced or its release stage is changed. The “Until” column, if not empty, contains the last Gardener release in which you can still use a feature gate. If a feature is in the Alpha or Beta state, you can find the feature listed in the Alpha/Beta feature gate table. If a feature is stable you can find all stages for that feature listed in the Graduated/Deprecated feature gate table. The Graduated/Deprecated feature gate table also lists deprecated and withdrawn features. Feature Gates for Alpha or Beta Features Feature Default Stage Since Until HVPA false Alpha 0.31 HVPAForShootedSeed false Alpha 0.32 DefaultSeccompProfile false Alpha 1.54 IPv6SingleStack false Alpha 1.63 ShootForceDeletion false Alpha 1.81 1.90 ShootForceDeletion true Beta 1.91 UseNamespacedCloudProfile false Alpha 1.92 ShootManagedIssuer false Alpha 1.93 VPAForETCD false Alpha 1.94 1.96 VPAForETCD true Beta 1.97 VPAAndHPAForAPIServer false Alpha 1.95 1.100 VPAAndHPAForAPIServer true Beta 1.101 ShootCredentialsBinding false Alpha 1.98 NewWorkerPoolHash false Alpha 1.98 NewVPN false Alpha 1.104 Feature Gates for Graduated or Deprecated Features Feature Default Stage Since Until NodeLocalDNS false Alpha 1.7 NodeLocalDNS Removed 1.26 KonnectivityTunnel false Alpha 1.6 KonnectivityTunnel Removed 1.27 MountHostCADirectories false Alpha 1.11 1.25 MountHostCADirectories true Beta 1.26 1.27 MountHostCADirectories true GA 1.27 MountHostCADirectories Removed 1.30 DisallowKubeconfigRotationForShootInDeletion false Alpha 1.28 1.31 DisallowKubeconfigRotationForShootInDeletion true Beta 1.32 1.35 DisallowKubeconfigRotationForShootInDeletion true GA 1.36 DisallowKubeconfigRotationForShootInDeletion Removed 1.38 Logging false Alpha 0.13 1.40 Logging Removed 1.41 AdminKubeconfigRequest false Alpha 1.24 1.38 AdminKubeconfigRequest true Beta 1.39 1.41 AdminKubeconfigRequest true GA 1.42 1.49 AdminKubeconfigRequest Removed 1.50 UseDNSRecords false Alpha 1.27 1.38 UseDNSRecords true Beta 1.39 1.43 UseDNSRecords true GA 1.44 1.49 UseDNSRecords Removed 1.50 CachedRuntimeClients false Alpha 1.7 1.33 CachedRuntimeClients true Beta 1.34 1.44 CachedRuntimeClients true GA 1.45 1.49 CachedRuntimeClients Removed 1.50 DenyInvalidExtensionResources false Alpha 1.31 1.41 DenyInvalidExtensionResources true Beta 1.42 1.44 DenyInvalidExtensionResources true GA 1.45 1.49 DenyInvalidExtensionResources Removed 1.50 RotateSSHKeypairOnMaintenance false Alpha 1.28 1.44 RotateSSHKeypairOnMaintenance true Beta 1.45 1.47 RotateSSHKeypairOnMaintenance (deprecated) false Beta 1.48 1.50 RotateSSHKeypairOnMaintenance (deprecated) Removed 1.51 ShootMaxTokenExpirationOverwrite false Alpha 1.43 1.44 ShootMaxTokenExpirationOverwrite true Beta 1.45 1.47 ShootMaxTokenExpirationOverwrite true GA 1.48 1.50 ShootMaxTokenExpirationOverwrite Removed 1.51 ShootMaxTokenExpirationValidation false Alpha 1.43 1.45 ShootMaxTokenExpirationValidation true Beta 1.46 1.47 ShootMaxTokenExpirationValidation true GA 1.48 1.50 ShootMaxTokenExpirationValidation Removed 1.51 WorkerPoolKubernetesVersion false Alpha 1.35 1.45 WorkerPoolKubernetesVersion true Beta 1.46 1.49 WorkerPoolKubernetesVersion true GA 1.50 1.51 WorkerPoolKubernetesVersion Removed 1.52 DisableDNSProviderManagement false Alpha 1.41 1.49 DisableDNSProviderManagement true Beta 1.50 1.51 DisableDNSProviderManagement true GA 1.52 1.59 DisableDNSProviderManagement Removed 1.60 SecretBindingProviderValidation false Alpha 1.38 1.50 SecretBindingProviderValidation true Beta 1.51 1.52 SecretBindingProviderValidation true GA 1.53 1.54 SecretBindingProviderValidation Removed 1.55 SeedKubeScheduler false Alpha 1.15 1.54 SeedKubeScheduler false Deprecated 1.55 1.60 SeedKubeScheduler Removed 1.61 ShootCARotation false Alpha 1.42 1.50 ShootCARotation true Beta 1.51 1.56 ShootCARotation true GA 1.57 1.59 ShootCARotation Removed 1.60 ShootSARotation false Alpha 1.48 1.50 ShootSARotation true Beta 1.51 1.56 ShootSARotation true GA 1.57 1.59 ShootSARotation Removed 1.60 ReversedVPN false Alpha 1.22 1.41 ReversedVPN true Beta 1.42 1.62 ReversedVPN true GA 1.63 1.69 ReversedVPN Removed 1.70 ForceRestore Removed 1.66 SeedChange false Alpha 1.12 1.52 SeedChange true Beta 1.53 1.68 SeedChange true GA 1.69 1.72 SeedChange Removed 1.73 CopyEtcdBackupsDuringControlPlaneMigration false Alpha 1.37 1.52 CopyEtcdBackupsDuringControlPlaneMigration true Beta 1.53 1.68 CopyEtcdBackupsDuringControlPlaneMigration true GA 1.69 1.72 CopyEtcdBackupsDuringControlPlaneMigration Removed 1.73 ManagedIstio false Alpha 1.5 1.18 ManagedIstio true Beta 1.19 ManagedIstio true Deprecated 1.48 1.69 ManagedIstio Removed 1.70 APIServerSNI false Alpha 1.7 1.18 APIServerSNI true Beta 1.19 APIServerSNI true Deprecated 1.48 1.72 APIServerSNI Removed 1.73 HAControlPlanes false Alpha 1.49 1.70 HAControlPlanes true Beta 1.71 1.72 HAControlPlanes true GA 1.73 1.73 HAControlPlanes Removed 1.74 FullNetworkPoliciesInRuntimeCluster false Alpha 1.66 1.70 FullNetworkPoliciesInRuntimeCluster true Beta 1.71 1.72 FullNetworkPoliciesInRuntimeCluster true GA 1.73 1.73 FullNetworkPoliciesInRuntimeCluster Removed 1.74 DisableScalingClassesForShoots false Alpha 1.73 1.78 DisableScalingClassesForShoots true Beta 1.79 1.80 DisableScalingClassesForShoots true GA 1.81 1.81 DisableScalingClassesForShoots Removed 1.82 ContainerdRegistryHostsDir false Alpha 1.77 1.85 ContainerdRegistryHostsDir true Beta 1.86 1.86 ContainerdRegistryHostsDir true GA 1.87 1.87 ContainerdRegistryHostsDir Removed 1.88 WorkerlessShoots false Alpha 1.70 1.78 WorkerlessShoots true Beta 1.79 1.85 WorkerlessShoots true GA 1.86 WorkerlessShoots Removed 1.88 MachineControllerManagerDeployment false Alpha 1.73 MachineControllerManagerDeployment true Beta 1.81 1.81 MachineControllerManagerDeployment true GA 1.82 1.91 MachineControllerManagerDeployment Removed 1.92 APIServerFastRollout true Beta 1.82 1.89 APIServerFastRollout true GA 1.90 1.91 APIServerFastRollout Removed 1.92 UseGardenerNodeAgent false Alpha 1.82 1.88 UseGardenerNodeAgent true Beta 1.89 UseGardenerNodeAgent true GA 1.90 1.91 UseGardenerNodeAgent Removed 1.92 CoreDNSQueryRewriting false Alpha 1.55 1.95 CoreDNSQueryRewriting true Beta 1.96 1.96 CoreDNSQueryRewriting true GA 1.97 1.100 CoreDNSQueryRewriting Removed 1.101 MutableShootSpecNetworkingNodes false Alpha 1.64 1.95 MutableShootSpecNetworkingNodes true Beta 1.96 1.96 MutableShootSpecNetworkingNodes true GA 1.97 1.100 MutableShootSpecNetworkingNodes Removed 1.101 Using a Feature A feature can be in Alpha, Beta or GA stage. An Alpha feature means:\n Disabled by default. Might be buggy. Enabling the feature may expose bugs. Support for feature may be dropped at any time without notice. The API may change in incompatible ways in a later software release without notice. Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support. A Beta feature means:\n Enabled by default. The feature is well tested. Enabling the feature is considered safe. Support for the overall feature will not be dropped, though details may change. The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature. Recommended for only non-critical uses because of potential for incompatible changes in subsequent releases. Please do try Beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.\n A General Availability (GA) feature is also referred to as a stable feature. It means:\n The feature is always enabled; you cannot disable it. The corresponding feature gate is no longer needed. Stable versions of features will appear in released software for many subsequent versions. List of Feature Gates Feature Relevant Components Description HVPA gardenlet, gardener-operator Enables simultaneous horizontal and vertical scaling in garden or seed clusters. HVPAForShootedSeed gardenlet Enables simultaneous horizontal and vertical scaling in managed seed (aka “shooted seed”) clusters. DefaultSeccompProfile gardenlet, gardener-operator Enables the defaulting of the seccomp profile for Gardener managed workload in the garden or seed to RuntimeDefault. IPv6SingleStack gardener-apiserver, gardenlet Allows creating seed and shoot clusters with IPv6 single-stack networking enabled in their spec (GEP-21). If enabled in gardenlet, the default behavior is unchanged, but setting ipFamilies=[IPv6] in the seedConfig is allowed. Only if the ipFamilies setting is changed, gardenlet behaves differently. ShootForceDeletion gardener-apiserver Allows forceful deletion of Shoots by annotating them with the confirmation.gardener.cloud/force-deletion annotation. UseNamespacedCloudProfile gardener-apiserver Enables usage of NamespacedCloudProfiles in Shoots. ShootManagedIssuer gardenlet Enables the shoot managed issuer functionality described in GEP 24. VPAForETCD gardenlet, gardener-operator Enables VPA for etcd-main and etcd-events, regardless of HVPA enablement. VPAAndHPAForAPIServer gardenlet, gardener-operator Enables an autoscaling mechanism for kube-apiserver of shoot or virtual garden clusters, and the gardener-apiserver. They are scaled simultaneously by VPA and HPA on the same metric (CPU and memory usage). The pod-trashing cycle between VPA and HPA scaling on the same metric is avoided by configuring the HPA to scale on average usage (not on average utilization) and by picking the target average utilization values in sync with VPA’s allowed maximums. The feature gate takes precedence over the HVPA feature gate when they are both enabled. ShootCredentialsBinding gardener-apiserver Enables usage of CredentialsBindingName in Shoots. NewWorkerPoolHash gardenlet Enables usage of the new worker pool hash calculation. The new calculation supports rolling worker pools if kubeReserved, systemReserved, evicitonHard or cpuManagerPolicy in the kubelet configuration are changed. All provider extensions must be upgraded to support this feature first. Existing worker pools are not immediately migrated to the new hash variant, since this would trigger the replacement of all nodes. The migration happens when a rolling update is triggered according to the old or new hash version calculation. NewVPN gardenlet Enables usage of the new implementation of the VPN (go rewrite) using an IPv6 transfer network. ","categories":"","description":"","excerpt":"Feature Gates in Gardener This page contains an overview of the …","ref":"/docs/gardener/deployment/feature_gates/","tags":"","title":"Feature Gates"},{"body":"Feature Gates in Etcd-Druid This page contains an overview of the various feature gates an administrator can specify on etcd-druid.\nOverview Feature gates are a set of key=value pairs that describe etcd-druid features. You can turn these features on or off by passing them to the --feature-gates CLI flag in the etcd-druid command.\nThe following tables are a summary of the feature gates that you can set on etcd-druid.\n The “Since” column contains the etcd-druid release when a feature is introduced or its release stage is changed. The “Until” column, if not empty, contains the last etcd-druid release in which you can still use a feature gate. If a feature is in the Alpha or Beta state, you can find the feature listed in the Alpha/Beta feature gate table. If a feature is stable you can find all stages for that feature listed in the Graduated/Deprecated feature gate table. The Graduated/Deprecated feature gate table also lists deprecated and withdrawn features. Feature Gates for Alpha or Beta Features Feature Default Stage Since Until UseEtcdWrapper false Alpha 0.19 0.21 UseEtcdWrapper true Beta 0.22 Feature Gates for Graduated or Deprecated Features Feature Default Stage Since Until Using a Feature A feature can be in Alpha, Beta or GA stage. An Alpha feature means:\n Disabled by default. Might be buggy. Enabling the feature may expose bugs. Support for feature may be dropped at any time without notice. The API may change in incompatible ways in a later software release without notice. Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support. A Beta feature means:\n Enabled by default. The feature is well tested. Enabling the feature is considered safe. Support for the overall feature will not be dropped, though details may change. The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature. Recommended for only non-critical uses because of potential for incompatible changes in subsequent releases. Please do try Beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.\n A General Availability (GA) feature is also referred to as a stable feature. It means:\n The feature is always enabled; you cannot disable it. The corresponding feature gate is no longer needed. Stable versions of features will appear in released software for many subsequent versions. List of Feature Gates Feature Description UseEtcdWrapper Enables the use of etcd-wrapper image and a compatible version of etcd-backup-restore, along with component-specific configuration changes necessary for the usage of the etcd-wrapper image. ","categories":"","description":"","excerpt":"Feature Gates in Etcd-Druid This page contains an overview of the …","ref":"/docs/other-components/etcd-druid/deployment/feature-gates/","tags":"","title":"Feature Gates in Etcd-Druid"},{"body":"Reasoning Custom Resource Definition (CRD) is what you use to define a Custom Resource. This is a powerful way to extend Kubernetes capabilities beyond the default installation, adding any kind of API objects useful for your application.\nThe CustomResourceDefinition API provides a workflow for introducing and upgrading to new versions of a CustomResourceDefinition. In a scenario where a CRD adds support for a new version and switches its spec.versions.storage field to it (i.e., from v1beta1 to v1), existing objects are not migrated in etcd. For more information, see Versions in CustomResourceDefinitions.\nThis creates a mismatch between the requested and stored version for all clients (kubectl, KCM, etc.). When the CRD also declares the usage of a conversion webhook, it gets called whenever a client requests information about a resource that still exists in the old version. If the CRD is created by the end-user, the webhook runs on the shoot side, whereas controllers / kapi-servers run separated, as part of the control-plane. For the webhook to be reachable, a working VPN connection seed -\u003e shoot is essential. In scenarios where the VPN connection is broken, the kube-controller-manager eventually stops its garbage collection, as that requires it to list v1.PartialObjectMetadata for everything to build a dependency graph. Without the kube-controller-manager’s garbage collector, managed resources get stuck during update/rollout.\nBreaking Situations When a user upgrades to failureTolerance: node|zone, that will cause the VPN deployments to be replaced by statefulsets. However, as the VPN connection is broken upon teardown of the deployment, garbage collection will fail, leading to a situation that is stuck until an operator manually tackles it.\nSuch a situation can be avoided if the end-user has correctly configured CRDs containing conversion webhooks.\nChecking Problematic CRDs In order to make sure there are no version problematic CRDs, please run the script below in your shoot. It will return the name of the CRDs in case they have one of the 2 problems:\n the returned version of the CR is different than what is maintained in the status.storedVersions field of the CRD. the status.storedVersions field of the CRD has more than 1 version defined. #!/bin/bash set -e -o pipefail echo \"Checking all CRDs in the cluster...\" for p in $(kubectl get crd | awk 'NR\u003e1' | awk '{print $1}'); do strategy=$(kubectl get crd \"$p\" -o json | jq -r .spec.conversion.strategy) if [ \"$strategy\" == \"Webhook\" ]; then crd_name=$(kubectl get crd \"$p\" -o json | jq -r .metadata.name) number_of_stored_versions=$(kubectl get crd \"$crd_name\" -o json | jq '.status.storedVersions | length') if [[ \"$number_of_stored_versions\" == 1 ]]; then returned_cr_version=$(kubectl get \"$crd_name\" -A -o json | jq -r '.items[] | .apiVersion' | sed 's:.*/::') if [ -z \"$returned_cr_version\" ]; then continue else variable=$(echo \"$returned_cr_version\" | xargs -n1 | sort -u | xargs) present_version=$(kubectl get crd \"$crd_name\" -o json | jq -cr '.status.storedVersions |.[]') if [[ $variable != \"$present_version\" ]]; then echo \"ERROR: Stored version differs from the version that CRs are being returned. $crd_namewith conversion webhook needs to be fixed\" fi fi fi if [[ \"$number_of_stored_versions\" -gt 1 ]]; then returned_cr_version=$(kubectl get \"$crd_name\" -A -o json | jq -r '.items[] | .apiVersion' | sed 's:.*/::') if [ -z \"$returned_cr_version\" ]; then continue else echo \"ERROR: Too many stored versions defined. $crd_namewith conversion webhook needs to be fixed\" fi fi fi done echo \"Problematic CRDs are reported above.\" Resolve CRDs Below we give the steps needed to be taken in order to fix the CRDs reported by the script above.\nInspect all your CRDs that have conversion webhooks in place. If you have more than 1 version defined in its spec.status.storedVersions field, then initiate migration as described in Option 2 in the Upgrade existing objects to a new stored version guide.\nFor convenience, we have provided the necessary steps below.\nNote Please test the following steps on a non-productive landscape to make sure that the new CR version doesn’t break any of your existing workloads. Please check/set the old CR version to storage:false and set the new CR version to storage:true.\nFor the sake of an example, let’s consider the two versions v1beta1 (old) and v1 (new).\nBefore:\nspec: versions: - name: v1beta1 ...... storage: true - name: v1 ...... storage: false After:\nspec: versions: - name: v1beta1 ...... storage: false - name: v1 ...... storage: true Convert custom-resources to the newest version.\nkubectl get \u003ccustom-resource-name\u003e -A -ojson | k apply -f - Patch the CRD to keep only the latest version under storedVersions.\nkubectl patch customresourcedefinitions \u003ccrd-name\u003e --subresource='status' --type='merge' -p '{\"status\":{\"storedVersions\":[\"your-latest-cr-version\"]}}' ","categories":"","description":"","excerpt":"Reasoning Custom Resource Definition (CRD) is what you use to define a …","ref":"/docs/guides/administer-shoots/conversion-webhook/","tags":"","title":"Fix Problematic Conversion Webhooks"},{"body":"Force Deletion From v1.81, Gardener supports Shoot Force Deletion. All extension controllers should also properly support it. This document outlines some important points that extension maintainers should keep in mind to support force deletion in their extensions.\nOverall Principles The following principles should always be upheld:\n All resources pertaining to the extension and managed by it should be appropriately handled and cleaned up by the extension when force deletion is initiated. Implementation Details ForceDelete Actuator Methods Most extension controller implementations follow a common pattern where a generic Reconciler implementation delegates to an Actuator interface that contains the methods Reconcile, Delete, Migrate and Restore provided by the extension. A new method, ForceDelete has been added to all such Actuator interfaces; see the infrastructure Actuator interface as an example. The generic reconcilers call this method if the Shoot has annotation confirmation.gardener.cloud/force-deletion=true. Thus, it should be implemented by the extension controller to forcefully delete resources if not possible to delete them gracefully. If graceful deletion is possible, then in the ForceDelete, they can simply call the Delete method.\nExtension Controllers Based on Generic Actuators In practice, the implementation of many extension controllers (for example, the controlplane and worker controllers in most provider extensions) are based on a generic Actuator implementation that only delegates to extension methods for behavior that is truly provider-specific. In all such cases, the ForceDelete method has already been implemented with a method that should suit most of the extensions. If it doesn’t suit your extension, then the ForceDelete method needs to be overridden; see the Azure controlplane controller as an example.\nExtension Controllers Not Based on Generic Actuators The implementation of some extension controllers (for example, the infrastructure controllers in all provider extensions) are not based on a generic Actuator implementation. Such extension controllers must always provide a proper implementation of the ForceDelete method according to the above guidelines; see the AWS infrastructure controller as an example. In practice, this might result in code duplication between the different extensions, since the ForceDelete code is usually not OS-specific.\nSome General Implementation Examples If the extension deploys only resources in the shoot cluster not backed by infrastructure in third-party systems, then performing the regular deletion code (actuator.Delete) will suffice in the majority of cases. (e.g - https://github.com/gardener/gardener-extension-shoot-networking-filter/blob/1d95a483d803874e8aa3b1de89431e221a7d574e/pkg/controller/lifecycle/actuator.go#L175-L178) If the extension deploys resources which are backed by infrastructure in third-party systems: If the resource is in the Seed cluster, the extension should remove the finalizers and delete the resource. This is needed especially if the resource is a custom resource since gardenlet will not be aware of this resource and cannot take action. If the resource is in the Shoot and if it’s deployed by a ManagedResource, then gardenlet will take care to forcefully delete it in a later step of force-deletion. If the resource is not deployed via a ManagedResource, then it wouldn’t block the deletion flow anyway since it is in the Shoot cluster. In both cases, the extension controller can ignore the resource and return nil. ","categories":"","description":"","excerpt":"Force Deletion From v1.81, Gardener supports Shoot Force Deletion. All …","ref":"/docs/gardener/extensions/force-deletion/","tags":"","title":"Force Deletion"},{"body":"This page gives writing formatting guidelines for the Gardener documentation. For style guidelines, see the Style Guide.\nThese are guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a pull request.\n Formatting of Inline Elements Code Snippet Formatting Related Links Formatting of Inline Elements Type of Text Formatting Markdown Syntax API Objects and Technical Components code Deploy a `Pod`. New Terms and Emphasis bold Do **not** stop it. Technical Names code Open file `root.yaml`. User Interface Elements italics Choose *CLUSTERS*. Inline Code and Inline Commands code For declarative management, use `kubectl apply`. Object Field Names and Field Values code Set the value of `image` to `nginx:1.8`. Links and References link Visit the [Gardener website](https://gardener.cloud/) Headers various # API Server API Objects and Technical Components When you refer to an API object, use the same uppercase and lowercase letters that are used in the actual object name, and use backticks (`) to format them. Typically, the names of API objects use camel case.\nDon’t split the API object name into separate words. For example, use PodTemplateList, not Pod Template List.\nRefer to API objects without saying “object,” unless omitting “object” leads to an awkward construction.\n Do Don’t The Pod has two containers. The pod has two containers. The Deployment is responsible for… The Deployment object is responsible for… A PodList is a list of Pods. A Pod List is a list of pods. The gardener-control-manager has control loops… The gardener-control-manager has control loops… The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources. The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources. Note Due to the way the website is built from content taken from different repositories, when editing or updating already existing documentation, you should follow the style used in the topic. When contributing new documentation, follow the guidelines outlined in this guide. New Terms and Emphasis Use bold to emphasize something or to introduce a new term.\n Do Don’t A cluster is a set of nodes … A “cluster” is a set of nodes … The system does not delete your objects. The system does not(!) delete your objects. Technical Names Use backticks (`) for filenames, technical componentes, directories, and paths.\n Do Don’t Open file envars.yaml. Open the envars.yaml file. Go to directory /docs/tutorials. Go to the /docs/tutorials directory. Open file /_data/concepts.yaml. Open the /_data/concepts.yaml file. User Interface Elements When referring to UI elements, refrain from using verbs like “Click” or “Select with right mouse button”. This level of detail is hardly ever needed and also invalidates a procedure if other devices are used. For example, for a tablet you’d say “Tap on”.\nUse italics when you refer to UI elements.\n UI Element Standard Formulation Markdown Syntax Button, Menu path Choose UI Element. Choose *UI Element*. Menu path, context menu, navigation path Choose System \u003e User Profile \u003e Own Data. Choose *System* \\\u003e *User Profile* \\\u003e *Own Data*. Entry fields Enter your password. Enter your password. Checkbox, radio button Select Filter. Select *Filter*. Expandable screen elements Expand User Settings.\nCollapse User Settings. Expand *User Settings*.\nCollapse *User Settings*. Inline Code and Inline Commands Use backticks (`) for inline code.\n Do Don’t The kubectl run command creates a Deployment. The “kubectl run” command creates a Deployment. For declarative management, use kubectl apply. For declarative management, use “kubectl apply”. Object Field Names and Field Values Use backticks (`) for field names, and field values.\n Do Don’t Set the value of the replicas field in the configuration file. Set the value of the “replicas” field in the configuration file. The value of the exec field is an ExecAction object. The value of the “exec” field is an ExecAction object. Set the value of imagePullPolicy to Always. Set the value of imagePullPolicy to “Always”. Set the value of image to nginx:1.8. Set the value of image to nginx:1.8. Links and References Do Don’t Use a descriptor of the link’s destination: “For more information, visit Gardener’s website.” Use a generic placeholder: “For more information, go here.” Use relative links when linking to content in the same repository: [Style Guide](../style-guide/_index.md) Use absolute links when linking to content in the same repository: [Style Guide](https://github.com/gardener/documentation/blob/master/website/documentation/contribute/documentation/style-guide/_index.md) Another thing to keep in mind is that markdown links do not work in certain shortcodes (e.g., mermaid). To circumvent this problem, you can use HTML links.\nHeaders Use H1 for the title of the topic. (# H1 Title) Use H2 for each main section. (## H2 Title) Use H3 for any sub-section in the main sections. (### H3 Title) Avoid using H4-H6. Try moving the additional information to a new topic instead. Code Snippet Formatting Don’t Include the Command Prompt Do Don’t kubectl get pods $ kubectl get pods Separate Commands from Output Verify that the pod is running on your chosen node: kubectl get pods --output=wide The output is similar to:\nNAME READY STATUS RESTARTS AGE IP NODE nginx 1/1 Running 0 13s 10.200.0.4 worker0 Placeholders Use angle brackets for placeholders. Tell the reader what a placeholder represents, for example:\n Display information about a pod:\nkubectl describe pod \u003cpod-name\u003e \u003cpod-name\u003e is the name of one of your pods.\n Versioning Kubernetes Examples Make code examples and configuration examples that include version information consistent with the accompanying text. Identify the Kubernetes version in the Prerequisites section.\nRelated Links Style Guide Contributors Guide ","categories":"","description":"","excerpt":"This page gives writing formatting guidelines for the Gardener …","ref":"/docs/contribute/documentation/formatting-guide/","tags":"","title":"Formatting Guide"},{"body":"Gardener Extension for Garden Linux OS \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group.\nIt manages those objects that are requesting…\n Garden Linux OS configuration (.spec.type=gardenlinux):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: gardenlinux units: ... files: ... Please find a concrete example in the example folder.\n MemoryOne on Garden Linux configuration (spec.type=memoryone-gardenlinux):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: memoryone-gardenlinux units: ... files: ... providerConfig: apiVersion: memoryone-gardenlinux.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"2\" systemMemory: \"6x\" Please find a concrete example in the example folder.\n After reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Garden Linux operating system","excerpt":"Gardener extension controller for the Garden Linux operating system","ref":"/docs/extensions/os-extensions/gardener-extension-os-gardenlinux/","tags":"","title":"Garden Linux OS"},{"body":"Overview While the Gardener API server works with admission plugins to validate and mutate resources belonging to Gardener related API groups, e.g. core.gardener.cloud, the same is needed for resources belonging to non-Gardener API groups as well, e.g. secrets in the core API group. Therefore, the Gardener Admission Controller runs a http(s) server with the following handlers which serve as validating/mutating endpoints for admission webhooks. It is also used to serve http(s) handlers for authorization webhooks.\nAdmission Webhook Handlers This section describes the admission webhook handlers that are currently served.\nAdmission Plugin Secret Validator In Shoot, AdmissionPlugin can have reference to other files. This validation handler validates the referred admission plugin secret and ensures that the secret always contains the required data kubeconfig.\nKubeconfig Secret Validator Malicious Kubeconfigs applied by end users may cause a leakage of sensitive data. This handler checks if the incoming request contains a Kubernetes secret with a .data.kubeconfig field and denies the request if the Kubeconfig structure violates Gardener’s security standards.\nNamespace Validator Namespaces are the backing entities of Gardener projects in which shoot cluster objects reside. This validation handler protects active namespaces against premature deletion requests. Therefore, it denies deletion requests if a namespace still contains shoot clusters or if it belongs to a non-deleting Gardener project (without .metadata.deletionTimestamp).\nResource Size Validator Since users directly apply Kubernetes native objects to the Garden cluster, it also involves the risk of being vulnerable to DoS attacks because these resources are continuously watched and read by controllers. One example is the creation of shoot resources with large annotation values (up to 256 kB per value), which can cause severe out-of-memory issues for the gardenlet component. Vertical autoscaling can help to mitigate such situations, but we cannot expect to scale infinitely, and thus need means to block the attack itself.\nThe Resource Size Validator checks arbitrary incoming admission requests against a configured maximum size for the resource’s group-version-kind combination. It denies the request if the object exceeds the quota.\n [!NOTE] The contents of status subresources and metadata.managedFields are not taken into account for the resource size calculation.\n Example for Gardener Admission Controller configuration:\nserver: resourceAdmissionConfiguration: limits: - apiGroups: [\"core.gardener.cloud\"] apiVersions: [\"*\"] resources: [\"shoots\"] size: 100k - apiGroups: [\"\"] apiVersions: [\"v1\"] resources: [\"secrets\"] size: 100k unrestrictedSubjects: - kind: Group name: gardener.cloud:system:seeds apiGroup: rbac.authorization.k8s.io # - kind: User # name: admin # apiGroup: rbac.authorization.k8s.io # - kind: ServiceAccount # name: \"*\" # namespace: garden # apiGroup: \"\" operationMode: block #log With the configuration above, the Resource Size Validator denies requests for shoots with Gardener’s core API group which exceed a size of 100 kB. The same is done for Kubernetes secrets.\nAs this feature is meant to protect the system from malicious requests sent by users, it is recommended to exclude trusted groups, users or service accounts from the size restriction via resourceAdmissionConfiguration.unrestrictedSubjects. For example, the backing user for the gardenlet should always be capable of changing the shoot resource instead of being blocked due to size restrictions. This is because the gardenlet itself occasionally changes the shoot specification, labels or annotations, and might violate the quota if the existing resource is already close to the quota boundary. Also, operators are supposed to be trusted users and subjecting them to a size limitation can inhibit important operational tasks. Wildcard (\"*\") in subject name is supported.\nSize limitations depend on the individual Gardener setup and choosing the wrong values can affect the availability of your Gardener service. resourceAdmissionConfiguration.operationMode allows to control if a violating request is actually denied (default) or only logged. It’s recommended to start with log, check the logs for exceeding requests, adjust the limits if necessary and finally switch to block.\nSeedRestriction Please refer to Scoped API Access for Gardenlets for more information.\nAuthorization Webhook Handlers This section describes the authorization webhook handlers that are currently served.\nSeedAuthorization Please refer to Scoped API Access for Gardenlets for more information.\n","categories":"","description":"Functions and list of handlers for the Gardener Admission Controller","excerpt":"Functions and list of handlers for the Gardener Admission Controller","ref":"/docs/gardener/concepts/admission-controller/","tags":"","title":"Gardener Admission Controller"},{"body":"Overview The Gardener API server is a Kubernetes-native extension based on its aggregation layer. It is registered via an APIService object and designed to run inside a Kubernetes cluster whose API it wants to extend.\nAfter registration, it exposes the following resources:\nCloudProfiles CloudProfiles are resources that describe a specific environment of an underlying infrastructure provider, e.g. AWS, Azure, etc. Each shoot has to reference a CloudProfile to declare the environment it should be created in. In a CloudProfile, the gardener operator specifies certain constraints like available machine types, regions, which Kubernetes versions they want to offer, etc. End-users can read CloudProfiles to see these values, but only operators can change the content or create/delete them. When a shoot is created or updated, then an admission plugin checks that only allowed values are used via the referenced CloudProfile.\nAdditionally, a CloudProfile may contain a providerConfig, which is a special configuration dedicated for the infrastructure provider. Gardener does not evaluate or understand this config, but extension controllers might need it for declaration of provider-specific constraints, or global settings.\nPlease see this example manifest and consult the documentation of your provider extension controller to get information about its providerConfig.\nNamespacedCloudProfiles In addition to CloudProfiles, NamespacedCloudProfiles exist to enable project level CloudProfiles. Please view GEP-25 for additional information. This feature is currently under development and not ready for productive use. At the moment, only the necessary APIs and validations exist to allow for extensions to adapt to the new NamespacedCloudProfile resource.\nWhen a shoot is created or updated, the cloudprofile reference can be set to point to a directly descendant NamespacedCloudProfile. Updates from one CloudProfile to another CloudProfile or from one NamespacedCloudProfile to another NamespacedCloudProfile or even to another CloudProfile are not allowed.\nProject viewers have the permission to see NamespacedCloudProfiles associated with a particular project. Project members can create, edit or delete NamespacedCloudProfiles, except for the special fields .spec.kubernetes and .spec.machineImages. In order to make changes to these special fields, a user needs to be granted the custom RBAC verbs modify-spec-kubernetes and modify-spec-machineimages respectively, which is typically only granted to landscape operators.\nInternalSecrets End-users can read and/or write Secrets in their project namespaces in the garden cluster. This prevents Gardener components from storing such “Gardener-internal” secrets in the respective project namespace. InternalSecrets are resources that contain shoot or project-related secrets that are “Gardener-internal”, i.e., secrets used and managed by the system that end-users don’t have access to. InternalSecrets are defined like plain Kubernetes Secrets, behave exactly like them, and can be used in the same manners. The only difference is, that the InternalSecret resource is a dedicated API resource (exposed by gardener-apiserver). This allows separating access to “normal” secrets and internal secrets by the usual RBAC means.\nGardener uses an InternalSecret per Shoot for syncing the client CA to the project namespace in the garden cluster (named \u003cshoot-name\u003e.ca-client). The shoots/adminkubeconfig subresource signs short-lived client certificates by retrieving the CA from the InternalSecret.\nOperators should configure gardener-apiserver to encrypt the internalsecrets.core.gardener.cloud resource in etcd.\nPlease see this example manifest.\nSeeds Seeds are resources that represent seed clusters. Gardener does not care about how a seed cluster got created - the only requirement is that it is of at least Kubernetes v1.25 and passes the Kubernetes conformance tests. The Gardener operator has to either deploy the gardenlet into the cluster they want to use as seed (recommended, then the gardenlet will create the Seed object itself after bootstrapping) or provide the kubeconfig to the cluster inside a secret (that is referenced by the Seed resource) and create the Seed resource themselves.\nPlease see this, this, and optionally this example manifests.\nShoot Quotas To allow end-users not having their dedicated infrastructure account to try out Gardener, the operator can register an account owned by them that they allow to be used for trial clusters. Trial clusters can be put under quota so that they don’t consume too many resources (resulting in costs) and that one user cannot consume all resources on their own. These clusters are automatically terminated after a specified time, but end-users may extend the lifetime manually if needed.\nPlease see this example manifest.\nProjects The first thing before creating a shoot cluster is to create a Project. A project is used to group multiple shoot clusters together. End-users can invite colleagues to the project to enable collaboration, and they can either make them admin or viewer. After an end-user has created a project, they will get a dedicated namespace in the garden cluster for all their shoots.\nPlease see this example manifest.\nSecretBindings Now that the end-user has a namespace the next step is registering their infrastructure provider account.\nPlease see this example manifest and consult the documentation of the extension controller for the respective infrastructure provider to get information about which keys are required in this secret.\nAfter the secret has been created, the end-user has to create a special SecretBinding resource that binds this secret. Later, when creating shoot clusters, they will reference such binding.\nPlease see this example manifest.\nShoots Shoot cluster contain various settings that influence how end-user Kubernetes clusters will look like in the end. As Gardener heavily relies on extension controllers for operating system configuration, networking, and infrastructure specifics, the end-user has the possibility (and responsibility) to provide these provider-specific configurations as well. Such configurations are not evaluated by Gardener (because it doesn’t know/understand them), but they are only transported to the respective extension controller.\n⚠️ This means that any configuration issues/mistake on the end-user side that relates to a provider-specific flag or setting cannot be caught during the update request itself but only later during the reconciliation (unless a validator webhook has been registered in the garden cluster by an operator).\nPlease see this example manifest and consult the documentation of the provider extension controller to get information about its spec.provider.controlPlaneConfig, .spec.provider.infrastructureConfig, and .spec.provider.workers[].providerConfig.\n(Cluster)OpenIDConnectPresets Please see this separate documentation file.\nOverview Data Model ","categories":["Users"],"description":"Understand the Gardener API server extension and the resources it exposes","excerpt":"Understand the Gardener API server extension and the resources it …","ref":"/docs/gardener/concepts/apiserver/","tags":"","title":"Gardener API Server"},{"body":"Overview The gardener-controller-manager (often referred to as “GCM”) is a component that runs next to the Gardener API server, similar to the Kubernetes Controller Manager. It runs several controllers that do not require talking to any seed or shoot cluster. Also, as of today, it exposes an HTTP server that is serving several health check endpoints and metrics.\nThis document explains the various functionalities of the gardener-controller-manager and their purpose.\nControllers Bastion Controller Bastion resources have a limited lifetime which can be extended up to a certain amount by performing a heartbeat on them. The Bastion controller is responsible for deleting expired or rotten Bastions.\n “expired” means a Bastion has exceeded its status.expirationTimestamp. “rotten” means a Bastion is older than the configured maxLifetime. The maxLifetime defaults to 24 hours and is an option in the BastionControllerConfiguration which is part of gardener-controller-managers ControllerManagerControllerConfiguration, see the example config file for details.\nThe controller also deletes Bastions in case the referenced Shoot:\n no longer exists is marked for deletion (i.e., have a non-nil .metadata.deletionTimestamp) was migrated to another seed (i.e., Shoot.spec.seedName is different than Bastion.spec.seedName). The deletion of Bastions triggers the gardenlet to perform the necessary cleanups in the Seed cluster, so some time can pass between deletion and the Bastion actually disappearing. Clients like gardenctl are advised to not re-use Bastions whose deletion timestamp has been set already.\nRefer to GEP-15 for more information on the lifecycle of Bastion resources.\nCertificateSigningRequest Controller After the gardenlet gets deployed on the Seed cluster, it needs to establish itself as a trusted party to communicate with the Gardener API server. It runs through a bootstrap flow similar to the kubelet bootstrap process.\nOn startup, the gardenlet uses a kubeconfig with a bootstrap token which authenticates it as being part of the system:bootstrappers group. This kubeconfig is used to create a CertificateSigningRequest (CSR) against the Gardener API server.\nThe controller in gardener-controller-manager checks whether the CertificateSigningRequest has the expected organization, common name and usages which the gardenlet would request.\nIt only auto-approves the CSR if the client making the request is allowed to “create” the certificatesigningrequests/seedclient subresource. Clients with the system:bootstrappers group are bound to the gardener.cloud:system:seed-bootstrapper ClusterRole, hence, they have such privileges. As the bootstrap kubeconfig for the gardenlet contains a bootstrap token which is authenticated as being part of the systems:bootstrappers group, its created CSR gets auto-approved.\nCloudProfile Controller CloudProfiles are essential when it comes to reconciling Shoots since they contain constraints (like valid machine types, Kubernetes versions, or machine images) and sometimes also some global configuration for the respective environment (typically via provider-specific configuration in .spec.providerConfig).\nConsequently, to ensure that CloudProfiles in-use are always present in the system until the last referring Shoot or NamespacedCloudProfile gets deleted, the controller adds a finalizer which is only released when there is no Shoot or NamespacedCloudProfile referencing the CloudProfile anymore.\nNamespacedCloudProfile Controller NamespacedCloudProfiles provide a project-scoped extension to CloudProfiles, allowing for adjustments of a parent CloudProfile (e.g. by overriding expiration dates of Kubernetes versions or machine images). This allows for modifications without global project visibility. Like CloudProfiles do in their spec, NamespacedCloudProfiles also expose the resulting Shoot constraints as a CloudProfileSpec in their status.\nThe controller ensures that NamespacedCloudProfiles in-use remain present in the system until the last referring Shoot is deleted by adding a finalizer that is only released when there is no Shoot referencing the NamespacedCloudProfile anymore.\nControllerDeployment Controller Extensions are registered in the garden cluster via ControllerRegistration and deployment of respective extensions are specified via ControllerDeployment. For more info refer to Registering Extension Controllers.\nThis controller ensures that ControllerDeployment in-use always exists until the last ControllerRegistration referencing them gets deleted. The controller adds a finalizer which is only released when there is no ControllerRegistration referencing the ControllerDeployment anymore.\nControllerRegistration Controller The ControllerRegistration controller makes sure that the required Gardener Extensions specified by the ControllerRegistration resources are present in the seed clusters. It also takes care of the creation and deletion of ControllerInstallation objects for a given seed cluster. The controller has three reconciliation loops.\n“Main” Reconciler This reconciliation loop watches the Seed objects and determines which ControllerRegistrations are required for them and reconciles the corresponding ControllerInstallation resources to reach the determined state. To begin with, it computes the kind/type combinations of extensions required for the seed. For this, the controller examines a live list of ControllerRegistrations, ControllerInstallations, BackupBuckets, BackupEntrys, Shoots, and Secrets from the garden cluster. For example, it examines the shoots running on the seed and deducts the kind/type, like Infrastructure/gcp. The seed (seed.spec.provider.type) and DNS (seed.spec.dns.provider.type) provider types are considered when calculating the list of required ControllerRegistrations, as well. It also decides whether they should always be deployed based on the .spec.deployment.policy. For the configuration options, please see this section.\nBased on these required combinations, each of them are mapped to ControllerRegistration objects and then to their corresponding ControllerInstallation objects (if existing). The controller then creates or updates the required ControllerInstallation objects for the given seed. It also deletes every existing ControllerInstallation whose referenced ControllerRegistration is not part of the required list. For example, if the shoots in the seed are no longer using the DNS provider aws-route53, then the controller proceeds to delete the respective ControllerInstallation object.\n\"ControllerRegistration Finalizer\" Reconciler This reconciliation loop watches the ControllerRegistration resource and adds finalizers to it when they are created. In case a deletion request comes in for the resource, i.e., if a .metadata.deletionTimestamp is set, it actively scans for a ControllerInstallation resource using this ControllerRegistration, and decides whether the deletion can be allowed. In case no related ControllerInstallation is present, it removes the finalizer and marks it for deletion.\n\"Seed Finalizer\" Reconciler This loop also watches the Seed object and adds finalizers to it at creation. If a .metadata.deletionTimestamp is set for the seed, then the controller checks for existing ControllerInstallation objects which reference this seed. If no such objects exist, then it removes the finalizer and allows the deletion.\n“Extension ClusterRole” Reconciler This reconciler watches two resources in the garden cluster:\n ClusterRoles labelled with authorization.gardener.cloud/custom-extensions-permissions=true ServiceAccounts in seed namespaces matching the selector provided via the authorization.gardener.cloud/extensions-serviceaccount-selector annotation of such ClusterRoles. Its core task is to maintain a ClusterRoleBinding resource referencing the respective ClusterRole. This gets bound to all ServiceAccounts in seed namespaces whose labels match the selector provided via the authorization.gardener.cloud/extensions-serviceaccount-selector annotation of such ClusterRoles.\nYou can read more about the purpose of this reconciler in this document.\nCredentialsBinding Controller CredentialsBindings reference Secrets, WorkloadIdentitys and Quotas and are themselves referenced by Shoots.\nThe controller adds finalizers to the referenced objects to ensure they don’t get deleted while still being referenced. Similarly, to ensure that CredentialsBindings in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the CredentialsBinding anymore.\nReferenced Secrets and WorkloadIdentitys will also be labeled with provider.shoot.gardener.cloud/\u003ctype\u003e=true, where \u003ctype\u003e is the value of the .provider.type of the CredentialsBinding. Also, all referenced Secrets and WorkloadIdentitys, as well as Quotas, will be labeled with reference.gardener.cloud/credentialsbinding=true to allow for easily filtering for objects referenced by CredentialsBindings.\nEvent Controller With the Gardener Event Controller, you can prolong the lifespan of events related to Shoot clusters. This is an optional controller which will become active once you provide the below mentioned configuration.\nAll events in K8s are deleted after a configurable time-to-live (controlled via a kube-apiserver argument called --event-ttl (defaulting to 1 hour)). The need to prolong the time-to-live for Shoot cluster events frequently arises when debugging customer issues on live systems. This controller leaves events involving Shoots untouched, while deleting all other events after a configured time. In order to activate it, provide the following configuration:\n concurrentSyncs: The amount of goroutines scheduled for reconciling events. ttlNonShootEvents: When an event reaches this time-to-live it gets deleted unless it is a Shoot-related event (defaults to 1h, equivalent to the event-ttl default). ⚠️ In addition, you should also configure the --event-ttl for the kube-apiserver to define an upper-limit of how long Shoot-related events should be stored. The --event-ttl should be larger than the ttlNonShootEvents or this controller will have no effect.\n ExposureClass Controller ExposureClass abstracts the ability to expose a Shoot clusters control plane in certain network environments (e.g. corporate networks, DMZ, internet) on all Seeds or a subset of the Seeds. For more information, see ExposureClasses.\nConsequently, to ensure that ExposureClasses in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the ExposureClass anymore.\nManagedSeedSet Controller ManagedSeedSet objects maintain a stable set of replicas of ManagedSeeds, i.e. they guarantee the availability of a specified number of identical ManagedSeeds on an equal number of identical Shoots. The ManagedSeedSet controller creates and deletes ManagedSeeds and Shoots in response to changes to the replicas and selector fields. For more information, refer to the ManagedSeedSet proposal document.\n The reconciler first gets all the replicas of the given ManagedSeedSet in the ManagedSeedSet’s namespace and with the matching selector. Each replica is a struct that contains a ManagedSeed, its corresponding Seed and Shoot objects. Then the pending replica is retrieved, if it exists. Next it determines the ready, postponed, and deletable replicas. A replica is considered ready when a Seed owned by a ManagedSeed has been registered either directly or by deploying gardenlet into a Shoot, the Seed is Ready and the Shoot’s status is Healthy. If a replica is not ready and it is not pending, i.e. it is not specified in the ManagedSeed’s status.pendingReplica field, then it is added to the postponed replicas. A replica is deletable if it has no scheduled Shoots and the replica’s Shoot and ManagedSeed do not have the seedmanagement.gardener.cloud/protect-from-deletion annotation. Finally, it checks the actual and target replica counts. If the actual count is less than the target count, the controller scales up the replicas by creating new replicas to match the desired target count. If the actual count is more than the target, the controller deletes replicas to match the desired count. Before scale-out or scale-in, the controller first reconciles the pending replica (there can always only be one) and makes sure the replica is ready before moving on to the next one. Scale-out(actual count \u003c target count) During the scale-out phase, the controller first creates the Shoot object from the ManagedSeedSet’s spec.shootTemplate field and adds the replica to the status.pendingReplica of the ManagedSeedSet. For the subsequent reconciliation steps, the controller makes sure that the pending replica is ready before proceeding to the next replica. Once the Shoot is created successfully, the ManagedSeed object is created from the ManagedSeedSet’s spec.template. The ManagedSeed object is reconciled by the ManagedSeed controller and a Seed object is created for the replica. Once the replica’s Seed becomes ready and the Shoot becomes healthy, the replica also becomes ready. Scale-in(actual count \u003e target count) During the scale-in phase, the controller first determines the replica that can be deleted. From the deletable replicas, it chooses the one with the lowest priority and deletes it. Priority is determined in the following order: First, compare replica statuses. Replicas with “less advanced” status are considered lower priority. For example, a replica with StatusShootReconciling status has a lower value than a replica with StatusShootReconciled status. Hence, in this case, a replica with a StatusShootReconciling status will have lower priority and will be considered for deletion. Then, the replicas are compared with the readiness of their Seeds. Replicas with non-ready Seeds are considered lower priority. Then, the replicas are compared with the health statuses of their Shoots. Replicas with “worse” statuses are considered lower priority. Finally, the replica ordinals are compared. Replicas with lower ordinals are considered lower priority. Quota Controller Quota object limits the resources consumed by shoot clusters either per provider secret or per project/namespace.\nConsequently, to ensure that Quotas in-use are always present in the system until the last SecretBinding or CredentialsBinding that references them gets deleted, the controller adds a finalizer which is only released when there is no SecretBinding or CredentialsBinding referencing the Quota anymore.\nProject Controller There are multiple controllers responsible for different aspects of Project objects. Please also refer to the Project documentation.\n“Main” Reconciler This reconciler manages a dedicated Namespace for each Project. The namespace name can either be specified explicitly in .spec.namespace (must be prefixed with garden-) or it will be determined by the controller. If .spec.namespace is set, it tries to create it. If it already exists, it tries to adopt it. This will only succeed if the Namespace was previously labeled with gardener.cloud/role=project and project.gardener.cloud/name=\u003cproject-name\u003e. This is to prevent end-users from being able to adopt arbitrary namespaces and escalate their privileges, e.g. the kube-system namespace.\nAfter the namespace was created/adopted, the controller creates several ClusterRoles and ClusterRoleBindings that allow the project members to access related resources based on their roles. These RBAC resources are prefixed with gardener.cloud:system:project{-member,-viewer}:\u003cproject-name\u003e. Gardener administrators and extension developers can define their own roles. For more information, see Extending Project Roles for more information.\nIn addition, operators can configure the Project controller to maintain a default ResourceQuota for project namespaces. Quotas can especially limit the creation of user facing resources, e.g. Shoots, SecretBindings, CredentialsBinding, Secrets and thus protect the garden cluster from massive resource exhaustion but also enable operators to align quotas with respective enterprise policies.\n ⚠️ Gardener itself is not exempted from configured quotas. For example, Gardener creates Secrets for every shoot cluster in the project namespace and at the same time increases the available quota count. Please mind this additional resource consumption.\n The controller configuration provides a template section controllers.project.quotas where such a ResourceQuota (see the example below) can be deposited.\ncontrollers: project: quotas: - config: apiVersion: v1 kind: ResourceQuota spec: hard: count/shoots.core.gardener.cloud: \"100\" count/secretbindings.core.gardener.cloud: \"10\" count/credentialsbindings.security.gardener.cloud: \"10\" count/secrets: \"800\" projectSelector: {} The Project controller takes the specified config and creates a ResourceQuota with the name gardener in the project namespace. If a ResourceQuota resource with the name gardener already exists, the controller will only update fields in spec.hard which are unavailable at that time. This is done to configure a default Quota in all projects but to allow manual quota increases as the projects’ demands increase. spec.hard fields in the ResourceQuota object that are not present in the configuration are removed from the object. Labels and annotations on the ResourceQuota config get merged with the respective fields on existing ResourceQuotas. An optional projectSelector narrows down the amount of projects that are equipped with the given config. If multiple configs match for a project, then only the first match in the list is applied to the project namespace.\nThe .status.phase of the Project resources is set to Ready or Failed by the reconciler to indicate whether the reconciliation loop was performed successfully. Also, it generates Events to provide further information about its operations.\nWhen a Project is marked for deletion, the controller ensures that there are no Shoots left in the project namespace. Once all Shoots are gone, the Namespace and Project are released.\n“Stale Projects” Reconciler As Gardener is a large-scale Kubernetes as a Service, it is designed for being used by a large amount of end-users. Over time, it is likely to happen that some of the hundreds or thousands of Project resources are no longer actively used.\nGardener offers the “stale projects” reconciler which will take care of identifying such stale projects, marking them with a “warning”, and eventually deleting them after a certain time period. This reconciler is enabled by default and works as follows:\n Projects are considered as “stale”/not actively used when all of the following conditions apply: The namespace associated with the Project does not have any… Shoot resources. BackupEntry resources. Secret resources that are referenced by a SecretBinding or a CredentialsBinding that is in use by a Shoot (not necessarily in the same namespace). Quota resources that are referenced by a SecretBinding or a CredentialsBinding that is in use by a Shoot (not necessarily in the same namespace). The time period when the project was used for the last time (status.lastActivityTimestamp) is longer than the configured minimumLifetimeDays If a project is considered “stale”, then its .status.staleSinceTimestamp will be set to the time when it was first detected to be stale. If it gets actively used again, this timestamp will be removed. After some time, the .status.staleAutoDeleteTimestamp will be set to a timestamp after which Gardener will auto-delete the Project resource if it still is not actively used.\nThe component configuration of the gardener-controller-manager offers to configure the following options:\n minimumLifetimeDays: Don’t consider newly created Projects as “stale” too early to give people/end-users some time to onboard and get familiar with the system. The “stale project” reconciler won’t set any timestamp for Projects younger than minimumLifetimeDays. When you change this value, then projects marked as “stale” may be no longer marked as “stale” in case they are young enough, or vice versa. staleGracePeriodDays: Don’t compute auto-delete timestamps for stale Projects that are unused for less than staleGracePeriodDays. This is to not unnecessarily make people/end-users nervous “just because” they haven’t actively used their Project for a given amount of time. When you change this value, then already assigned auto-delete timestamps may be removed if the new grace period is not yet exceeded. staleExpirationTimeDays: Expiration time after which stale Projects are finally auto-deleted (after .status.staleSinceTimestamp). If this value is changed and an auto-delete timestamp got already assigned to the projects, then the new value will only take effect if it’s increased. Hence, decreasing the staleExpirationTimeDays will not decrease already assigned auto-delete timestamps. Gardener administrators/operators can exclude specific Projects from the stale check by annotating the related Namespace resource with project.gardener.cloud/skip-stale-check=true.\n “Activity” Reconciler Since the other two reconcilers are unable to actively monitor the relevant objects that are used in a Project (Shoot, Secret, etc.), there could be a situation where the user creates and deletes objects in a short period of time. In that case, the Stale Project Reconciler could not see that there was any activity on that project and it will still mark it as a Stale, even though it is actively used.\nThe Project Activity Reconciler is implemented to take care of such cases. An event handler will notify the reconciler for any activity and then it will update the status.lastActivityTimestamp. This update will also trigger the Stale Project Reconciler.\nSecretBinding Controller SecretBindings reference Secrets and Quotas and are themselves referenced by Shoots. The controller adds finalizers to the referenced objects to ensure they don’t get deleted while still being referenced. Similarly, to ensure that SecretBindings in-use are always present in the system until the last referring Shoot gets deleted, the controller adds a finalizer which is only released when there is no Shoot referencing the SecretBinding anymore.\nReferenced Secrets will also be labeled with provider.shoot.gardener.cloud/\u003ctype\u003e=true, where \u003ctype\u003e is the value of the .provider.type of the SecretBinding. Also, all referenced Secrets, as well as Quotas, will be labeled with reference.gardener.cloud/secretbinding=true to allow for easily filtering for objects referenced by SecretBindings.\nSeed Controller The Seed controller in the gardener-controller-manager reconciles Seed objects with the help of the following reconcilers.\n“Main” Reconciler This reconciliation loop takes care of seed related operations in the garden cluster. When a new Seed object is created, the reconciler creates a new Namespace in the garden cluster seed-\u003cseed-name\u003e. Namespaces dedicated to single seed clusters allow us to segregate access permissions i.e., a gardenlet must not have permissions to access objects in all Namespaces in the garden cluster. There are objects in a Garden environment which are created once by the operator e.g., default domain secret, alerting credentials, and are required for operations happening in the gardenlet. Therefore, we not only need a seed specific Namespace but also a copy of these “shared” objects.\nThe “main” reconciler takes care about this replication:\n Kind Namespace Label Selector Secret garden gardener.cloud/role “Backup Buckets Check” Reconciler Every time a BackupBucket object is created or updated, the referenced Seed object is enqueued for reconciliation. It’s the reconciler’s task to check the status subresource of all existing BackupBuckets that reference this Seed. If at least one BackupBucket has .status.lastError != nil, the BackupBucketsReady condition on the Seed will be set to False, and consequently the Seed is considered as NotReady. If the SeedBackupBucketsCheckControllerConfiguration (which is part of gardener-controller-managers component configuration) contains a conditionThreshold for the BackupBucketsReady, the condition will instead first be set to Progressing and eventually to False once the conditionThreshold expires. See the example config file for details. Once the BackupBucket is healthy again, the seed will be re-queued and the condition will turn true.\n“Extensions Check” Reconciler This reconciler reconciles Seed objects and checks whether all ControllerInstallations referencing them are in a healthy state. Concretely, all three conditions Valid, Installed, and Healthy must have status True and the Progressing condition must have status False. Based on this check, it maintains the ExtensionsReady condition in the respective Seed’s .status.conditions list.\n“Lifecycle” Reconciler The “Lifecycle” reconciler processes Seed objects which are enqueued every 10 seconds in order to check if the responsible gardenlet is still responding and operable. Therefore, it checks renewals via Lease objects of the seed in the garden cluster which are renewed regularly by the gardenlet.\nIn case a Lease is not renewed for the configured amount in config.controllers.seed.monitorPeriod.duration:\n The reconciler assumes that the gardenlet stopped operating and updates the GardenletReady condition to Unknown. Additionally, the conditions and constraints of all Shoot resources scheduled on the affected seed are set to Unknown as well, because a striking gardenlet won’t be able to maintain these conditions any more. If the gardenlet’s client certificate has expired (identified based on the .status.clientCertificateExpirationTimestamp field in the Seed resource) and if it is managed by a ManagedSeed, then this will be triggered for a reconciliation. This will trigger the bootstrapping process again and allows gardenlets to obtain a fresh client certificate. Shoot Controller “Conditions” Reconciler In case the reconciled Shoot is registered via a ManagedSeed as a seed cluster, this reconciler merges the conditions in the respective Seed’s .status.conditions into the .status.conditions of the Shoot. This is to provide a holistic view on the status of the registered seed cluster by just looking at the Shoot resource.\n“Hibernation” Reconciler This reconciler is responsible for hibernating or awakening shoot clusters based on the schedules defined in their .spec.hibernation.schedules. It ignores failed Shoots and those marked for deletion.\n“Maintenance” Reconciler This reconciler is responsible for maintaining shoot clusters based on the time window defined in their .spec.maintenance.timeWindow. It might auto-update the Kubernetes version or the operating system versions specified in the worker pools (.spec.provider.workers). It could also add some operation or task annotations. For more information, see Shoot Maintenance.\n“Quota” Reconciler This reconciler might auto-delete shoot clusters in case their referenced SecretBinding or CredentialsBinding is itself referencing a Quota with .spec.clusterLifetimeDays != nil. If the shoot cluster is older than the configured lifetime, then it gets deleted. It maintains the expiration time of the Shoot in the value of the shoot.gardener.cloud/expiration-timestamp annotation. This annotation might be overridden, however only by at most twice the value of the .spec.clusterLifetimeDays.\n“Reference” Reconciler Shoot objects may specify references to other objects in the garden cluster which are required for certain features. For example, users can configure various DNS providers via .spec.dns.providers and usually need to refer to a corresponding Secret with valid DNS provider credentials inside. Such objects need a special protection against deletion requests as long as they are still being referenced by one or multiple shoots.\nTherefore, this reconciler checks Shoots for referenced objects and adds the finalizer gardener.cloud/reference-protection to their .metadata.finalizers list. The reconciled Shoot also gets this finalizer to enable a proper garbage collection in case the gardener-controller-manager is offline at the moment of an incoming deletion request. When an object is not actively referenced anymore because the Shoot specification has changed or all related shoots were deleted (are in deletion), the controller will remove the added finalizer again so that the object can safely be deleted or garbage collected.\nThis reconciler inspects the following references:\n DNS provider secrets (.spec.dns.provider) Audit policy configmaps (.spec.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef) Further checks might be added in the future.\n“Retry” Reconciler This reconciler is responsible for retrying certain failed Shoots. Currently, the reconciler retries only failed Shoots with an error code ERR_INFRA_RATE_LIMITS_EXCEEDED. See Shoot Status for more details.\n“Status Label” Reconciler This reconciler is responsible for maintaining the shoot.gardener.cloud/status label on Shoots. See Shoot Status for more details.\n","categories":"","description":"Understand where the gardener-controller-manager runs and its functionalities","excerpt":"Understand where the gardener-controller-manager runs and its …","ref":"/docs/gardener/concepts/controller-manager/","tags":"","title":"Gardener Controller Manager"},{"body":"Overview The goal of the gardener-node-agent is to bootstrap a machine into a worker node and maintain node-specific components, which run on the node and are unmanaged by Kubernetes (e.g. the kubelet service, systemd units, …).\nIt effectively is a Kubernetes controller deployed onto the worker node.\nArchitecture and Basic Design This figure visualizes the overall architecture of the gardener-node-agent. On the left side, it starts with an OperatingSystemConfig resource (OSC) with a corresponding worker pool specific cloud-config-\u003cworker-pool\u003e secret being passed by reference through the userdata to a machine by the machine-controller-manager (MCM).\nOn the right side, the cloud-config secret will be extracted and used by the gardener-node-agent after being installed. Details on this can be found in the next section.\nFinally, the gardener-node-agent runs a systemd service watching on secret resources located in the kube-system namespace like our cloud-config secret that contains the OperatingSystemConfig. When gardener-node-agent applies the OSC, it installs the kubelet + configuration on the worker node.\nInstallation and Bootstrapping This section describes how the gardener-node-agent is initially installed onto the worker node.\nIn the beginning, there is a very small bash script called gardener-node-init.sh, which will be copied to /var/lib/gardener-node-agent/init.sh on the node with cloud-init data. This script’s sole purpose is downloading and starting the gardener-node-agent. The binary artifact is extracted from an OCI artifact and lives at /opt/bin/gardener-node-agent.\nAlong with the init script, a configuration for the gardener-node-agent is carried over to the worker node at /var/lib/gardener-node-agent/config.yaml. This configuration contains things like the shoot’s kube-apiserver endpoint, the according certificates to communicate with it, and controller configuration.\nIn a bootstrapping phase, the gardener-node-agent sets itself up as a systemd service. It also executes tasks that need to be executed before any other components are installed, e.g. formatting the data device for the kubelet.\nControllers This section describes the controllers in more details.\nLease Controller This controller creates a Lease for gardener-node-agent in kube-system namespace of the shoot cluster. Each instance of gardener-node-agent creates its own Lease when its corresponding Node was created. It renews the Lease resource every 10 seconds. This indicates a heartbeat to the external world.\nNode Controller This controller watches the Node object for the machine it runs on. The correct Node is identified based on the hostname of the machine (Nodes have the kubernetes.io/hostname label). Whenever the worker.gardener.cloud/restart-systemd-services annotation changes, the controller performs the desired changes by restarting the specified systemd unit files. See also this document for more information. After restarting all units, the annotation is removed.\n ℹ️ When the gardener-node-agent systemd service itself is requested to be restarted, the annotation is removed first to ensure it does not restart itself indefinitely.\n Operating System Config Controller This controller contains the main logic of gardener-node-agent. It watches Secrets whose data map contains the OperatingSystemConfig which consists of all systemd units and files that are relevant for the node configuration. Amongst others, a prominent example is the configuration file for kubelet and its unit file for the kubelet.service.\nThe controller decodes the configuration and computes the files and units that have changed since its last reconciliation. It writes or update the files and units to the file system, removes no longer needed files and units, reloads the systemd daemon, and starts or stops the units accordingly.\nAfter successful reconciliation, it persists the just applied OperatingSystemConfig into a file on the host. This file will be used for future reconciliations to compute file/unit changes.\nThe controller also maintains two annotations on the Node:\n worker.gardener.cloud/kubernetes-version, describing the version of the installed kubelet. checksum/cloud-config-data, describing the checksum of the applied OperatingSystemConfig (used in future reconciliations to determine whether it needs to reconcile, and to report that this node is up-to-date). Token Controller This controller watches the access token Secrets in the kube-system namespace configured via the gardener-node-agent’s component configuration (.controllers.token.syncConfigs[] field). Whenever the .data.token field changes, it writes the new content to a file on the configured path on the host file system. This mechanism is used to download its own access token for the shoot cluster, but also the access tokens of other systemd components (e.g., valitail). Since the underlying client is based on k8s.io/client-go and the kubeconfig points to this token file, it is dynamically reloaded without the necessity of explicit configuration or code changes. This procedure ensures that the most up-to-date tokens are always present on the host and used by the gardener-node-agent and the other systemd components.\nReasoning The gardener-node-agent is a replacement for what was called the cloud-config-downloader and the cloud-config-executor, both written in bash. The gardener-node-agent implements this functionality as a regular controller and feels more uniform in terms of maintenance.\nWith the new architecture we gain a lot, let’s describe the most important gains here.\nDeveloper Productivity Since the Gardener community develops in Go day by day, writing business logic in bash is difficult, hard to maintain, almost impossible to test. Getting rid of almost all bash scripts which are currently in use for this very important part of the cluster creation process will enhance the speed of adding new features and removing bugs.\nSpeed Until now, the cloud-config-downloader runs in a loop every 60s to check if something changed on the shoot which requires modifications on the worker node. This produces a lot of unneeded traffic on the API server and wastes time, it will sometimes take up to 60s until a desired modification is started on the worker node. By writing a “real” Kubernetes controller, we can watch for the Node, the OSC in the Secret, and the shoot-access token in the secret. If any of these object changed, and only then, the required action will take effect immediately. This will speed up operations and will reduce the load on the API server of the shoot especially for large clusters.\nScalability The cloud-config-downloader adds a random wait time before restarting the kubelet in case the kubelet was updated or a configuration change was made to it. This is required to reduce the load on the API server and the traffic on the internet uplink. It also reduces the overall downtime of the services in the cluster because every kubelet restart transforms a node for several seconds into NotReady state which potentially interrupts service availability.\nDecision was made to keep the existing jitter mechanism which calculates the kubelet-download-and-restart-delay-seconds on the controller itself.\nCorrectness The configuration of the cloud-config-downloader is actually done by placing a file for every configuration item on the disk on the worker node. This was done because parsing the content of a single file and using this as a value in bash reduces to something like VALUE=$(cat /the/path/to/the/file). Simple, but it lacks validation, type safety and whatnot. With the gardener-node-agent we introduce a new API which is then stored in the gardener-node-agent secret and stored on disk in a single YAML file for comparison with the previous known state. This brings all benefits of type safe configuration. Because actual and previous configuration are compared, removed files and units are also removed and stopped on the worker if removed from the OSC.\nAvailability Previously, the cloud-config-downloader simply restarted the systemd units on every change to the OSC, regardless which of the services changed. The gardener-node-agent first checks which systemd unit was changed, and will only restart these. This will prevent unneeded kubelet restarts.\n","categories":"","description":"How Gardener bootstraps machines into worker nodes and how it installs and maintains gardener-managed node-specific components","excerpt":"How Gardener bootstraps machines into worker nodes and how it installs …","ref":"/docs/gardener/concepts/node-agent/","tags":"","title":"Gardener Node Agent"},{"body":"Overview The gardener-operator is responsible for the garden cluster environment. Without this component, users must deploy ETCD, the Gardener control plane, etc., manually and with separate mechanisms (not maintained in this repository). This is quite unfortunate since this requires separate tooling, processes, etc. A lot of production- and enterprise-grade features were built into Gardener for managing the seed and shoot clusters, so it makes sense to re-use them as much as possible also for the garden cluster.\nDeployment There is a Helm chart which can be used to deploy the gardener-operator. Once deployed and ready, you can create a Garden resource. Note that there can only be one Garden resource per system at a time.\n ℹ️ Similar to seed clusters, garden runtime clusters require a VPA, see this section. By default, gardener-operator deploys the VPA components. However, when there already is a VPA available, then set .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=false in the Garden resource.\n Garden Resources Please find an exemplary Garden resource here.\nConfiguration For Runtime Cluster Settings The Garden resource offers a few settings that are used to control the behaviour of gardener-operator in the runtime cluster. This section provides an overview over the available settings in .spec.runtimeCluster.settings:\nLoad Balancer Services gardener-operator deploys Istio and relevant resources to the runtime cluster in order to expose the virtual-garden-kube-apiserver service (similar to how the kube-apiservers of shoot clusters are exposed). In most cases, the cloud-controller-manager (responsible for managing these load balancers on the respective underlying infrastructure) supports certain customization and settings via annotations. This document provides a good overview and many examples.\nBy setting the .spec.runtimeCluster.settings.loadBalancerServices.annotations field the Gardener administrator can specify a list of annotations which will be injected into the Services of type LoadBalancer.\nVertical Pod Autoscaler gardener-operator heavily relies on the Kubernetes vertical-pod-autoscaler component. By default, the Garden controller deploys the VPA components into the garden namespace of the respective runtime cluster. In case you want to manage the VPA deployment on your own or have a custom one, then you might want to disable the automatic deployment of gardener-operator. Otherwise, you might end up with two VPAs which will cause erratic behaviour. By setting the .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=false you can disable the automatic deployment.\n⚠️ In any case, there must be a VPA available for your runtime cluster. Using a runtime cluster without VPA is not supported.\nTopology-Aware Traffic Routing Refer to the Topology-Aware Traffic Routing documentation as this document contains the documentation for the topology-aware routing setting for the garden runtime cluster.\nVolumes It is possible to define the minimum size for PersistentVolumeClaims in the runtime cluster created by gardener-operator via the .spec.runtimeCluster.volume.minimumSize field. This can be relevant in case the runtime cluster runs on an infrastructure that does only support disks of at least a certain size.\nCert-Management The operator can deploy the Gardener cert-management component optionally. A default issuer has to be specified and will be deployed, too. Please note that the cert-controller-manager is configured to use DNSRecords for ACME DNS challenges on certificate requests. A suitable provider extension must be deployed in this case, e.g. using an operator Extension resource. The default issuer must be set at .spec.runtimeCluster.certManagement.defaultIssuer either specifying an ACME or CA issuer.\nIf the cert-controller-manager should make requests to any ACME servers running with self-signed TLS certificates, the related CA can be provided using a secret with data field bundle.crt referenced with .spec.runtimeCluster.certManagement.config.caCertificatesSecretRef.\nDefault Issuer using an ACME server Please provide at least server and e-mail address.\nspec: runtimeCluster: certManagement: defaultIssuer: ACME: server: https://acme-v02.api.letsencrypt.org/directory email: some.name@my-email-domain.com # secretRef: # name: defaultIssuerPrivateKey # precheckNameservers: # - 1.2.3.4 # - 5.6.7.8 If needed, an existing ACME account can be specified with the secretRef. The referenced secret must contain a field privateKey. Otherwise, an account is auto-registered if supported by the ACME server. If you are using a private DNS server, you may need to set the precheckNameservers used to check the propagation of the DNS challenges.\nDefault Issuer using a root or intermediate CA If you want to use a root or intermediate CA for signing the certificates, provide a TLS secret containing the CA and reference it as shown in the example below.\nspec: runtimeCluster: certManagement: defaultIssuer: CA: secretRef: name: my-ca-tls-secret Configuration For Virtual Cluster ETCD Encryption Config The spec.virtualCluster.kubernetes.kubeAPIServer.encryptionConfig field in the Garden API allows operators to customize encryption configurations for the kube-apiserver of the virtual cluster. It provides options to specify additional resources for encryption. Similarly spec.virtualCluster.gardener.gardenerAPIServer.encryptionConfig field allows operators to customize encryption configurations for the gardener-apiserver.\n The resources field can be used to specify resources that should be encrypted in addition to secrets. Secrets are always encrypted for the kube-apiserver. For the gardener-apiserver, the following resources are always encrypted: controllerdeployments.core.gardener.cloud controllerregistrations.core.gardener.cloud internalsecrets.core.gardener.cloud shootstates.core.gardener.cloud Adding an item to any of the lists will cause patch requests for all the resources of that kind to encrypt them in the etcd. See Encrypting Confidential Data at Rest for more details. Removing an item from any of these lists will cause patch requests for all the resources of that type to decrypt and rewrite the resource as plain text. See Decrypt Confidential Data that is Already Encrypted at Rest for more details. ℹ️ Note that configuring encryption for a custom resource for the kube-apiserver is only supported for Kubernetes versions \u003e= 1.26.\n Extension Resource A Gardener installation relies on extensions to provide support for new cloud providers or to add new capabilities. You can find out more about Gardener extensions and how they can be used here.\nThe Extension resource is intended to automate the installation and management of extensions in a Gardener landscape. It contains configuration for the following scenarios:\n The deployment of the extension chart in the garden runtime cluster. The deployment of ControllerRegistration and ControllerDeployment resources in the (virtual) garden cluster. The deployment of extension admissions charts in runtime and virtual clusters. In the near future, the Extension will be used by the gardener-operator to automate the management of the backup bucket for ETCD and DNS records required by the garden cluster. To do that, gardener-operator will leverage extensions that support DNSRecord and BackupBucket resources. As of today, the support for managed DNSRecords and BackupBuckets in the gardener-operator is still being built. However, the Extension’s specification already reflects the target picture.\nPlease find an exemplary Extension resource here.\nExtension Deployment The .spec.deployment specifies how an extension can be installed for a Gardener landscape and consists of the following parts:\n .spec.deployment.extension contains the deployment specification of an extension. .spec.deployment.admission contains the deployment specification of an extension admission. Each one is described in more details below.\nConfiguration for Extension Deployment .spec.deployment.extension contains configuration for the registration of an extension in the garden cluster. gardener-operator follows the same principles described by this document:\n .spec.deployment.extension.helm and .spec.deployment.extension.values are used when creating the ControllerDeployment in the garden cluster. .spec.deployment.extension.policy and .spec.deployment.extension.seedSelector define the extension’s installation policy as per the ControllerDeployment's respective fields Runtime Extensions can manage resources required by the Garden resource (e.g. BackupBucket, DNSRecord, Extension) in the runtime cluster. Since the environment in the runtime cluster may differ from that of a Seed, the extension is installed in the runtime cluster with a distinct set of Helm chart values specified in .spec.deployment.extension.runtimeValues. If no runtimeValues are provided, the extension deployment for the runtime garden is considered superfluous and the deployment is uninstalled. The configuration allows for precise control over various extension parameters, such as requested resources, priority classes, and more.\nBesides the values configured in .spec.deployment.extension.runtimeValues, a runtime deployment flag and a priority class are merged into the values:\ngardener: runtimeCluster: enabled: true # indicates the extension is enabled for the Garden cluster, e.g. for handling `BackupBucket`, `DNSRecord` and `Extension` objects. priorityClassName: gardener-garden-system-200 As soon as a Garden object is created and runtimeValues are configured, the extension is deployed in the runtime cluster.\nExtension Registration When the virtual garden cluster is available, the Extension controller manages ControllerRegistration/ControllerDeployment resources to register extensions for shoots. The fields of .spec.deployment.extension include their configuration options.\nConfiguration for Admission Deployment The .spec.deployment.admission defines how an extension admission may be deployed by the gardener-operator. This deployment is optional and may be omitted. Typically, the admission are split in two parts:\nRuntime The runtime part contains deployment relevant manifests, required to run the admission service in the runtime cluster. The following values are passed to the chart during reconciliation:\ngardener: runtimeCluster: priorityClassName: \u003cClass to be used for extension admission\u003e Virtual The virtual part includes the webhook registration (MutatingWebhookConfiguration/Validatingwebhookconfiguration) and RBAC configuration. The following values are passed to the chart during reconciliation:\ngardener: virtualCluster: serviceAccount: name: \u003cName of the service account used to connect to the garden cluster\u003e namespace: \u003cNamespace of the service account\u003e Extension admissions often need to retrieve additional context from the garden cluster in order to process validating or mutating requests.\nFor example, the corresponding CloudProfile might be needed to perform a provider specific shoot validation. Therefore, Gardener automatically injects a kubeconfig into the admission deployment to interact with the (virtual) garden cluster (see this document for more information).\nConfiguration for Extension Resources The .spec.resources field refers to the extension resources as defined by Gardener in the extensions.gardener.cloud/v1alpha1 API. These include both well-known types such as Infrastructure, Worker etc. and generic resources. The field will be used to populate the respective field in the resulting ControllerRegistration in the garden cluster.\nControllers The gardener-operator controllers are now described in more detail.\nGarden Controller The Garden controller in the operator reconciles Garden objects with the help of the following reconcilers.\nMain Reconciler The reconciler first generates a general CA certificate which is valid for ~30d and auto-rotated when 80% of its lifetime is reached. Afterwards, it brings up the so-called “garden system components”. The gardener-resource-manager is deployed first since its ManagedResource controller will be used to bring up the remainders.\nOther system components are:\n runtime garden system resources (PriorityClasses for the workload resources) virtual garden system resources (RBAC rules) Vertical Pod Autoscaler (if enabled via .spec.runtimeCluster.settings.verticalPodAutoscaler.enabled=true in the Garden) HVPA Controller (when HVPA feature gate is enabled) ETCD Druid Istio As soon as all system components are up, the reconciler deploys the virtual garden cluster. It comprises out of two ETCDs (one “main” etcd, one “events” etcd) which are managed by ETCD Druid via druid.gardener.cloud/v1alpha1.Etcd custom resources. The whole management works similar to how it works for Shoots, so you can take a look at this document for more information in general.\nThe virtual garden control plane components are:\n virtual-garden-etcd-main virtual-garden-etcd-events virtual-garden-kube-apiserver virtual-garden-kube-controller-manager virtual-garden-gardener-resource-manager If the .spec.virtualCluster.controlPlane.highAvailability={} is set then these components will be deployed in a “highly available” mode. For ETCD, this means that there will be 3 replicas each. This works similar like for Shoots (see this document) except for the fact that there is no failure tolerance type configurability. The gardener-resource-manager’s HighAvailabilityConfig webhook makes sure that all pods with multiple replicas are spread on nodes, and if there are at least two zones in .spec.runtimeCluster.provider.zones then they also get spread across availability zones.\n If once set, removing .spec.virtualCluster.controlPlane.highAvailability again is not supported.\n The virtual-garden-kube-apiserver Deployment is exposed via Istio, similar to how the kube-apiservers of shoot clusters are exposed.\nSimilar to the Shoot API, the version of the virtual garden cluster is controlled via .spec.virtualCluster.kubernetes.version. Likewise, specific configuration for the control plane components can be provided in the same section, e.g. via .spec.virtualCluster.kubernetes.kubeAPIServer for the kube-apiserver or .spec.virtualCluster.kubernetes.kubeControllerManager for the kube-controller-manager.\nThe kube-controller-manager only runs a few controllers that are necessary in the scenario of the virtual garden. Most prominently, the serviceaccount-token controller is unconditionally disabled. Hence, the usage of static ServiceAccount secrets is not supported generally. Instead, the TokenRequest API should be used. Third-party components that need to communicate with the virtual cluster can leverage the gardener-resource-manager’s TokenRequestor controller and the generic kubeconfig, just like it works for Shoots. Please note, that this functionality is restricted to the garden namespace. The current Secret name of the generic kubeconfig can be found in the annotations (key: generic-token-kubeconfig.secret.gardener.cloud/name) of the Garden resource.\nFor the virtual cluster, it is essential to provide at least one DNS domain via .spec.virtualCluster.dns.domains. The respective DNS records are not managed by gardener-operator and should be created manually. They should point to the load balancer IP of the istio-ingressgateway Service in namespace virtual-garden-istio-ingress. The DNS records must be prefixed with both gardener. and api. for all domains in .spec.virtualCluster.dns.domains.\nThe first DNS domain in this list is used for the server in the kubeconfig, and for configuring the --external-hostname flag of the API server.\nApart from the control plane components of the virtual cluster, the reconcile also deploys the control plane components of Gardener. gardener-apiserver reuses the same ETCDs like the virtual-garden-kube-apiserver, so all data related to the “the garden cluster” is stored together and “isolated” from ETCD data related to the runtime cluster. This drastically simplifies backup and restore capabilities (e.g., moving the virtual garden cluster from one runtime cluster to another).\nThe Gardener control plane components are:\n gardener-apiserver gardener-admission-controller gardener-controller-manager gardener-scheduler Besides those, the gardener-operator is able to deploy the following optional components:\n Gardener Dashboard (and the controller for web terminals) when .spec.virtualCluster.gardener.gardenerDashboard (or .spec.virtualCluster.gardener.gardenerDashboard.terminal, respectively) is set. You can read more about it and its configuration in this section. Gardener Discovery Server when .spec.virtualCluster.gardener.gardenerDiscoveryServer is set. The service account issuer of shoots will be calculated in the format https://discovery.\u003c.spec.runtimeCluster.ingress.domains[0]\u003e/projects/\u003cproject-name\u003e/shoots/\u003cshoot-uid\u003e/issuer. This configuration applies for all seeds registered with the Garden cluster. Once set it should not be modified. The reconciler also manages a few observability-related components (more planned as part of GEP-19):\n fluent-operator fluent-bit gardener-metrics-exporter kube-state-metrics plutono vali prometheus-operator alertmanager-garden (read more here) prometheus-garden (read more here) prometheus-longterm (read more here) blackbox-exporter It is also mandatory to provide an IPv4 CIDR for the service network of the virtual cluster via .spec.virtualCluster.networking.services. This range is used by the API server to compute the cluster IPs of Services.\nThe controller maintains the .status.lastOperation which indicates the status of an operation.\nGardener Dashboard .spec.virtualCluster.gardener.gardenerDashboard serves a few configuration options for the dashboard. This section highlights the most prominent fields:\n oidcConfig: The general OIDC configuration is part of .spec.virtualCluster.kubernetes.kubeAPIServer.oidcConfig. This section allows you to define a few specific settings for the dashboard. sessionLifetime is the duration after which a session is terminated (i.e., after which a user is automatically logged out). additionalScopes allows to extend the list of scopes of the JWT token that are to be recognized. You must reference a Secret in the garden namespace containing the client ID/secret for the dashboard: apiVersion: v1 kind: Secret metadata: name: gardener-dashboard-oidc namespace: garden type: Opaque stringData: client_id: \u003csecret\u003e client_secret: \u003csecret\u003e enableTokenLogin: This is enabled by default and allows logging into the dashboard with a JWT token. You can disable it in case you want to only allow OIDC-based login. However, at least one of the both login methods must be enabled. frontendConfigMapRef: Reference a ConfigMap in the garden namespace containing the frontend configuration in the data with key frontend-config.yaml, for example apiVersion: v1 kind: ConfigMap metadata: name: gardener-dashboard-frontend namespace: garden data: frontend-config.yaml: |helpMenuItems: - title: Homepage icon: mdi-file-document url: https://gardener.cloud Please take a look at this file to get an idea of which values are configurable. This configuration can also include branding, themes, and colors. Read more about it here. Assets (logos/icons) are configured in a separate ConfigMap, see below. assetsConfigMapRef: Reference a ConfigMap in the garden namespace containing the assets, for example apiVersion: v1 kind: ConfigMap metadata: name: gardener-dashboard-assets namespace: garden binaryData: favicon-16x16.png: base64(favicon-16x16.png) favicon-32x32.png: base64(favicon-32x32.png) favicon-96x96.png: base64(favicon-96x96.png) favicon.ico: base64(favicon.ico) logo.svg: base64(logo.svg) Note that the assets must be provided base64-encoded, hence binaryData (instead of data) must be used. Please take a look at this file to get more information. gitHub: You can connect a GitHub repository that can be used to create issues for shoot clusters in the cluster details page. You have to reference a Secret in the garden namespace that contains the GitHub credentials, for example: apiVersion: v1 kind: Secret metadata: name: gardener-dashboard-github namespace: garden type: Opaque stringData: # This is for GitHub token authentication: authentication.token: \u003csecret\u003e # Alternatively, this is for GitHub app authentication: authentication.appId: \u003csecret\u003e authentication.clientId: \u003csecret\u003e authentication.clientSecret: \u003csecret\u003e authentication.installationId: \u003csecret\u003e authentication.privateKey: \u003csecret\u003e # This is the webhook secret, see explanation below webhookSecret: \u003csecret\u003e Note that you can also set up a GitHub webhook to the dashboard such that it receives updates when somebody changes the GitHub issue. The webhookSecret field is the secret that you enter in GitHub in the webhook configuration. The dashboard uses it to verify that received traffic is indeed originated from GitHub. If you don’t want to set up such webhook, or if the dashboard is not reachable by the GitHub webhook (e.g., in restricted environments) you can also configure gitHub.pollInterval. It is the interval of how often the GitHub API is polled for issue updates. This field is used as a fallback mechanism to ensure state synchronization, even when there is a GitHub webhook configuration. If a webhook event is missed or not successfully delivered, the polling will help catch up on any missed updates. If this field is not provided and there is no webhookSecret key in the referenced secret, it will be implicitly defaulted to 15m. The dashboard will use this to regularly poll the GitHub API for updates on issues. terminal: This enables the web terminal feature, read more about it here. When set, the terminal-controller-manager will be deployed to the runtime cluster. The allowedHosts field is explained here. The container section allows you to specify a container image and a description that should be used for the web terminals. Observability Garden Prometheus gardener-operator deploys a Prometheus instance in the garden namespace (called “Garden Prometheus”) which fetches metrics and data from garden system components, cAdvisors, the virtual cluster control plane, and the Seeds’ aggregate Prometheus instances. Its purpose is to provide an entrypoint for operators when debugging issues with components running in the garden cluster. It also serves as the top-level aggregator of metering across a Gardener landscape.\nTo extend the configuration of the Garden Prometheus, you can create the prometheus-operator’s custom resources and label them with prometheus=garden, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: garden name: garden-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Long-Term Prometheus gardener-operator deploys another Prometheus instance in the garden namespace (called “Long-Term Prometheus”) which federates metrics from Garden Prometheus. Its purpose is to store those with a longer retention than Garden Prometheus would. It is not possible to define different retention periods for different metrics in Prometheus, hence, using another Prometheus instance is the only option. This Long-term Prometheus also has an additional Cortex sidecar container for caching some queries to achieve faster processing times.\nTo extend the configuration of the Long-term Prometheus, you can create the prometheus-operator’s custom resources and label them with prometheus=longterm, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: longterm name: longterm-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Alertmanager By default, the alertmanager-garden deployed by gardener-operator does not come with any configuration. It is the responsibility of the human operators to design and provide it. This can be done by creating monitoring.coreos.com/v1alpha1.AlertmanagerConfig resources labeled with alertmanager=garden (read more about them here), for example:\napiVersion: monitoring.coreos.com/v1alpha1 kind: AlertmanagerConfig metadata: name: config namespace: garden labels: alertmanager: garden spec: route: receiver: dev-null groupBy: - alertname - landscape routes: - continue: true groupWait: 3m groupInterval: 5m repeatInterval: 12h routes: - receiver: ops matchers: - name: severity value: warning matchType: = - name: topology value: garden matchType: = receivers: - name: dev-null - name: ops slackConfigs: - apiURL: https://\u003cslack-api-url\u003e channel: \u003cchannel-name\u003e username: Gardener-Alertmanager iconEmoji: \":alert:\" title: \"[{{ .Status | toUpper }}] Gardener Alert(s)\" text: \"{{ range .Alerts }}*{{ .Annotations.summary }} ({{ .Status }})*\\n{{ .Annotations.description }}\\n\\n{{ end }}\" sendResolved: true Plutono A Plutono instance is deployed by gardener-operator into the garden namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the garden namespace labelled with dashboard.monitoring.gardener.cloud/garden=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/garden: \"true\" name: my-custom-dashboard namespace: garden data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Care Reconciler This reconciler performs four “care” actions related to Gardens.\nIt maintains the following conditions:\n VirtualGardenAPIServerAvailable: The /healthz endpoint of the garden’s virtual-garden-kube-apiserver is called and considered healthy when it responds with 200 OK. RuntimeComponentsHealthy: The conditions of the ManagedResources applied to the runtime cluster are checked (e.g., ResourcesApplied). VirtualComponentsHealthy: The virtual components are considered healthy when the respective Deployments (for example virtual-garden-kube-apiserver,virtual-garden-kube-controller-manager), and Etcds (for example virtual-garden-etcd-main) exist and are healthy. Additionally, the conditions of the ManagedResources applied to the virtual cluster are checked (e.g., ResourcesApplied). ObservabilityComponentsHealthy: This condition is considered healthy when the respective Deployments (for example plutono) and StatefulSets (for example prometheus, vali) exist and are healthy. If all checks for a certain condition are succeeded, then its status will be set to True. Otherwise, it will be set to False or Progressing.\nIf at least one check fails and there is threshold configuration for the conditions (in .controllers.gardenCare.conditionThresholds), then the status will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nIn order to compute the condition statuses, this reconciler considers ManagedResources (in the garden and istio-system namespace) and their status, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when RuntimeComponentsHealthy .spec.class=seed and care.gardener.cloud/condition-type label either unset, or set to RuntimeComponentsHealthy VirtualComponentsHealthy .spec.class unset or care.gardener.cloud/condition-type label set to VirtualComponentsHealthy ObservabilityComponentsHealthy care.gardener.cloud/condition-type label set to ObservabilityComponentsHealthy Reference Reconciler Garden objects may specify references to other objects in the Garden cluster which are required for certain features. For example, operators can configure a secret for ETCD backup via .spec.virtualCluster.etcd.main.backup.secretRef.name or an audit policy ConfigMap via .spec.virtualCluster.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef.name. Such objects need a special protection against deletion requests as long as they are still being referenced by the Garden.\nTherefore, this reconciler checks Gardens for referenced objects and adds the finalizer gardener.cloud/reference-protection to their .metadata.finalizers list. The reconciled Garden also gets this finalizer to enable a proper garbage collection in case the gardener-operator is offline at the moment of an incoming deletion request. When an object is not actively referenced anymore because the Garden specification has changed is in deletion, the controller will remove the added finalizer again so that the object can safely be deleted or garbage collected.\nThis reconciler inspects the following references:\n ETCD backup Secrets (.spec.virtualCluster.etcd.main.backup.secretRef) Admission plugin kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.admissionPlugins[].kubeconfigSecretName and .spec.virtualCluster.gardener.gardenerAPIServer.admissionPlugins[].kubeconfigSecretName) Authentication webhook kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.authentication.webhook.kubeconfigSecretName) Audit webhook kubeconfig Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.auditWebhook.kubeconfigSecretName and .spec.virtualCluster.gardener.gardenerAPIServer.auditWebhook.kubeconfigSecretName) SNI Secrets (.spec.virtualCluster.kubernetes.kubeAPIServer.sni.secretName) Audit policy ConfigMaps (.spec.virtualCluster.kubernetes.kubeAPIServer.auditConfig.auditPolicy.configMapRef.name and .spec.virtualCluster.gardener.gardenerAPIServer.auditConfig.auditPolicy.configMapRef.name) Further checks might be added in the future.\nController Registrar controller This controller registers controllers, which need to be installed in two contexts. If the Garden cluster is at the same time used as a Seed cluster, the gardener-operator will start these controllers. If the Garden cluster is separate from the Seed cluster, the controllers will be started by gardenlet.\nCurrently, this applies to two controllers:\n NetworkPolicy controller VPA EvictionRequirements controller The registration happens as soon as the Garden resource is created. It contains the networking information of the garden runtime cluster which is required configuration for the NetworkPolicy controller.\nExtension Controller Gardener relies on extensions to provide various capabilities, such as supporting cloud providers. This controller automates the management of extensions by managing all necessary resources in the runtime and virtual garden clusters.\nCurrently, this controller handles the following scenarios:\n Extension deployment in the runtime cluster Extension admission deployment for the virtual garden cluster. ControllerDeployment and ControllerRegistration reconciliation in the virtual garden cluster. Gardenlet Controller The Gardenlet controller reconciles a seedmanagement.gardener.cloud/v1alpha1.Gardenlet resource in case there is no Seed yet with the same name. This is used to allow easy deployments of gardenlets into unmanaged seed clusters. For a general overview, see this document.\nOn Gardenlet reconciliation, the controller deploys the gardenlet to the cluster (either its own, or the one provided via the .spec.kubeconfigSecretRef) after downloading the Helm chart specified in .spec.deployment.helm.ociRepository and rendering it with the provided values/configuration.\nOn Gardenlet deletion, nothing happens: gardenlets must always be deleted manually (by deleting the Seed and, once gone, then the gardenlet Deployment).\n [!NOTE] This controller only takes care of the very first gardenlet deployment (since it only reacts when there is no Seed resource yet). After the gardenlet is running, it uses the self-upgrade mechanism by watching the seedmanagement.gardener.cloud/v1alpha1.Gardenlet (see this for more details.)\nAfter a successful Garden reconciliation, gardener-operator also updates the .spec.deployment.helm.ociRepository.ref to its own version in all Gardenlet resources labeled with operator.gardener.cloud/auto-update-gardenlet-helm-chart-ref=true. gardenlets then updates themselves.\n⚠️ If you prefer to manage the Gardenlet resources via GitOps, Flux, or similar tools, then you should better manage the .spec.deployment.helm.ociRepository.ref field yourself and not label the resources as mentioned above (to prevent gardener-operator from interfering with your desired state). Make sure to apply your Gardenlet resources (potentially containing a new version) after the Garden resource was successfully reconciled (i.e., after Gardener control plane was successfully rolled out, see this for more information.)\n Webhooks As of today, the gardener-operator only has one webhook handler which is now described in more detail.\nValidation This webhook handler validates CREATE/UPDATE/DELETE operations on Garden resources. Simple validation is performed via standard CRD validation. However, more advanced validation is hard to express via these means and is performed by this webhook handler.\nFurthermore, for deletion requests, it is validated that the Garden is annotated with a deletion confirmation annotation, namely confirmation.gardener.cloud/deletion=true. Only if this annotation is present it allows the DELETE operation to pass. This prevents users from accidental/undesired deletions.\nAnother validation is to check that there is only one Garden resource at a time. It prevents creating a second Garden when there is already one in the system.\nDefaulting This webhook handler mutates the Garden resource on CREATE/UPDATE/DELETE operations. Simple defaulting is performed via standard CRD defaulting. However, more advanced defaulting is hard to express via these means and is performed by this webhook handler.\nUsing Garden Runtime Cluster As Seed Cluster In production scenarios, you probably wouldn’t use the Kubernetes cluster running gardener-operator and the Gardener control plane (called “runtime cluster”) as seed cluster at the same time. However, such setup is technically possible and might simplify certain situations (e.g., development, evaluation, …).\nIf the runtime cluster is a seed cluster at the same time, gardenlet’s Seed controller will not manage the components which were already deployed (and reconciled) by gardener-operator. As of today, this applies to:\n gardener-resource-manager vpa-{admission-controller,recommender,updater} hvpa-controller (when HVPA feature gate is enabled) etcd-druid istio control-plane nginx-ingress-controller Those components are so-called “seed system components”. In addition, there are a few observability components:\n fluent-operator fluent-bit vali plutono kube-state-metrics prometheus-operator As all of these components are managed by gardener-operator in this scenario, the gardenlet just skips them.\n ℹ️ There is no need to configure anything - the gardenlet will automatically detect when its seed cluster is the garden runtime cluster at the same time.\n ⚠️ Note that such setup requires that you upgrade the versions of gardener-operator and gardenlet in lock-step. Otherwise, you might experience unexpected behaviour or issues with your seed or shoot clusters.\nCredentials Rotation The credentials rotation works in the same way as it does for Shoot resources, i.e. there are gardener.cloud/operation annotation values for starting or completing the rotation procedures.\nFor certificate authorities, gardener-operator generates one which is automatically rotated roughly each month (ca-garden-runtime) and several CAs which are NOT automatically rotated but only on demand.\n🚨 Hence, it is the responsibility of the (human) operator to regularly perform the credentials rotation.\nPlease refer to this document for more details. As of today, gardener-operator only creates the following types of credentials (i.e., some sections of the document don’t apply for Gardens and can be ignored):\n certificate authorities (and related server and client certificates) ETCD encryption key observability password for Plutono ServiceAccount token signing key WorkloadIdentity token signing key ⚠️ Rotation of static ServiceAccount secrets is not supported since the kube-controller-manager does not enable the serviceaccount-token controller.\nWhen the ServiceAccount token signing key rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-garden-access-secrets. This causes gardenlet to populate new ServiceAccount tokens for the garden cluster to all extensions, which are now signed with the new signing key. Read more about it here.\nSimilarly, when the CA certificate rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-kubeconfig. This causes gardenlet to request a new client certificate for its garden cluster kubeconfig, which is now signed with the new client CA, and which also contains the new CA bundle for the server certificate verification. Read more about it here.\nAlso, when the WorkloadIdentity token signing key rotation is in Preparing phase, then gardener-operator annotates all Seeds with gardener.cloud/operation=renew-workload-identity-tokens. This causes gardenlet to renew all workload identity tokens in the seed cluster with new tokens now signed with the new signing key.\nMigrating an Existing Gardener Landscape to gardener-operator Since gardener-operator was only developed in 2023, six years after the Gardener project initiation, most users probably already have an existing Gardener landscape. The most prominent installation procedure is garden-setup, however experience shows that most community members have developed their own tooling for managing the garden cluster and the Gardener control plane components.\n Consequently, providing a general migration guide is not possible since the detailed steps vary heavily based on how the components were set up previously. As a result, this section can only highlight the most important caveats and things to know, while the concrete migration steps must be figured out individually based on the existing installation.\nPlease test your migration procedure thoroughly. Note that in some cases it can be easier to set up a fresh landscape with gardener-operator, restore the ETCD data, switch the DNS records, and issue new credentials for all clients.\n Please make sure that you configure all your desired fields in the Garden resource.\nETCD gardener-operator leverages etcd-druid for managing the virtual-garden-etcd-main and virtual-garden-etcd-events, similar to how shoot cluster control planes are handled. The PersistentVolumeClaim names differ slightly - for virtual-garden-etcd-events it’s virtual-garden-etcd-events-virtual-garden-etcd-events-0, while for virtual-garden-etcd-main it’s main-virtual-garden-etcd-virtual-garden-etcd-main-0. The easiest approach for the migration is to make your existing ETCD volumes follow the same naming scheme. Alternatively, backup your data, let gardener-operator take over ETCD, and then restore your data to the new volume.\nThe backup bucket must be created separately, and its name as well as the respective credentials must be provided via the Garden resource in .spec.virtualCluster.etcd.main.backup.\nvirtual-garden-kube-apiserver Deployment gardener-operator deploys a virtual-garden-kube-apiserver into the runtime cluster. This virtual-garden-kube-apiserver spans a new cluster, called the virtual cluster. There are a few certificates and other credentials that should not change during the migration. You have to prepare the environment accordingly by leveraging the secret’s manager capabilities.\n The existing Cluster CA Secret should be labeled with secrets-manager-use-data-for-name=ca. The existing Client CA Secret should be labeled with secrets-manager-use-data-for-name=ca-client. The existing Front Proxy CA Secret should be labeled with secrets-manager-use-data-for-name=ca-front-proxy. The existing Service Account Signing Key Secret should be labeled with secrets-manager-use-data-for-name=service-account-key. The existing ETCD Encryption Key Secret should be labeled with secrets-manager-use-data-for-name=kube-apiserver-etcd-encryption-key. virtual-garden-kube-apiserver Exposure The virtual-garden-kube-apiserver is exposed via a dedicated istio-ingressgateway deployed to namespace virtual-garden-istio-ingress. The virtual-garden-kube-apiserver Service in the garden namespace is only of type ClusterIP. Consequently, DNS records for this API server must target the load balancer IP of the istio-ingressgateway.\nVirtual Garden Kubeconfig gardener-operator does not generate any static token or likewise for access to the virtual cluster. Ideally, human users access it via OIDC only. Alternatively, you can create an auto-rotated token that you can use for automation like CI/CD pipelines:\napiVersion: v1 kind: Secret type: Opaque metadata: name: shoot-access-virtual-garden namespace: garden labels: resources.gardener.cloud/purpose: token-requestor resources.gardener.cloud/class: shoot annotations: serviceaccount.resources.gardener.cloud/name: virtual-garden-user serviceaccount.resources.gardener.cloud/namespace: kube-system serviceaccount.resources.gardener.cloud/token-expiration-duration: 3h --- apiVersion: v1 kind: Secret metadata: name: managedresource-virtual-garden-access namespace: garden type: Opaque stringData: clusterrolebinding____gardener.cloud.virtual-garden-access.yaml: |apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: gardener.cloud.sap:virtual-garden roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: virtual-garden-user namespace: kube-system --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: virtual-garden-access namespace: garden spec: secretRefs: - name: managedresource-virtual-garden-access The shoot-access-virtual-garden Secret will get a .data.token field which can be used to authenticate against the virtual garden cluster. See also this document for more information about the TokenRequestor.\ngardener-apiserver Similar to the virtual-garden-kube-apiserver, the gardener-apiserver also uses a few certificates and other credentials that should not change during the migration. Again, you have to prepare the environment accordingly by leveraging the secret’s manager capabilities.\n The existing ETCD Encryption Key Secret should be labeled with secrets-manager-use-data-for-name=gardener-apiserver-etcd-encryption-key. Also note that gardener-operator manages the Service and Endpoints resources for the gardener-apiserver in the virtual cluster within the kube-system namespace (garden-setup uses the garden namespace).\nLocal Development The easiest setup is using a local KinD cluster and the Skaffold based approach to deploy and develop the gardener-operator.\nSetting Up the KinD Cluster (runtime cluster) make kind-operator-up This command sets up a new KinD cluster named gardener-local and stores the kubeconfig in the ./example/gardener-local/kind/operator/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/operator/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/operator/kubeconfig.\n All the following steps assume that you are using this kubeconfig.\nSetting Up Gardener Operator make operator-up This will first build the base images (which might take a bit if you do it for the first time). Afterwards, the Gardener Operator resources will be deployed into the cluster.\nDeveloping Gardener Operator (Optional) make operator-dev This is similar to make operator-up but additionally starts a skaffold dev loop. After the initial deployment, skaffold starts watching source files. Once it has detected changes, press any key to trigger a new build and deployment of the changed components.\nDebugging Gardener Operator (Optional) make operator-debug This is similar to make gardener-debug but for Gardener Operator component. Please check Debugging Gardener for details.\nCreating a Garden In order to create a garden, just run:\nkubectl apply -f example/operator/20-garden.yaml You can wait for the Garden to be ready by running:\n./hack/usage/wait-for.sh garden local VirtualGardenAPIServerAvailable VirtualComponentsHealthy Alternatively, you can run kubectl get garden and wait for the RECONCILED status to reach True:\nNAME LAST OPERATION RUNTIME VIRTUAL API SERVER OBSERVABILITY AGE local Processing False False False False 1s (Optional): Instead of creating above Garden resource manually, you could execute the e2e tests by running:\nmake test-e2e-local-operator Accessing the Virtual Garden Cluster ⚠️ Please note that in this setup, the virtual garden cluster is not accessible by default when you download the kubeconfig and try to communicate with it. The reason is that your host most probably cannot resolve the DNS name of the cluster. Hence, if you want to access the virtual garden cluster, you have to run the following command which will extend your /etc/hosts file with the required information to make the DNS names resolvable:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Manually created to access local Gardener virtual garden cluster. # TODO: Remove this again when the virtual garden cluster access is no longer required. 172.18.255.3 api.virtual-garden.local.gardener.cloud EOF To access the virtual garden, you can acquire a kubeconfig by\nkubectl -n garden get secret gardener -o jsonpath={.data.kubeconfig} | base64 -d \u003e /tmp/virtual-garden-kubeconfig kubectl --kubeconfig /tmp/virtual-garden-kubeconfig get namespaces Note that this kubeconfig uses a token that has validity of 12h only, hence it might expire and causing you to re-download the kubeconfig.\nCreating Seeds and Shoots You can also create Seeds and Shoots from your local development setup. Please see here for details.\nDeleting the Garden ./hack/usage/delete garden local Tear Down the Gardener Operator Environment make operator-down make kind-operator-down ","categories":"","description":"Understand the component responsible for the garden cluster environment and its various features","excerpt":"Understand the component responsible for the garden cluster …","ref":"/docs/gardener/concepts/operator/","tags":"","title":"Gardener Operator"},{"body":"Overview Initially, the gardener-resource-manager was a project similar to the kube-addon-manager. It manages Kubernetes resources in a target cluster which means that it creates, updates, and deletes them. Also, it makes sure that manual modifications to these resources are reconciled back to the desired state.\nIn the Gardener project we were using the kube-addon-manager since more than two years. While we have progressed with our extensibility story (moving cloud providers out-of-tree), we had decided that the kube-addon-manager is no longer suitable for this use-case. The problem with it is that it needs to have its managed resources on its file system. This requires storing the resources in ConfigMaps or Secrets and mounting them to the kube-addon-manager pod during deployment time. The gardener-resource-manager uses CustomResourceDefinitions which allows to dynamically add, change, and remove resources with immediate action and without the need to reconfigure the volume mounts/restarting the pod.\nMeanwhile, the gardener-resource-manager has evolved to a more generic component comprising several controllers and webhook handlers. It is deployed by gardenlet once per seed (in the garden namespace) and once per shoot (in the respective shoot namespaces in the seed).\nComponent Configuration Similar to other Gardener components, the gardener-resource-manager uses a so-called component configuration file. It allows specifying certain central settings like log level and formatting, client connection configuration, server ports and bind addresses, etc. In addition, controllers and webhooks can be configured and sometimes even disabled.\nNote that the very basic ManagedResource and health controllers cannot be disabled.\nYou can find an example configuration file here.\nControllers ManagedResource Controller This controller watches custom objects called ManagedResources in the resources.gardener.cloud/v1alpha1 API group. These objects contain references to secrets, which itself contain the resources to be managed. The reason why a Secret is used to store the resources is that they could contain confidential information like credentials.\n--- apiVersion: v1 kind: Secret metadata: name: managedresource-example1 namespace: default type: Opaque data: objects.yaml: YXBpVmVyc2lvbjogdjEKa2luZDogQ29uZmlnTWFwCm1ldGFkYXRhOgogIG5hbWU6IHRlc3QtMTIzNAogIG5hbWVzcGFjZTogZGVmYXVsdAotLS0KYXBpVmVyc2lvbjogdjEKa2luZDogQ29uZmlnTWFwCm1ldGFkYXRhOgogIG5hbWU6IHRlc3QtNTY3OAogIG5hbWVzcGFjZTogZGVmYXVsdAo= # apiVersion: v1 # kind: ConfigMap # metadata: # name: test-1234 # namespace: default # --- # apiVersion: v1 # kind: ConfigMap # metadata: # name: test-5678 # namespace: default --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: example namespace: default spec: secretRefs: - name: managedresource-example1 In the above example, the controller creates two ConfigMaps in the default namespace. When a user is manually modifying them, they will be reconciled back to the desired state stored in the managedresource-example secret.\nIt is also possible to inject labels into all the resources:\n--- apiVersion: v1 kind: Secret metadata: name: managedresource-example2 namespace: default type: Opaque data: other-objects.yaml: YXBpVmVyc2lvbjogYXBwcy92MSAjIGZvciB2ZXJzaW9ucyBiZWZvcmUgMS45LjAgdXNlIGFwcHMvdjFiZXRhMgpraW5kOiBEZXBsb3ltZW50Cm1ldGFkYXRhOgogIG5hbWU6IG5naW54LWRlcGxveW1lbnQKc3BlYzoKICBzZWxlY3RvcjoKICAgIG1hdGNoTGFiZWxzOgogICAgICBhcHA6IG5naW54CiAgcmVwbGljYXM6IDIgIyB0ZWxscyBkZXBsb3ltZW50IHRvIHJ1biAyIHBvZHMgbWF0Y2hpbmcgdGhlIHRlbXBsYXRlCiAgdGVtcGxhdGU6CiAgICBtZXRhZGF0YToKICAgICAgbGFiZWxzOgogICAgICAgIGFwcDogbmdpbngKICAgIHNwZWM6CiAgICAgIGNvbnRhaW5lcnM6CiAgICAgIC0gbmFtZTogbmdpbngKICAgICAgICBpbWFnZTogbmdpbng6MS43LjkKICAgICAgICBwb3J0czoKICAgICAgICAtIGNvbnRhaW5lclBvcnQ6IDgwCg== # apiVersion: apps/v1 # kind: Deployment # metadata: # name: nginx-deployment # spec: # selector: # matchLabels: # app: nginx # replicas: 2 # tells deployment to run 2 pods matching the template # template: # metadata: # labels: # app: nginx # spec: # containers: # - name: nginx # image: nginx:1.7.9 # ports: # - containerPort: 80 --- apiVersion: resources.gardener.cloud/v1alpha1 kind: ManagedResource metadata: name: example namespace: default spec: secretRefs: - name: managedresource-example2 injectLabels: foo: bar In this example, the label foo=bar will be injected into the Deployment, as well as into all created ReplicaSets and Pods.\nPreventing Reconciliations If a ManagedResource is annotated with resources.gardener.cloud/ignore=true, then it will be skipped entirely by the controller (no reconciliations or deletions of managed resources at all). However, when the ManagedResource itself is deleted (for example when a shoot is deleted), then the annotation is not respected and all resources will be deleted as usual. This feature can be helpful to temporarily patch/change resources managed as part of such ManagedResource. Condition checks will be skipped for such ManagedResources.\nModes The gardener-resource-manager can manage a resource in the following supported modes:\n Ignore The corresponding resource is removed from the ManagedResource status (.status.resources). No action is performed on the cluster. The resource is no longer “managed” (updated or deleted). The primary use case is a migration of a resource from one ManagedResource to another one. The mode for a resource can be specified with the resources.gardener.cloud/mode annotation. The annotation should be specified in the encoded resource manifest in the Secret that is referenced by the ManagedResource.\nResource Class and Reconcilation Scope By default, the gardener-resource-manager controller watches for ManagedResources in all namespaces. The .sourceClientConnection.namespace field in the component configuration restricts the watch to ManagedResources in a single namespace only. Note that this setting also affects all other controllers and webhooks since it’s a central configuration.\nA ManagedResource has an optional .spec.class field that allows it to indicate that it belongs to a given class of resources. The .controllers.resourceClass field in the component configuration restricts the watch to ManagedResources with the given .spec.class. A default class is assumed if no class is specified.\nFor instance, the gardener-resource-manager which is deployed in the Shoot’s control plane namespace in the Seed does not specify a .spec.class and watches only for resources in the control plane namespace by specifying it in the .sourceClientConnection.namespace field.\nIf the .spec.class changes this means that the resources have to be handled by a different Gardener Resource Manager. That is achieved by:\n Cleaning all referenced resources by the Gardener Resource Manager that was responsible for the old class in its target cluster. Creating all referenced resources by the Gardener Resource Manager that is responsible for the new class in its target cluster. Conditions A ManagedResource has a ManagedResourceStatus, which has an array of Conditions. Conditions currently include:\n Condition Description ResourcesApplied True if all resources are applied to the target cluster ResourcesHealthy True if all resources are present and healthy ResourcesProgressing False if all resources have been fully rolled out ResourcesApplied may be False when:\n the resource apiVersion is not known to the target cluster the resource spec is invalid (for example the label value does not match the required regex for it) … ResourcesHealthy may be False when:\n the resource is not found the resource is a Deployment and the Deployment does not have the minimum availability. … ResourcesProgressing may be True when:\n a Deployment, StatefulSet or DaemonSet has not been fully rolled out yet, i.e. not all replicas have been updated with the latest changes to spec.template. there are still old Pods belonging to an older ReplicaSet of a Deployment which are not terminated yet. Each Kubernetes resources has different notion for being healthy. For example, a Deployment is considered healthy if the controller observed its current revision and if the number of updated replicas is equal to the number of replicas.\nThe following status.conditions section describes a healthy ManagedResource:\nconditions: - lastTransitionTime: \"2022-05-03T10:55:39Z\" lastUpdateTime: \"2022-05-03T10:55:39Z\" message: All resources are healthy. reason: ResourcesHealthy status: \"True\" type: ResourcesHealthy - lastTransitionTime: \"2022-05-03T10:55:36Z\" lastUpdateTime: \"2022-05-03T10:55:36Z\" message: All resources have been fully rolled out. reason: ResourcesRolledOut status: \"False\" type: ResourcesProgressing - lastTransitionTime: \"2022-05-03T10:55:18Z\" lastUpdateTime: \"2022-05-03T10:55:18Z\" message: All resources are applied. reason: ApplySucceeded status: \"True\" type: ResourcesApplied Ignoring Updates In some cases, it is not desirable to update or re-apply some of the cluster components (for example, if customization is required or needs to be applied by the end-user). For these resources, the annotation “resources.gardener.cloud/ignore” needs to be set to “true” or a truthy value (Truthy values are “1”, “t”, “T”, “true”, “TRUE”, “True”) in the corresponding managed resource secrets. This can be done from the components that create the managed resource secrets, for example Gardener extensions or Gardener. Once this is done, the resource will be initially created and later ignored during reconciliation.\nFinalizing Deletion of Resources After Grace Period When a ManagedResource is deleted, the controller deletes all managed resources from the target cluster. In case the resources still have entries in their .metadata.finalizers[] list, they will remain stuck in the system until another entity removes the finalizers. If you want the controller to forcefully finalize the deletion after some grace period (i.e., setting .metadata.finalizers=null), you can annotate the managed resources with resources.gardener.cloud/finalize-deletion-after=\u003cduration\u003e, e.g., resources.gardener.cloud/finalize-deletion-after=1h.\nPreserving replicas or resources in Workload Resources The objects which are part of the ManagedResource can be annotated with:\n resources.gardener.cloud/preserve-replicas=true in case the .spec.replicas field of workload resources like Deployments, StatefulSets, etc., shall be preserved during updates. resources.gardener.cloud/preserve-resources=true in case the .spec.containers[*].resources fields of all containers of workload resources like Deployments, StatefulSets, etc., shall be preserved during updates. This can be useful if there are non-standard horizontal/vertical auto-scaling mechanisms in place. Standard mechanisms like HorizontalPodAutoscaler or VerticalPodAutoscaler will be auto-recognized by gardener-resource-manager, i.e., in such cases the annotations are not needed.\n Origin All the objects managed by the resource manager get a dedicated annotation resources.gardener.cloud/origin describing the ManagedResource object that describes this object. The default format is \u003cnamespace\u003e/\u003cobjectname\u003e.\nIn multi-cluster scenarios (the ManagedResource objects are maintained in a cluster different from the one the described objects are managed), it might be useful to include the cluster identity, as well.\nThis can be enforced by setting the .controllers.clusterID field in the component configuration. Here, several possibilities are supported:\n given a direct value: use this as id for the source cluster. \u003ccluster\u003e: read the cluster identity from a cluster-identity config map in the kube-system namespace (attribute cluster-identity). This is automatically maintained in all clusters managed or involved in a gardener landscape. \u003cdefault\u003e: try to read the cluster identity from the config map. If not found, no identity is used. empty string: no cluster identity is used (completely cluster local scenarios). By default, cluster id is not used. If cluster id is specified, the format is \u003ccluster id\u003e:\u003cnamespace\u003e/\u003cobjectname\u003e.\nIn addition to the origin annotation, all objects managed by the resource manager get a dedicated label resources.gardener.cloud/managed-by. This label can be used to describe these objects with a selector. By default it is set to “gardener”, but this can be overwritten by setting the .conrollers.managedResources.managedByLabelValue field in the component configuration.\nCompression The number and size of manifests for a ManagedResource can accumulate to a considerable amount which leads to increased Secret data. A decent compression algorithm helps to reduce the footprint of such Secrets and the load they put on etcd, the kube-apiserver, and client caches. We found Brotli to be a suitable candidate for most use cases (see comparison table here). When the gardener-resource-manager detects a data key with the known suffix .br, it automatically un-compresses the data first before processing the contained manifest.\nhealth Controller This controller processes ManagedResources that were reconciled by the main ManagedResource Controller at least once. Its main job is to perform checks for maintaining the well known conditions ResourcesHealthy and ResourcesProgressing.\nProgressing Checks In Kubernetes, applied changes must usually be rolled out first, e.g. when changing the base image in a Deployment. Progressing checks detect ongoing roll-outs and report them in the ResourcesProgressing condition of the corresponding ManagedResource.\nThe following object kinds are considered for progressing checks:\n DaemonSet Deployment StatefulSet Prometheus Alertmanager Certificate Issuer Health Checks gardener-resource-manager can evaluate the health of specific resources, often by consulting their conditions. Health check results are regularly updated in the ResourcesHealthy condition of the corresponding ManagedResource.\nThe following object kinds are considered for health checks:\n CustomResourceDefinition DaemonSet Deployment Job Pod ReplicaSet ReplicationController Service StatefulSet VerticalPodAutoscaler Prometheus Alertmanager Certificate Issuer Skipping Health Check If a resource owned by a ManagedResource is annotated with resources.gardener.cloud/skip-health-check=true, then the resource will be skipped during health checks by the health controller. The ManagedResource conditions will not reflect the health condition of this resource anymore. The ResourcesProgressing condition will also be set to False.\nGarbage Collector For Immutable ConfigMaps/Secrets In Kubernetes, workload resources (e.g., Pods) can mount ConfigMaps or Secrets or reference them via environment variables in containers. Typically, when the content of such a ConfigMap/Secret gets changed, then the respective workload is usually not dynamically reloading the configuration, i.e., a restart is required. The most commonly used approach is probably having the so-called checksum annotations in the pod template, which makes Kubernetes recreate the pod if the checksum changes. However, it has the downside that old, still running versions of the workload might not be able to properly work with the already updated content in the ConfigMap/Secret, potentially causing application outages.\nIn order to protect users from such outages (and also to improve the performance of the cluster), the Kubernetes community provides the “immutable ConfigMaps/Secrets feature”. Enabling immutability requires ConfigMaps/Secrets to have unique names. Having unique names requires the client to delete ConfigMaps/Secrets no longer in use.\nIn order to provide a similarly lightweight experience for clients (compared to the well-established checksum annotation approach), the gardener-resource-manager features an optional garbage collector controller (disabled by default). The purpose of this controller is cleaning up such immutable ConfigMaps/Secrets if they are no longer in use.\nHow Does the Garbage Collector Work? The following algorithm is implemented in the GC controller:\n List all ConfigMaps and Secrets labeled with resources.gardener.cloud/garbage-collectable-reference=true. List all Deployments, StatefulSets, DaemonSets, Jobs, CronJobs, Pods, ManagedResources and for each of them: iterate over the .metadata.annotations and for each of them: If the annotation key follows the reference.resources.gardener.cloud/{configmap,secret}-\u003chash\u003e scheme and the value equals \u003cname\u003e, then consider it as “in-use”. Delete all ConfigMaps and Secrets not considered as “in-use”. Consequently, clients need to:\n Create immutable ConfigMaps/Secrets with unique names (e.g., a checksum suffix based on the .data).\n Label such ConfigMaps/Secrets with resources.gardener.cloud/garbage-collectable-reference=true.\n Annotate their workload resources with reference.resources.gardener.cloud/{configmap,secret}-\u003chash\u003e=\u003cname\u003e for all ConfigMaps/Secrets used by the containers of the respective Pods.\n⚠️ Add such annotations to .metadata.annotations, as well as to all templates of other resources (e.g., .spec.template.metadata.annotations in Deployments or .spec.jobTemplate.metadata.annotations and .spec.jobTemplate.spec.template.metadata.annotations for CronJobs. This ensures that the GC controller does not unintentionally consider ConfigMaps/Secrets as “not in use” just because there isn’t a Pod referencing them anymore (e.g., they could still be used by a Deployment scaled down to 0).\n ℹ️ For the last step, there is a helper function InjectAnnotations in the pkg/controller/garbagecollector/references, which you can use for your convenience.\nExample:\n--- apiVersion: v1 kind: ConfigMap metadata: name: test-1234 namespace: default labels: resources.gardener.cloud/garbage-collectable-reference: \"true\" --- apiVersion: v1 kind: ConfigMap metadata: name: test-5678 namespace: default labels: resources.gardener.cloud/garbage-collectable-reference: \"true\" --- apiVersion: v1 kind: Pod metadata: name: example namespace: default annotations: reference.resources.gardener.cloud/configmap-82a3537f: test-5678 spec: containers: - name: nginx image: nginx:1.14.2 terminationGracePeriodSeconds: 2 The GC controller would delete the ConfigMap/test-1234 because it is considered as not “in-use”.\nℹ️ If the GC controller is activated then the ManagedResource controller will no longer delete ConfigMaps/Secrets having the above label.\nHow to Activate the Garbage Collector? The GC controller can be activated by setting the .controllers.garbageCollector.enabled field to true in the component configuration.\nTokenInvalidator Controller The Kubernetes community is slowly transitioning from static ServiceAccount token Secrets to ServiceAccount Token Volume Projection. Typically, when you create a ServiceAccount\napiVersion: v1 kind: ServiceAccount metadata: name: default then the serviceaccount-token controller (part of kube-controller-manager) auto-generates a Secret with a static token:\napiVersion: v1 kind: Secret metadata: annotations: kubernetes.io/service-account.name: default kubernetes.io/service-account.uid: 86e98645-2e05-11e9-863a-b2d4d086dd5a) name: default-token-ntxs9 type: kubernetes.io/service-account-token data: ca.crt: base64(cluster-ca-cert) namespace: base64(namespace) token: base64(static-jwt-token) Unfortunately, when using ServiceAccount Token Volume Projection in a Pod, this static token is actually not used at all:\napiVersion: v1 kind: Pod metadata: name: nginx spec: serviceAccountName: default containers: - image: nginx name: nginx volumeMounts: - mountPath: /var/run/secrets/tokens name: token volumes: - name: token projected: sources: - serviceAccountToken: path: token expirationSeconds: 7200 While the Pod is now using an expiring and auto-rotated token, the static token is still generated and valid.\nThere is neither a way of preventing kube-controller-manager to generate such static tokens, nor a way to proactively remove or invalidate them:\n https://github.com/kubernetes/kubernetes/issues/77599 https://github.com/kubernetes/kubernetes/issues/77600 Disabling the serviceaccount-token controller is an option, however, especially in the Gardener context it may either break end-users or it may not even be possible to control such settings. Also, even if a future Kubernetes version supports native configuration of the above behaviour, Gardener still supports older versions which won’t get such features but need a solution as well.\nThis is where the TokenInvalidator comes into play: Since it is not possible to prevent kube-controller-manager from generating static ServiceAccount Secrets, the TokenInvalidator is, as its name suggests, just invalidating these tokens. It considers all such Secrets belonging to ServiceAccounts with .automountServiceAccountToken=false. By default, all namespaces in the target cluster are watched, however, this can be configured by specifying the .targetClientConnection.namespace field in the component configuration. Note that this setting also affects all other controllers and webhooks since it’s a central configuration.\napiVersion: v1 kind: ServiceAccount metadata: name: my-serviceaccount automountServiceAccountToken: false This will result in a static ServiceAccount token secret whose token value is invalid:\napiVersion: v1 kind: Secret metadata: annotations: kubernetes.io/service-account.name: my-serviceaccount kubernetes.io/service-account.uid: 86e98645-2e05-11e9-863a-b2d4d086dd5a name: my-serviceaccount-token-ntxs9 type: kubernetes.io/service-account-token data: ca.crt: base64(cluster-ca-cert) namespace: base64(namespace) token: AAAA Any attempt to regenerate the token or creating a new such secret will again make the component invalidating it.\n You can opt-out of this behaviour for ServiceAccounts setting .automountServiceAccountToken=false by labeling them with token-invalidator.resources.gardener.cloud/skip=true.\n In order to enable the TokenInvalidator you have to set both .controllers.tokenValidator.enabled=true and .webhooks.tokenValidator.enabled=true in the component configuration.\nThe below graphic shows an overview of the Token Invalidator for Service account secrets in the Shoot cluster. TokenRequestor Controller This controller provides the service to create and auto-renew tokens via the TokenRequest API.\nIt provides a functionality similar to the kubelet’s Service Account Token Volume Projection. It was created to handle the special case of issuing tokens to pods that run in a different cluster than the API server they communicate with (hence, using the native token volume projection feature is not possible).\nThe controller differentiates between source cluster and target cluster. The source cluster hosts the gardener-resource-manager pod. Secrets in this cluster are watched and modified by the controller. The target cluster can be configured to point to another cluster. The existence of ServiceAccounts are ensured and token requests are issued against the target. When the gardener-resource-manager is deployed next to the Shoot’s controlplane in the Seed, the source cluster is the Seed while the target cluster points to the Shoot.\nReconciliation Loop This controller reconciles Secrets in all namespaces in the source cluster with the label: resources.gardener.cloud/purpose=token-requestor. See this YAML file for an example of the secret.\nThe controller ensures a ServiceAccount exists in the target cluster as specified in the annotations of the Secret in the source cluster:\nserviceaccount.resources.gardener.cloud/name: \u003csa-name\u003e serviceaccount.resources.gardener.cloud/namespace: \u003csa-namespace\u003e You can optionally annotate the Secret with serviceaccount.resources.gardener.cloud/labels, e.g. serviceaccount.resources.gardener.cloud/labels={\"some\":\"labels\",\"foo\":\"bar\"}. This will make the ServiceAccount getting labelled accordingly.\nThe requested tokens will act with the privileges which are assigned to this ServiceAccount.\nThe controller will then request a token via the TokenRequest API and populate it into the .data.token field to the Secret in the source cluster.\nAlternatively, the client can provide a raw kubeconfig (in YAML or JSON format) via the Secret’s .data.kubeconfig field. The controller will then populate the requested token in the kubeconfig for the user used in the .current-context. For example, if .data.kubeconfig is\napiVersion: v1 clusters: - cluster: certificate-authority-data: AAAA server: some-server-url name: shoot--foo--bar contexts: - context: cluster: shoot--foo--bar user: shoot--foo--bar-token name: shoot--foo--bar current-context: shoot--foo--bar kind: Config preferences: {} users: - name: shoot--foo--bar-token user: token: \"\" then the .users[0].user.token field of the kubeconfig will be updated accordingly.\nThe controller also adds an annotation to the Secret to keep track when to renew the token before it expires. By default, the tokens are issued to expire after 12 hours. The expiration time can be set with the following annotation:\nserviceaccount.resources.gardener.cloud/token-expiration-duration: 6h It automatically renews once 80% of the lifetime is reached, or after 24h.\nOptionally, the controller can also populate the token into a Secret in the target cluster. This can be requested by annotating the Secret in the source cluster with:\ntoken-requestor.resources.gardener.cloud/target-secret-name: \"foo\" token-requestor.resources.gardener.cloud/target-secret-namespace: \"bar\" Overall, the TokenRequestor controller provides credentials with limited lifetime (JWT tokens) used by Shoot control plane components running in the Seed to talk to the Shoot API Server. Please see the graphic below:\n ℹ️ Generally, the controller can run with multiple instances in different components. For example, gardener-resource-manager might run the TokenRequestor controller, but gardenlet might run it, too. In order to differentiate which instance of the controller is responsible for a Secret, it can be labeled with resources.gardener.cloud/class=\u003cclass\u003e. The \u003cclass\u003e must be configured in the respective controller, otherwise it will be responsible for all Secrets no matter whether they have the label or not.\n Kubelet Server CertificateSigningRequest Approver Gardener configures the kubelets such that they request two certificates via the CertificateSigningRequest API:\n client certificate for communicating with the kube-apiserver server certificate for serving its HTTPS server For client certificates, the kubernetes.io/kube-apiserver-client-kubelet signer is used (see Certificate Signing Requests for more details). The kube-controller-manager’s csrapprover controller is responsible for auto-approving such CertificateSigningRequests so that the respective certificates can be issued.\nFor server certificates, the kubernetes.io/kubelet-serving signer is used. Unfortunately, the kube-controller-manager is not able to auto-approve such CertificateSigningRequests (see kubernetes/kubernetes#73356 for details).\nThat’s the motivation for having this controller as part of gardener-resource-manager. It watches CertificateSigningRequests with the kubernetes.io/kubelet-serving signer and auto-approves them when all the following conditions are met:\n The .spec.username is prefixed with system:node:. There must be at least one DNS name or IP address as part of the certificate SANs. The common name in the CSR must match the .spec.username. The organization in the CSR must only contain system:nodes. There must be a Node object with the same name in the shoot cluster. There must be exactly one Machine for the node in the seed cluster. The DNS names part of the SANs must be equal to all .status.addresses[] of type Hostname in the Node. The IP addresses part of the SANs must be equal to all .status.addresses[] of type InternalIP in the Node. If any one of these requirements is violated, the CertificateSigningRequest will be denied. Otherwise, once approved, the kube-controller-manager’s csrsigner controller will issue the requested certificate.\nNetworkPolicy Controller This controller reconciles Services with a non-empty .spec.podSelector. It creates two NetworkPolicys for each port in the .spec.ports[] list. For example:\napiVersion: v1 kind: Service metadata: name: gardener-resource-manager namespace: a spec: selector: app: gardener-resource-manager ports: - name: server port: 443 protocol: TCP targetPort: 10250 leads to\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from pods running in namespace a labeled with map[networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250:allowed]. name: ingress-to-gardener-resource-manager-tcp-10250 namespace: a spec: ingress: - from: - podSelector: matchLabels: networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250: allowed ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in namespace a labeled with map[networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250:allowed] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-gardener-resource-manager-tcp-10250 namespace: a spec: egress: - to: - podSelector: matchLabels: app: gardener-resource-manager ports: - port: 10250 protocol: TCP podSelector: matchLabels: networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250: allowed policyTypes: - Egress A component that initiates the connection to gardener-resource-manager’s tcp/10250 port can now be labeled with networking.resources.gardener.cloud/to-gardener-resource-manager-tcp-10250=allowed. That’s all this component needs to do - it does not need to create any NetworkPolicys itself.\nCross-Namespace Communication Apart from this “simple” case where both communicating components run in the same namespace a, there is also the cross-namespace communication case. With above example, let’s say there are components running in another namespace b, and they would like to initiate the communication with gardener-resource-manager in a. To cover this scenario, the Service can be annotated with networking.resources.gardener.cloud/namespace-selectors='[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"b\"}}]'.\n Note that you can specify multiple namespace selectors in this annotation which are OR-ed.\n This will make the controller create additional NetworkPolicys as follows:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from pods running in namespace b labeled with map[networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250:allowed]. name: ingress-to-gardener-resource-manager-tcp-10250-from-b namespace: a spec: ingress: - from: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: b podSelector: matchLabels: networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250: allowed ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in namespace b labeled with map[networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250:allowed] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-a-gardener-resource-manager-tcp-10250 namespace: b spec: egress: - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: a podSelector: matchLabels: app: gardener-resource-manager ports: - port: 10250 protocol: TCP podSelector: matchLabels: networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250: allowed policyTypes: - Egress The components in namespace b now need to be labeled with networking.resources.gardener.cloud/to-a-gardener-resource-manager-tcp-10250=allowed, but that’s already it.\n Obviously, this approach also works for namespace selectors different from kubernetes.io/metadata.name to cover scenarios where the namespace name is not known upfront or where multiple namespaces with a similar label are relevant. The controller creates two dedicated policies for each namespace matching the selectors.\n Service Targets In Multiple Namespaces Finally, let’s say there is a Service called example which exists in different namespaces whose names are not static (e.g., foo-1, foo-2), and a component in namespace bar wants to initiate connections with all of them.\nThe example Services in these namespaces can now be annotated with networking.resources.gardener.cloud/namespace-selectors='[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"bar\"}}]'. As a consequence, the component in namespace bar now needs to be labeled with networking.resources.gardener.cloud/to-foo-1-example-tcp-8080=allowed, networking.resources.gardener.cloud/to-foo-2-example-tcp-8080=allowed, etc. This approach does not work in practice, however, since the namespace names are neither static nor known upfront.\nTo overcome this, it is possible to specify an alias for the concrete namespace in the pod label selector via the networking.resources.gardener.cloud/pod-label-selector-namespace-alias annotation.\nIn above case, the example Service in the foo-* namespaces could be annotated with networking.resources.gardener.cloud/pod-label-selector-namespace-alias=all-foos. This would modify the label selector in all NetworkPolicys related to cross-namespace communication, i.e. instead of networking.resources.gardener.cloud/to-foo-{1,2,...}-example-tcp-8080=allowed, networking.resources.gardener.cloud/to-all-foos-example-tcp-8080=allowed would be used. Now the component in namespace bar only needs this single label and is able to talk to all such Services in the different namespaces.\n Real-world examples for this scenario are the kube-apiserver Service (which exists in all shoot namespaces), or the istio-ingressgateway Service (which exists in all istio-ingress* namespaces). In both cases, the names of the namespaces are not statically known and depend on user input.\n Overwriting The Pod Selector Label For a component which initiates the connection to many other components, it’s sometimes impractical to specify all the respective labels in its pod template. For example, let’s say a component foo talks to bar{0..9} on ports tcp/808{0..9}. foo would need to have the ten networking.resources.gardener.cloud/to-bar{0..9}-tcp-808{0..9}=allowed labels.\nAs an alternative and to simplify this, it is also possible to annotate the targeted Services with networking.resources.gardener.cloud/from-\u003csome-alias\u003e-allowed-ports. For our example, \u003csome-alias\u003e could be all-bars.\nAs a result, component foo just needs to have the label networking.resources.gardener.cloud/to-all-bars=allowed instead of all the other ten explicit labels.\n⚠️ Note that this also requires to specify the list of allowed container ports as annotation value since the pod selector label will no longer be specific for a dedicated service/port. For our example, the Service for barX with X in {0..9} needs to be annotated with networking.resources.gardener.cloud/from-all-bars-allowed-ports=[{\"port\":808X,\"protocol\":\"TCP\"}] in addition.\n Real-world examples for this scenario are the Prometheis in seed clusters which initiate the communication to a lot of components in order to scrape their metrics. Another example is the kube-apiserver which initiates the communication to webhook servers (potentially of extension components that are not known by Gardener itself).\n Ingress From Everywhere All above scenarios are about components initiating connections to some targets. However, some components also receive incoming traffic from sources outside the cluster. This traffic requires adequate ingress policies so that it can be allowed.\nTo cover this scenario, the Service can be annotated with networking.resources.gardener.cloud/from-world-to-ports=[{\"port\":\"10250\",\"protocol\":\"TCP\"}]. As a result, the controller creates the following NetworkPolicy:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: ingress-to-gardener-resource-manager-from-world namespace: a spec: ingress: - from: - namespaceSelector: {} podSelector: {} - ipBlock: cidr: 0.0.0.0/0 - ipBlock: cidr: ::/0 ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress The respective pods don’t need any additional labels. If the annotation’s value is empty ([]) then all ports are allowed.\nServices Exposed via Ingress Resources The controller can optionally be configured to watch Ingress resources by specifying the pod and namespace selectors for the Ingress controller. If this information is provided, it automatically creates NetworkPolicy resources allowing the respective ingress/egress traffic for the backends exposed by the Ingresses. This way, neither custom NetworkPolicys nor custom labels must be provided.\nThe needed configuration is part of the component configuration:\ncontrollers: networkPolicy: enabled: true concurrentSyncs: 5 # namespaceSelectors: # - matchLabels: # kubernetes.io/metadata.name: default ingressControllerSelector: namespace: default podSelector: matchLabels: foo: bar As an example, let’s assume that above gardener-resource-manager Service was exposed via the following Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: gardener-resource-manager namespace: a spec: rules: - host: grm.foo.example.com http: paths: - backend: service: name: gardener-resource-manager port: number: 443 path: / pathType: Prefix As a result, the controller would automatically create the following NetworkPolicys:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows ingress TCP traffic to port 10250 for pods selected by the a/gardener-resource-manager service selector from ingress controller pods running in the default namespace labeled with map[foo:bar]. name: ingress-to-gardener-resource-manager-tcp-10250-from-ingress-controller namespace: a spec: ingress: - from: - podSelector: matchLabels: foo: bar namespaceSelector: matchLabels: kubernetes.io/metadata.name: default ports: - port: 10250 protocol: TCP podSelector: matchLabels: app: gardener-resource-manager policyTypes: - Ingress --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: annotations: gardener.cloud/description: Allows egress TCP traffic to port 10250 from pods running in the default namespace labeled with map[foo:bar] to pods selected by the a/gardener-resource-manager service selector. name: egress-to-a-gardener-resource-manager-tcp-10250-from-ingress-controller namespace: default spec: egress: - to: - podSelector: matchLabels: app: gardener-resource-manager namespaceSelector: matchLabels: kubernetes.io/metadata.name: a ports: - port: 10250 protocol: TCP podSelector: matchLabels: foo: bar policyTypes: - Egress ℹ️ Note that Ingress resources reference the service port while NetworkPolicys reference the target port/container port. The controller automatically translates this when reconciling the NetworkPolicy resources.\n Node Controller Critical Components Controller Gardenlet configures kubelet of shoot worker nodes to register the Node object with the node.gardener.cloud/critical-components-not-ready taint (effect NoSchedule). This controller watches newly created Node objects in the shoot cluster and removes the taint once all node-critical components are scheduled and ready. If the controller finds node-critical components that are not scheduled or not ready yet, it checks the Node again after the duration configured in ResourceManagerConfiguration.controllers.node.backoff Please refer to the feature documentation or proposal issue for more details.\nNode Agent Reconciliation Delay Controller This controller computes a reconciliation delay per node by using a simple linear mapping approach based on the index of the nodes in the list of all nodes in the shoot cluster. This approach ensures that the delays of all instances of gardener-node-agent are distributed evenly.\nThe minimum and maximum delays can be configured, but they are defaulted to 0s and 5m, respectively.\nThis approach works well as long as the number of nodes in the cluster is not higher than the configured maximum delay in seconds. In this case, the delay is still computed linearly, however, the more nodes exist in the cluster, the closer the delay times become (which might be of limited use then). Consider increasing the maximum delay by annotating the Shoot with shoot.gardener.cloud/cloud-config-execution-max-delay-seconds=\u003cvalue\u003e. The highest possible value is 1800.\nThe controller adds the node-agent.gardener.cloud/reconciliation-delay annotation to nodes whose value is read by the node-agents.\nWebhooks Mutating Webhooks High Availability Config This webhook is used to conveniently apply the configuration to make components deployed to seed or shoot clusters highly available. The details and scenarios are described in High Availability Of Deployed Components.\nThe webhook reacts on creation/update of Deployments, StatefulSets, HorizontalPodAutoscalers and HVPAs in namespaces labeled with high-availability-config.resources.gardener.cloud/consider=true.\nThe webhook performs the following actions:\n The .spec.replicas (or spec.minReplicas respectively) field is mutated based on the high-availability-config.resources.gardener.cloud/type label of the resource and the high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation of the namespace:\n Failure Tolerance Type ➡️\n/\n⬇️ Component Type️ ️ unset empty non-empty controller 2 1 2 server 2 2 2 The replica count values can be overwritten by the high-availability-config.resources.gardener.cloud/replicas annotation. It does NOT mutate the replicas when: the replicas are already set to 0 (hibernation case), or when the resource is scaled horizontally by HorizontalPodAutoscaler or Hvpa, and the current replica count is higher than what was computed above. When the high-availability-config.resources.gardener.cloud/zones annotation is NOT empty and either the high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation is set or the high-availability-config.resources.gardener.cloud/zone-pinning annotation is set to true, then it adds a node affinity to the pod template spec:\nspec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - \u003czone1\u003e # - ... This ensures that all pods are pinned to only nodes in exactly those concrete zones.\n Topology Spread Constraints are added to the pod template spec when the .spec.replicas are greater than 1. When the high-availability-config.resources.gardener.cloud/zones annotation …\n … contains only one zone, then the following is added:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname minDomains: 3 # lower value of max replicas or 3 maxSkew: 1 whenUnsatisfiable: ScheduleAnyway labelSelector: ... This ensures that the (multiple) pods are scheduled across nodes. minDomains is set when failure tolerance is configured or annotation high-availability-config.resources.gardener.cloud/host-spread=\"true\" is given.\n … contains at least two zones, then the following is added:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname maxSkew: 1 whenUnsatisfiable: ScheduleAnyway labelSelector: ... - topologyKey: topology.kubernetes.io/zone minDomains: 2 # lower value of max replicas or number of zones maxSkew: 1 whenUnsatisfiable: DoNotSchedule labelSelector: ... This enforces that the (multiple) pods are scheduled across zones. It circumvents a known limitation in Kubernetes for clusters \u003c 1.26 (ref kubernetes/kubernetes#109364. In case the number of replicas is larger than twice the number of zones, then the maxSkew=2 for the second spread constraints. The minDomains calculation is based on whatever value is lower - (maximum) replicas or number of zones. This is the number of minimum domains required to schedule pods in a highly available manner.\n Independent on the number of zones, when one of the following conditions is true, then the field whenUnsatisfiable is set to DoNotSchedule for the constraint with topologyKey=kubernetes.io/hostname (which enforces the node-spread):\n The high-availability-config.resources.gardener.cloud/host-spread annotation is set to true. The high-availability-config.resources.gardener.cloud/failure-tolerance-type annotation is set and NOT empty. Adds default tolerations for taint-based evictions:\nTolerations for taints node.kubernetes.io/not-ready and node.kubernetes.io/unreachable are added to the handled Deployment and StatefulSet if their podTemplates do not already specify them. The TolerationSeconds are taken from the respective configuration section of the webhook’s configuration (see example)).\nWe consider fine-tuned values for those tolerations a matter of high-availability because they often help to reduce recovery times in case of node or zone outages, also see High-Availability Best Practices. In addition, this webhook handling helps to set defaults for many but not all workload components in a cluster. For instance, Gardener can use this webhook to set defaults for nearly every component in seed clusters but only for the system components in shoot clusters. Any customer workload remains unchanged.\n Kubernetes Service Host Injection By default, when Pods are created, Kubernetes implicitly injects the KUBERNETES_SERVICE_HOST environment variable into all containers. The value of this variable points it to the default Kubernetes service (i.e., kubernetes.default.svc.cluster.local). This allows pods to conveniently talk to the API server of their cluster.\nIn shoot clusters, this network path involves the apiserver-proxy DaemonSet which eventually forwards the traffic to the API server. Hence, it results in additional network hop.\nThe purpose of this webhook is to explicitly inject the KUBERNETES_SERVICE_HOST environment variable into all containers and setting its value to the FQDN of the API server. This way, the additional network hop is avoided.\nAuto-Mounting Projected ServiceAccount Tokens When this webhook is activated, then it automatically injects projected ServiceAccount token volumes into Pods and all its containers if all of the following preconditions are fulfilled:\n The Pod is NOT labeled with projected-token-mount.resources.gardener.cloud/skip=true. The Pod’s .spec.serviceAccountName field is NOT empty and NOT set to default. The ServiceAccount specified in the Pod’s .spec.serviceAccountName sets .automountServiceAccountToken=false. The Pod’s .spec.volumes[] DO NOT already contain a volume with a name prefixed with kube-api-access-. The projected volume will look as follows:\nspec: volumes: - name: kube-api-access-gardener projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 43200 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace The expirationSeconds are defaulted to 12h and can be overwritten with the .webhooks.projectedTokenMount.expirationSeconds field in the component configuration, or with the projected-token-mount.resources.gardener.cloud/expiration-seconds annotation on a Pod resource.\n The volume will be mounted into all containers specified in the Pod to the path /var/run/secrets/kubernetes.io/serviceaccount. This is the default location where client libraries expect to find the tokens and mimics the upstream ServiceAccount admission plugin. See Managing Service Accounts for more information.\nOverall, this webhook is used to inject projected service account tokens into pods running in the Shoot and the Seed cluster. Hence, it is served from the Seed GRM and each Shoot GRM. Please find an overview below for pods deployed in the Shoot cluster:\nPod Topology Spread Constraints When this webhook is enabled, then it mimics the topologyKey feature for Topology Spread Constraints (TSC) on the label pod-template-hash. Concretely, when a pod is labelled with pod-template-hash, the handler of this webhook extends any topology spread constraint in the pod:\nmetadata: labels: pod-template-hash: 123abc spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: pod-template-hash: 123abc # added by webhook The procedure circumvents a known limitation with TSCs which leads to imbalanced deployments after rolling updates. Gardener enables this webhook to schedule pods of deployments across nodes and zones.\nPlease note that the gardener-resource-manager itself as well as pods labelled with topology-spread-constraints.resources.gardener.cloud/skip are excluded from any mutations.\nSystem Components Webhook If enabled, this webhook handles scheduling concerns for system components Pods (except those managed by DaemonSets). The following tasks are performed by this webhook:\n Add pod.spec.nodeSelector as given in the webhook configuration. Add pod.spec.tolerations as given in the webhook configuration. Add pod.spec.tolerations for any existing nodes matching the node selector given in the webhook configuration. Known taints and tolerations used for taint based evictions are disregarded. Gardener enables this webhook for kube-system and kubernetes-dashboard namespaces in shoot clusters, selecting Pods being labelled with resources.gardener.cloud/managed-by: gardener. It adds a configuration, so that Pods will get the worker.gardener.cloud/system-components: true node selector (step 1) as well as tolerate any custom taint (step 2) that is added to system component worker nodes (shoot.spec.provider.workers[].systemComponents.allow: true). In addition, the webhook merges these tolerations with the ones required for at that time available system component Nodes in the cluster (step 3). Both is required to ensure system component Pods can be scheduled or executed during an active shoot reconciliation that is happening due to any modifications to shoot.spec.provider.workers[].taints, e.g. Pods must be scheduled while there are still Nodes not having the updated taint configuration.\n You can opt-out of this behaviour for Pods by labeling them with system-components-config.resources.gardener.cloud/skip=true.\n EndpointSlice Hints This webhook mutates EndpointSlices. For each endpoint in the EndpointSlice, it sets the endpoint’s hints to the endpoint’s zone.\napiVersion: discovery.k8s.io/v1 kind: EndpointSlice metadata: name: example-hints endpoints: - addresses: - \"10.1.2.3\" conditions: ready: true hostname: pod-1 zone: zone-a hints: forZones: - name: \"zone-a\" # added by webhook - addresses: - \"10.1.2.4\" conditions: ready: true hostname: pod-2 zone: zone-b hints: forZones: - name: \"zone-b\" # added by webhook The webhook aims to circumvent issues with the Kubernetes TopologyAwareHints feature that currently does not allow to achieve a deterministic topology-aware traffic routing. For more details, see the following issue kubernetes/kubernetes#113731 that describes drawbacks of the TopologyAwareHints feature for our use case. If the above-mentioned issue gets resolved and there is a native support for deterministic topology-aware traffic routing in Kubernetes, then this webhook can be dropped in favor of the native Kubernetes feature.\nValidating Webhooks Unconfirmed Deletion Prevention For Custom Resources And Definitions As part of Gardener’s extensibility concepts, a lot of CustomResourceDefinitions are deployed to the seed clusters that serve as extension points for provider-specific controllers. For example, the Infrastructure CRD triggers the provider extension to prepare the IaaS infrastructure of the underlying cloud provider for a to-be-created shoot cluster. Consequently, these extension CRDs have a lot of power and control large portions of the end-user’s shoot cluster. Accidental or undesired deletions of those resource can cause tremendous and hard-to-recover-from outages and should be prevented.\nWhen this webhook is activated, it reacts for CustomResourceDefinitions and most of the custom resources in the extensions.gardener.cloud/v1alpha1 API group. It also reacts for the druid.gardener.cloud/v1alpha1.Etcd resources.\nThe webhook prevents DELETE requests for those CustomResourceDefinitions labeled with gardener.cloud/deletion-protected=true, and for all mentioned custom resources if they were not previously annotated with the confirmation.gardener.cloud/deletion=true. This prevents that undesired kubectl delete \u003c...\u003e requests are accepted.\nExtension Resource Validation When this webhook is activated, it reacts for most of the custom resources in the extensions.gardener.cloud/v1alpha1 API group. It also reacts for the druid.gardener.cloud/v1alpha1.Etcd resources.\nThe webhook validates the resources specifications for CREATE and UPDATE requests.\n","categories":"","description":"Set of controllers with different responsibilities running once per seed and once per shoot","excerpt":"Set of controllers with different responsibilities running once per …","ref":"/docs/gardener/concepts/resource-manager/","tags":"","title":"Gardener Resource Manager"},{"body":"Overview The Gardener Scheduler is in essence a controller that watches newly created shoots and assigns a seed cluster to them. Conceptually, the task of the Gardener Scheduler is very similar to the task of the Kubernetes Scheduler: finding a seed for a shoot instead of a node for a pod.\nEither the scheduling strategy or the shoot cluster purpose hereby determines how the scheduler is operating. The following sections explain the configuration and flow in greater detail.\nWhy Is the Gardener Scheduler Needed? 1. Decoupling Previously, an admission plugin in the Gardener API server conducted the scheduling decisions. This implies changes to the API server whenever adjustments of the scheduling are needed. Decoupling the API server and the scheduler comes with greater flexibility to develop these components independently.\n2. Extensibility It should be possible to easily extend and tweak the scheduler in the future. Possibly, similar to the Kubernetes scheduler, hooks could be provided which influence the scheduling decisions. It should be also possible to completely replace the standard Gardener Scheduler with a custom implementation.\nAlgorithm Overview The following sequence describes the steps involved to determine a seed candidate:\n Determine usable seeds with “usable” defined as follows: no .metadata.deletionTimestamp .spec.settings.scheduling.visible is true .status.lastOperation is not nil conditions GardenletReady, BackupBucketsReady (if available) are true Filter seeds: matching .spec.seedSelector in CloudProfile used by the Shoot matching .spec.seedSelector in Shoot having no network intersection with the Shoot’s networks (due to the VPN connectivity between seeds and shoots their networks must be disjoint) whose taints (.spec.taints) are tolerated by the Shoot (.spec.tolerations) whose capacity for shoots would not be exceeded if the shoot is scheduled onto the seed, see Ensuring seeds capacity for shoots is not exceeded which have at least three zones in .spec.provider.zones if shoot requests a high available control plane with failure tolerance type zone. Apply active strategy e.g., Minimal Distance strategy Choose least utilized seed, i.e., the one with the least number of shoot control planes, will be the winner and written to the .spec.seedName field of the Shoot. In order to put the scheduling decision into effect, the scheduler sends an update request for the Shoot resource to the API server. After validation, the gardener-apiserver updates the Shoot to have the spec.seedName field set. Subsequently, the gardenlet picks up and starts to create the cluster on the specified seed.\nConfiguration The Gardener Scheduler configuration has to be supplied on startup. It is a mandatory and also the only available flag. This yaml file holds an example scheduler configuration.\nMost of the configuration options are the same as in the Gardener Controller Manager (leader election, client connection, …). However, the Gardener Scheduler on the other hand does not need a TLS configuration, because there are currently no webhooks configurable.\nStrategies The scheduling strategy is defined in the candidateDeterminationStrategy of the scheduler’s configuration and can have the possible values SameRegion and MinimalDistance. The SameRegion strategy is the default strategy.\nSame Region strategy The Gardener Scheduler reads the spec.provider.type and .spec.region fields from the Shoot resource. It tries to find a seed that has the identical .spec.provider.type and .spec.provider.region fields set. If it cannot find a suitable seed, it adds an event to the shoot stating that it is unschedulable.\nMinimal Distance strategy The Gardener Scheduler tries to find a valid seed with minimal distance to the shoot’s intended region. Distances are configured via ConfigMap(s), usually per cloud provider in a Gardener landscape. The configuration is structured like this:\n It refers to one or multiple CloudProfiles via annotation scheduling.gardener.cloud/cloudprofiles. It contains the declaration as region-config via label scheduling.gardener.cloud/purpose. If a CloudProfile is referred by multiple ConfigMaps, only the first one is considered. The data fields configure actual distances, where key relates to the Shoot region and value contains distances to Seed regions. apiVersion: v1 kind: ConfigMap metadata: name: \u003cname\u003e namespace: garden annotations: scheduling.gardener.cloud/cloudprofiles: cloudprofile-name-1{,optional-cloudprofile-name-2,...} labels: scheduling.gardener.cloud/purpose: region-config data: region-1: |region-2: 10 region-3: 20 ... region-2: |region-1: 10 region-3: 10 ... Gardener provider extensions for public cloud providers usually have an example weight ConfigMap in their repositories. We suggest to check them out before defining your own data.\n If a valid seed candidate cannot be found after consulting the distance configuration, the scheduler will fall back to the Levenshtein distance to find the closest region. Therefore, the region name is split into a base name and an orientation. Possible orientations are north, south, east, west and central. The distance then is twice the Levenshtein distance of the region’s base name plus a correction value based on the orientation and the provider.\nIf the orientations of shoot and seed candidate match, the correction value is 0, if they differ it is 2 and if either the seed’s or the shoot’s region does not have an orientation it is 1. If the provider differs, the correction value is additionally incremented by 2.\nBecause of this, a matching region with a matching provider is always preferred.\nSpecial handling based on shoot cluster purpose Every shoot cluster can have a purpose that describes what the cluster is used for, and also influences how the cluster is setup (see Shoot Cluster Purpose for more information).\nIn case the shoot has the testing purpose, then the scheduler only reads the .spec.provider.type from the Shoot resource and tries to find a Seed that has the identical .spec.provider.type. The region does not matter, i.e., testing shoots may also be scheduled on a seed in a complete different region if it is better for balancing the whole Gardener system.\nshoots/binding Subresource The shoots/binding subresource is used to bind a Shoot to a Seed. On creation of a shoot cluster/s, the scheduler updates the binding automatically if an appropriate seed cluster is available. Only an operator with the necessary RBAC can update this binding manually. This can be done by changing the .spec.seedName of the shoot. However, if a different seed is already assigned to the shoot, this will trigger a control-plane migration. For required steps, please see Triggering the Migration.\nspec.schedulerName Field in the Shoot Specification Similar to the spec.schedulerName field in Pods, the Shoot specification has an optional .spec.schedulerName field. If this field is set on creation, only the scheduler which relates to the configured name is responsible for scheduling the shoot. The default-scheduler name is reserved for the default scheduler of Gardener. Affected Shoots will remain in Pending state if the mentioned scheduler is not present in the landscape.\nspec.seedName Field in the Shoot Specification Similar to the .spec.nodeName field in Pods, the Shoot specification has an optional .spec.seedName field. If this field is set on creation, the shoot will be scheduled to this seed. However, this field can only be set by users having RBAC for the shoots/binding subresource. If this field is not set, the scheduler will assign a suitable seed automatically and populate this field with the seed name.\nseedSelector Field in the Shoot Specification Similar to the .spec.nodeSelector field in Pods, the Shoot specification has an optional .spec.seedSelector field. It allows the user to provide a label selector that must match the labels of the Seeds in order to be scheduled to one of them. The labels on the Seeds are usually controlled by Gardener administrators/operators - end users cannot add arbitrary labels themselves. If provided, the Gardener Scheduler will only consider as “suitable” those seeds whose labels match those provided in the .spec.seedSelector of the Shoot.\nBy default, only seeds with the same provider as the shoot are selected. By adding a providerTypes field to the seedSelector, a dedicated set of possible providers (* means all provider types) can be selected.\nEnsuring a Seed’s Capacity for Shoots Is Not Exceeded Seeds have a practical limit of how many shoots they can accommodate. Exceeding this limit is undesirable, as the system performance will be noticeably impacted. Therefore, the scheduler ensures that a seed’s capacity for shoots is not exceeded by taking into account a maximum number of shoots that can be scheduled onto a seed.\nThis mechanism works as follows:\n The gardenlet is configured with certain resources and their total capacity (and, for certain resources, the amount reserved for Gardener), see /example/20-componentconfig-gardenlet.yaml. Currently, the only such resource is the maximum number of shoots that can be scheduled onto a seed. The gardenlet seed controller updates the capacity and allocatable fields in the Seed status with the capacity of each resource and how much of it is actually available to be consumed by shoots. The allocatable value of a resource is equal to capacity minus reserved. When scheduling shoots, the scheduler filters out all candidate seeds whose allocatable capacity for shoots would be exceeded if the shoot is scheduled onto the seed. Failure to Determine a Suitable Seed In case the scheduler fails to find a suitable seed, the operation is being retried with exponential backoff. The reason for the failure will be reported in the Shoot’s .status.lastOperation field as well as a Kubernetes event (which can be retrieved via kubectl -n \u003cnamespace\u003e describe shoot \u003cshoot-name\u003e).\nCurrent Limitation / Future Plans Azure unfortunately has a geographically non-hierarchical naming pattern and does not start with the continent. This is the reason why we will exchange the implementation of the MinimalDistance strategy with a more suitable one in the future. ","categories":"","description":"Understand the configuration and flow of the controller that assigns a seed cluster to newly created shoots","excerpt":"Understand the configuration and flow of the controller that assigns a …","ref":"/docs/gardener/concepts/scheduler/","tags":"","title":"Gardener Scheduler"},{"body":"As we ramp up more and more friends of Gardener, I thought it worthwhile to explore and write a tutorial about how to simply:\n create a Gardener managed Kubernetes Cluster (Shoot) via kubectl install Istio as a preferred, production ready Ingress/Service Mesh (instead of the Nginx Ingress addon) attach your own custom domain to be managed by Gardener combine everything with certificates from Let’s Encrypt Here are some pre-pointers that you will need to go deeper:\n CRUD Gardener Shoot DNS Management Certificate Management Tutorial Domain Names Tutorial Certificates Tip If you try my instructions and fail, then read the alternative title of this tutorial as “Shoot yourself in the foot with Gardener, custom Domains, Istio and Certificates”. First Things First Login to your Gardener landscape, setup a project with adequate infrastructure credentials and then navigate to your account. Note down the name of your secret. I chose the GCP infrastructure from the vast possible options that my Gardener provides me with, so i had named the secret as shoot-operator-gcp.\nFrom the Access widget (leave the default settings) download your personalized kubeconfig into ~/.kube/kubeconfig-garden-myproject. Follow the instructions to setup kubelogin:\nFor convinience, let us set an alias command with\nalias kgarden=\"kubectl --kubeconfig ~/.kube/kubeconfig-garden-myproject.yaml\" kgarden now gives you all botanical powers and connects you directly with your Gardener.\nYou should now be able to run kgarden get shoots, automatically get an oidc token, and list already running clusters/shoots.\nPrepare your Custom Domain I am going to use Cloud Flare as programmatic DNS of my custom domain mydomain.io. Please follow detailed instructions from Cloud Flare on how to delegate your domain (the free account does not support delegating subdomains). Alternatively, AWS Route53 (and most others) support delegating subdomains.\nI needed to follow these instructions and created the following secret:\napiVersion: v1 kind: Secret metadata: name: cloudflare-mydomain-io type: Opaque data: CLOUDFLARE_API_TOKEN: useYOURownDAMITzNDU2Nzg5MDEyMzQ1Njc4OQ== Apply this secret into your project with kgarden create -f cloudflare-mydomain-io.yaml.\nOur External DNS Manager also supports Amazon Route53, Google CloudDNS, AliCloud DNS, Azure DNS, or OpenStack Designate. Check it out.\nPrepare Gardener Extensions I now need to prepare the Gardener extensions shoot-dns-service and shoot-cert-service and set the parameters accordingly.\nPlease note, that the availability of Gardener Extensions depends on how your administrator has configured the Gardener landscape. Please contact your Gardener administrator in case you experience any issues during activation. The following snippet allows Gardener to manage my entire custom domain, whereas with the include: attribute I restrict all dynamic entries under the subdomain gsicdc.mydomain.io:\n dns: providers: - domains: include: - gsicdc.mydomain.io primary: false secretName: cloudflare-mydomain-io type: cloudflare-dns extensions: - type: shoot-dns-service The next snipplet allows Gardener to manage certificates automatically from Let’s Encrypt on mydomain.io for me:\n extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 issuers: - email: me@mail.com name: mydomain server: 'https://acme-v02.api.letsencrypt.org/directory' - email: me@mail.com name: mydomain-staging server: 'https://acme-staging-v02.api.letsencrypt.org/directory' Adjust the snipplets with your parameters (don’t forget your email). And please use the mydomain-staging issuer while you are testing and learning. Otherwise, Let’s Encrypt will rate limit your frequent requests and you can wait a week until you can continue. References for Let’s Encrypt:\n Rate limit Staging environment Challenge Types Wildcard Certificates Create the Gardener Shoot Cluster Remember I chose to create the Shoot on GCP, so below is the simplest declarative shoot or cluster order document. Notice that I am referring to the infrastructure credentials with shoot-operator-gcp and I combined the above snippets into the yaml file:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: gsicdc spec: dns: providers: - domains: include: - gsicdc.mydomain.io primary: false secretName: cloudflare-mydomain-io type: cloudflare-dns extensions: - type: shoot-dns-service - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 issuers: - email: me@mail.com name: mydomain server: 'https://acme-v02.api.letsencrypt.org/directory' - email: me@mail.com name: mydomain-staging server: 'https://acme-staging-v02.api.letsencrypt.org/directory' cloudProfileName: gcp kubernetes: allowPrivilegedContainers: true version: 1.24.8 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true networking: nodes: 10.250.0.0/16 pods: 100.96.0.0/11 services: 100.64.0.0/13 type: calico provider: controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-d infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.250.0.0/16 type: gcp workers: - machine: image: name: gardenlinux version: 576.9.0 type: n1-standard-2 maxSurge: 1 maxUnavailable: 0 maximum: 2 minimum: 1 name: my-workerpool volume: size: 50Gi type: pd-standard zones: - europe-west1-d purpose: testing region: europe-west1 secretBindingName: shoot-operator-gcp Create your cluster and wait for it to be ready (about 5 to 7min).\n$ kgarden create -f gsicdc.yaml shoot.core.gardener.cloud/gsicdc created $ kgarden get shoot gsicdc --watch NAME CLOUDPROFILE VERSION SEED DOMAIN HIBERNATION OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE gsicdc gcp 1.24.8 gcp gsicdc.myproject.shoot.devgarden.cloud Awake Processing 38 Progressing Progressing Unknown Unknown 83s ... gsicdc gcp 1.24.8 gcp gsicdc.myproject.shoot.devgarden.cloud Awake Succeeded 100 True True True False 6m7s Get access to your freshly baked cluster and set your KUBECONFIG:\n$ kgarden get secrets gsicdc.kubeconfig -o jsonpath={.data.kubeconfig} | base64 -d \u003ekubeconfig-gsicdc.yaml $ export KUBECONFIG=$(pwd)/kubeconfig-gsicdc.yaml $ kubectl get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 100.64.0.1 \u003cnone\u003e 443/TCP 89m Install Istio Please follow the Istio installation instructions and download istioctl. If you are on a Mac, I recommend:\nbrew install istioctl I want to install Istio with a default profile and SDS enabled. Furthermore I pass the following annotations to the service object istio-ingressgateway in the istio-system namespace.\n annotations: cert.gardener.cloud/issuer: mydomain-staging cert.gardener.cloud/secretname: wildcard-tls dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: \"*.gsicdc.mydomain.io\" dns.gardener.cloud/ttl: \"120\" With these annotations three things now happen automatically:\n The External DNS Manager, provided to you as a service (dns.gardener.cloud/class: garden), picks up the request and creates the wildcard DNS entry *.gsicdc.mydomain.io with a time to live of 120sec at your DNS provider. My provider Cloud Flare is very very quick (as opposed to some other services). You should be able to verify the entry with dig lovemygardener.gsicdc.mydomain.io within seconds. The Certificate Management picks up the request as well and initiates a DNS01 protocol exchange with Let’s Encrypt; using the staging environment referred to with the issuer behind mydomain-staging. After aproximately 70sec (give and take) you will receive the wildcard certificate in the wildcard-tls secret in the namespace istio-system. Notice, that the namespace for the certificate secret is often the cause of many troubleshooting sessions: the secret must reside in the same namespace of the gateway. Here is the istio-install script:\n$ export domainname=\"*.gsicdc.mydomain.io\" $ export issuer=\"mydomain-staging\" $ cat \u003c\u003cEOF | istioctl install -y -f - apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: profile: default components: ingressGateways: - name: istio-ingressgateway enabled: true k8s: serviceAnnotations: cert.gardener.cloud/issuer: \"${issuer}\" cert.gardener.cloud/secretname: wildcard-tls dns.gardener.cloud/class: garden dns.gardener.cloud/dnsnames: \"${domainname}\" dns.gardener.cloud/ttl: \"120\" EOF Verify that setup is working and that DNS and certificates have been created/delivered:\n$ kubectl -n istio-system describe service istio-ingressgateway \u003csnip\u003e Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuringLoadBalancer 58s service-controller Ensuring load balancer Normal reconcile 58s cert-controller-manager created certificate object istio-system/istio-ingressgateway-service-pwqdm Normal cert-annotation 58s cert-controller-manager wildcard-tls: cert request is pending Normal cert-annotation 54s cert-controller-manager wildcard-tls: certificate pending: certificate requested, preparing/waiting for successful DNS01 challenge Normal cert-annotation 28s cert-controller-manager wildcard-tls: certificate ready Normal EnsuredLoadBalancer 26s service-controller Ensured load balancer Normal reconcile 26s dns-controller-manager created dns entry object shoot--core--gsicdc/istio-ingressgateway-service-p9qqb Normal dns-annotation 26s dns-controller-manager *.gsicdc.mydomain.io: dns entry is pending Normal dns-annotation 21s (x3 over 21s) dns-controller-manager *.gsicdc.mydomain.io: dns entry active $ dig lovemygardener.gsicdc.mydomain.io ; \u003c\u003c\u003e\u003e DiG 9.10.6 \u003c\u003c\u003e\u003e lovemygardener.gsicdc.mydomain.io \u003csnip\u003e ;; ANSWER SECTION: lovemygardener.gsicdc.mydomain.io. 120 IN A\t35.195.120.62 \u003csnip\u003e There you have it, the wildcard-tls certificate is ready and the *.gsicdc.mydomain.io dns entry is active. Traffic will be going your way.\nHandy Tools to Install Another set of fine tools to use are kapp (formerly known as k14s), k9s and HTTPie. While we are at it, let’s install them all. If you are on a Mac, I recommend:\nbrew tap vmware-tanzu/carvel brew install ytt kbld kapp kwt imgpkg vendir brew install derailed/k9s/k9s brew install httpie Ingress at Your Service Networking is a central part of Kubernetes, but it can be challenging to understand exactly how it is expected to work. You should learn about Kubernetes networking, and first try to debug problems yourself. With a solid managed cluster from Gardener, it is always PEBCAK! Kubernetes Ingress is a subject that is evolving to much broader standard. Please watch Evolving the Kubernetes Ingress APIs to GA and Beyond for a good introduction. In this example, I did not want to use the Kubernetes Ingress compatibility option of Istio. Instead, I used VirtualService and Gateway from the Istio’s API group networking.istio.io/v1 directly, and enabled istio-injection generically for the namespace.\nI use httpbin as service that I want to expose to the internet, or where my ingress should be routed to (depends on your point of view, I guess).\napiVersion: v1 kind: Namespace metadata: name: production labels: istio-injection: enabled --- apiVersion: v1 kind: Service metadata: name: httpbin namespace: production labels: app: httpbin spec: ports: - name: http port: 8000 targetPort: 80 selector: app: httpbin --- apiVersion: apps/v1 kind: Deployment metadata: name: httpbin namespace: production spec: replicas: 1 selector: matchLabels: app: httpbin template: metadata: labels: app: httpbin spec: containers: - image: docker.io/kennethreitz/httpbin imagePullPolicy: IfNotPresent name: httpbin ports: - containerPort: 80 --- apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gw namespace: production spec: selector: istio: ingressgateway #! use istio default ingress gateway servers: - port: number: 80 name: http protocol: HTTP tls: httpsRedirect: true hosts: - \"httpbin.gsicdc.mydomain.io\" - port: number: 443 name: https protocol: HTTPS tls: mode: SIMPLE credentialName: wildcard-tls hosts: - \"httpbin.gsicdc.mydomain.io\" --- apiVersion: networking.istio.io/v1 kind: VirtualService metadata: name: httpbin-vs namespace: production spec: hosts: - \"httpbin.gsicdc.mydomain.io\" gateways: - httpbin-gw http: - match: - uri: regex: /.* route: - destination: port: number: 8000 host: httpbin --- Let us now deploy the whole package of Kubernetes primitives using kapp:\n$ kapp deploy -a httpbin -f httpbin-kapp.yaml Target cluster 'https://api.gsicdc.myproject.shoot.devgarden.cloud' (nodes: shoot--myproject--gsicdc-my-workerpool-z1-6586c8f6cb-x24kh) Changes Namespace Name Kind Conds. Age Op Wait to Rs Ri (cluster) production Namespace - - create reconcile - - production httpbin Deployment - - create reconcile - - ^ httpbin Service - - create reconcile - - ^ httpbin-gw Gateway - - create reconcile - - ^ httpbin-vs VirtualService - - create reconcile - - Op: 5 create, 0 delete, 0 update, 0 noop Wait to: 5 reconcile, 0 delete, 0 noop Continue? [yN]: y 5:36:31PM: ---- applying 1 changes [0/5 done] ---- \u003csnip\u003e 5:37:00PM: ok: reconcile deployment/httpbin (apps/v1) namespace: production 5:37:00PM: ---- applying complete [5/5 done] ---- 5:37:00PM: ---- waiting complete [5/5 done] ---- Succeeded Let’s finally test the service (Of course you can use the browser as well):\n$ http httpbin.gsicdc.mydomain.io HTTP/1.1 301 Moved Permanently content-length: 0 date: Wed, 13 May 2020 21:29:13 GMT location: https://httpbin.gsicdc.mydomain.io/ server: istio-envoy $ curl -k https://httpbin.gsicdc.mydomain.io/ip { \"origin\": \"10.250.0.2\" } Quod erat demonstrandum. The proof of exchanging the issuer is now left to the reader.\nTip Remember that the certificate is actually not valid because it is issued from the Let’s encrypt staging environment. Thus, we needed “curl -k” or “http –verify no”. Hint: use the interactive k9s tool. Cleanup Remove the cloud native application:\n$ kapp ls Apps in namespace 'default' Name Namespaces Lcs Lca httpbin (cluster),production true 17m $ kapp delete -a httpbin ... Continue? [yN]: y ... 11:47:47PM: ---- waiting complete [8/8 done] ---- Succeeded Remove Istio:\n$ istioctl x uninstall --purge clusterrole.rbac.authorization.k8s.io \"prometheus-istio-system\" deleted clusterrolebinding.rbac.authorization.k8s.io \"prometheus-istio-system\" deleted ... Delete your Shoot:\nkgarden annotate shoot gsicdc confirmation.gardener.cloud/deletion=true --overwrite kgarden delete shoot gsicdc --wait=false ","categories":"","description":"","excerpt":"As we ramp up more and more friends of Gardener, I thought it …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/tutorial-custom-domain-with-istio/","tags":"","title":"Gardener yourself a Shoot with Istio, custom Domains, and Certificates"},{"body":"Overview Gardener is implemented using the operator pattern: It uses custom controllers that act on our own custom resources, and apply Kubernetes principles to manage clusters instead of containers. Following this analogy, you can recognize components of the Gardener architecture as well-known Kubernetes components, for example, shoot clusters can be compared with pods, and seed clusters can be seen as worker nodes.\nThe following Gardener components play a similar role as the corresponding components in the Kubernetes architecture:\n Gardener Component Kubernetes Component gardener-apiserver kube-apiserver gardener-controller-manager kube-controller-manager gardener-scheduler kube-scheduler gardenlet kubelet Similar to how the kube-scheduler of Kubernetes finds an appropriate node for newly created pods, the gardener-scheduler of Gardener finds an appropriate seed cluster to host the control plane for newly ordered clusters. By providing multiple seed clusters for a region or provider, and distributing the workload, Gardener also reduces the blast radius of potential issues.\nKubernetes runs a primary “agent” on every node, the kubelet, which is responsible for managing pods and containers on its particular node. Decentralizing the responsibility to the kubelet has the advantage that the overall system is scalable. Gardener achieves the same for cluster management by using a gardenlet as а primary “agent” on every seed cluster, and is only responsible for shoot clusters located in its particular seed cluster:\nThe gardener-controller-manager has controllers to manage resources of the Gardener API. However, instead of letting the gardener-controller-manager talk directly to seed clusters or shoot clusters, the responsibility isn’t only delegated to the gardenlet, but also managed using a reversed control flow: It’s up to the gardenlet to contact the Gardener API server, for example, to share a status for its managed seed clusters.\nReversing the control flow allows placing seed clusters or shoot clusters behind firewalls without the necessity of direct access via VPN tunnels anymore.\nTLS Bootstrapping Kubernetes doesn’t manage worker nodes itself, and it’s also not responsible for the lifecycle of the kubelet running on the workers. Similarly, Gardener doesn’t manage seed clusters itself, so it is also not responsible for the lifecycle of the gardenlet running on the seeds. As a consequence, both the gardenlet and the kubelet need to prepare a trusted connection to the Gardener API server and the Kubernetes API server correspondingly.\nTo prepare a trusted connection between the gardenlet and the Gardener API server, the gardenlet initializes a bootstrapping process after you deployed it into your seed clusters:\n The gardenlet starts up with a bootstrap kubeconfig having a bootstrap token that allows to create CertificateSigningRequest (CSR) resources.\n After the CSR is signed, the gardenlet downloads the created client certificate, creates a new kubeconfig with it, and stores it inside a Secret in the seed cluster.\n The gardenlet deletes the bootstrap kubeconfig secret, and starts up with its new kubeconfig.\n The gardenlet starts normal operation.\n The gardener-controller-manager runs a control loop that automatically signs CSRs created by gardenlets.\n The gardenlet bootstrapping process is based on the kubelet bootstrapping process. More information: Kubelet’s TLS bootstrapping.\n If you don’t want to run this bootstrap process, you can create a kubeconfig pointing to the garden cluster for the gardenlet yourself, and use the field gardenClientConnection.kubeconfig in the gardenlet configuration to share it with the gardenlet.\ngardenlet Certificate Rotation The certificate used to authenticate the gardenlet against the API server has a certain validity based on the configuration of the garden cluster (--cluster-signing-duration flag of the kube-controller-manager (default 1y)).\n You can also configure the validity for the client certificate by specifying .gardenClientConnection.kubeconfigValidity.validity in the gardenlet’s component configuration. Note that changing this value will only take effect when the kubeconfig is rotated again (it is not picked up immediately). The minimum validity is 10m (that’s what is enforced by the CertificateSigningRequest API in Kubernetes which is used by the gardenlet).\n By default, after about 70-90% of the validity has expired, the gardenlet tries to automatically replace the current certificate with a new one (certificate rotation).\n You can change these boundaries by specifying .gardenClientConnection.kubeconfigValidity.autoRotationJitterPercentage{Min,Max} in the gardenlet’s component configuration.\n To use a certificate rotation, you need to specify the secret to store the kubeconfig with the rotated certificate in the field .gardenClientConnection.kubeconfigSecret of the gardenlet component configuration.\nRotate Certificates Using Bootstrap kubeconfig If the gardenlet created the certificate during the initial TLS Bootstrapping using the Bootstrap kubeconfig, certificates can be rotated automatically. The same control loop in the gardener-controller-manager that signs the CSRs during the initial TLS Bootstrapping also automatically signs the CSR during a certificate rotation.\nℹ️ You can trigger an immediate renewal by annotating the Secret in the seed cluster stated in the .gardenClientConnection.kubeconfigSecret field with gardener.cloud/operation=renew. Within 10s, gardenlet detects this and terminates itself to request new credentials. After it has booted up again, gardenlet will issue a new certificate independent of the remaining validity of the existing one.\nℹ️ Alternatively, annotate the respective Seed with gardener.cloud/operation=renew-kubeconfig. This will make gardenlet annotate its own kubeconfig secret with gardener.cloud/operation=renew and triggers the process described in the previous paragraph.\nRotate Certificates Using Custom kubeconfig When trying to rotate a custom certificate that wasn’t created by gardenlet as part of the TLS Bootstrap, the x509 certificate’s Subject field needs to conform to the following:\n the Common Name (CN) is prefixed with gardener.cloud:system:seed: the Organization (O) equals gardener.cloud:system:seeds Otherwise, the gardener-controller-manager doesn’t automatically sign the CSR. In this case, an external component or user needs to approve the CSR manually, for example, using the command kubectl certificate approve seed-csr-\u003c...\u003e). If that doesn’t happen within 15 minutes, the gardenlet repeats the process and creates another CSR.\nConfiguring the Seed to Work with gardenlet The gardenlet works with a single seed, which must be configured in the GardenletConfiguration under .seedConfig. This must be a copy of the Seed resource, for example:\napiVersion: gardenlet.config.gardener.cloud/v1alpha1 kind: GardenletConfiguration seedConfig: metadata: name: my-seed spec: provider: type: aws # ... settings: scheduling: visible: true (see this yaml file for a more complete example)\nOn startup, gardenlet registers a Seed resource using the given template in the seedConfig if it’s not present already.\nComponent Configuration In the component configuration for the gardenlet, it’s possible to define:\n settings for the Kubernetes clients interacting with the various clusters settings for the controllers inside the gardenlet settings for leader election and log levels, feature gates, and seed selection or seed configuration. More information: Example gardenlet Component Configuration.\nHeartbeats Similar to how Kubernetes uses Lease objects for node heart beats (see KEP), the gardenlet is using Lease objects for heart beats of the seed cluster. Every two seconds, the gardenlet checks that the seed cluster’s /healthz endpoint returns HTTP status code 200. If that is the case, the gardenlet renews the lease in the Garden cluster in the gardener-system-seed-lease namespace and updates the GardenletReady condition in the status.conditions field of the Seed resource. For more information, see this section.\nSimilar to the node-lifecycle-controller inside the kube-controller-manager, the gardener-controller-manager features a seed-lifecycle-controller that sets the GardenletReady condition to Unknown in case the gardenlet fails to renew the lease. As a consequence, the gardener-scheduler doesn’t consider this seed cluster for newly created shoot clusters anymore.\n/healthz Endpoint The gardenlet includes an HTTP server that serves a /healthz endpoint. It’s used as a liveness probe in the Deployment of the gardenlet. If the gardenlet fails to renew its lease, then the endpoint returns 500 Internal Server Error, otherwise it returns 200 OK.\nPlease note that the /healthz only indicates whether the gardenlet could successfully probe the Seed’s API server and renew the lease with the Garden cluster. It does not show that the Gardener extension API server (with the Gardener resource groups) is available. However, the gardenlet is designed to withstand such connection outages and retries until the connection is reestablished.\nControllers The gardenlet consists out of several controllers which are now described in more detail.\nBackupBucket Controller The BackupBucket controller reconciles those core.gardener.cloud/v1beta1.BackupBucket resources whose .spec.seedName value is equal to the name of the Seed the respective gardenlet is responsible for. A core.gardener.cloud/v1beta1.BackupBucket resource is created by the Seed controller if .spec.backup is defined in the Seed.\nThe controller adds finalizers to the BackupBucket and the secret mentioned in the .spec.secretRef of the BackupBucket. The controller also copies this secret to the seed cluster. Additionally, it creates an extensions.gardener.cloud/v1alpha1.BackupBucket resource (non-namespaced) in the seed cluster and waits until the responsible extension controller reconciles it (see Contract: BackupBucket Resource for more details). The status from the reconciliation is reported in the .status.lastOperation field. Once the extension resource is ready and the .status.generatedSecretRef is set by the extension controller, the gardenlet copies the referenced secret to the garden namespace in the garden cluster. An owner reference to the core.gardener.cloud/v1beta1.BackupBucket is added to this secret.\nIf the core.gardener.cloud/v1beta1.BackupBucket is deleted, the controller deletes the generated secret in the garden cluster and the extensions.gardener.cloud/v1alpha1.BackupBucket resource in the seed cluster and it waits for the respective extension controller to remove its finalizers from the extensions.gardener.cloud/v1alpha1.BackupBucket. Then it deletes the secret in the seed cluster and finally removes the finalizers from the core.gardener.cloud/v1beta1.BackupBucket and the referred secret.\nBackupEntry Controller The BackupEntry controller reconciles those core.gardener.cloud/v1beta1.BackupEntry resources whose .spec.seedName value is equal to the name of a Seed the respective gardenlet is responsible for. Those resources are created by the Shoot controller (only if backup is enabled for the respective Seed) and there is exactly one BackupEntry per Shoot.\nThe controller creates an extensions.gardener.cloud/v1alpha1.BackupEntry resource (non-namespaced) in the seed cluster and waits until the responsible extension controller reconciled it (see Contract: BackupEntry Resource for more details). The status is populated in the .status.lastOperation field.\nThe core.gardener.cloud/v1beta1.BackupEntry resource has an owner reference pointing to the corresponding Shoot. Hence, if the Shoot is deleted, the BackupEntry resource also gets deleted. In this case, the controller deletes the extensions.gardener.cloud/v1alpha1.BackupEntry resource in the seed cluster and waits until the responsible extension controller has deleted it. Afterwards, the finalizer of the core.gardener.cloud/v1beta1.BackupEntry resource is released so that it finally disappears from the system.\nIf the spec.seedName and .status.seedName of the core.gardener.cloud/v1beta1.BackupEntry are different, the controller will migrate it by annotating the extensions.gardener.cloud/v1alpha1.BackupEntry in the Source Seed with gardener.cloud/operation: migrate, waiting for it to be migrated successfully and eventually deleting it from the Source Seed cluster. Afterwards, the controller will recreate the extensions.gardener.cloud/v1alpha1.BackupEntry in the Destination Seed, annotate it with gardener.cloud/operation: restore and wait for the restore operation to finish. For more details about control plane migration, please read Shoot Control Plane Migration.\nKeep Backup for Deleted Shoots In some scenarios it might be beneficial to not immediately delete the BackupEntrys (and with them, the etcd backup) for deleted Shoots.\nIn this case you can configure the .controllers.backupEntry.deletionGracePeriodHours field in the component configuration of the gardenlet. For example, if you set it to 48, then the BackupEntrys for deleted Shoots will only be deleted 48 hours after the Shoot was deleted.\nAdditionally, you can limit the shoot purposes for which this applies by setting .controllers.backupEntry.deletionGracePeriodShootPurposes[]. For example, if you set it to [production] then only the BackupEntrys for Shoots with .spec.purpose=production will be deleted after the configured grace period. All others will be deleted immediately after the Shoot deletion.\nIn case a BackupEntry is scheduled for future deletion but you want to delete it immediately, add the annotation backupentry.core.gardener.cloud/force-deletion=true.\nBastion Controller The Bastion controller reconciles those operations.gardener.cloud/v1alpha1.Bastion resources whose .spec.seedName value is equal to the name of a Seed the respective gardenlet is responsible for.\nThe controller creates an extensions.gardener.cloud/v1alpha1.Bastion resource in the seed cluster in the shoot namespace with the same name as operations.gardener.cloud/v1alpha1.Bastion. Then it waits until the responsible extension controller has reconciled it (see Contract: Bastion Resource for more details). The status is populated in the .status.conditions and .status.ingress fields.\nDuring the deletion of operations.gardener.cloud/v1alpha1.Bastion resources, the controller first sets the Ready condition to False and then deletes the extensions.gardener.cloud/v1alpha1.Bastion resource in the seed cluster. Once this resource is gone, the finalizer of the operations.gardener.cloud/v1alpha1.Bastion resource is released, so it finally disappears from the system.\nControllerInstallation Controller The ControllerInstallation controller in the gardenlet reconciles ControllerInstallation objects with the help of the following reconcilers.\n“Main” Reconciler This reconciler is responsible for ControllerInstallations referencing a ControllerDeployment whose type=helm.\nFor each ControllerInstallation, it creates a namespace on the seed cluster named extension-\u003ccontroller-installation-name\u003e. Then, it creates a generic garden kubeconfig and garden access secret for the extension for accessing the garden cluster.\nAfter that, it unpacks the Helm chart tarball in the ControllerDeployments .providerConfig.chart field and deploys the rendered resources to the seed cluster. The Helm chart values in .providerConfig.values will be used and extended with some information about the Gardener environment and the seed cluster:\ngardener: version: \u003cgardenlet-version\u003e garden: clusterIdentity: \u003cidentity-of-garden-cluster\u003e genericKubeconfigSecretName: \u003csecret-name\u003e gardenlet: featureGates: Foo: true Bar: false # ... seed: name: \u003cseed-name\u003e clusterIdentity: \u003cidentity-of-seed-cluster\u003e annotations: \u003cseed-annotations\u003e labels: \u003cseed-labels\u003e spec: \u003cseed-specification\u003e As of today, there are a few more fields in .gardener.seed, but it is recommended to use the .gardener.seed.spec if the Helm chart needs more information about the seed configuration.\nThe rendered chart will be deployed via a ManagedResource created in the garden namespace of the seed cluster. It is labeled with controllerinstallation-name=\u003cname\u003e so that one can easily find the owning ControllerInstallation for an existing ManagedResource.\nThe reconciler maintains the Installed condition of the ControllerInstallation and sets it to False if the rendering or deployment fails.\n“Care” Reconciler This reconciler reconciles ControllerInstallation objects and checks whether they are in a healthy state. It checks the .status.conditions of the backing ManagedResource created in the garden namespace of the seed cluster.\n If the ResourcesApplied condition of the ManagedResource is True, then the Installed condition of the ControllerInstallation will be set to True. If the ResourcesHealthy condition of the ManagedResource is True, then the Healthy condition of the ControllerInstallation will be set to True. If the ResourcesProgressing condition of the ManagedResource is True, then the Progressing condition of the ControllerInstallation will be set to True. A ControllerInstallation is considered “healthy” if Applied=Healthy=True and Progressing=False.\n“Required” Reconciler This reconciler watches all resources in the extensions.gardener.cloud API group in the seed cluster. It is responsible for maintaining the Required condition on ControllerInstallations. Concretely, when there is at least one extension resource in the seed cluster a ControllerInstallation is responsible for, then the status of the Required condition will be True. If there are no extension resources anymore, its status will be False.\nThis condition is taken into account by the ControllerRegistration controller part of gardener-controller-manager when it computes which extensions have to be deployed to which seed cluster. See Gardener Controller Manager for more details.\nGardenlet Controller The Gardenlet controller reconciles a Gardenlet resource with the same name as the Seed the gardenlet is responsible for. This is used to implement self-upgrades of gardenlet based on information pulled from the garden cluster. For a general overview, see this document.\nOn Gardenlet reconciliation, the controller deploys the gardenlet within its own cluster which after downloading the Helm chart specified in .spec.deployment.helm.ociRepository and rendering it with the provided values/configuration.\nOn Gardenlet deletion, nothing happens: The gardenlet does not terminate itself - deleting a Gardenlet object effectively means that self-upgrades are stopped.\nManagedSeed Controller The ManagedSeed controller in the gardenlet reconciles ManagedSeeds that refers to Shoot scheduled on Seed the gardenlet is responsible for. Additionally, the controller monitors Seeds, which are owned by ManagedSeeds for which the gardenlet is responsible.\nOn ManagedSeed reconciliation, the controller first waits for the referenced Shoot to undergo a reconciliation process. Once the Shoot is successfully reconciled, the controller sets the ShootReconciled status of the ManagedSeed to true. Then, it creates garden namespace within the target shoot cluster. The controller also manages secrets related to Seeds, such as the backup and kubeconfig secrets. It ensures that these secrets are created and updated according to the ManagedSeed spec. Finally, it deploys the gardenlet within the specified shoot cluster which registers the Seed cluster.\nOn ManagedSeed deletion, the controller first deletes the corresponding Seed that was originally created by the controller. Subsequently, it deletes the gardenlet instance within the shoot cluster. The controller also ensures the deletion of related Seed secrets. Finally, the dedicated garden namespace within the shoot cluster is deleted.\nNetworkPolicy Controller The NetworkPolicy controller reconciles NetworkPolicys in all relevant namespaces in the seed cluster and provides so-called “general” policies for access to the runtime cluster’s API server, DNS, public networks, etc.\nThe controller resolves the IP address of the Kubernetes service in the default namespace and creates an egress NetworkPolicys for it.\nFor more details about NetworkPolicys in Gardener, please see NetworkPolicys In Garden, Seed, Shoot Clusters.\nSeed Controller The Seed controller in the gardenlet reconciles Seed objects with the help of the following reconcilers.\n“Main Reconciler” This reconciler is responsible for managing the seed’s system components. Those comprise CA certificates, the various CustomResourceDefinitions, the logging and monitoring stacks, and few central components like gardener-resource-manager, etcd-druid, istio, etc.\nThe reconciler also deploys a BackupBucket resource in the garden cluster in case the Seed's .spec.backup is set. It also checks whether the seed cluster’s Kubernetes version is at least the minimum supported version and errors in case this constraint is not met.\nThis reconciler maintains the .status.lastOperation field, i.e. it sets it:\n to state=Progressing before it executes its reconciliation flow. to state=Error in case an error occurs. to state=Succeeded in case the reconciliation succeeded. “Care” Reconciler This reconciler checks whether the seed system components (deployed by the “main” reconciler) are healthy. It checks the .status.conditions of the backing ManagedResource created in the garden namespace of the seed cluster. A ManagedResource is considered “healthy” if the conditions ResourcesApplied=ResourcesHealthy=True and ResourcesProgressing=False.\nIf all ManagedResources are healthy, then the SeedSystemComponentsHealthy condition of the Seed will be set to True. Otherwise, it will be set to False.\nIf at least one ManagedResource is unhealthy and there is threshold configuration for the conditions (in .controllers.seedCare.conditionThresholds), then the status of the SeedSystemComponentsHealthy condition will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nIn order to compute the condition statuses, this reconciler considers ManagedResources (in the garden and istio-system namespace) and their status, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when SeedSystemComponentsHealthy .spec.class is set “Lease” Reconciler This reconciler checks whether the connection to the seed cluster’s /healthz endpoint works. If this succeeds, then it renews a Lease resource in the garden cluster’s gardener-system-seed-lease namespace. This indicates a heartbeat to the external world, and internally the gardenlet sets its health status to true. In addition, the GardenletReady condition in the status of the Seed is set to True. The whole process is similar to what the kubelet does to report heartbeats for its Node resource and its KubeletReady condition. For more information, see this section.\nIf the connection to the /healthz endpoint or the update of the Lease fails, then the internal health status of gardenlet is set to false. Also, this internal health status is set to false automatically after some time, in case the controller gets stuck for whatever reason. This internal health status is available via the gardenlet’s /healthz endpoint and is used for the livenessProbe in the gardenlet pod.\nShoot Controller The Shoot controller in the gardenlet reconciles Shoot objects with the help of the following reconcilers.\n“Main” Reconciler This reconciler is responsible for managing all shoot cluster components and implements the core logic for creating, updating, hibernating, deleting, and migrating shoot clusters. It is also responsible for syncing the Cluster cluster to the seed cluster before and after each successful shoot reconciliation.\nThe main reconciliation logic is performed in 3 different task flows dedicated to specific operation types:\n reconcile (operations: create, reconcile, restore): this is the main flow responsible for creation and regular reconciliation of shoots. Hibernating a shoot also triggers this flow. It is also used for restoration of the shoot control plane on the new seed (second half of a Control Plane Migration) migrate: this flow is triggered when spec.seedName specifies a different seed than status.seedName. It performs the first half of the Control Plane Migration, i.e., a backup (migrate operation) of all control plane components followed by a “shallow delete”. delete: this flow is triggered when the shoot’s deletionTimestamp is set, i.e., when it is deleted. The gardenlet takes special care to prevent unnecessary shoot reconciliations. This is important for several reasons, e.g., to not overload the seed API servers and to not exhaust infrastructure rate limits too fast. The gardenlet performs shoot reconciliations according to the following rules:\n If status.observedGeneration is less than metadata.generation: this is the case, e.g., when the spec was changed, a manual reconciliation operation was triggered, or the shoot was deleted. If the last operation was not successful. If the shoot is in a failed state, the gardenlet does not perform any reconciliation on the shoot (unless the retry operation was triggered). However, it syncs the Cluster resource to the seed in order to inform the extension controllers about the failed state. Regular reconciliations are performed with every GardenletConfiguration.controllers.shoot.syncPeriod (defaults to 1h). Shoot reconciliations are not performed if the assigned seed cluster is not healthy or has not been reconciled by the current gardenlet version yet (determined by the Seed.status.gardener section). This is done to make sure that shoots are reconciled with fully rolled out seed system components after a Gardener upgrade. Otherwise, the gardenlet might perform operations of the new version that doesn’t match the old version of the deployed seed system components, which might lead to unspecified behavior. There are a few special cases that overwrite or confine how often and under which circumstances periodic shoot reconciliations are performed:\n In case the gardenlet config allows it (controllers.shoot.respectSyncPeriodOverwrite, disabled by default), the sync period for a shoot can be increased individually by setting the shoot.gardener.cloud/sync-period annotation. This is always allowed for shoots in the garden namespace. Shoots are not reconciled with a higher frequency than specified in GardenletConfiguration.controllers.shoot.syncPeriod. In case the gardenlet config allows it (controllers.shoot.respectSyncPeriodOverwrite, disabled by default), shoots can be marked as “ignored” by setting the shoot.gardener.cloud/ignore annotation. In this case, the gardenlet does not perform any reconciliation for the shoot. In case GardenletConfiguration.controllers.shoot.reconcileInMaintenanceOnly is enabled (disabled by default), the gardenlet performs regular shoot reconciliations only once in the respective maintenance time window (GardenletConfiguration.controllers.shoot.syncPeriod is ignored). The gardenlet randomly distributes shoot reconciliations over the maintenance time window to avoid high bursts of reconciliations (see Shoot Maintenance). In case Shoot.spec.maintenance.confineSpecUpdateRollout is enabled (disabled by default), changes to the shoot specification are not rolled out immediately but only during the respective maintenance time window (see Shoot Maintenance). “Care” Reconciler This reconciler performs three “care” actions related to Shoots.\nConditions It maintains the following conditions:\n APIServerAvailable: The /healthz endpoint of the shoot’s kube-apiserver is called and considered healthy when it responds with 200 OK. ControlPlaneHealthy: The control plane is considered healthy when the respective Deployments (for example kube-apiserver,kube-controller-manager), and Etcds (for example etcd-main) exist and are healthy. ObservabilityComponentsHealthy: This condition is considered healthy when the respective Deployments (for example plutono) and StatefulSets (for example prometheus,vali) exist and are healthy. EveryNodeReady: The conditions of the worker nodes are checked (e.g., Ready, MemoryPressure). Also, it’s checked whether the Kubernetes version of the installed kubelet matches the desired version specified in the Shoot resource. SystemComponentsHealthy: The conditions of the ManagedResources are checked (e.g., ResourcesApplied). Also, it is verified whether the VPN tunnel connection is established (which is required for the kube-apiserver to communicate with the worker nodes). Sometimes, ManagedResources can have both Healthy and Progressing conditions set to True (e.g., when a DaemonSet rolls out one-by-one on a large cluster with many nodes) while this is not reflected in the Shoot status. In order to catch issues where the rollout gets stuck, one can set .controllers.shootCare.managedResourceProgressingThreshold in the gardenlet’s component configuration. If the Progressing condition is still True for more than the configured duration, the SystemComponentsHealthy condition in the Shoot is set to False, eventually.\nEach condition can optionally also have error codes in order to indicate which type of issue was detected (see Shoot Status for more details).\nApart from the above, extension controllers can also contribute to the status or error codes of these conditions (see Contributing to Shoot Health Status Conditions for more details).\nIf all checks for a certain conditions are succeeded, then its status will be set to True. Otherwise, it will be set to False.\nIf at least one check fails and there is threshold configuration for the conditions (in .controllers.seedCare.conditionThresholds), then the status will be set:\n to Progressing if it was True before. to Progressing if it was Progressing before and the lastUpdateTime of the condition does not exceed the configured threshold duration yet. to False if it was Progressing before and the lastUpdateTime of the condition exceeds the configured threshold duration. The condition thresholds can be used to prevent reporting issues too early just because there is a rollout or a short disruption. Only if the unhealthiness persists for at least the configured threshold duration, then the issues will be reported (by setting the status to False).\nBesides directly checking the status of Deployments, Etcds, StatefulSets in the shoot namespace, this reconciler also considers ManagedResources (in the shoot namespace) and their status in order to compute the condition statuses, see this document for more information. The following table explains which ManagedResources are considered for which condition type:\n Condition Type ManagedResources are considered when ControlPlaneHealthy .spec.class=seed and care.gardener.cloud/condition-type label either unset, or set to ControlPlaneHealthy ObservabilityComponentsHealthy care.gardener.cloud/condition-type label set to ObservabilityComponentsHealthy SystemComponentsHealthy .spec.class unset or care.gardener.cloud/condition-type label set to SystemComponentsHealthy Constraints And Automatic Webhook Remediation Please see Shoot Status for more details.\nGarbage Collection Stale pods in the shoot namespace in the seed cluster and in the kube-system namespace in the shoot cluster are deleted. A pod is considered stale when:\n it was terminated with reason Evicted. it was terminated with reason starting with OutOf (e.g., OutOfCpu). it was terminated with reason NodeAffinity. it is stuck in termination (i.e., if its deletionTimestamp is more than 5m ago). “State” Reconciler This reconciler periodically (default: every 6h) performs backups of the state of Shoot clusters and persists them into ShootState resources into the same namespace as the Shoots in the garden cluster. It is only started in case the gardenlet is responsible for an unmanaged Seed, i.e. a Seed which is not backed by a seedmanagement.gardener.cloud/v1alpha1.ManagedSeed object. Alternatively, it can be disabled by setting the concurrentSyncs=0 for the controller in the gardenlet’s component configuration.\nPlease refer to GEP-22: Improved Usage of the ShootState API for all information.\nTokenRequestor Controller For ServiceAccounts The gardenlet uses an instance of the TokenRequestor controller which initially was developed in the context of the gardener-resource-manager, please read this document for further information.\ngardenlet uses it for requesting tokens for components running in the seed cluster that need to communicate with the garden cluster. The mechanism works the same way as for shoot control plane components running in the seed which need to communicate with the shoot cluster. However, gardenlet’s instance of the TokenRequestor controller is restricted to Secrets labeled with resources.gardener.cloud/class=garden. Furthermore, it doesn’t respect the serviceaccount.resources.gardener.cloud/namespace annotation. Instead, it always uses the seed’s namespace in the garden cluster for managing ServiceAccounts and their tokens.\nTokenRequestor Controller For WorkloadIdentitys The TokenRequestorWorkloadIdentity controller in the gardenlet reconciles Secrets labeled with security.gardener.cloud/purpose=workload-identity-token-requestor. When it encounters such Secret, it associates the Secret with a specific WorkloadIdentity using the annotations workloadidentity.security.gardener.cloud/name and workloadidentity.security.gardener.cloud/namespace. Any workload creating such Secrets is responsible to label and annotate the Secrets accordingly. After the association is made, the gardenlet requests a token for the specific WorkloadIdentity from the Gardener API Server and writes it back in the Secret’s data against the token key. The gardenlet is responsible to keep this token valid by refreshing it periodically. The token is then used by components running in the seed cluster in order to present the said WorkloadIdentity before external systems, e.g. by calling cloud provider APIs.\nPlease refer to GEP-26: Workload Identity - Trust Based Authentication for more details.\nVPAEvictionRequirements Controller The VPAEvictionRequirements controller in the gardenlet reconciles VerticalPodAutoscaler objects labeled with autoscaling.gardener.cloud/eviction-requirements: managed-by-controller. It manages the EvictionRequirements on a VPA object, which are used to restrict when and how a Pod can be evicted to apply a new resource recommendation. Specifically, the following actions will be taken for the respective label and annotation configuration:\n If the VPA has the annotation eviction-requirements.autoscaling.gardener.cloud/downscale-restriction: never, an EvictionRequirement is added to the VPA object that allows evictions for upscaling only If the VPA has the annotation eviction-requirements.autoscaling.gardener.cloud/downscale-restriction: in-maintenance-window-only, the same EvictionRequirement is added to the VPA object when the Shoot is currently outside of its maintenance window. When the Shoot is inside its maintenance window, the EvictionRequirement is removed. Information about the Shoot maintenance window times are stored in the annotation shoot.gardener.cloud/maintenance-window on the VPA Managed Seeds Gardener users can use shoot clusters as seed clusters, so-called “managed seeds” (aka “shooted seeds”), by creating ManagedSeed resources. By default, the gardenlet that manages this shoot cluster then automatically creates a clone of itself with the same version and the same configuration that it currently has. Then it deploys the gardenlet clone into the managed seed cluster.\nFor more information, see ManagedSeeds: Register Shoot as Seed.\nMigrating from Previous Gardener Versions If your Gardener version doesn’t support gardenlets yet, no special migration is required, but the following prerequisites must be met:\n Your Gardener version is at least 0.31 before upgrading to v1. You have to make sure that your garden cluster is exposed in a way that it’s reachable from all your seed clusters. With previous Gardener versions, you had deployed the Gardener Helm chart (incorporating the API server, controller-manager, and scheduler). With v1, this stays the same, but you now have to deploy the gardenlet Helm chart as well into all of your seeds (if they aren’t managed, as mentioned earlier).\nSee Deploy a gardenlet for all instructions.\nRelated Links Gardener Architecture #356: Implement Gardener Scheduler #2309: Add /healthz endpoint for gardenlet ","categories":"","description":"Understand how the gardenlet, the primary \"agent\" on every seed cluster, works and learn more about the different Gardener components","excerpt":"Understand how the gardenlet, the primary \"agent\" on every seed …","ref":"/docs/gardener/concepts/gardenlet/","tags":"","title":"gardenlet"},{"body":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This tutorial describes how to use annotated Gateway API resources as source for Certificate.\nInstall Istio on your cluster Follow the Istio Kubernetes Gateway API to install the Gateway API and to install Istio.\nThese are the typical commands for the Istio installation with the Kubernetes Gateway API:\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - kubectl get crd gateways.gateway.networking.k8s.io \u0026\u003e /dev/null || \\ { kubectl kustomize \"github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.0.0\" | kubectl apply -f -; } istioctl install --set profile=minimal -y kubectl label namespace default istio-injection=enabled Verify that Gateway Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Note: The sample service is not used in the following steps. It is deployed for illustration purposes only. To use it with certificates, you have to add an HTTPS port for it.\nUsing a Gateway as a source Deploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1beta1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: #cert.gardener.cloud/dnsnames: \"*.example.com\" # alternative if you want to control the dns names explicitly. cert.gardener.cloud/purpose: managed spec: gatewayClassName: istio listeners: - name: default hostname: \"*.example.com\" # this is used by cert-controller-manager to extract DNS names port: 443 protocol: HTTPS allowedRoutes: namespaces: from: All tls: # important: tls section must be defined with exactly one certificateRefs item certificateRefs: - name: foo-example-com --- apiVersion: gateway.networking.k8s.io/v1beta1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by cert-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF You should now see a created Certificate resource similar to:\n$ kubectl -n istio-ingress get cert -oyaml apiVersion: v1 items: - apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: generateName: gateway-gateway- name: gateway-gateway-kdw6h namespace: istio-ingress ownerReferences: - apiVersion: gateway.networking.k8s.io/v1 blockOwnerDeletion: true controller: true kind: Gateway name: gateway spec: commonName: '*.example.com' secretName: foo-example-com status: ... kind: List metadata: resourceVersion: \"\" Using a HTTPRoute as a source If the Gateway resource is annotated with cert.gardener.cloud/purpose: managed, hostnames from all referencing HTTPRoute resources are automatically extracted. These resources don’t need an additional annotation.\nDeploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1beta1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: cert.gardener.cloud/purpose: managed spec: gatewayClassName: istio listeners: - name: default hostname: null # not set port: 443 protocol: HTTPS allowedRoutes: namespaces: from: All tls: # important: tls section must be defined with exactly one certificateRefs item certificateRefs: - name: foo-example-com --- apiVersion: gateway.networking.k8s.io/v1beta1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF This should show a similar Certificate resource as above.\n","categories":"","description":"","excerpt":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/gateway-api-gateways/","tags":"","title":"Gateway Api Gateways"},{"body":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This tutorial describes how to use annotated Gateway API resources as source for DNSEntries with the Gardener shoot-dns-service extension.\nThe dns-controller-manager supports the resources Gateway and HTTPRoute.\nInstall Istio on your cluster Using a new or existing shoot cluster, follow the Istio Kubernetes Gateway API to install the Gateway API and to install Istio.\nThese are the typical commands for the Istio installation with the Kubernetes Gateway API:\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - kubectl get crd gateways.gateway.networking.k8s.io \u0026\u003e /dev/null || \\ { kubectl kustomize \"github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.0.0\" | kubectl apply -f -; } istioctl install --set profile=minimal -y kubectl label namespace default istio-injection=enabled Verify that Gateway Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Deploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: dns.gardener.cloud/dnsnames: \"*.example.com\" dns.gardener.cloud/class: garden spec: gatewayClassName: istio listeners: - name: default hostname: \"*.example.com\" # this is used by dns-controller-manager to extract DNS names port: 80 protocol: HTTP allowedRoutes: namespaces: from: All --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF You should now see events in the namespace of the gateway:\n$ kubectl -n istio-system get events --sort-by={.metadata.creationTimestamp} LAST SEEN TYPE REASON OBJECT MESSAGE ... 38s Normal dns-annotation service/gateway-istio httpbin.example.com: created dns entry object shoot--foo--bar/gateway-istio-service-zpf8n 38s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry pending: waiting for dns reconciliation 38s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry is pending 36s Normal dns-annotation service/gateway-istio httpbin.example.com: dns entry active Using a HTTPRoute as a source If the Gateway resource is annotated with dns.gardener.cloud/dnsnames: \"*\", hostnames from all referencing HTTPRoute resources are automatically extracted. These resources don’t need an additional annotation.\nDeploy the Gateway API configuration including a single exposed route (i.e., /get):\nkubectl create namespace istio-ingress kubectl apply -f - \u003c\u003cEOF apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: gateway namespace: istio-ingress annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: gatewayClassName: istio listeners: - name: default hostname: null # not set port: 80 protocol: HTTP allowedRoutes: namespaces: from: All --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: http namespace: default spec: parentRefs: - name: gateway namespace: istio-ingress hostnames: [\"httpbin.example.com\"] # this is used by dns-controller-manager to extract DNS names too rules: - matches: - path: type: PathPrefix value: /get backendRefs: - name: httpbin port: 8000 EOF This should show a similar events as above.\nAccess the sample service using curl $ curl -I http://httpbin.example.com/get HTTP/1.1 200 OK server: istio-envoy date: Tue, 13 Feb 2024 08:09:41 GMT content-type: application/json content-length: 701 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 19 Accessing any other URL that has not been explicitly exposed should return an HTTP 404 error:\n$ curl -I http://httpbin.example.com/headers HTTP/1.1 404 Not Found date: Tue, 13 Feb 2024 08:09:41 GMT server: istio-envoy transfer-encoding: chunked ","categories":"","description":"","excerpt":"Using annotated Gateway API Gateway and/or HTTPRoutes as Source This …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/tutorials/gateway-api-gateways/","tags":"","title":"Gateway Api Gateways"},{"body":"Overview To troubleshoot certain problems in a Kubernetes cluster, operators need access to the host of the Kubernetes node. This can be required if a node misbehaves or fails to join the cluster in the first place.\nWith access to the host, it is for instance possible to check the kubelet logs and interact with common tools such as systemctl and journalctl.\nThe first section of this guide explores options to get a shell to the node of a Gardener Kubernetes cluster. The options described in the second section do not rely on Kubernetes capabilities to get shell access to a node and thus can also be used if an instance failed to join the cluster.\nThis guide only covers how to get access to the host, but does not cover troubleshooting methods.\n Overview Get a Shell to an Operational Cluster Node Gardener Dashboard Result Gardener Ops Toolbelt Custom Root Pod SSH Access to a Node That Failed to Join the Cluster Identifying the Problematic Instance gardenctl ssh SSH with a Manually Created Bastion on AWS Create the Bastion Security Group Create the Bastion Instance Connecting to the Target Instance Cleanup Get a Shell to an Operational Cluster Node The following describes four different approaches to get a shell to an operational Shoot worker node. As a prerequisite to troubleshooting a Kubernetes node, the node must have joined the cluster successfully and be able to run a pod. All of the described approaches involve scheduling a pod with root permissions and mounting the root filesystem.\nGardener Dashboard Prerequisite: the terminal feature is configured for the Gardener dashboard.\n Navigate to the cluster overview page and find the Terminal in the Access tile. Select the target Cluster (Garden, Seed / Control Plane, Shoot cluster) depending on the requirements and access rights (only certain users have access to the Seed Control Plane).\n To open the terminal configuration, interact with the top right-hand corner of the screen. Set the Terminal Runtime to “Privileged”. Also, specify the target node from the drop-down menu. Result The Dashboard then schedules a pod and opens a shell session to the node.\nTo get access to the common binaries installed on the host, prefix the command with chroot /hostroot. Note that the path depends on where the root path is mounted in the container. In the default image used by the Dashboard, it is under /hostroot.\nGardener Ops Toolbelt Prerequisite: kubectl is available.\nThe Gardener ops-toolbelt can be used as a convenient way to deploy a root pod to a node. The pod uses an image that is bundled with a bunch of useful troubleshooting tools. This is also the same image that is used by default when using the Gardener Dashboard terminal feature as described in the previous section.\nThe easiest way to use the Gardener ops-toolbelt is to execute the ops-pod script in the hacks folder. To get root shell access to a node, execute the aforementioned script by supplying the target node name as an argument:\n\u003cpath-to-ops-toolbelt-repo\u003e/hacks/ops-pod \u003ctarget-node\u003e Custom Root Pod Alternatively, a pod can be assigned to a target node and a shell can be opened via standard Kubernetes means. To enable root access to the node, the pod specification requires proper securityContext and volume properties.\nFor instance, you can use the following pod manifest, after changing with the name of the node you want this pod attached to:\napiVersion: v1 kind: Pod metadata: name: privileged-pod namespace: default spec: nodeSelector: kubernetes.io/hostname: \u003ctarget-node-name\u003e containers: - name: busybox image: busybox stdin: true securityContext: privileged: true volumeMounts: - name: host-root-volume mountPath: /host readOnly: true volumes: - name: host-root-volume hostPath: path: / hostNetwork: true hostPID: true restartPolicy: Never SSH Access to a Node That Failed to Join the Cluster This section explores two options that can be used to get SSH access to a node that failed to join the cluster. As it is not possible to schedule a pod on the node, the Kubernetes-based methods explored so far cannot be used in this scenario.\nAdditionally, Gardener typically provisions worker instances in a private subnet of the VPC, hence - there is no public IP address that could be used for direct SSH access.\nFor this scenario, cloud providers typically have extensive documentation (e.g., AWS \u0026 GCP and in some cases tooling support). However, these approaches are mostly cloud provider specific, require interaction via their CLI and API or sometimes the installation of a cloud provider specific agent on the node.\nAlternatively, gardenctl can be used providing a cloud provider agnostic and out-of-the-box support to get ssh access to an instance in a private subnet. Currently gardenctl supports AWS, GCP, Openstack, Azure and Alibaba Cloud.\nIdentifying the Problematic Instance First, the problematic instance has to be identified. In Gardener, worker pools can be created in different cloud provider regions, zones, and accounts.\nThe instance would typically show up as successfully started / running in the cloud provider dashboard or API and it is not immediately obvious which one has a problem. Instead, we can use the Gardener API / CRDs to obtain the faulty instance identifier in a cloud-agnostic way.\nGardener uses the Machine Controller Manager to create the Shoot worker nodes. For each worker node, the Machine Controller Manager creates a Machine CRD in the Shoot namespace in the respective Seed cluster. Usually the problematic instance can be identified, as the respective Machine CRD has status pending.\nThe instance / node name can be obtained from the Machine .status field:\nkubectl get machine \u003cmachine-name\u003e -o json | jq -r .status.node This is all the information needed to go ahead and use gardenctl ssh to get a shell to the node. In addition, the used cloud provider, the specific identifier of the instance, and the instance region can be identified from the Machine CRD.\nGet the identifier of the instance via:\nkubectl get machine \u003cmachine-name\u003e -o json | jq -r .spec.providerID // e.g aws:///eu-north-1/i-069733c435bdb4640 The identifier shows that the instance belongs to the cloud provider aws with the ec2 instance-id i-069733c435bdb4640 in region eu-north-1.\nTo get more information about the instance, check out the MachineClass (e.g., AWSMachineClass) that is associated with each Machine CRD in the Shoot namespace of the Seed cluster.\nThe AWSMachineClass contains the machine image (ami), machine-type, iam information, network-interfaces, subnets, security groups and attached volumes.\nOf course, the information can also be used to get the instance with the cloud provider CLI / API.\ngardenctl ssh Using the node name of the problematic instance, we can use the gardenctl ssh command to get SSH access to the cloud provider instance via an automatically set up bastion host. gardenctl takes care of spinning up the bastion instance, setting up the SSH keys, ports and security groups and opens a root shell on the target instance. After the SSH session has ended, gardenctl deletes the created cloud provider resources.\nUse the following commands:\n First, target a Garden cluster containing all the Shoot definitions. gardenctl target garden \u003ctarget-garden\u003e Target an available Shoot by name. This sets up the context, configures the kubeconfig file of the Shoot cluster and downloads the cloud provider credentials. Subsequent commands will execute in this context. gardenctl target shoot \u003ctarget-shoot\u003e This uses the cloud provider credentials to spin up the bastion and to open a shell on the target instance. gardenctl ssh \u003ctarget-node\u003e SSH with a Manually Created Bastion on AWS In case you are not using gardenctl or want to control the bastion instance yourself, you can also manually set it up. The steps described here are generally the same as those used by gardenctl internally. Despite some cloud provider specifics, they can be generalized to the following list:\n Open port 22 on the target instance. Create an instance / VM in a public subnet (the bastion instance needs to have a public IP address). Set-up security groups and roles, and open port 22 for the bastion instance. The following diagram shows an overview of how the SSH access to the target instance works:\nThis guide demonstrates the setup of a bastion on AWS.\nPrerequisites:\n The AWS CLI is set up.\n Obtain target instance-id (see Identifying the Problematic Instance).\n Obtain the VPC ID the Shoot resources are created in. This can be found in the Infrastructure CRD in the Shoot namespace in the Seed.\n Make sure that port 22 on the target instance is open (default for Gardener deployed instances).\n Extract security group via: aws ec2 describe-instances --instance-ids \u003cinstance-id\u003e Check for rule that allows inbound connections on port 22: aws ec2 describe-security-groups --group-ids=\u003csecurity-group-id\u003e If not available, create the rule with the following comamnd: aws ec2 authorize-security-group-ingress --group-id \u003csecurity-group-id\u003e --protocol tcp --port 22 --cidr 0.0.0.0/0 Create the Bastion Security Group The common name of the security group is \u003cshoot-name\u003e-bsg. Create the security group: aws ec2 create-security-group --group-name \u003cbastion-security-group-name\u003e --description ssh-access --vpc-id \u003cVPC-ID\u003e Optionally, create identifying tags for the security group: aws ec2 create-tags --resources \u003cbastion-security-group-id\u003e --tags Key=component,Value=\u003ctag\u003e Create a permission in the bastion security group that allows ssh access on port 22: aws ec2 authorize-security-group-ingress --group-id \u003cbastion-security-group-id\u003e --protocol tcp --port 22 --cidr 0.0.0.0/0 Create an IAM role for the bastion instance with the name \u003cshoot-name\u003e-bastions: aws iam create-role --role-name \u003cshoot-name\u003e-bastions The content should be:\n{ \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": [ \"ec2:DescribeRegions\" ], \"Resource\": [ \"*\" ] } ] } Create the instance profile and name it \u003cshoot-name\u003e-bastions: aws iam create-instance-profile --instance-profile-name \u003cname\u003e Add the created role to the instance profile: aws iam add-role-to-instance-profile --instance-profile-name \u003cinstance-profile-name\u003e --role-name \u003crole-name\u003e Create the Bastion Instance Next, in order to be able to ssh into the bastion instance, the instance has to be set up with a user with a public ssh key. Create a user gardener that has the same Gardener-generated public ssh key as the target instance.\n First, we need to get the public part of the Shoot ssh-key. The ssh-key is stored in a secret in the the project namespace in the Garden cluster. The name is: \u003cshoot-name\u003e-ssh-publickey. Get the key via: kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\\\"id_rsa.pub\\\" A script handed over as user-data to the bastion ec2 instance, can be used to create the gardener user and add the ssh-key. For your convenience, you can use the following script to generate the user-data. #!/bin/bash -eu saveUserDataFile () { ssh_key=$1 cat \u003e gardener-bastion-userdata.sh \u003c\u003cEOF #!/bin/bash -eu id gardener || useradd gardener -mU mkdir -p /home/gardener/.ssh echo \"$ssh_key\" \u003e /home/gardener/.ssh/authorized_keys chown gardener:gardener /home/gardener/.ssh/authorized_keys echo \"gardener ALL=(ALL) NOPASSWD:ALL\" \u003e/etc/sudoers.d/99-gardener-user EOF } if [ -p /dev/stdin ]; then read -r input cat | saveUserDataFile \"$input\" else pbpaste | saveUserDataFile \"$input\" fi Use the script by handing-over the public ssh-key of the Shoot cluster: kubectl get secret aws-gvisor.ssh-keypair -o json | jq -r .data.\\\"id_rsa.pub\\\" | ./generate-userdata.sh This generates a file called gardener-bastion-userdata.sh in the same directory containing the user-data.\n The following information is needed to create the bastion instance: bastion-IAM-instance-profile-name - Use the created instance profile with the name \u003cshoot-name\u003e-bastions\nimage-id - It is possible to use the same image-id as the one used for the target instance (or any other image). Has cloud provider specific format (AWS: ami).\nssh-public-key-name\n- This is the ssh key pair already created in the Shoot's cloud provider account by Gardener during the `Infrastructure` CRD reconciliation. - The name is usually: `\u003cshoot-name\u003e-ssh-publickey` subnet-id - Choose a subnet that is attached to an Internet Gateway and NAT Gateway (bastion instance must have a public IP). - The Gardener created public subnet with the name \u003cshoot-name\u003e-public-utility-\u003cxy\u003e can be used. Please check the created subnets with the cloud provider.\nbastion-security-group-id - Use the id of the created bastion security group.\nfile-path-to-userdata - Use the filepath to the user-data file generated in the previous step.\n bastion-instance-name Optionaly, you can tag the instance. Usually \u003cshoot-name\u003e-bastions Create the bastion instance via: ec2 run-instances --iam-instance-profile Name=\u003cbastion-IAM-instance-profile-name\u003e --image-id \u003cimage-id\u003e --count 1 --instance-type t3.nano --key-name \u003cssh-public-key-name\u003e --security-group-ids \u003cbastion-security-group-id\u003e --subnet-id \u003csubnet-id\u003e --associate-public-ip-address --user-data \u003cfile-path-to-userdata\u003e --tag-specifications ResourceType=instance,Tags=[{Key=Name,Value=\u003cbastion-instance-name\u003e},{Key=component,Value=\u003cmytag\u003e}] ResourceType=volume,Tags=[{Key=component,Value=\u003cmytag\u003e}]\" Capture the instance-id from the response and wait until the ec2 instance is running and has a public IP address.\nConnecting to the Target Instance Save the private key of the ssh-key-pair in a temporary local file for later use: umask 077 kubectl get secret \u003cshoot-name\u003e.ssh-keypair -o json | jq -r .data.\\\"id_rsa\\\" | base64 -d \u003e id_rsa.key Use the private ssh key to ssh into the bastion instance: ssh -i \u003cpath-to-private-key\u003e gardener@\u003cpublic-bastion-instance-ip\u003e  If that works, connect from your local terminal to the target instance via the bastion: ssh -i \u003cpath-to-private-key\u003e -o ProxyCommand=\"ssh -W %h:%p -i \u003cprivate-key\u003e -o IdentitiesOnly=yes -o StrictHostKeyChecking=no gardener@\u003cpublic-ip-bastion\u003e\" gardener@\u003cprivate-ip-target-instance\u003e -o IdentitiesOnly=yes -o StrictHostKeyChecking=no Cleanup Do not forget to cleanup the created resources. Otherwise Gardener will eventually fail to delete the Shoot.\n","categories":"","description":"Describes the methods for getting shell access to worker nodes","excerpt":"Describes the methods for getting shell access to worker nodes","ref":"/docs/guides/monitoring-and-troubleshooting/shell-to-node/","tags":"","title":"Get a Shell to a Gardener Shoot Worker Node"},{"body":"Deploying Rsyslog Relp Extension Locally This document will walk you through running the Rsyslog Relp extension and a fake rsyslog relp service on your local machine for development purposes. This guide uses Gardener’s local development setup and builds on top of it.\nIf you encounter difficulties, please open an issue so that we can make this process easier.\nPrerequisites Make sure that you have a running local Gardener setup. The steps to complete this can be found here. Make sure you are running Gardener version \u003e= 1.74.0 or the latest version of the master branch. Setting up the Rsyslog Relp Extension Important: Make sure that your KUBECONFIG env variable is targeting the local Gardener cluster!\nmake extension-up This will build the shoot-rsyslog-relp, shoot-rsyslog-relp-admission, and shoot-rsyslog-relp-echo-server images and deploy the needed resources and configurations in the garden cluster. The shoot-rsyslog-relp-echo-server will act as development replacement of a real rsyslog relp server.\nCreating a Shoot Cluster Once the above step is completed, we can deploy and configure a Shoot cluster with default rsyslog relp settings.\nkubectl apply -f ./example/shoot.yaml Once the Shoot’s namespace is created, we can create a networkpolicy that will allow egress traffic from the rsyslog on the Shoot’s nodes to the rsyslog-relp-echo-server that serves as a fake rsyslog target server.\nkubectl apply -f ./example/local/allow-machine-to-rsyslog-relp-echo-server-netpol.yaml Currently, the Shoot’s nodes run Ubuntu, which does not have the rsyslog-relp and auditd packages installed, so the configuration done by the extension has no effect. Once the Shoot is created, we have to manually install the rsyslog-relp and auditd packages:\nkubectl -n shoot--local--local exec -it $(kubectl -n shoot--local--local get po -l app=machine,machine-provider=local -o name) -- bash -c \" apt-get update \u0026\u0026 \\ apt-get install -y rsyslog-relp auditd \u0026\u0026 \\ systemctl enable rsyslog.service \u0026\u0026 \\ systemctl start rsyslog.service\" Once that is done we can verify that log messages are forwarded to the rsyslog-relp-echo-server by checking its logs.\nkubectl -n rsyslog-relp-echo-server logs deployment/rsyslog-relp-echo-server Making Changes to the Rsyslog Relp Extension Changes to the rsyslog relp extension can be applied to the local environment by repeatedly running the make recipe.\nmake extension-up Tearing Down the Development Environment To tear down the development environment, delete the Shoot cluster or disable the shoot-rsyslog-relp extension in the Shoot’s spec. When the extension is not used by the Shoot anymore, you can run:\nmake extension-down This will delete the ControllerRegistration and ControllerDeployment of the extension, the shoot-rsyslog-relp-admission deployment, and the rsyslog-relp-echo-server deployment.\nMaintaining the Publicly Available Image for the rsyslog-relp Echo Server The testmachinery tests use an rsyslog-relp-echo-server image from a publicly available repository. The one which is currently used is eu.gcr.io/gardener-project/gardener/extensions/shoot-rsyslog-relp-echo-server:v0.1.0.\nSometimes it might be necessary to update the image and publish it, e.g. when updating the alpine base image version specified in the repository’s Dokerfile.\nTo do that:\n Bump the version with which the image is built in the Makefile.\n Build the shoot-rsyslog-relp-echo-server image:\nmake echo-server-docker-image Once the image is built, push it to gcr with:\nmake push-echo-server-image Finally, bump the version of the image used by the testmachinery tests here.\n Create a PR with the changes.\n ","categories":"","description":"","excerpt":"Deploying Rsyslog Relp Extension Locally This document will walk you …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/getting-started/","tags":"","title":"Getting Started"},{"body":"Deploying Gardener Locally This document will walk you through deploying Gardener on your local machine. If you encounter difficulties, please open an issue so that we can make this process easier.\nOverview Gardener runs in any Kubernetes cluster. In this guide, we will start a KinD cluster which is used as both garden and seed cluster (please refer to the architecture overview) for simplicity.\nBased on Skaffold, the container images for all required components will be built and deployed into the cluster (via their Helm charts).\nAlternatives When deploying Gardener on your local machine you might face several limitations:\n Your machine doesn’t have enough compute resources (see prerequisites) for hosting a second seed cluster or multiple shoot clusters. Testing Gardener’s IPv6 features requires a Linux machine and native IPv6 connectivity to the internet, but you’re on macOS or don’t have IPv6 connectivity in your office environment or via your home ISP. In these cases, you might want to check out one of the following options that run the setup described in this guide elsewhere for circumventing these limitations:\n remote local setup: deploy on a remote pod for more compute resources dev box on Google Cloud: deploy on a Google Cloud machine for more compute resource and/or simple IPv4/IPv6 dual-stack networking Prerequisites Make sure that you have followed the Local Setup guide up until the Get the sources step. Make sure your Docker daemon is up-to-date, up and running and has enough resources (at least 8 CPUs and 8Gi memory; see here how to configure the resources for Docker for Mac). Please note that 8 CPU / 8Gi memory might not be enough for more than two Shoot clusters, i.e., you might need to increase these values if you want to run additional Shoots. If you plan on following the optional steps to create a second seed cluster, the required resources will be more - at least 10 CPUs and 18Gi memory. Additionally, please configure at least 120Gi of disk size for the Docker daemon. Tip: You can clean up unused data with docker system df and docker system prune -a.\n Setting Up the KinD Cluster (Garden and Seed) make kind-up If you want to setup an IPv6 KinD cluster, use make kind-up IPFAMILY=ipv6 instead.\n This command sets up a new KinD cluster named gardener-local and stores the kubeconfig in the ./example/gardener-local/kind/local/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/local/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig.\n All of the following steps assume that you are using this kubeconfig.\nAdditionally, this command also deploys a local container registry to the cluster, as well as a few registry mirrors, that are set up as a pull-through cache for all upstream registries Gardener uses by default. This is done to speed up image pulls across local clusters. The local registry can be accessed as localhost:5001 for pushing and pulling. The storage directories of the registries are mounted to the host machine under dev/local-registry. With this, mirrored images don’t have to be pulled again after recreating the cluster.\nThe command also deploys a default calico installation as the cluster’s CNI implementation with NetworkPolicy support (the default kindnet CNI doesn’t provide NetworkPolicy support). Furthermore, it deploys the metrics-server in order to support HPA and VPA on the seed cluster.\nSetting Up IPv6 Single-Stack Networking (optional) First, ensure that your /etc/hosts file contains an entry resolving localhost to the IPv6 loopback address:\n::1 localhost Typically, only ip6-localhost is mapped to ::1 on linux machines. However, we need localhost to resolve to both 127.0.0.1 and ::1 so that we can talk to our registry via a single address (localhost:5001).\nNext, we need to configure NAT for outgoing traffic from the kind network to the internet. After executing make kind-up IPFAMILY=ipv6, execute the following command to set up the corresponding iptables rules:\nip6tables -t nat -A POSTROUTING -o $(ip route show default | awk '{print $5}') -s fd00:10::/64 -j MASQUERADE Setting Up Gardener make gardener-up If you want to setup an IPv6 ready Gardener, use make gardener-up IPFAMILY=ipv6 instead.\n This will first build the base images (which might take a bit if you do it for the first time). Afterwards, the Gardener resources will be deployed into the cluster.\nDeveloping Gardener make gardener-dev This is similar to make gardener-up but additionally starts a skaffold dev loop. After the initial deployment, skaffold starts watching source files. Once it has detected changes, press any key to trigger a new build and deployment of the changed components.\nTip: you can set the SKAFFOLD_MODULE environment variable to select specific modules of the skaffold configuration (see skaffold.yaml) that skaffold should watch, build, and deploy. This significantly reduces turnaround times during development.\nFor example, if you want to develop changes to gardenlet:\n# initial deployment of all components make gardener-up # start iterating on gardenlet without deploying other components make gardener-dev SKAFFOLD_MODULE=gardenlet Debugging Gardener make gardener-debug This is using skaffold debugging features. In the Gardener case, Go debugging using Delve is the most relevant use case. Please see the skaffold debugging documentation how to setup your IDE accordingly.\nSKAFFOLD_MODULE environment variable is working the same way as described for Developing Gardener. However, skaffold is not watching for changes when debugging, because it would like to avoid interrupting your debugging session.\nFor example, if you want to debug gardenlet:\n# initial deployment of all components make gardener-up # start debugging gardenlet without deploying other components make gardener-debug SKAFFOLD_MODULE=gardenlet In debugging flow, skaffold builds your container images, reconfigures your pods and creates port forwardings for the Delve debugging ports to your localhost. The default port is 56268. If you debug multiple pods at the same time, the port of the second pod will be forwarded to 56269 and so on. Please check your console output for the concrete port-forwarding on your machine.\n Note: Resuming or stopping only a single goroutine (Go Issue 25578, 31132) is currently not supported, so the action will cause all the goroutines to get activated or paused. (vscode-go wiki)\n This means that when a goroutine of gardenlet (or any other gardener-core component you try to debug) is paused on a breakpoint, all the other goroutines are paused. Hence, when the whole gardenlet process is paused, it can not renew its lease and can not respond to the liveness and readiness probes. Skaffold automatically increases timeoutSeconds of liveness and readiness probes to 600. Anyway, we were facing problems when debugging that pods have been killed after a while.\nThus, leader election, health and readiness checks for gardener-admission-controller, gardener-apiserver, gardener-controller-manager, gardener-scheduler,gardenlet and operator are disabled when debugging.\nIf you have similar problems with other components which are not deployed by skaffold, you could temporarily turn off the leader election and disable liveness and readiness probes there too.\nCreating a Shoot Cluster You can wait for the Seed to be ready by running:\n./hack/usage/wait-for.sh seed local GardenletReady SeedSystemComponentsHealthy ExtensionsReady Alternatively, you can run kubectl get seed local and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION local Ready local local 4m42s vX.Y.Z-dev v1.25.1 In order to create a first shoot cluster, just run:\nkubectl apply -f example/provider-local/shoot.yaml You can wait for the Shoot to be ready by running:\nNAMESPACE=garden-local ./hack/usage/wait-for.sh shoot local APIServerAvailable ControlPlaneHealthy ObservabilityComponentsHealthy EveryNodeReady SystemComponentsHealthy Alternatively, you can run kubectl -n garden-local get shoot local and wait for the LAST OPERATION to reach 100%:\nNAME CLOUDPROFILE PROVIDER REGION K8S VERSION HIBERNATION LAST OPERATION STATUS AGE local local local local 1.25.1 Awake Create Processing (43%) healthy 94s If you don’t need any worker pools, you can create a workerless Shoot by running:\nkubectl apply -f example/provider-local/shoot-workerless.yaml (Optional): You could also execute a simple e2e test (creating and deleting a shoot) by running:\nmake test-e2e-local-simple KUBECONFIG=\"$PWD/example/gardener-local/kind/local/kubeconfig\" Accessing the Shoot Cluster ⚠️ Please note that in this setup, shoot clusters are not accessible by default when you download the kubeconfig and try to communicate with them. The reason is that your host most probably cannot resolve the DNS names of the clusters since provider-local extension runs inside the KinD cluster (for more details, see DNSRecord). Hence, if you want to access the shoot cluster, you have to run the following command which will extend your /etc/hosts file with the required information to make the DNS names resolvable:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Begin of Gardener local setup section # Shoot API server domains 172.18.255.1 api.local.local.external.local.gardener.cloud 172.18.255.1 api.local.local.internal.local.gardener.cloud # Ingress 172.18.255.1 p-seed.ingress.local.seed.local.gardener.cloud 172.18.255.1 g-seed.ingress.local.seed.local.gardener.cloud 172.18.255.1 gu-local--local.ingress.local.seed.local.gardener.cloud 172.18.255.1 p-local--local.ingress.local.seed.local.gardener.cloud 172.18.255.1 v-local--local.ingress.local.seed.local.gardener.cloud # E2e tests 172.18.255.1 api.e2e-managedseed.garden.external.local.gardener.cloud 172.18.255.1 api.e2e-managedseed.garden.internal.local.gardener.cloud 172.18.255.1 api.e2e-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-hib-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-hib-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-unpriv.local.external.local.gardener.cloud 172.18.255.1 api.e2e-unpriv.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-ncp.local.external.local.gardener.cloud 172.18.255.1 api.e2e-wake-up-ncp.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-migrate.local.external.local.gardener.cloud 172.18.255.1 api.e2e-migrate.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-migrate-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-migrate-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-mgr-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-mgr-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-rotate.local.external.local.gardener.cloud 172.18.255.1 api.e2e-rotate.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-rotate-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-rotate-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-default.local.external.local.gardener.cloud 172.18.255.1 api.e2e-default.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-default-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-default-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-force-delete.local.external.local.gardener.cloud 172.18.255.1 api.e2e-force-delete.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-fd-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-fd-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upd-node.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upd-node.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upd-node-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upd-node-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upgrade.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upgrade.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upgrade-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upgrade-wl.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib.local.internal.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib-wl.local.external.local.gardener.cloud 172.18.255.1 api.e2e-upg-hib-wl.local.internal.local.gardener.cloud 172.18.255.1 gu-local--e2e-rotate.ingress.local.seed.local.gardener.cloud 172.18.255.1 gu-local--e2e-rotate-wl.ingress.local.seed.local.gardener.cloud # End of Gardener local setup section EOF To access the Shoot, you can acquire a kubeconfig by using the shoots/adminkubeconfig subresource.\nFor convenience a helper script is provided in the hack directory. By default the script will generate a kubeconfig for a Shoot named “local” in the garden-local namespace valid for one hour.\n./hack/usage/generate-admin-kubeconf.sh \u003e admin-kubeconf.yaml If you want to change the default namespace or shoot name, you can do so by passing different values as arguments.\n./hack/usage/generate-admin-kubeconf.sh --namespace \u003cnamespace\u003e --shoot-name \u003cshootname\u003e \u003e admin-kubeconf.yaml To access an Ingress resource from the Seed, use the Ingress host with port 8448 (https://\u003cingress-host\u003e:8448, for example https://gu-local--local.ingress.local.seed.local.gardener.cloud:8448).\n(Optional): Setting Up a Second Seed Cluster There are cases where you would want to create a second seed cluster in your local setup. For example, if you want to test the control plane migration feature. The following steps describe how to do that.\nIf you are on macOS, add a new IP address on your loopback device which will be necessary for the new KinD cluster that you will create. On macOS, the default loopback device is lo0.\nsudo ip addr add 172.18.255.2 dev lo0 # adding 172.18.255.2 ip to the loopback interface Next, setup the second KinD cluster:\nmake kind2-up This command sets up a new KinD cluster named gardener-local2 and stores its kubeconfig in the ./example/gardener-local/kind/local2/kubeconfig file.\nIn order to deploy required resources in the KinD cluster that you just created, run:\nmake gardenlet-kind2-up The following steps assume that you are using the kubeconfig that points to the gardener-local cluster (first KinD cluster): export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig.\nYou can wait for the local2 Seed to be ready by running:\n./hack/usage/wait-for.sh seed local2 GardenletReady SeedSystemComponentsHealthy ExtensionsReady Alternatively, you can run kubectl get seed local2 and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION local2 Ready local local 4m42s vX.Y.Z-dev v1.25.1 If you want to perform control plane migration, you can follow the steps outlined in Control Plane Migration to migrate the shoot cluster to the second seed you just created.\nDeleting the Shoot Cluster ./hack/usage/delete shoot local garden-local (Optional): Tear Down the Second Seed Cluster make kind2-down Tear Down the Gardener Environment make kind-down Alternative Way to Set Up Garden and Seed Leveraging gardener-operator Instead of starting Garden and Seed via make kind-up gardener-up, you can also use gardener-operator to create your local dev landscape. In this setup, the virtual garden cluster has its own load balancer, so you have to create an own DNS entry in your /etc/hosts:\ncat \u003c\u003cEOF | sudo tee -a /etc/hosts # Manually created to access local Gardener virtual garden cluster. # TODO: Remove this again when the virtual garden cluster access is no longer required. 172.18.255.3 api.virtual-garden.local.gardener.cloud EOF You can bring up gardener-operator with this command:\nmake kind-operator-up operator-up Afterwards, you can create your local Garden and install gardenlet into the KinD cluster with this command:\nmake operator-seed-up You find the kubeconfig for the KinD cluster at ./example/gardener-local/kind/operator/kubeconfig. The one for the virtual garden is accessible at ./example/operator/virtual-garden/kubeconfig.\n [!IMPORTANT] When you create non-HA shoot clusters (i.e., Shoots with .spec.controlPlane.highAvailability.failureTolerance != zone), then they are not exposed via 172.18.255.1 (ref). Instead, you need to find out under which Istio instance they got exposed, and put the corresponding IP address into your /etc/hosts file:\n# replace \u003cshoot-namespace\u003e with your shoot namespace (e.g., `shoot--foo--bar`): kubectl -n \"$(kubectl -n \u003cshoot-namespace\u003e get gateway kube-apiserver -o jsonpath={.spec.selector.istio} | sed 's/.*--/istio-ingress--/')\" get svc istio-ingressgateway -o jsonpath={.status.loadBalancer.ingress..ip} When the shoot cluster is HA (i.e., .spec.controlPlane.highAvailability.failureTolerance == zone), then you can access it via 172.18.255.1.\n Please use this command to tear down your environment:\nmake kind-operator-down This setup supports creating shoots and managed seeds the same way as explained in the previous chapters. However, the development and debugging setups are not working yet.\nRemote Local Setup Just like Prow is executing the KinD based integration tests in a K8s pod, it is possible to interactively run this KinD based Gardener development environment, aka “local setup”, in a “remote” K8s pod.\nk apply -f docs/deployment/content/remote-local-setup.yaml k exec -it remote-local-setup-0 -- sh tmux a Caveats Please refer to the TMUX documentation for working effectively inside the remote-local-setup pod.\nTo access Plutono, Prometheus or other components in a browser, two port forwards are needed:\nThe port forward from the laptop to the pod:\nk port-forward remote-local-setup-0 3000 The port forward in the remote-local-setup pod to the respective component:\nk port-forward -n shoot--local--local deployment/plutono 3000 Related Links Local Provider Extension ","categories":"","description":"","excerpt":"Deploying Gardener Locally This document will walk you through …","ref":"/docs/gardener/deployment/getting_started_locally/","tags":"","title":"Getting Started Locally"},{"body":"Developing Gardener Locally This document explains how to setup a kind based environment for developing Gardener locally.\nFor the best development experience you should especially check the Developing Gardener section.\nIn case you plan a debugging session please check the Debugging Gardener section.\n","categories":"","description":"","excerpt":"Developing Gardener Locally This document explains how to setup a kind …","ref":"/docs/gardener/getting_started_locally/","tags":"","title":"Getting Started Locally"},{"body":"Etcd-Druid Local Setup This page aims to provide steps on how to setup Etcd-Druid locally with and without storage providers.\nClone the etcd-druid github repo # clone the repo git clone https://github.com/gardener/etcd-druid.git # cd into etcd-druid folder cd etcd-druid Note:\n Etcd-druid uses kind as it’s local Kubernetes engine. The local setup is configured for kind due to its convenience but any other kubernetes setup would also work. To set up etcd-druid with backups enabled on a LocalStack provider, refer this document In the section Annotate Etcd CR with the reconcile annotation, the flag --enable-etcd-spec-auto-reconcile is set to false, which means a special annotation is required on the Etcd CR, for etcd-druid to reconcile it. To disable this behavior and allow auto-reconciliation of the Etcd CR for any change in the Etcd spec, set the controllers.etcd.enableEtcdSpecAutoReconcile value to true in the values.yaml located at charts/druid/values.yaml. Or if etcd-druid is being run as a process, then while starting the process, set the CLI flag --enable-etcd-spec-auto-reconcile=true for it. Setting up the kind cluster # Create a kind cluster make kind-up This creates a new kind cluster and stores the kubeconfig in the ./hack/e2e-test/infrastructure/kind/kubeconfig file.\nTo target this newly created cluster, set the KUBECONFIG environment variable to the kubeconfig file located at ./hack/e2e-test/infrastructure/kind/kubeconfig by using the following\nexport KUBECONFIG=$PWD/hack/e2e-test/infrastructure/kind/kubeconfig Setting up etcd-druid Either one of these commands may be used to deploy etcd-druid to the configured k8s cluster.\n The following command deploys etcd-druid to the configured k8s cluster:\nmake deploy The following command deploys etcd-druid to the configured k8s cluster using Skaffold dev mode, such that changes in the etcd-druid code are automatically picked up and applied to the deployment. This helps with local development and quick iterative changes:\nmake deploy-dev The following command deploys etcd-druid to the configured k8s cluster using Skaffold debug mode, so that a debugger can be attached to the running etcd-druid deployment. Please refer to this guide for more information on Skaffold-based debugging:\nmake deploy-debug This generates the Etcd and EtcdCopyBackupsTask CRDs and deploys an etcd-druid pod into the cluster.\n Note: Before calling any of the make deploy* commands, certain environment variables may be set in order to enable/disable certain functionalities of etcd-druid. These are:\n DRUID_ENABLE_ETCD_COMPONENTS_WEBHOOK=true : enables the etcdcomponents webhook DRUID_E2E_TEST=true : sets specific configuration for etcd-druid for optimal e2e test runs, like a lower sync period for the etcd controller. USE_ETCD_DRUID_FEATURE_GATES=false : enables etcd-druid feature gates. Prepare the Etcd CR Etcd CR can be configured in 2 ways. Either to take backups to the store or disable them. Follow the appropriate section below based on the requirement.\nThe Etcd CR can be found at this location $PWD/config/samples/druid_v1alpha1_etcd.yaml\n Without Backups enabled\nTo set up etcd-druid without backups enabled, make sure the spec.backup.store section of the Etcd CR is commented out.\n With Backups enabled (On Cloud Provider Object Stores)\n Prepare the secret\nCreate a secret for cloud provider access. Find the secret yaml templates for different cloud providers here.\nReplace the dummy values with the actual configurations and make sure to add a name and a namespace to the secret as intended.\n Note 1: The secret should be applied in the same namespace as druid.\nNote 2: All the values in the data field of secret yaml should be in base64 encoded format.\n Apply the secret\nkubectl apply -f path/to/secret Adapt Etcd resource\nUncomment the spec.backup.store section of the druid yaml and set the keys to allow backuprestore to take backups by connecting to an object store.\n# Configuration for storage provider store: secretRef: name: etcd-backup-secret-name container: object-storage-container-name provider: aws # options: aws,azure,gcp,openstack,alicloud,dell,openshift,local prefix: etcd-test Brief explanation of keys:\n secretRef.name is the name of the secret that was applied as mentioned above store.container is the object storage bucket name store.provider is the bucket provider. Pick from the options mentioned in comment store.prefix is the folder name that you want to use for your snapshots inside the bucket. Applying the Etcd CR Note: With backups enabled, make sure the bucket is created in corresponding cloud provider before applying the Etcd yaml\n Create the Etcd CR (Custom Resource) by applying the Etcd yaml to the cluster\n# Apply the prepared etcd CR yaml kubectl apply -f config/samples/druid_v1alpha1_etcd.yaml Verify the Etcd cluster To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the cluster size, readiness status of its members, and various other attributes.\nkubectl get etcd -o=wide Verify Etcd Member Pods To check the etcd member pods, do the following and look out for pods starting with the name etcd-\nkubectl get pods Verify Etcd Pods’ Functionality Verify the working conditions of the etcd pods by putting data through a etcd container and access the db from same/another container depending on single/multi node etcd cluster.\nIdeally, you can exec into the etcd container using kubectl exec -it \u003cetcd_pod\u003e -c etcd -- bash if it utilizes a base image containing a shell. However, note that the etcd-wrapper Docker image employs a distroless image, which lacks a shell. To interact with etcd, use an Ephemeral container as a debug container. Refer to this documentation for building and using an ephemeral container by attaching it to the etcd container.\n# Put a key-value pair into the etcd etcdctl put key1 value1 # Retrieve all key-value pairs from the etcd db etcdctl get --prefix \"\" For a multi-node etcd cluster, insert the key-value pair from the etcd container of one etcd member and retrieve it from the etcd container of another member to verify consensus among the multiple etcd members.\nView Etcd Database File The Etcd database file is located at var/etcd/data/new.etcd/snap/db inside the backup-restore container. In versions with an alpine base image, you can exec directly into the container. However, in recent versions where the backup-restore docker image started using a distroless image, a debug container is required to communicate with it, as mentioned in the previous section.\nUpdating the Etcd CR The Etcd spec can be updated with new changes, such as etcd cluster configuration or backup-restore configuration, and etcd-druid will reconcile these changes as expected, under certain conditions:\n If the --enable-etcd-spec-auto-reconcile flag is set to true, the spec change is automatically picked up and reconciled by etcd-druid. If the --enable-etcd-spec-auto-reconcile flag is unset, or set to false, then etcd-druid will expect an additional annotation gardener.cloud/operation: reconcile on the Etcd resource in order to pick it up for reconciliation. Upon successful reconciliation, this annotation is removed by etcd-druid. The annotation can be added as follows: # Annotate etcd-test CR to reconcile kubectl annotate etcd etcd-test gardener.cloud/operation=\"reconcile\" Cleaning the setup # Delete the cluster make kind-down This cleans up the entire setup as the kind cluster gets deleted. It deletes the created Etcd, all pods that got created along the way and also other resources such as statefulsets, services, PV’s, PVC’s, etc.\n","categories":"","description":"","excerpt":"Etcd-Druid Local Setup This page aims to provide steps on how to setup …","ref":"/docs/other-components/etcd-druid/getting-started-locally/","tags":"","title":"Getting Started Locally"},{"body":"Getting started with etcd-druid using Azurite, and kind This document is a step-by-step guide to run etcd-druid with Azurite, the Azure Blob Storage emulator, within a kind cluster. This setup is ideal for local development and testing.\nPrerequisites Docker with the daemon running, or Docker Desktop running. Azure CLI (\u003e=2.55.0) Environment setup Step 1: Provisioning the kind cluster Execute the command below to provision a kind cluster. This command also forwards port 10000 from the kind cluster to your local machine, enabling Azurite access:\nmake kind-up Export the KUBECONFIG file after running the above command.\nStep 2: Deploy Azurite To start up the Azurite emulator in a pod in the kind cluster, run:\nmake deploy-azurite Step 3: Set up a ABS Container To use the Azure CLI with the Azurite emulator running as a pod in the kind cluster, export the connection string for the Azure CLI. export AZURE_STORAGE_CONNECTION_STRING=\"DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;\" Create a Azure Blob Storage Container in Azurite az storage container create -n etcd-bucket Step 4: Deploy etcd-druid make deploy Step 5: Configure the Secret and the Etcd manifests Apply the Kubernetes Secret manifest through: kubectl apply -f config/samples/etcd-secret-azurite.yaml Apply the Etcd manifest through: kubectl apply -f config/samples/druid_v1alpha1_etcd_azurite.yaml Step 6 : Make use of the Azurite emulator however you wish etcd-backup-restore will now use Azurite running in kind as the remote store to store snapshots if all the previous steps were followed correctly.\nCleanup make kind-down unset AZURE_STORAGE_CONNECTION_STRING KUBECONFIG ","categories":"","description":"","excerpt":"Getting started with etcd-druid using Azurite, and kind This document …","ref":"/docs/other-components/etcd-druid/getting-started-locally-azurite/","tags":"","title":"Getting Started Locally Azurite"},{"body":"Getting Started with etcd-druid, LocalStack, and Kind This guide provides step-by-step instructions on how to set up etcd-druid with LocalStack and Kind on your local machine. LocalStack emulates AWS services locally, which allows the etcd cluster to interact with AWS S3 without the need for an actual AWS connection. This setup is ideal for local development and testing.\nPrerequisites Docker (installed and running) AWS CLI (version \u003e=1.29.0 or \u003e=2.13.0) Environment Setup Step 1: Provision the Kind Cluster Execute the command below to provision a kind cluster. This command also forwards port 4566 from the kind cluster to your local machine, enabling LocalStack access:\nmake kind-up Step 2: Deploy LocalStack Deploy LocalStack onto the Kubernetes cluster using the command below:\nmake deploy-localstack Step 3: Set up an S3 Bucket Set up the AWS CLI to interact with LocalStack by setting the necessary environment variables. This configuration redirects S3 commands to the LocalStack endpoint and provides the required credentials for authentication: export AWS_ENDPOINT_URL_S3=\"http://localhost:4566\" export AWS_ACCESS_KEY_ID=ACCESSKEYAWSUSER export AWS_SECRET_ACCESS_KEY=sEcreTKey export AWS_DEFAULT_REGION=us-east-2 Create an S3 bucket for etcd-druid backup purposes: aws s3api create-bucket --bucket etcd-bucket --region us-east-2 --create-bucket-configuration LocationConstraint=us-east-2 --acl private Step 4: Deploy etcd-druid Deploy etcd-druid onto the Kind cluster using the command below:\nmake deploy Step 5: Configure etcd with LocalStack Store Apply the required Kubernetes manifests to create an etcd custom resource (CR) and a secret for AWS credentials, facilitating LocalStack access:\nexport KUBECONFIG=hack/e2e-test/infrastructure/kind/kubeconfig kubectl apply -f config/samples/druid_v1alpha1_etcd_localstack.yaml -f config/samples/etcd-secret-localstack.yaml Step 6: Reconcile the etcd Initiate etcd reconciliation by annotating the etcd resource with the gardener.cloud/operation=reconcile annotation:\nkubectl annotate etcd etcd-test gardener.cloud/operation=reconcile Congratulations! You have successfully configured etcd-druid, LocalStack, and kind on your local machine. Inspect the etcd-druid logs and LocalStack to ensure the setup operates as anticipated.\nTo validate the buckets, execute the following command:\naws s3 ls etcd-bucket/etcd-test/v2/ Cleanup To dismantle the setup, execute the following command:\nmake kind-down unset AWS_ENDPOINT_URL_S3 AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION KUBECONFIG ","categories":"","description":"","excerpt":"Getting Started with etcd-druid, LocalStack, and Kind This guide …","ref":"/docs/other-components/etcd-druid/getting-started-locally-localstack/","tags":"","title":"Getting Started Locally Localstack"},{"body":"Deploying Gardener Locally and Enabling Provider-Extensions This document will walk you through deploying Gardener on your local machine and bootstrapping your own seed clusters on an existing Kubernetes cluster. It is supposed to run your local Gardener developments on a real infrastructure. For running Gardener only entirely local, please check the getting started locally documentation. If you encounter difficulties, please open an issue so that we can make this process easier.\nOverview Gardener runs in any Kubernetes cluster. In this guide, we will start a KinD cluster which is used as garden cluster. Any Kubernetes cluster could be used as seed clusters in order to support provider extensions (please refer to the architecture overview). This guide is tested for using Kubernetes clusters provided by Gardener, AWS, Azure, and GCP as seed so far.\nBased on Skaffold, the container images for all required components will be built and deployed into the clusters (via their Helm charts).\nPrerequisites Make sure that you have followed the Local Setup guide up until the Get the sources step. Make sure your Docker daemon is up-to-date, up and running and has enough resources (at least 8 CPUs and 8Gi memory; see the Docker documentation for how to configure the resources for Docker for Mac). Additionally, please configure at least 120Gi of disk size for the Docker daemon. Tip: You can clean up unused data with docker system df and docker system prune -a.\n Make sure that you have access to a Kubernetes cluster you can use as a seed cluster in this setup. The seed cluster requires at least 16 CPUs in total to run one shoot cluster You could use any Kubernetes cluster for your seed cluster. However, using a Gardener shoot cluster for your seed simplifies some configuration steps. When bootstrapping gardenlet to the cluster, your new seed will have the same provider type as the shoot cluster you use - an AWS shoot will become an AWS seed, a GCP shoot will become a GCP seed, etc. (only relevant when using a Gardener shoot as seed). Provide Infrastructure Credentials and Configuration As this setup is running on a real infrastructure, you have to provide credentials for DNS, the infrastructure, and the kubeconfig for the Kubernetes cluster you want to use as seed.\n There are .gitignore entries for all files and directories which include credentials. Nevertheless, please double check and make sure that credentials are not committed to the version control system.\n DNS Gardener control plane requires DNS for default and internal domains. Thus, you have to configure a valid DNS provider for your setup.\nPlease maintain your DNS provider configuration and credentials at ./example/provider-extensions/garden/controlplane/domain-secrets.yaml.\nYou can find a template for the file at ./example/provider-extensions/garden/controlplane/domain-secrets.yaml.tmpl.\nInfrastructure Infrastructure secrets and the corresponding secret bindings should be maintained at:\n ./example/provider-extensions/garden/project/credentials/infrastructure-secrets.yaml ./example/provider-extensions/garden/project/credentials/secretbindings.yaml There are templates with .tmpl suffixes for the files in the same folder.\nProjects The projects and the namespaces associated with them should be maintained at ./example/provider-extensions/garden/project/project.yaml.\nYou can find a template for the file at ./example/provider-extensions/garden/project/project.yaml.tmpl.\nSeed Cluster Preparation The kubeconfig of your Kubernetes cluster you would like to use as seed should be placed at ./example/provider-extensions/seed/kubeconfig. Additionally, please maintain the configuration of your seed in ./example/provider-extensions/gardenlet/values.yaml. It is automatically copied from values.yaml.tmpl in the same directory when you run make gardener-extensions-up for the first time. It also includes explanations of the properties you should set.\nUsing a Gardener Shoot cluster as seed simplifies the process, because some configuration options can be taken from shoot-info and creating DNS entries and TLS certificates is automated.\nHowever, you can use different Kubernetes clusters for your seed too and configure these things manually. Please configure the options of ./example/provider-extensions/gardenlet/values.yaml upfront. For configuring DNS and TLS certificates, make gardener-extensions-up, which is explained later, will pause and tell you what to do.\nExternal Controllers You might plan to deploy and register external controllers for networking, operating system, providers, etc. Please put ControllerDeployments and ControllerRegistrations into the ./example/provider-extensions/garden/controllerregistrations directory. The whole content of this folder will be applied to your KinD cluster.\nCloudProfiles There are no demo CloudProfiles yet. Thus, please copy CloudProfiles from another landscape to the ./example/provider-extensions/garden/cloudprofiles directory or create your own CloudProfiles based on the gardener examples. Please check the GitHub repository of your desired provider-extension. Most of them include example CloudProfiles. All files you place in this folder will be applied to your KinD cluster.\nSetting Up the KinD Cluster make kind-extensions-up This command sets up a new KinD cluster named gardener-extensions and stores the kubeconfig in the ./example/gardener-local/kind/extensions/kubeconfig file.\n It might be helpful to copy this file to $HOME/.kube/config, since you will need to target this KinD cluster multiple times. Alternatively, make sure to set your KUBECONFIG environment variable to ./example/gardener-local/kind/extensions/kubeconfig for all future steps via export KUBECONFIG=$PWD/example/gardener-local/kind/extensions/kubeconfig.\n All of the following steps assume that you are using this kubeconfig.\nAdditionally, this command deploys a local container registry to the cluster as well as a few registry mirrors that are set up as a pull-through cache for all upstream registries Gardener uses by default. This is done to speed up image pulls across local clusters. The local registry can be accessed as localhost:5001 for pushing and pulling. The storage directories of the registries are mounted to your machine under dev/local-registry. With this, mirrored images don’t have to be pulled again after recreating the cluster.\nThe command also deploys a default calico installation as the cluster’s CNI implementation with NetworkPolicy support (the default kindnet CNI doesn’t provide NetworkPolicy support). Furthermore, it deploys the metrics-server in order to support HPA and VPA on the seed cluster.\nSetting Up Gardener (Garden on KinD, Seed on Gardener Cluster) make gardener-extensions-up This will first prepare the basic configuration of your KinD and Gardener clusters. Afterwards, the images for the Garden cluster are built and deployed into the KinD cluster. Finally, the images for the Seed cluster are built, pushed to a container registry on the Seed, and the gardenlet is started.\nAdding Additional Seeds Additional seed(s) can be added by running\nmake gardener-extensions-up SEED_NAME=\u003cseed-name\u003e The seed cluster preparations are similar to the first seed:\nThe kubeconfig of your Kubernetes cluster you would like to use as seed should be placed at ./example/provider-extensions/seed/kubeconfig-\u003cseed-name\u003e. Additionally, please maintain the configuration of your seed in ./example/provider-extensions/gardenlet/values-\u003cseed-name\u003e.yaml. It is automatically copied from values.yaml.tmpl in the same directory when you run make gardener-extensions-up SEED_NAME=\u003cseed-name\u003e for the first time. It also includes explanations of the properties you should set.\nRemoving a Seed If you have multiple seeds and want to remove one, just use\nmake gardener-extensions-down SEED_NAME=\u003cseed-name\u003e If it is not the last seed, this command will only remove the seed, but leave the local Gardener cluster and the other seeds untouched. To remove all seeds and to cleanup the local Gardener cluster, you have to run the command for each seed.\nRotate credentials of container image registry in a Seed There is a container image registry in each Seed cluster where Gardener images required for the Seed and the Shoot nodes are pushed to. This registry is password protected. The password is generated when the Seed is deployed via make gardener-extensions-up. Afterward, it is not rotated automatically. Otherwise, this could break the update of gardener-node-agent, because it might not be able to pull its own new image anymore This is no general issue of gardener-node-agent, but a limitation provider-extensions setup. Gardener does not support protected container images out of the box. The function was added for this scenario only.\nHowever, if you want to rotate the credentials for any reason, there are two options for it.\n run make gardener-extensions-up (to ensure that your images are up-to-date) reconcile all shoots on the seed where you want to rotate the registry password run kubectl delete secrets -n registry registry-password on your seed cluster run make gardener-extensions-up reconcile the shoots again or\n reconcile all shoots on the seed where you want to rotate the registry password run kubectl delete secrets -n registry registry-password on your seed cluster run ./example/provider-extensions/registry-seed/deploy-registry.sh \u003cpath to seed kubeconfig\u003e \u003cseed registry hostname\u003e reconcile the shoots again Pause and Unpause the KinD Cluster The KinD cluster can be paused by stopping and keeping its docker container. This can be done by running:\nmake kind-extensions-down When you run make kind-extensions-up again, you will start the docker container with your previous Gardener configuration again.\nThis provides the option to switch off your local KinD cluster fast without leaving orphaned infrastructure elements behind.\nCreating a Shoot Cluster You can wait for the Seed to be ready by running:\nkubectl wait --for=condition=gardenletready seed provider-extensions --timeout=5m make kind-extensions-up already includes such a check. However, it might be useful when you wake up your Seed from hibernation or unpause you KinD cluster.\nAlternatively, you can run kubectl get seed provider-extensions and wait for the STATUS to indicate readiness:\nNAME STATUS PROVIDER REGION AGE VERSION K8S VERSION provider-extensions Ready gcp europe-west1 111m v1.61.0-dev v1.24.7 In order to create a first shoot cluster, please create your own Shoot definition and apply it to your KinD cluster. gardener-scheduler includes candidateDeterminationStrategy: MinimalDistance configuration so you are able to run schedule Shoots of different providers on your Seed.\nYou can wait for your Shoots to be ready by running kubectl -n garden-local get shoots and wait for the LAST OPERATION to reach 100%. The output depends on your Shoot definition. This is an example output:\nNAME CLOUDPROFILE PROVIDER REGION K8S VERSION HIBERNATION LAST OPERATION STATUS AGE aws aws aws eu-west-1 1.24.3 Awake Create Processing (43%) healthy 84s aws-arm64 aws aws eu-west-1 1.24.3 Awake Create Processing (43%) healthy 65s azure az azure westeurope 1.24.2 Awake Create Processing (43%) healthy 57s gcp gcp gcp europe-west1 1.24.3 Awake Create Processing (43%) healthy 94s Accessing the Shoot Cluster Your shoot clusters will have a public DNS entries for their API servers, so that they could be reached via the Internet via kubectl after you have created their kubeconfig.\nWe encourage you to use the adminkubeconfig subresource for accessing your shoot cluster. You can find an example how to use it in Accessing Shoot Clusters.\nDeleting the Shoot Clusters Before tearing down your environment, you have to delete your shoot clusters. This is highly recommended because otherwise you would leave orphaned items on your infrastructure accounts.\n./hack/usage/delete shoot \u003cyour-shoot\u003e garden-local Tear Down the Gardener Environment Before you delete your local KinD cluster, you should shut down your Shoots and Seed in a clean way to avoid orphaned infrastructure elements in your projects.\nPlease ensure that your KinD and Seed clusters are online (not paused or hibernated) and run:\nmake gardener-extensions-down This will delete all Shoots first (this could take a couple of minutes), then uninstall gardenlet from the Seed and the gardener components from the KinD. Finally, the additional components like container registry, etc., are deleted from both clusters.\nWhen this is done, you can securely delete your local KinD cluster by running:\nmake kind-extensions-clean ","categories":"","description":"","excerpt":"Deploying Gardener Locally and Enabling Provider-Extensions This …","ref":"/docs/gardener/deployment/getting_started_locally_with_extensions/","tags":"","title":"Getting Started Locally With Extensions"},{"body":"Disclaimer Be aware, that the following sections might be opinionated. Kubernetes, and the GPU support in particular, are rapidly evolving, which means that this guide is likely to be outdated sometime soon. For this reason, contributions are highly appreciated to update this guide.\nCreate a Cluster First thing first, let’s create a Kubernetes (K8s) cluster with GPU accelerated nodes. In this example we will use an AWS p2.xlarge EC2 instance because it’s the cheapest available option at the moment. Use such cheap instances for learning to limit your resource costs. This costs around 1€/hour per GPU\nInstall NVidia Driver as Daemonset apiVersion: apps/v1 kind: DaemonSet metadata: name: nvidia-driver-installer namespace: kube-system labels: k8s-app: nvidia-driver-installer spec: selector: matchLabels: name: nvidia-driver-installer k8s-app: nvidia-driver-installer template: metadata: labels: name: nvidia-driver-installer k8s-app: nvidia-driver-installer spec: hostPID: true initContainers: - image: squat/modulus:4a1799e7aa0143bcbb70d354bab3e419b1f54972 name: modulus args: - compile - nvidia - \"410.104\" securityContext: privileged: true env: - name: MODULUS_CHROOT value: \"true\" - name: MODULUS_INSTALL value: \"true\" - name: MODULUS_INSTALL_DIR value: /opt/drivers - name: MODULUS_CACHE_DIR value: /opt/modulus/cache - name: MODULUS_LD_ROOT value: /root - name: IGNORE_MISSING_MODULE_SYMVERS value: \"1\" volumeMounts: - name: etc-coreos mountPath: /etc/coreos readOnly: true - name: usr-share-coreos mountPath: /usr/share/coreos readOnly: true - name: ld-root mountPath: /root - name: module-cache mountPath: /opt/modulus/cache - name: module-install-dir-base mountPath: /opt/drivers - name: dev mountPath: /dev containers: - image: \"gcr.io/google-containers/pause:3.1\" name: pause tolerations: - key: \"nvidia.com/gpu\" effect: \"NoSchedule\" operator: \"Exists\" volumes: - name: etc-coreos hostPath: path: /etc/coreos - name: usr-share-coreos hostPath: path: /usr/share/coreos - name: ld-root hostPath: path: / - name: module-cache hostPath: path: /opt/modulus/cache - name: dev hostPath: path: /dev - name: module-install-dir-base hostPath: path: /opt/drivers Install Device Plugin apiVersion: apps/v1 kind: DaemonSet metadata: name: nvidia-gpu-device-plugin namespace: kube-system labels: k8s-app: nvidia-gpu-device-plugin #addonmanager.kubernetes.io/mode: Reconcile spec: selector: matchLabels: k8s-app: nvidia-gpu-device-plugin template: metadata: labels: k8s-app: nvidia-gpu-device-plugin annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: priorityClassName: system-node-critical volumes: - name: device-plugin hostPath: path: /var/lib/kubelet/device-plugins - name: dev hostPath: path: /dev containers: - image: \"k8s.gcr.io/nvidia-gpu-device-plugin@sha256:08509a36233c5096bb273a492251a9a5ca28558ab36d74007ca2a9d3f0b61e1d\" command: [\"/usr/bin/nvidia-gpu-device-plugin\", \"-logtostderr\", \"-host-path=/opt/drivers/nvidia\"] name: nvidia-gpu-device-plugin resources: requests: cpu: 50m memory: 10Mi limits: cpu: 50m memory: 10Mi securityContext: privileged: true volumeMounts: - name: device-plugin mountPath: /device-plugin - name: dev mountPath: /dev updateStrategy: type: RollingUpdate Test To run an example training on a GPU node, first start a base image with Tensorflow with GPU support \u0026 Keras:\napiVersion: apps/v1 kind: Deployment metadata: name: deeplearning-workbench namespace: default spec: replicas: 1 selector: matchLabels: app: deeplearning-workbench template: metadata: labels: app: deeplearning-workbench spec: containers: - name: deeplearning-workbench image: afritzler/deeplearning-workbench resources: limits: nvidia.com/gpu: 1 tolerations: - key: \"nvidia.com/gpu\" effect: \"NoSchedule\" operator: \"Exists\" Note the tolerations section above is not required if you deploy the ExtendedResourceToleration admission controller to your cluster. You can do this in the kubernetes section of your Gardener cluster shoot.yaml as follows:\n kubernetes: kubeAPIServer: admissionPlugins: - name: ExtendedResourceToleration Now exec into the container and start an example Keras training:\nkubectl exec -it deeplearning-workbench-8676458f5d-p4d2v -- /bin/bash cd /keras/example python imdb_cnn.py Related Links Andreas Fritzler from the Gardener Core team for the R\u0026D, who has provided this setup. Build and install NVIDIA driver on CoreOS Nvidia Device Plugin ","categories":"","description":"Setting up a GPU Enabled Cluster for Deep Learning","excerpt":"Setting up a GPU Enabled Cluster for Deep Learning","ref":"/docs/guides/administer-shoots/gpu/","tags":"","title":"GPU Enabled Cluster"},{"body":"GRPC based implementation of Cloud Providers - WIP Goal: Currently the Cloud Providers’ (CP) functionalities ( Create(), Delete(), List() ) are part of the Machine Controller Manager’s (MCM)repository. Because of this, adding support for new CPs into MCM requires merging code into MCM which may not be required for core functionalities of MCM itself. Also, for various reasons it may not be feasible for all CPs to merge their code with MCM which is an Open Source project.\nBecause of these reasons, it was decided that the CP’s code will be moved out in separate repositories so that they can be maintained separately by the respective teams. Idea is to make MCM act as a GRPC server, and CPs as GRPC clients. The CP can register themselves with the MCM using a GRPC service exposed by the MCM. Details of this approach is discussed below.\nHow it works: MCM acts as GRPC server and listens on a pre-defined port 5000. It implements below GRPC services. Details of each of these services are mentioned in next section.\n Register() GetMachineClass() GetSecret() GRPC services exposed by MCM: Register() rpc Register(stream DriverSide) returns (stream MCMside) {}\nThe CP GRPC client calls this service to register itself with the MCM. The CP passes the kind and the APIVersion which it implements, and MCM maintains an internal map for all the registered clients. A GRPC stream is returned in response which is kept open througout the life of both the processes. MCM uses this stream to communicate with the client for machine operations: Create(), Delete() or List(). The CP client is responsible for reading the incoming messages continuously, and based on the operationType parameter embedded in the message, it is supposed to take the required action. This part is already handled in the package grpc/infraclient. To add a new CP client, import the package, and implement the ExternalDriverProvider interface:\ntype ExternalDriverProvider interface { Create(machineclass *MachineClassMeta, credentials, machineID, machineName string) (string, string, error) Delete(machineclass *MachineClassMeta, credentials, machineID string) error List(machineclass *MachineClassMeta, credentials, machineID string) (map[string]string, error) } GetMachineClass() rpc GetMachineClass(MachineClassMeta) returns (MachineClass) {}\nAs part of the message from MCM for various machine operations, the name of the machine class is sent instead of the full machine class spec. The CP client is expected to use this GRPC service to get the full spec of the machine class. This optionally enables the client to cache the machine class spec, and make the call only if the machine calass spec is not already cached.\nGetSecret() rpc GetSecret(SecretMeta) returns (Secret) {}\nAs part of the message from MCM for various machine operations, the Cloud Config (CC) and CP credentials are not sent. The CP client is expected to use this GRPC service to get the secret which has CC and CP’s credentials from MCM. This enables the client to cache the CC and credentials, and to make the call only if the data is not already cached.\nHow to add a new Cloud Provider’s support Import the package grpc/infraclient and grpc/infrapb from MCM (currently in MCM’s “grpc-driver” branch)\n Implement the interface ExternalDriverProvider Create(): Creates a new machine Delete(): Deletes a machine List(): Lists machines Use the interface MachineClassDataProvider GetMachineClass(): Makes the call to MCM to get machine class spec GetSecret(): Makes the call to MCM to get secret containing Cloud Config and CP’s credentials Example implementation: Refer GRPC based implementation for AWS client: https://github.com/ggaurav10/aws-driver-grpc\n","categories":"","description":"","excerpt":"GRPC based implementation of Cloud Providers - WIP Goal: Currently the …","ref":"/docs/other-components/machine-controller-manager/proposals/external_providers_grpc/","tags":"","title":"GRPC Based Implementation of Cloud Providers"},{"body":"Gardener Extension for the gVisor Container Runtime Sandbox \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-10 (Additional Container Runtimes) Extensibility API documentation ","categories":"","description":"Gardener extension controller for the gVisor container runtime sandbox","excerpt":"Gardener extension controller for the gVisor container runtime sandbox","ref":"/docs/extensions/container-runtime-extensions/gardener-extension-runtime-gvisor/","tags":"","title":"GVisor container runtime"},{"body":"Health Check Library Goal Typically, an extension reconciles a specific resource (Custom Resource Definitions (CRDs)) and creates / modifies resources in the cluster (via helm, managed resources, kubectl, …). We call these API Objects ‘dependent objects’ - as they are bound to the lifecycle of the extension.\nThe goal of this library is to enable extensions to setup health checks for their ‘dependent objects’ with minimal effort.\nUsage The library provides a generic controller with the ability to register any resource that satisfies the extension object interface. An example is the Worker CRD.\nHealth check functions for commonly used dependent objects can be reused and registered with the controller, such as:\n Deployment DaemonSet StatefulSet ManagedResource (Gardener specific) See the below example taken from the provider-aws.\nhealth.DefaultRegisterExtensionForHealthCheck( aws.Type, extensionsv1alpha1.SchemeGroupVersion.WithKind(extensionsv1alpha1.WorkerResource), func() runtime.Object { return \u0026extensionsv1alpha1.Worker{} }, mgr, // controller runtime manager opts, // options for the health check controller nil, // custom predicates map[extensionshealthcheckcontroller.HealthCheck]string{ general.CheckManagedResource(genericactuator.McmShootResourceName): string(gardencorev1beta1.ShootSystemComponentsHealthy), general.CheckSeedDeployment(aws.MachineControllerManagerName): string(gardencorev1beta1.ShootEveryNodeReady), worker.SufficientNodesAvailable(): string(gardencorev1beta1.ShootEveryNodeReady), }) This creates a health check controller that reconciles the extensions.gardener.cloud/v1alpha1.Worker resource with the spec.type ‘aws’. Three health check functions are registered that are executed during reconciliation. Each health check is mapped to a single HealthConditionType that results in conditions with the same condition.type (see below). To contribute to the Shoot’s health, the following conditions can be used: SystemComponentsHealthy, EveryNodeReady, ControlPlaneHealthy, ObservabilityComponentsHealthy. In case of workerless Shoot the EveryNodeReady condition is not present, so it can’t be used.\nThe Gardener/Gardenlet checks each extension for conditions matching these types. However, extensions are free to choose any HealthConditionType. For more information, see Contributing to Shoot Health Status Conditions.\nA health check has to satisfy the below interface. You can find implementation examples in the healtcheck folder.\ntype HealthCheck interface { // Check is the function that executes the actual health check Check(context.Context, types.NamespacedName) (*SingleCheckResult, error) // InjectSeedClient injects the seed client InjectSeedClient(client.Client) // InjectShootClient injects the shoot client InjectShootClient(client.Client) // SetLoggerSuffix injects the logger SetLoggerSuffix(string, string) // DeepCopy clones the healthCheck DeepCopy() HealthCheck } The health check controller regularly (default: 30s) reconciles the extension resource and executes the registered health checks for the dependent objects. As a result, the controller writes condition(s) to the status of the extension containing the health check result. In our example, two checks are mapped to ShootEveryNodeReady and one to ShootSystemComponentsHealthy, leading to conditions with two distinct HealthConditionTypes (condition.type):\nstatus: conditions: - lastTransitionTime: \"20XX-10-28T08:17:21Z\" lastUpdateTime: \"20XX-11-28T08:17:21Z\" message: (1/1) Health checks successful reason: HealthCheckSuccessful status: \"True\" type: SystemComponentsHealthy - lastTransitionTime: \"20XX-10-28T08:17:21Z\" lastUpdateTime: \"20XX-11-28T08:17:21Z\" message: (2/2) Health checks successful reason: HealthCheckSuccessful status: \"True\" type: EveryNodeReady Please note that there are four statuses: True, False, Unknown, and Progressing.\n True should be used for successful health checks. False should be used for unsuccessful/failing health checks. Unknown should be used when there was an error trying to determine the health status. Progressing should be used to indicate that the health status did not succeed but for expected reasons (e.g., a cluster scale up/down could make the standard health check fail because something is wrong with the Machines, however, it’s actually an expected situation and known to be completed within a few minutes.) Health checks that report Progressing should also provide a timeout, after which this “progressing situation” is expected to be completed. The health check library will automatically transition the status to False if the timeout was exceeded.\nAdditional Considerations It is up to the extension to decide how to conduct health checks, though it is recommended to make use of the build-in health check functionality of managed-resources for trivial checks. By deploying the depending resources via managed resources, the gardener resource manager conducts basic checks for different API objects out-of-the-box (e.g Deployments, DaemonSets, …) - and writes health conditions.\nBy default, Gardener performs health checks for all the ManagedResources created in the shoot namespaces. Their status will be aggregated to the Shoot conditions according to the following rules:\n Health checks of ManagedResource with .spec.class=nil are aggregated to the SystemComponentsHealthy condition Health checks of ManagedResource with .spec.class!=nil are aggregated to the ControlPlaneHealthy condition unless the ManagedResource is labeled with care.gardener.cloud/condition-type=\u003cother-condition-type\u003e. In such case, it is aggregated to the \u003cother-condition-type\u003e. More sophisticated health checks should be implemented by the extension controller itself (implementing the HealthCheck interface).\n","categories":"","description":"","excerpt":"Health Check Library Goal Typically, an extension reconciles a …","ref":"/docs/gardener/extensions/healthcheck-library/","tags":"","title":"Healthcheck Library"},{"body":"Heartbeat Controller The heartbeat controller renews a dedicated Lease object named gardener-extension-heartbeat at regular 30 second intervals by default. This Lease is used for heartbeats similar to how gardenlet uses Lease objects for seed heartbeats (see gardenlet heartbeats).\nThe gardener-extension-heartbeat Lease can be checked by other controllers to verify that the corresponding extension controller is still running. Currently, gardenlet checks this Lease when performing shoot health checks and expects to find the Lease inside the namespace where the extension controller is deployed by the corresponding ControllerInstallation. For each extension resource deployed in the Shoot control plane, gardenlet finds the corresponding gardener-extension-heartbeat Lease resource and checks whether the Lease’s .spec.renewTime is older than the allowed threshold for stale extension health checks - in this case, gardenlet considers the health check report for an extension resource as “outdated” and reflects this in the Shoot status.\n","categories":"","description":"","excerpt":"Heartbeat Controller The heartbeat controller renews a dedicated Lease …","ref":"/docs/gardener/extensions/heartbeat/","tags":"","title":"Heartbeat"},{"body":"High Availability of Deployed Components gardenlets and extension controllers are deploying components via Deployments, StatefulSets, etc., as part of the shoot control plane, or the seed or shoot system components.\nSome of the above component deployments must be further tuned to improve fault tolerance / resilience of the service. This document outlines what needs to be done to achieve this goal.\nPlease be forwarded to the Convenient Application Of These Rules section, if you want to take a shortcut to the list of actions that require developers’ attention.\nSeed Clusters The worker nodes of seed clusters can be deployed to one or multiple availability zones. The Seed specification allows you to provide the information which zones are available:\nspec: provider: region: europe-1 zones: - europe-1a - europe-1b - europe-1c Independent of the number of zones, seed system components like the gardenlet or the extension controllers themselves, or others like etcd-druid, dependency-watchdog, etc., should always be running with multiple replicas.\nConcretely, all seed system components should respect the following conventions:\n Replica Counts\n Component Type \u003c 3 Zones \u003e= 3 Zones Comment Observability (Monitoring, Logging) 1 1 Downtimes accepted due to cost reasons Controllers 2 2 / (Webhook) Servers 2 2 / Apart from the above, there might be special cases where these rules do not apply, for example:\n istio-ingressgateway is scaled horizontally, hence the above numbers are the minimum values. nginx-ingress-controller in the seed cluster is used to advertise all shoot observability endpoints, so due to performance reasons it runs with 2 replicas at all times. In the future, this component might disappear in favor of the istio-ingressgateway anyways. Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint, ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - topologyKey: kubernetes.io/hostname minDomains: 3 # lower value of max replicas or 3 maxSkew: 1 whenUnsatisfiable: ScheduleAnyway matchLabels: ... minDomains is set when failure tolerance is configured or annotation high-availability-config.resources.gardener.cloud/host-spread=\"true\" is given.\n … and the seed cluster has \u003e= 2 zones, then the component should also have a second topologySpreadConstraint, ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - topologyKey: topology.kubernetes.io/zone minDomains: 2 # lower value of max replicas or number of zones maxSkew: 1 whenUnsatisfiable: DoNotSchedule matchLabels: ... According to these conventions, even seed clusters with only one availability zone try to be highly available “as good as possible” by spreading the replicas across multiple nodes. Hence, while such seed clusters obviously cannot handle zone outages, they can at least handle node failures.\n Shoot Clusters The Shoot specification allows configuring “high availability” as well as the failure tolerance type for the control plane components, see Highly Available Shoot Control Plane for details.\nRegarding the seed cluster selection, the only constraint is that shoot clusters with failure tolerance type zone are only allowed to run on seed clusters with at least three zones. All other shoot clusters (non-HA or those with failure tolerance type node) can run on seed clusters with any number of zones.\nControl Plane Components All control plane components should respect the following conventions:\n Replica Counts\n Component Type w/o HA w/ HA (node) w/ HA (zone) Comment Observability (Monitoring, Logging) 1 1 1 Downtimes accepted due to cost reasons Controllers 1 2 2 / (Webhook) Servers 2 2 2 / Apart from the above, there might be special cases where these rules do not apply, for example:\n etcd is a server, though the most critical component of a cluster requiring a quorum to survive failures. Hence, it should have 3 replicas even when the failure tolerance is node only. kube-apiserver is scaled horizontally, hence the above numbers are the minimum values (even when the shoot cluster is not HA, there might be multiple replicas). Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway matchLabels: ... Hence, the node spread is done on best-effort basis only.\nHowever, if the shoot cluster has defined a failure tolerance type, the whenUnsatisfiable field should be set to DoNotSchedule.\n … and the failure tolerance type of the shoot cluster is zone, then the component should also have a second topologySpreadConstraint ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - maxSkew: 1 minDomains: 2 # lower value of max replicas or number of zones topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule matchLabels: ... Node Affinity\nThe gardenlet annotates the shoot namespace in the seed cluster with the high-availability-config.resources.gardener.cloud/zones annotation.\n If the shoot cluster is non-HA or has failure tolerance type node, then the value will be always exactly one zone (e.g., high-availability-config.resources.gardener.cloud/zones=europe-1b). If the shoot cluster has failure tolerance type zone, then the value will always contain exactly three zones (e.g., high-availability-config.resources.gardener.cloud/zones=europe-1a,europe-1b,europe-1c). For backwards-compatibility, this annotation might contain multiple zones for shoot clusters created before gardener/gardener@v1.60 and not having failure tolerance type zone. This is because their volumes might already exist in multiple zones, hence pinning them to only one zone would not work.\nHence, in case this annotation is present, the components should have the following node affinity:\nspec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: topology.kubernetes.io/zone operator: In values: - europe-1a # - ... This is to ensure all pods are running in the same (set of) availability zone(s) such that cross-zone network traffic is avoided as much as possible (such traffic is typically charged by the underlying infrastructure provider).\n System Components The availability of system components is independent of the control plane since they run on the shoot worker nodes while the control plane components run on the seed worker nodes (for more information, see the Kubernetes architecture overview). Hence, it only depends on the number of availability zones configured in the shoot worker pools via .spec.provider.workers[].zones. Concretely, the highest number of zones of a worker pool with systemComponents.allow=true is considered.\nAll system components should respect the following conventions:\n Replica Counts\n Component Type 1 or 2 Zones \u003e= 3 Zones Controllers 2 2 (Webhook) Servers 2 2 Apart from the above, there might be special cases where these rules do not apply, for example:\n coredns is scaled horizontally (today), hence the above numbers are the minimum values (possibly, scaling these components vertically may be more appropriate, but that’s unrelated to the HA subject matter). Optional addons like nginx-ingress or kubernetes-dashboard are only provided on best-effort basis for evaluation purposes, hence they run with 1 replica at all times. Topology Spread Constraints\nWhen the component has \u003e= 2 replicas …\n … then it should also have a topologySpreadConstraint ensuring the replicas are spread over the nodes:\nspec: topologySpreadConstraints: - maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway matchLabels: ... Hence, the node spread is done on best-effort basis only.\n … and the cluster has \u003e= 2 zones, then the component should also have a second topologySpreadConstraint ensuring the replicas are spread over the zones:\nspec: topologySpreadConstraints: - maxSkew: 1 minDomains: 2 # lower value of max replicas or number of zones topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule matchLabels: ... Convenient Application of These Rules According to above scenarios and conventions, the replicas, topologySpreadConstraints or affinity settings of the deployed components might need to be adapted.\nIn order to apply those conveniently and easily for developers, Gardener installs a mutating webhook into both seed and shoot clusters which reacts on Deployments and StatefulSets deployed to namespaces with the high-availability-config.resources.gardener.cloud/consider=true label set.\nThe following actions have to be taken by developers:\n Check if components are prepared to run concurrently with multiple replicas, e.g. controllers usually use leader election to achieve this.\n All components should be generally equipped with PodDisruptionBudgets with .spec.maxUnavailable=1 and unhealthyPodEvictionPolicy=AlwaysAllow:\n spec: maxUnavailable: 1 unhealthyPodEvictionPolicy: AlwaysAllow selector: matchLabels: ... Add the label high-availability-config.resources.gardener.cloud/type to deployments or statefulsets, as well as optionally involved horizontalpodautoscalers or HVPAs where the following two values are possible: controller server Type server is also preferred if a component is a controller and (webhook) server at the same time.\nYou can read more about the webhook’s internals in High Availability Config.\ngardenlet Internals Make sure you have read the above document about the webhook internals before continuing reading this section.\nSeed Controller The gardenlet performs the following changes on all namespaces running seed system components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e, where \u003czones\u003e is the list provided in .spec.provider.zones[] in the Seed specification. Note that neither the high-availability-config.resources.gardener.cloud/failure-tolerance-type, nor the high-availability-config.resources.gardener.cloud/zone-pinning annotations are set, hence the node affinity would never be touched by the webhook.\nThe only exception to this rule are the istio ingress gateway namespaces. This includes the default istio ingress gateway when SNI is enabled, as well as analogous namespaces for exposure classes and zone-specific istio ingress gateways. Those namespaces will additionally be annotated with high-availability-config.resources.gardener.cloud/zone-pinning set to true, resulting in the node affinities and the topology spread constraints being set. The replicas are not touched, as the istio ingress gateways are scaled by a horizontal autoscaler instance.\nShoot Controller Control Plane The gardenlet performs the following changes on the namespace running the shoot control plane components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. This makes the webhook mutate the replica count and the topology spread constraints. adds the annotation high-availability-config.resources.gardener.cloud/failure-tolerance-type with value equal to .spec.controlPlane.highAvailability.failureTolerance.type (or \"\", if .spec.controlPlane.highAvailability=nil). This makes the webhook mutate the node affinity according to the specified zone(s). adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e, where \u003czones\u003e is a … … random zone chosen from the .spec.provider.zones[] list in the Seed specification (always only one zone (even if there are multiple available in the seed cluster)) in case the Shoot has no HA setting (i.e., spec.controlPlane.highAvailability=nil) or when the Shoot has HA setting with failure tolerance type node. … list of three randomly chosen zones from the .spec.provider.zones[] list in the Seed specification in case the Shoot has HA setting with failure tolerance type zone. System Components The gardenlet performs the following changes on all namespaces running shoot system components:\n adds the label high-availability-config.resources.gardener.cloud/consider=true. This makes the webhook mutate the replica count and the topology spread constraints. adds the annotation high-availability-config.resources.gardener.cloud/zones=\u003czones\u003e where \u003czones\u003e is the merged list of zones provided in .zones[] with systemComponents.allow=true for all worker pools in .spec.provider.workers[] in the Shoot specification. Note that neither the high-availability-config.resources.gardener.cloud/failure-tolerance-type, nor the high-availability-config.resources.gardener.cloud/zone-pinning annotations are set, hence the node affinity would never be touched by the webhook.\n","categories":"","description":"","excerpt":"High Availability of Deployed Components gardenlets and extension …","ref":"/docs/gardener/high-availability/","tags":"","title":"High Availability"},{"body":"Hot-Update VirtualMachine tags without triggering a rolling-update Hot-Update VirtualMachine tags without triggering a rolling-update Motivation Boundary Condition What is available today? What are the problems with the current approach? MachineClass Update and its impact Proposal Shoot YAML changes Provider specific WorkerConfig API changes Gardener provider extension changes Driver interface changes Machine Class reconciliation Reconciliation Changes Motivation MCM Issue#750 There is a requirement to provide a way for consumers to add tags which can be hot-updated onto VMs. This requirement can be generalized to also offer a convenient way to specify tags which can be applied to VMs, NICs, Devices etc.\n MCM Issue#635 which in turn points to MCM-Provider-AWS Issue#36 - The issue hints at other fields like enable/disable source/destination checks for NAT instances which needs to be hot-updated on network interfaces.\n In GCP provider - instance.ServiceAccounts can be updated without the need to roll-over the instance. See\n Boundary Condition All tags that are added via means other than MachineClass.ProviderSpec should be preserved as-is. Only updates done to tags in MachineClass.ProviderSpec should be applied to the infra resources (VM/NIC/Disk).\nWhat is available today? WorkerPool configuration inside shootYaml provides a way to set labels. As per the definition these labels will be applied on Node resources. Currently these labels are also passed to the VMs as tags. There is no distinction made between Node labels and VM tags.\nMachineClass has a field which holds provider specific configuration and one such configuration is tags. Gardener provider extensions updates the tags in MachineClass.\n AWS provider extension directly passes the labels to the tag section of machineClass. Azure provider extension sanitizes the woker pool labels and adds them as tags in MachineClass. GCP provider extension sanitizes them, and then sets them as labels in the MachineClass. In GCP tags only have keys and are currently hard coded. Let us look at an example of MachineClass.ProviderSpec in AWS:\nproviderSpec: ami: ami-02fe00c0afb75bbd3 tags: #[section-1] pool lables added by gardener extension ######################################################### kubernetes.io/arch: amd64 networking.gardener.cloud/node-local-dns-enabled: \"true\" node.kubernetes.io/role: node worker.garden.sapcloud.io/group: worker-ser234 worker.gardener.cloud/cri-name: containerd worker.gardener.cloud/pool: worker-ser234 worker.gardener.cloud/system-components: \"true\" #[section-2] Tags defined in the gardener-extension-provider-aws ########################################################### kubernetes.io/cluster/cluster-full-name: \"1\" kubernetes.io/role/node: \"1\" #[section-3] ########################################################### user-defined-key1: user-defined-val1 user-defined-key2: user-defined-val2 Refer src for tags defined in section-1. Refer src for tags defined in section-2. Tags in section-3 are defined by the user.\n Out of the above three tag categories, MCM depends section-2 tags (mandatory-tags) for its orphan collection and Driver’s DeleteMachineand GetMachineStatus to work.\nProviderSpec.Tags are transported to the provider specific resources as follows:\n Provider Resources Tags are set on Code Reference Comment AWS Instance(VM), Volume, Network-Interface aws-VM-Vol-NIC No distinction is made between tags set on VM, NIC or Volume Azure Instance(VM), Network-Interface azure-VM-parameters \u0026 azureNIC-Parameters GCP Instance(VM), 1 tag: name (denoting the name of the worker) is added to Disk gcp-VM \u0026 gcp-Disk In GCP key-value pairs are called labels while network tags have only keys AliCloud Instance(VM) aliCloud-VM What are the problems with the current approach? There are a few shortcomings in the way tags/labels are handled:\n Tags can only be set at the time a machine is created. There is no distinction made amongst tags/labels that are added to VM’s, disks or network interfaces. As stated above for AWS same set of tags are added to all. There is a limit defined on the number of tags/labels that can be associated to the devices (disks, VMs, NICs etc). Example: In AWS a max of 50 user created tags are allowed. Similar restrictions are applied on different resources across providers. Therefore adding all tags to all devices even if the subset of tags are not meant for that resource exhausts the total allowed tags/labels for that resource. The only placeholder in shoot yaml as mentioned above is meant to only hold labels that should be applied on primarily on the Node objects. So while you could use the node labels for extended resources, using it also for tags is not clean. There is no provision in the shoot YAML today to add tags only to a subset of resources. MachineClass Update and its impact When Worker.ProviderConfig is changed then a worker-hash is computed which includes the raw ProviderConfig. This hash value is then used as a suffix when constructing the name for a MachineClass. See aws-extension-provider as an example. A change in the name of the MachineClass will then in-turn trigger a rolling update of machines. Since tags are provider specific and therefore will be part of ProviderConfig, any update to them will result in a rolling-update of machines.\nProposal Shoot YAML changes Provider specific configuration is set via providerConfig section for each worker pool.\nExample worker provider config (current):\nproviderConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 dataVolumes: - name: kubelet-dir snapshotID: snap-13234 iamInstanceProfile: # (specify either ARN or name) name: my-profile arn: my-instance-profile-arn It is proposed that an additional field be added for tags under providerConfig. Proposed changed YAML:\nproviderConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 dataVolumes: - name: kubelet-dir snapshotID: snap-13234 iamInstanceProfile: # (specify either ARN or name) name: my-profile arn: my-instance-profile-arn tags: vm: key1: val1 key2: val2 .. # for GCP network tags are just keys (there is no value associated to them). # What is shown below will work for AWS provider. network: key3: val3 key4: val4 Under tags clear distinction is made between tags for VMs, Disks, network interface etc. Each provider has a different allowed-set of characters that it accepts as key names, has different limits on the tags that can be set on a resource (disk, NIC, VM etc.) and also has a different format (GCP network tags are only keys).\n TODO:\n Check if worker.labels are getting added as tags on infra resources. We should continue to support it and double check that these should only be added to VMs and not to other resources.\n Should we support users adding VM tags as node labels?\n Provider specific WorkerConfig API changes Taking AWS provider extension as an example to show the changes.\n WorkerConfig will now have the following changes:\n A new field for tags will be introduced. Additional metadata for struct fields will now be added via struct tags. type WorkerConfig struct { metav1.TypeMeta Volume *Volume // .. all fields are not mentioned here. // Tags are a collection of tags to be set on provider resources (e.g. VMs, Disks, Network Interfaces etc.) Tags *Tags `hotupdatable:true` } // Tags is a placeholder for all tags that can be set/updated on VMs, Disks and Network Interfaces. type Tags struct { // VM tags set on the VM instances. VM map[string]string // Network tags set on the network interfaces. Network map[string]string // Disk tags set on the volumes/disks. Disk map[string]string } There is a need to distinguish fields within ProviderSpec (which is then mapped to the above WorkerConfig) which can be updated without the need to change the hash suffix for MachineClass and thus trigger a rolling update on machines.\nTo achieve that we propose to use struct tag hotupdatable whose value indicates if the field can be updated without the need to do a rolling update. To ensure backward compatibility, all fields which do not have this tag or have hotupdatable set to false will be considered as immutable and will require a rolling update to take affect.\nGardener provider extension changes Taking AWS provider extension as an example. Following changes should be made to all gardener provider extensions\n AWS Gardener Extension generates machine config using worker pool configuration. As part of that it also computes the workerPoolHash which is then used to create the name of the MachineClass.\nCurrently WorkerPoolHash function uses the entire providerConfig to compute the hash. Proposal is to do the following:\n Remove the code from function WorkerPoolHash. Add another function to compute hash using all immutable fields in the provider config struct and then pass that to worker.WorkerPoolHash as additionalData. The above will ensure that tags and any other field in WorkerConfig which is marked with updatable:true is not considered for hash computation and will therefore not contribute to changing the name of MachineClass object thus preventing a rolling update.\nWorkerConfig and therefore the contained tags will be set as ProviderSpec in MachineClass.\nIf only fields which have updatable:true are changed then it should result in update/patch of MachineClass and not creation.\nDriver interface changes Driver interface which is a facade to provider specific API implementations will have one additional method.\ntype Driver interface { // .. existing methods are not mentioned here for brevity. UpdateMachine(context.Context, *UpdateMachineRequest) error } // UpdateMachineRequest is the request to update machine tags. type UpdateMachineRequest struct { ProviderID string LastAppliedProviderSpec raw.Extension MachineClass *v1alpha1.MachineClass Secret *corev1.Secret } If any machine-controller-manager-provider-\u003cprovidername\u003e has not implemented UpdateMachine then updates of tags on Instances/NICs/Disks will not be done. An error message will be logged instead.\n Machine Class reconciliation Current MachineClass reconciliation does not reconcile MachineClass resource updates but it only enqueues associated machines. The reason is that it is assumed that anything that is changed in a MachineClass will result in a creation of a new MachineClass with a different name. This will result in a rolling update of all machines using the MachineClass as a template.\nHowever, it is possible that there is data that all machines in a MachineSet share which do not require a rolling update (e.g. tags), therefore there is a need to reconcile the MachineClass as well.\nReconciliation Changes In order to ensure that machines get updated eventually with changes to the hot-updatable fields defined in the MachineClass.ProviderConfig as raw.Extension.\nWe should only fix MCM Issue#751 in the MachineClass reconciliation and let it enqueue the machines as it does today. We additionally propose the following two things:\n Introduce a new annotation last-applied-providerspec on every machine resource. This will capture the last successfully applied MachineClass.ProviderSpec on this instance.\n Enhance the machine reconciliation to include code to hot-update machine.\n In machine-reconciliation there are currently two flows triggerDeletionFlow and triggerCreationFlow. When a machine gets enqueued due to changes in MachineClass then in this method following changes needs to be introduced:\nCheck if the machine has last-applied-providerspec annotation.\nCase 1.1\nIf the annotation is not present then there can be just 2 possibilities:\n It is a fresh/new machine and no backing resources (VM/NIC/Disk) exist yet. The current flow checks if the providerID is empty and Status.CurrenStatus.Phase is empty then it enters into the triggerCreationFlow.\n It is an existing machine which does not yet have this annotation. In this case call Driver.UpdateMachine. If the driver returns no error then add last-applied-providerspec annotation with the value of MachineClass.ProviderSpec to this machine.\n Case 1.2\nIf the annotation is present then compare the last applied provider-spec with the current provider-spec. If there are changes (check their hash values) then call Driver.UpdateMachine. If the driver returns no error then add last-applied-providerspec annotation with the value of MachineClass.ProviderSpec to this machine.\n NOTE: It is assumed that if there are changes to the fields which are not marked as hotupdatable then it will result in the change of name for MachineClass resulting in a rolling update of machines. If the name has not changed + machine is enqueued + there is a change in machine-class then it will be change to a hotupdatable fields in the spec.\n Trigger update flow can be done after reconcileMachineHealth and syncMachineNodeTemplates in machine-reconciliation.\nThere are 2 edge cases that needs attention and special handling:\n Premise: It is identified that there is an update done to one or more hotupdatable fields in the MachineClass.ProviderSpec.\n Edge-Case-1\nIn the machine reconciliation, an update-machine-flow is triggered which in-turn calls Driver.UpdateMachine. Consider the case where the hot update needs to be done to all VM, NIC and Disk resources. The driver returns an error which indicates a partial-failure. As we have mentioned above only when Driver.UpdateMachine returns no error will last-applied-providerspec be updated. In case of partial failure the annotation will not be updated. This event will be re-queued for a re-attempt. However consider a case where before the item is re-queued, another update is done to MachineClass reverting back the changes to the original spec.\n At T1 At T2 (T2 \u003e T1) At T3 (T3\u003e T2) last-applied-providerspec=S1MachineClass.ProviderSpec = S1 last-applied-providerspec=S1MachineClass.ProviderSpec = S2 Another update to MachineClass.ProviderConfig = S3 is enqueue (S3 == S1) last-applied-providerspec=S1Driver.UpdateMachine for S1-S2 update - returns partial failureMachine-Key is requeued At T4 (T4\u003e T3) when a machine is reconciled then it checks that last-applied-providerspec is S1 and current MachineClass.ProviderSpec = S3 and since S3 is same as S1, no update is done. At T2 Driver.UpdateMachine was called to update the machine with S2 but it partially failed. So now you will have resources which are partially updated with S2 and no further updates will be attempted.\nEdge-Case-2\nThe above situation can also happen when Driver.UpdateMachine is in the process of updating resources. It has hot-updated lets say 1 resource. But now MCM crashes. By the time it comes up another update to MachineClass.ProviderSpec is done essentially reverting back the previous change (same case as above). In this case reconciliation loop never got a chance to get any response from the driver.\nTo handle the above edge cases there are 2 options:\nOption #1\nIntroduce a new annotation inflight-providerspec-hash . The value of this annotation will be the hash value of the MachineClass.ProviderSpec that is in the process of getting applied on this machine. The machine will be updated with this annotation just before calling Driver.UpdateMachine (in the trigger-update-machine-flow). If the driver returns no error then (in a single update):\n last-applied-providerspec will be updated\n inflight-providerspec-hash annotation will be removed.\n Option #2 - Preferred\nLeverage Machine.Status.LastOperation with Type set to MachineOperationUpdate and State set to MachineStateProcessing This status will be updated just before calling Driver.UpdateMachine.\nSemantically LastOperation captures the details of the operation post-operation and not pre-operation. So this solution would be a divergence from the norm.\n","categories":"","description":"","excerpt":"Hot-Update VirtualMachine tags without triggering a rolling-update …","ref":"/docs/other-components/machine-controller-manager/proposals/hotupdate-instances/","tags":"","title":"Hotupdate Instances"},{"body":"There are two ways to get the health information of a shoot API server.\n Try to reach the public endpoint of the shoot API server via \"https://api.\u003cshoot-name\u003e.\u003cproject-name\u003e.shoot.\u003ccanary|office|live\u003e.k8s-hana.ondemand.com/healthz\" The endpoint is secured, therefore you need to authenticate via basic auth or client cert. Both are available in the admin kubeconfig of the shoot cluster. Note that with those credentials you have full (admin) access to the cluster, therefore it is highly recommended to create custom credentials with some RBAC rules and bindings which only allow access to the /healthz endpoint.\n Fetch the shoot resource of your cluster via the programmatic API of the Gardener and get the availability information from the status. You need a kubeconfig for the Garden cluster, which you can get via the Gardener dashboard. Then you could fetch your shoot resource and query for the availability information via: kubectl get shoot \u003cshoot-name\u003e -o json | jq -r '.status.conditions[] | select(.type==\"APIServerAvailable\")' The availability information in the second scenario is collected by the Gardener. If you want to collect the information independently from Gardener, you should choose the first scenario.\nIf you want to archive a simple pull monitor in the AvS for a shoot cluster, you also need to use the first scenario, because with it you have a stable endpoint for the API server which you can query.\n","categories":"","description":"","excerpt":"There are two ways to get the health information of a shoot API …","ref":"/docs/faq/clusterhealthz/","tags":"","title":"How can you get the status of a shoot API server?"},{"body":"Configuration of Multi-AZ worker pools depends on the infrastructure.\nThe zone distribution for the worker pools can be configured generically across all infrastructures. You can find provider-specific details in the InfrastructureConfig section of each extension provider repository:\n AWS (a VPC with a subnet is required in each zone you want to support) GCP Azure AliCloud OpenStack ","categories":"","description":"","excerpt":"Configuration of Multi-AZ worker pools depends on the infrastructure. …","ref":"/docs/faq/configure-worker-pools/","tags":"","title":"How do you configure Multi-AZ worker pools for different extensions?"},{"body":"End-users must provide credentials such that Gardener and Kubernetes controllers can communicate with the respective cloud provider APIs in order to perform infrastructure operations. These credentials should be regularly rotated.\nHow to do so is explained in Shoot Credentials Rotation.\n","categories":"","description":"","excerpt":"End-users must provide credentials such that Gardener and Kubernetes …","ref":"/docs/faq/rotate-iaas-keys/","tags":"","title":"How do you rotate IaaS keys for a running cluster?"},{"body":"Adding a Feature Gate In order to add a feature gate, add it as enabled to the appropriate section of the shoot.yaml file:\nSectionName: featureGates: SomeKubernetesFeature: true The available sections are kubelet, kubernetes, kubeAPIServer, kubeControllerManager, kubeScheduler, and kubeProxy.\nFor more detals, see the example shoot.yaml file.\nWhat is the expected downtime when updating the shoot.yaml? No downtime is expected after executing a shoot.yaml update.\n","categories":"","description":"","excerpt":"Adding a Feature Gate In order to add a feature gate, add it as …","ref":"/docs/faq/add-feature-gates/","tags":"","title":"How to add K8S feature gates to my shoot cluster?"},{"body":"Introduction Kubernetes offers powerful options to get more details about startup or runtime failures of pods as e.g. described in Application Introspection and Debugging or Debug Pods and Replication Controllers.\nIn order to identify pods with potential issues, you could, e.g., run kubectl get pods --all-namespaces | grep -iv Running to filter out the pods which are not in the state Running. One of frequent error state is CrashLoopBackOff, which tells that a pod crashes right after the start. Kubernetes then tries to restart the pod again, but often the pod startup fails again.\nHere is a short list of possible reasons which might lead to a pod crash:\n Error during image pull caused by e.g. wrong/missing secrets or wrong/missing image The app runs in an error state caused e.g. by missing environmental variables (ConfigMaps) or secrets Liveness probe failed Too high resource consumption (memory and/or CPU) or too strict quota settings Persistent volumes can’t be created/mounted The container image is not updated Basically, the commands kubectl logs ... and kubectl describe ... with different parameters are used to get more detailed information. By calling e.g. kubectl logs --help you can get more detailed information about the command and its parameters.\nIn the next sections you’ll find some basic approaches to get some ideas what went wrong.\nRemarks:\n Even if the pods seem to be running, as the status Running indicates, a high counter of the Restarts shows potential problems You can get a good overview of the troubleshooting process with the interactive tutorial Troubleshooting with Kubectl available which explains basic debugging activities The examples below are deployed into the namespace default. In case you want to change it, use the optional parameter --namespace \u003cyour-namespace\u003e to select the target namespace. The examples require a Kubernetes release ≥ 1.8. Prerequisites Your deployment was successful (no logical/syntactical errors in the manifest files), but the pod(s) aren’t running.\nError Caused by Wrong Image Name Start by running kubectl describe pod \u003cyour-pod\u003e \u003cyour-namespace\u003e to get detailed information about the pod startup.\nIn the Events section, you should get an error message like Failed to pull image ... and Reason: Failed. The pod is in state ImagePullBackOff.\nThe example below is based on a demo in the Kubernetes documentation. In all examples, the default namespace is used.\nFirst, perform a cleanup with:\nkubectl delete pod termination-demo\nNext, create a resource based on the yaml content below:\napiVersion: v1 kind: Pod metadata: name: termination-demo spec: containers: - name: termination-demo-container image: debiann command: [\"/bin/sh\"] args: [\"-c\", \"sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log\"] kubectl describe pod termination-demo lists in the Event section the content\nEvents: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 2m\t2m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned termination-demo to ip-10-250-17-112.eu-west-1.compute.internal 2m\t2m\t1\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-sgccm\" 2m\t1m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulling\tpulling image \"debiann\" 2m\t1m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tWarning\tFailed\tFailed to pull image \"debiann\": rpc error: code = Unknown desc = Error: image library/debiann:latest not found 2m\t54s\t10\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tWarning\tFailedSync\tError syncing pod 2m\t54s\t6\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tBackOff\tBack-off pulling image \"debiann\" The error message with Reason: Failed tells you that there is an error during pulling the image. A closer look at the image name indicates a misspelling.\nThe App Runs in an Error State Caused, e.g., by Missing Environmental Variables (ConfigMaps) or Secrets This example illustrates the behavior in the case when the app expects environment variables but the corresponding Kubernetes artifacts are missing.\nFirst, perform a cleanup with:\nkubectl delete deployment termination-demo kubectl delete configmaps app-env Next, deploy the following manifest:\napiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] Now, the command kubectl get pods lists the pod termination-demo-xxx in the state Error or CrashLoopBackOff. The command kubectl describe pod termination-demo-xxx tells you that there is no error during startup but gives no clue about what caused the crash.\nEvents: FirstSeen\tLastSeen\tCount\tFrom\tSubObjectPath\tType\tReason\tMessage ---------\t--------\t-----\t----\t-------------\t--------\t------\t------- 19m\t19m\t1\tdefault-scheduler\tNormal\tScheduled\tSuccessfully assigned termination-demo-5fb484867d-xz2x9 to ip-10-250-17-112.eu-west-1.compute.internal 19m\t19m\t1\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tNormal\tSuccessfulMountVolume\tMountVolume.SetUp succeeded for volume \"default-token-sgccm\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulling\tpulling image \"debian\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tPulled\tSuccessfully pulled image \"debian\" 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tCreated\tCreated container 19m\t19m\t4\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tNormal\tStarted\tStarted container 19m\t14m\t24\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tspec.containers{termination-demo-container}\tWarning\tBackOff\tBack-off restarting failed container 19m\t4m\t69\tkubelet, ip-10-250-17-112.eu-west-1.compute.internal\tWarning\tFailedSync\tError syncing pod The command kubectl get logs termination-demo-xxx gives access to the output, the application writes on stderr and stdout. In this case, you should get an output similar to:\n/bin/sh: 1: cannot open : No such file So you need to have a closer look at the application. In this case, the environmental variable MYFILE is missing. To fix this issue, you could e.g. add a ConfigMap to your deployment as is shown in the manifest listed below:\napiVersion: v1 kind: ConfigMap metadata: name: app-env data: MYFILE: \"/etc/profile\" --- apiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] envFrom: - configMapRef: name: app-env Note that once you fix the error and re-run the scenario, you might still see the pod in a CrashLoopBackOff status. It is because the container finishes the command sed ... and runs to completion. In order to keep the container in a Running status, a long running task is required, e.g.:\napiVersion: v1 kind: ConfigMap metadata: name: app-env data: MYFILE: \"/etc/profile\" SLEEP: \"5\" --- apiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] # args: [\"-c\", \"sed \\\"s/foo/bar/\\\" \u003c $MYFILE\"] args: [\"-c\", \"while true; do sleep $SLEEP; echo sleeping; done;\"] envFrom: - configMapRef: name: app-env Too High Resource Consumption (Memory and/or CPU) or Too Strict Quota Settings You can optionally specify the amount of memory and/or CPU your container gets during runtime. In case these settings are missing, the default requests settings are taken: CPU: 0m (in Milli CPU) and RAM: 0Gi, which indicate no other limits other than the ones of the node(s) itself. For more details, e.g. about how to configure limits, see Configure Default Memory Requests and Limits for a Namespace.\nIn case your application needs more resources, Kubernetes distinguishes between requests and limit settings: requests specify the guaranteed amount of resource, whereas limit tells Kubernetes the maximum amount of resource the container might need. Mathematically, both settings could be described by the relation 0 \u003c= requests \u003c= limit. For both settings you need to consider the total amount of resources your nodes provide. For a detailed description of the concept, see Resource Quality of Service in Kubernetes.\nUse kubectl describe nodes to get a first overview of the resource consumption in your cluster. Of special interest are the figures indicating the amount of CPU and Memory Requests at the bottom of the output.\nThe next example demonstrates what happens in case the CPU request is too high in order to be managed by your cluster.\nFirst, perform a cleanup with:\nkubectl delete deployment termination-demo kubectl delete configmaps app-env Next, adapt the cpu below in the yaml below to be slightly higher than the remaining CPU resources in your cluster and deploy this manifest. In this example, 600m (milli CPUs) are requested in a Kubernetes system with a single 2 core worker node which results in an error message.\napiVersion: apps/v1beta2 kind: Deployment metadata: name: termination-demo labels: app: termination-demo spec: replicas: 1 selector: matchLabels: app: termination-demo template: metadata: labels: app: termination-demo spec: containers: - name: termination-demo-container image: debian command: [\"/bin/sh\"] args: [\"-c\", \"sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log\"] resources: requests: cpu: \"600m\" The command kubectl get pods lists the pod termination-demo-xxx in the state Pending. More details on why this happens could be found by using the command kubectl describe pod termination-demo-xxx:\n$ kubectl describe po termination-demo-fdb7bb7d9-mzvfw Name: termination-demo-fdb7bb7d9-mzvfw Namespace: default ... Containers: termination-demo-container: Image: debian Port: \u003cnone\u003e Host Port: \u003cnone\u003e Command: /bin/sh Args: -c sleep 10 \u0026\u0026 echo Sleep expired \u003e /dev/termination-log Requests: cpu: 6 Environment: \u003cnone\u003e Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-t549m (ro) Conditions: Type Status PodScheduled False Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 9s (x7 over 40s) default-scheduler 0/2 nodes are available: 2 Insufficient cpu. You can find more details in:\n Managing Compute Resources for Containters Resource Quality of Service in Kubernetes Remarks:\n This example works similarly when specifying a too high request for memory In case you configured an autoscaler range when creating your Kubernetes cluster, another worker node will be spinned up automatically if you didn’t reach the maximum number of worker nodes In case your app is running out of memory (the memory settings are too small), you will typically find an OOMKilled (Out Of Memory) message in the Events section of the kubectl describe pod ... output The Container Image Is Not Updated You applied a fix in your app, created a new container image and pushed it into your container repository. After redeploying your Kubernetes manifests, you expected to get the updated app, but the same bug is still in the new deployment present.\nThis behavior is related to how Kubernetes decides whether to pull a new docker image or to use the cached one.\nIn case you didn’t change the image tag, the default image policy IfNotPresent tells Kubernetes to use the cached image (see Images).\nAs a best practice, you should not use the tag latest and change the image tag in case you changed anything in your image (see Configuration Best Practices).\nFor more information, see Container Image Not Updating.\nRelated Links Application Introspection and Debugging Debug Pods and Replication Controllers Logging Architecture Configure Default Memory Requests and Limits for a Namespace Managing Compute Resources for Containters Resource Quality of Service in Kubernetes Interactive Tutorial Troubleshooting with Kubectl Images Kubernetes Best Practices ","categories":"","description":"Your pod doesn't run as expected. Are there any log files? Where? How could I debug a pod?","excerpt":"Your pod doesn't run as expected. Are there any log files? Where? How …","ref":"/docs/guides/monitoring-and-troubleshooting/debug-a-pod/","tags":"","title":"How to Debug a Pod"},{"body":"How to provide credentials for upstream registry? In Kubernetes, to pull images from private container image registries you either have to specify an image pull Secret (see Pull an Image from a Private Registry) or you have to configure the kubelet to dynamically retrieve credentials using a credential provider plugin (see Configure a kubelet image credential provider). When pulling an image, the kubelet is providing the credentials to the CRI implementation. The CRI implementation uses the provided credentials against the upstream registry to pull the image.\nThe registry-cache extension is using the Distribution project as pull through cache implementation. The Distribution project does not use the provided credentials from the CRI implementation while fetching an image from the upstream. Hence, the above-described scenarios such as configuring image pull Secret for a Pod or configuring kubelet credential provider plugins don’t work out of the box with the pull through cache provided by the registry-cache extension. Instead, the Distribution project supports configuring only one set of credentials for a given pull through cache instance (for a given upstream).\nThis document describe how to supply credentials for the private upstream registry in order to pull private image with the registry cache.\nHow to configure the registry cache to use upstream registry credentials? Create an immutable Secret with the upstream registry credentials in the Garden cluster:\nkubectl create -f - \u003c\u003cEOF apiVersion: v1 kind: Secret metadata: name: ro-docker-secret-v1 namespace: garden-dev type: Opaque immutable: true data: username: $(echo -n $USERNAME | base64 -w0) password: $(echo -n $PASSWORD | base64 -w0) EOF For Artifact Registry, the username is _json_key and the password is the service account key in JSON format. To base64 encode the service account key, copy it and run:\necho -n $SERVICE_ACCOUNT_KEY_JSON | base64 -w0 Add the newly created Secret as a reference to the Shoot spec, and then to the registry-cache extension configuration.\nIn the registry-cache configuration, set the secretReferenceName field. It should point to a resource reference under spec.resources. The resource reference itself points to the Secret in project namespace.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot # ... spec: extensions: - type: registry-cache providerConfig: apiVersion: registry.extensions.gardener.cloud/v1alpha3 kind: RegistryConfig caches: - upstream: docker.io secretReferenceName: docker-secret # ... resources: - name: docker-secret resourceRef: apiVersion: v1 kind: Secret name: ro-docker-secret-v1 # ... [!WARNING] Do not delete the referenced Secret when there is a Shoot still using it.\n How to rotate the registry credentials? To rotate registry credentials perform the following steps:\n Generate a new pair of credentials in the cloud provider account. Do not invalidate the old ones. Create a new Secret (e.g., ro-docker-secret-v2) with the newly generated credentials as described in step 1. in How to configure the registry cache to use upstream registry credentials?. Update the Shoot spec with newly created Secret as described in step 2. in How to configure the registry cache to use upstream registry credentials?. The above step will trigger a Shoot reconciliation. Wait for it to complete. Make sure that the old Secret is no longer referenced by any Shoot cluster. Finally, delete the Secret containing the old credentials (e.g., ro-docker-secret-v1). Delete the corresponding old credentials from the cloud provider account. Possible Pitfalls The registry cache is not protected by any authentication/authorization mechanism. The cached images (incl. private images) can be fetched from the registry cache without authentication/authorization. Note that the registry cache itself is not exposed publicly. The registry cache provides the credentials for every request against the corresponding upstream. In some cases, misconfigured credentials can prevent the registry cache to pull even public images from the upstream (for example: invalid service account key for Artifact Registry). However, this behaviour is controlled by the server-side logic of the upstream registry. Do not remove the image pull Secrets when configuring credentials for the registry cache. When the registry-cache is not available, containerd falls back to the upstream registry. containerd still needs the image pull Secret to pull the image and in this way to have the fallback mechanism working. ","categories":"","description":"","excerpt":"How to provide credentials for upstream registry? In Kubernetes, to …","ref":"/docs/extensions/others/gardener-extension-registry-cache/registry-cache/upstream-credentials/","tags":"","title":"How to provide credentials for upstream registry?"},{"body":"Image Vector The Gardener components are deploying several different container images into the garden, seed, and the shoot clusters. The image repositories and tags are defined in a central image vector file. Obviously, the image versions defined there must fit together with the deployment manifests (e.g., some command-line flags do only exist in certain versions).\nExample images: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: registry.k8s.io/pause tag: \"3.4\" targetVersion: \"1.20.x\" architectures: - amd64 - arm64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 targetVersion: \"\u003e= 1.21\" architectures: - amd64 - arm64 That means that Gardener will use the pause-container with tag 3.4 for all clusters with Kubernetes version 1.20.x, and the image with ref registry.k8s.io/pause:3.5 for all clusters with Kubernetes \u003e= 1.21.\n [!NOTE] As you can see, it is possible to provide the full image reference via the ref field. Another option is to use the repository and tag fields. tag may also be a digest only (starting with sha256:...), or it can contain both tag and digest (v1.2.3@sha256:...).\n Architectures images: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: registry.k8s.io/pause tag: \"3.5\" architectures: - amd64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 architectures: - arm64 - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: registry.k8s.io/pause:3.5 architectures: - amd64 - arm64 architectures is an optional field of image. It is a list of strings specifying CPU architecture of machines on which this image can be used. The valid options for the architectures field are as follows:\n amd64 : This specifies that the image can run only on machines having CPU architecture amd64. arm64 : This specifies that the image can run only on machines having CPU architecture arm64. If an image doesn’t specify any architectures, then by default it is considered to support both amd64 and arm64 architectures.\nOverwriting Image Vector In some environments it is not possible to use these “pre-defined” images that come with a Gardener release. A prominent example for that is Alicloud in China, which does not allow access to Google’s GCR. In these cases, you might want to overwrite certain images, e.g., point the pause-container to a different registry.\n⚠️ If you specify an image that does not fit to the resource manifest, then the reconciliations might fail.\nIn order to overwrite the images, you must provide a similar file to the Gardener component:\nimages: - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile repository: my-custom-image-registry/pause tag: \"3.4\" version: \"1.20.x\" - name: pause-container sourceRepository: github.com/kubernetes/kubernetes/blob/master/build/pause/Dockerfile ref: my-custom-image-registry/pause:3.5 version: \"\u003e= 1.21\" [!IMPORTANT] When the overwriting file contains ref for an image but the source file doesn’t, then this invalidates both repository and tag of the source. When it contains repository for an image but the source file uses ref, then this invalidates ref of the source.\n For gardenlet, you can create a ConfigMap containing the above content and mount it as a volume into the gardenlet pod. Next, specify the environment variable IMAGEVECTOR_OVERWRITE, whose value must be the path to the file you just mounted. The approach works similarly for gardener-operator.\napiVersion: v1 kind: ConfigMap metadata: name: gardenlet-images-overwrite namespace: garden data: images_overwrite.yaml: |images: - ... --- apiVersion: apps/v1 kind: Deployment metadata: name: gardenlet namespace: garden spec: template: spec: containers: - name: gardenlet env: - name: IMAGEVECTOR_OVERWRITE value: /imagevector-overwrite/images_overwrite.yaml volumeMounts: - name: gardenlet-images-overwrite mountPath: /imagevector-overwrite volumes: - name: gardenlet-images-overwrite configMap: name: gardenlet-images-overwrite Image Vectors for Dependent Components Gardener is deploying a lot of different components that might deploy other images themselves. These components might use an image vector as well. Operators might want to customize the image locations for these transitive images as well, hence, they might need to specify an image vector overwrite for the components directly deployed by Gardener.\nIt is possible to specify the IMAGEVECTOR_OVERWRITE_COMPONENTS environment variable to Gardener that points to a file with the following content:\ncomponents: - name: etcd-druid imageVectorOverwrite: |images: - name: etcd tag: v1.2.3 repository: etcd/etcd Gardener will, if supported by the directly deployed component (etcd-druid in this example), inject the given imageVectorOverwrite into the Deployment manifest. The respective component is responsible for using the overwritten images instead of its defaults.\nHelm Chart Image Vector Some Gardener components might also deploy packaged Helm charts which are pulled from an OCI repository. The concepts are the very same as for the container images. The only difference is that the environment variable for overwriting this chart image vector is called IMAGEVECTOR_OVERWRITE_CHARTS.\n","categories":"","description":"","excerpt":"Image Vector The Gardener components are deploying several different …","ref":"/docs/gardener/deployment/image_vector/","tags":"","title":"Image Vector"},{"body":"Contract: Infrastructure Resource Every Kubernetes cluster requires some low-level infrastructure to be setup in order to work properly. Examples for that are networks, routing entries, security groups, IAM roles, etc. Before introducing the Infrastructure extension resource Gardener was using Terraform in order to create and manage these provider-specific resources (e.g., see here). Now, Gardener commissions an external, provider-specific controller to take over this task.\nWhich infrastructure resources are required? Unfortunately, there is no general answer to this question as it is highly provider specific. Consider the above mentioned resources, i.e. VPC, subnets, route tables, security groups, IAM roles, SSH key pairs. Most of the resources are required in order to create VMs (the shoot cluster worker nodes), load balancers, and volumes.\nWhat needs to be implemented to support a new infrastructure provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig resourceGroup: name: mygroup networks: vnet: # specify either 'name' or 'cidr' # name: my-vnet cidr: 10.250.0.0/16 workers: 10.250.0.0/19 The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed resources. However, the most important section is the .spec.providerConfig. It contains an embedded declaration of the provider specific configuration for the infrastructure (that cannot be known by Gardener itself). You are responsible for designing how this configuration looks like. Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the Shoot resource.\nAfter your controller has created the required resources in your provider’s infrastructure it needs to generate an output that can be used by other controllers in subsequent steps. An example for that is the Worker extension resource controller. It is responsible for creating virtual machines (shoot worker nodes) in this prepared infrastructure. Everything that it needs to know in order to do that (e.g. the network IDs, security group names, etc. (again: provider-specific)) needs to be provided as output in the Infrastructure resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus resourceGroup: name: mygroup networks: vnet: name: my-vnet subnets: - purpose: nodes name: my-subnet availabilitySets: - purpose: nodes id: av-set-id name: av-set-name routeTables: - purpose: nodes name: route-table-name securityGroups: - purpose: nodes name: sec-group-name In order to support a new infrastructure provider you need to write a controller that watches all Infrastructures with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the Azure provider.\nDynamic nodes network for shoot clusters Some environments do not allow end-users to statically define a CIDR for the network that shall be used for the shoot worker nodes. In these cases it is possible for the extension controllers to dynamically provision a network for the nodes (as part of their reconciliation loops), and to provide the CIDR in the status of the Infrastructure resource:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: ... status: lastOperation: ... providerStatus: ... nodesCIDR: 10.250.0.0/16 Gardener will pick this nodesCIDR and use it to configure the VPN components to establish network connectivity between the control plane and the worker nodes. If the Shoot resource already specifies a nodes CIDR in .spec.networking.nodes and the extension controller provides also a value in .status.nodesCIDR in the Infrastructure resource then the latter one will always be considered with higher priority by Gardener.\nNon-provider specific information required for infrastructure creation Some providers might require further information that is not provider specific but already part of the shoot resource. One example for this is the GCP infrastructure controller which needs the pod and the service network of the cluster in order to prepare and configure the infrastructure correctly. As Gardener cannot know which information is required by providers it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Infrastructure resource itself.\nImplementation details Actuator interface Most existing infrastructure controller implementations follow a common pattern where a generic Reconciler delegates to an Actuator interface that contains the methods Reconcile, Delete, Migrate, and Restore. These methods are called by the generic Reconciler for the respective operations, and should be implemented by the extension according to the contract described here and the migration guidelines.\nConfigValidator interface For infrastructure controllers, the generic Reconciler also delegates to a ConfigValidator interface that contains a single Validate method. This method is called by the generic Reconciler at the beginning of every reconciliation, and can be implemented by the extension to validate the .spec.providerConfig part of the Infrastructure resource with the respective cloud provider, typically the existence and validity of cloud provider resources such as AWS VPCs or GCP Cloud NAT IPs.\nThe Validate method returns a list of errors. If this list is non-empty, the generic Reconciler will fail with an error. This error will have the error code ERR_CONFIGURATION_PROBLEM, unless there is at least one error in the list that has its ErrorType field set to field.ErrorTypeInternal.\nReferences and additional resources Infrastructure API (Golang specification) Sample implementation for the Azure provider Sample ConfigValidator implementation ","categories":"","description":"","excerpt":"Contract: Infrastructure Resource Every Kubernetes cluster requires …","ref":"/docs/gardener/extensions/infrastructure/","tags":"","title":"Infrastructure"},{"body":"Post-Create Initialization of Machine Instance Background Today the driver.Driver facade represents the boundary between the the machine-controller and its various provider specific implementations.\nWe have abstract operations for creation/deletion and listing of machines (actually compute instances) but we do not correctly handle post-creation initialization logic. Nor do we provide an abstract operation to represent the hot update of an instance after creation.\nWe have found this to be necessary for several use cases. Today in the MCM AWS Provider, we already misuse driver.GetMachineStatus which is supposed to be a read-only operation obtaining the status of an instance.\n Each AWS EC2 instance performs source/destination checks by default. For EC2 NAT instances these should be disabled. This is done by issuing a ModifyInstanceAttribute request with the SourceDestCheck set to false. The MCM AWS Provider, decodes the AWSProviderSpec, reads providerSpec.SrcAndDstChecksEnabled and correspondingly issues the call to modify the already launched instance. However, this should be done as an action after creating the instance and should not be part of the VM status retrieval.\n Similarly, there is a pending PR to add the Ipv6AddessCount and Ipv6PrefixCount to enable the assignment of an ipv6 address and an ipv6 prefix to instances. This requires constructing and issuing an AssignIpv6Addresses request after the EC2 instance is available.\n We have other uses-cases such as MCM Issue#750 where there is a requirement to provide a way for consumers to add tags which can be hot-updated onto instances. This requirement can be generalized to also offer a convenient way to specify tags which can be applied to VMs, NICs, Devices etc.\n We have a need for “machine-instance-not-ready” taint as described in MCM#740 which should only get removed once the post creation updates are finished.\n Objectives We will split the fulfilment of this overall need into 2 stages of implementation.\n Stage-A: Support post-VM creation initialization logic of the instance suing a proposed Driver.InitializeMachine by permitting provider implementors to add initialization logic after VM creation, return with special new error code codes.Initialization for initialization errors and correspondingly support a new machine operation stage InstanceInitialization which will be updated in the machine LastOperation. The triggerCreationFlow - a reconciliation sub-flow of the MCM responsible for orchestrating instance creation and updating machine status will be changed to support this behaviour.\n Stage-B: Introduction of Driver.UpdateMachine and enhancing the MCM, MCM providers and gardener extension providers to support hot update of instances through Driver.UpdateMachine. The MCM triggerUpdationFlow - a reconciliation sub-flow of the MCM which is supposed to be responsible for orchestrating instance update - but currently not used, will be updated to invoke the provider Driver.UpdateMachine on hot-updates to to the Machine object\n Stage-A Proposal Current MCM triggerCreationFlow Today, reconcileClusterMachine which is the main routine for the Machine object reconciliation invokes triggerCreationFlow at the end when the machine.Spec.ProviderID is empty or if the machine.Status.CurrentStatus.Phase is empty or in CrashLoopBackOff\n%%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart LR other[\"...\"] --\u003echk{\"machine ProviderID empty OR Phase empty or CrashLoopBackOff ? \"}--yes--\u003etriggerCreationFlow chk--noo--\u003eLongRetry[\"return machineutils.LongRetry\"] Today, the triggerCreationFlow is illustrated below with some minor details omitted/compressed for brevity\nNOTES\n The lastop below is an abbreviation for machine.Status.LastOperation. This, along with the machine phase is generally updated on the Machine object just before returning from the method. regarding phase=CrashLoopBackOff|Failed. the machine phase may either be CrashLoopBackOff or move to Failed if the difference between current time and the machine.CreationTimestamp has exceeded the configured MachineCreationTimeout. %%{ init: { 'themeVariables': { 'fontSize': '12px'} } }%% flowchart TD end1((\"end\")) begin((\" \")) medretry[\"return MediumRetry, err\"] shortretry[\"return ShortRetry, err\"] medretry--\u003eend1 shortretry--\u003eend1 begin--\u003eAddBootstrapTokenToUserData --\u003egms[\"statusResp,statusErr=driver.GetMachineStatus(...)\"] --\u003echkstatuserr{\"Check statusErr\"} chkstatuserr--notFound--\u003echknodelbl{\"Chk Node Label\"} chkstatuserr--else--\u003ecreateFailed[\"lastop.Type=Create,lastop.state=Failed,phase=CrashLoopBackOff|Failed\"]--\u003emedretry chkstatuserr--nil--\u003einitnodename[\"nodeName = statusResp.NodeName\"]--\u003esetnodename chknodelbl--notset--\u003ecreatemachine[\"createResp, createErr=driver.CreateMachine(...)\"]--\u003echkCreateErr{\"Check createErr\"} chkCreateErr--notnil--\u003ecreateFailed chkCreateErr--nil--\u003egetnodename[\"nodeName = createResp.NodeName\"] --\u003echkstalenode{\"nodeName != machine.Name\\n//chk stale node\"} chkstalenode--false--\u003esetnodename[\"if unset machine.Labels['node']= nodeName\"] --\u003emachinepending[\"if empty/crashloopbackoff lastop.type=Create,lastop.State=Processing,phase=Pending\"] --\u003eshortretry chkstalenode--true--\u003edelmachine[\"driver.DeleteMachine(...)\"] --\u003epermafail[\"lastop.type=Create,lastop.state=Failed,Phase=Failed\"] --\u003eshortretry subgraph noteA [\" \"] permafail -.- note1([\"VM was referring to stale node obj\"]) end style noteA opacity:0 subgraph noteB [\" \"] setnodename-.- note2([\"Proposal: Introduce Driver.InitializeMachine after this\"]) end Enhancement of MCM triggerCreationFlow Relevant Observations on Current Flow Observe that we always perform a call to Driver.GetMachineStatus and only then conditionally perform a call to Driver.CreateMachine if there was was no machine found. Observe that after the call to a successful Driver.CreateMachine, the machine phase is set to Pending, the LastOperation.Type is currently set to Create and the LastOperation.State set to Processing before returning with a ShortRetry. The LastOperation.Description is (unfortunately) set to the fixed message: Creating machine on cloud provider. Observe that after an erroneous call to Driver.CreateMachine, the machine phase is set to CrashLoopBackOff or Failed (in case of creation timeout). The following changes are proposed with a view towards minimal impact on current code and no introduction of a new Machine Phase.\nMCM Changes We propose introducing a new machine operation Driver.InitializeMachine with the following signature type Driver interface { // .. existing methods are omitted for brevity. // InitializeMachine call is responsible for post-create initialization of the provider instance. InitializeMachine(context.Context, *InitializeMachineRequest) error } // InitializeMachineRequest is the initialization request for machine instance initialization type InitializeMachineRequest struct { // Machine object whose VM instance should be initialized Machine *v1alpha1.Machine // MachineClass backing the machine object MachineClass *v1alpha1.MachineClass // Secret backing the machineClass object Secret *corev1.Secret } We propose introducing a new MC error code codes.Initialization indicating that the VM Instance was created but there was an error in initialization after VM creation. The implementor of Driver.InitializeMachine can return this error code, indicating that InitializeMachine needs to be called again. The Machine Controller will change the phase to CrashLoopBackOff as usual when encountering a codes.Initialization error. We will introduce a new machine operation stage InstanceInitialization. In case of an codes.Initialization error the machine.Status.LastOperation.Description will be set to InstanceInitialization, machine.Status.LastOperation.ErrorCode will be set to codes.Initialization the LastOperation.Type will be set to Create the LastOperation.State set to Failed before returning with a ShortRetry The semantics of Driver.GetMachineStatus will be changed. If the instance associated with machine exists, but the instance was not initialized as expected, the provider implementations of GetMachineStatus should return an error: status.Error(codes.Initialization). If Driver.GetMachineStatus returned an error encapsulating codes.Initialization then Driver.InitializeMachine will be invoked again in the triggerCreationFlow. As according to the usual logic, the main machine controller reconciliation loop will now re-invoke the triggerCreationFlow again if the machine phase is CrashLoopBackOff. Illustration AWS Provider Changes Driver.InitializeMachine The implementation for the AWS Provider will look something like:\n After the VM instance is available, check providerSpec.SrcAndDstChecksEnabled, construct ModifyInstanceAttributeInput and call ModifyInstanceAttribute. In case of an error return codes.Initialization instead of the current codes.Internal Check providerSpec.NetworkInterfaces and if Ipv6PrefixCount is not nil, then construct AssignIpv6AddressesInput and call AssignIpv6Addresses. In case of an error return codes.Initialization. Don’t use the generic codes.Internal The existing Ipv6 PR will need modifications.\nDriver.GetMachineStatus If providerSpec.SrcAndDstChecksEnabled is false, check ec2.Instance.SourceDestCheck. If it does not match then return status.Error(codes.Initialization) Check providerSpec.NetworkInterfaces and if Ipv6PrefixCount is not nil, check ec2.Instance.NetworkInterfaces and check if InstanceNetworkInterface.Ipv6Addresses has a non-nil slice. If this is not the case then return status.Error(codes.Initialization) Instance Not Ready Taint Due to the fact that creation flow for machines will now be enhanced to correctly support post-creation startup logic, we should not scheduled workload until this startup logic is complete. Even without this feature we have a need for such a taint as described in MCM#740 We propose a new taint node.machine.sapcloud.io/instance-not-ready which will be added as a node startup taint in gardener core KubeletConfiguration.RegisterWithTaints The will will then removed by MCM in health check reconciliation, once the machine becomes fully ready. (when moving to Running phase) We will add this taint as part of --ignore-taint in CA We will introduce a disclaimer / prerequisite in the MCM FAQ, to add this taint as part of kubelet config under --register-with-taints, otherwise workload could get scheduled , before machine beomes Running Stage-B Proposal Enhancement of Driver Interface for Hot Updation Kindly refer to the Hot-Update Instances design which provides elaborate detail.\n","categories":"","description":"","excerpt":"Post-Create Initialization of Machine Instance Background Today the …","ref":"/docs/other-components/machine-controller-manager/proposals/initialize-machine/","tags":"","title":"Initialize Machine"},{"body":"Overview This guide walks you through the installation of the latest version of Knative using pre-built images on a Gardener created cluster environment. To set up your own Gardener, see the documentation or have a look at the landscape-setup-template project. To learn more about this open source project, read the blog on kubernetes.io.\nPrerequisites Knative requires a Kubernetes cluster v1.15 or newer.\nSteps Install and Configure kubectl If you already have kubectl CLI, run kubectl version --short to check the version. You need v1.10 or newer. If your kubectl is older, follow the next step to install a newer version.\n Install the kubectl CLI.\n Access Gardener Create a project in the Gardener dashboard. This will essentially create a Kubernetes namespace with the name garden-\u003cmy-project\u003e.\n Configure access to your Gardener project using a kubeconfig.\nIf you are not the Gardener Administrator already, you can create a technical user in the Gardener dashboard. Go to the “Members” section and add a service account. You can then download the kubeconfig for your project. You can skip this step if you create your cluster using the user interface; it is only needed for programmatic access, make sure you set export KUBECONFIG=garden-my-project.yaml in your shell.\n Creating a Kubernetes Cluster You can create your cluster using kubectl CLI by providing a cluster specification yaml file. You can find an example for GCP in the gardener/gardener repository. Make sure the namespace matches that of your project. Then just apply the prepared so-called “shoot” cluster CRD with kubectl:\nkubectl apply --filename my-cluster.yaml The easier alternative is to create the cluster following the cluster creation wizard in the Gardener dashboard: Configure kubectl for Your Cluster You can now download the kubeconfig for your freshly created cluster in the Gardener dashboard or via the CLI as follows:\nkubectl --namespace shoot--my-project--my-cluster get secret kubecfg --output jsonpath={.data.kubeconfig} | base64 --decode \u003e my-cluster.yaml This kubeconfig file has full administrators access to you cluster. For the rest of this guide, be sure you have export KUBECONFIG=my-cluster.yaml set.\nInstalling Istio Knative depends on Istio. If your cloud platform offers a managed Istio installation, we recommend installing Istio that way, unless you need the ability to customize your installation.\nOtherwise, see the Installing Istio for Knative guide to install Istio.\nYou must install Istio on your Kubernetes cluster before continuing with these instructions to install Knative.\nInstalling cluster-local-gateway for Serving Cluster-Internal Traffic If you installed Istio, you can install a cluster-local-gateway within your Knative cluster so that you can serve cluster-internal traffic. If you want to configure your revisions to use routes that are visible only within your cluster, install and use the cluster-local-gateway.\nInstalling Knative The following commands install all available Knative components as well as the standard set of observability plugins. Knative’s installation guide - Installing Knative.\n If you are upgrading from Knative 0.3.x: Update your domain and static IP address to be associated with the LoadBalancer istio-ingressgateway instead of knative-ingressgateway. Then run the following to clean up leftover resources:\nkubectl delete svc knative-ingressgateway -n istio-system kubectl delete deploy knative-ingressgateway -n istio-system If you have the Knative Eventing Sources component installed, you will also need to delete the following resource before upgrading:\nkubectl delete statefulset/controller-manager -n knative-sources While the deletion of this resource during the upgrade process will not prevent modifications to Eventing Source resources, those changes will not be completed until the upgrade process finishes.\n To install Knative, first install the CRDs by running the kubectl apply command once with the -l knative.dev/crd-install=true flag. This prevents race conditions during the install, which cause intermittent errors:\nkubectl apply --selector knative.dev/crd-install=true \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/serving.yaml \\ --filename https://github.com/knative/eventing/releases/download/v0.12.1/eventing.yaml \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/monitoring.yaml To complete the installation of Knative and its dependencies, run the kubectl apply command again, this time without the --selector flag:\nkubectl apply --filename https://github.com/knative/serving/releases/download/v0.12.1/serving.yaml \\ --filename https://github.com/knative/eventing/releases/download/v0.12.1/eventing.yaml \\ --filename https://github.com/knative/serving/releases/download/v0.12.1/monitoring.yaml Monitor the Knative components until all of the components show a STATUS of Running:\nkubectl get pods --namespace knative-serving kubectl get pods --namespace knative-eventing kubectl get pods --namespace knative-monitoring Set Your Custom Domain Fetch the external IP or CNAME of the knative-ingressgateway: kubectl --namespace istio-system get service knative-ingressgateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE knative-ingressgateway LoadBalancer 100.70.219.81 35.233.41.212 80:32380/TCP,443:32390/TCP,32400:32400/TCP 4d Create a wildcard DNS entry in your custom domain to point to the above IP or CNAME: *.knative.\u003cmy domain\u003e == A 35.233.41.212 # or CNAME if you are on AWS *.knative.\u003cmy domain\u003e == CNAME a317a278525d111e89f272a164fd35fb-1510370581.eu-central-1.elb.amazonaws.com Adapt your Knative config-domain (set your domain in the data field): kubectl --namespace knative-serving get configmaps config-domain --output yaml apiVersion: v1 data: knative.\u003cmy domain\u003e: \"\" kind: ConfigMap name: config-domain namespace: knative-serving What’s Next Now that your cluster has Knative installed, you can see what Knative has to offer.\nDeploy your first app with the Getting Started with Knative App Deployment guide.\nGet started with Knative Eventing by walking through one of the Eventing Samples.\nInstall Cert-Manager if you want to use the automatic TLS cert provisioning feature.\nCleaning Up Use the Gardener dashboard to delete your cluster, or execute the following with kubectl pointing to your garden-my-project.yaml kubeconfig:\nkubectl --kubeconfig garden-my-project.yaml --namespace garden--my-project annotate shoot my-cluster confirmation.gardener.cloud/deletion=true kubectl --kubeconfig garden-my-project.yaml --namespace garden--my-project delete shoot my-cluster ","categories":"","description":"A walkthrough the steps for installing Knative in Gardener shoot clusters.","excerpt":"A walkthrough the steps for installing Knative in Gardener shoot …","ref":"/docs/guides/applications/knative-install/","tags":"","title":"Install Knative in Gardener Clusters"},{"body":"Integration tests Usage General setup \u0026 configurations Integration tests for machine-controller-manager-provider-{provider-name} can be executed manually by following below steps.\n Clone the repository machine-controller-manager-provider-{provider-name} on the local system. Navigate to machine-controller-manager-provider-{provider-name} directory and create a dev sub-directory in it. If the tags on instances \u0026 associated resources on the provider are of String type (for example, GCP tags on its instances are of type String and not key-value pair) then add TAGS_ARE_STRINGS := true in the Makefile and export it. For GCP this has already been hard coded in the Makefile. Running the tests There is a rule test-integration in the Makefile of the provider repository, which can be used to start the integration test: $ make test-integration This will ask for additional inputs. Most of them are self explanatory except: The script assumes that both the control and target clusters are already being created. In case of non-gardener setup (control cluster is not a gardener seed), the name of the machineclass must be test-mc-v1 and the value of providerSpec.secretRef.name should be test-mc-secret. In case of azure, TARGET_CLUSTER_NAME must be same as the name of the Azure ResourceGroup for the cluster. If you are deploying the secret manually, a Secret named test-mc-secret (that contains the provider secret and cloud-config) in the default namespace of the Control Cluster should be created. The controllers log files (mcm_process.log and mc_process.log) are stored in .ci/controllers-test/logs repo and can be used later. Adding Integration Tests for new providers For a new provider, Running Integration tests works with no changes. But for the orphan resource test cases to work correctly, the provider-specific API calls and the Resource Tracker Interface (RTI) should be implemented. Please check machine-controller-manager-provider-aws for reference.\nExtending integration tests Update ControllerTests to be extend the testcases for all providers. Common testcases for machine|machineDeployment creation|deletion|scaling are packaged into ControllerTests. To extend the provider specfic test cases, the changes should be done in the machine-controller-manager-provider-{provider-name} repository. For example, to extended the testcases for machine-controller-manager-provider-aws, make changes to test/integration/controller/controller_test.go inside the machine-controller-manager-provider-aws repository. commons contains the Cluster and Clientset objects that makes it easy to extend the tests. ","categories":"","description":"","excerpt":"Integration tests Usage General setup \u0026 configurations Integration …","ref":"/docs/other-components/machine-controller-manager/integration_tests/","tags":"","title":"Integration Tests"},{"body":"Introduction When transferring data among networked systems, trust is a central concern. In particular, when communicating over an untrusted medium such as the internet, it is critical to ensure the integrity and immutability of all the data a system operates on. Especially if you use Docker Engine to push and pull images (data) to a public registry.\nThis immutability offers you a guarantee that any and all containers that you instantiate will be absolutely identical at inception. Surprise surprise, deterministic operations.\nA Lesson in Deterministic Ops Docker Tags are about as reliable and disposable as this guy down here.\nSeems simple enough. You have probably already deployed hundreds of YAML’s or started endless counts of Docker containers.\ndocker run --name mynginx1 -P -d nginx:1.13.9 or\napiVersion: apps/v1 kind: Deployment metadata: name: rss-site spec: replicas: 1 selector: matchLabels: app: web template: metadata: labels: app: web spec: containers: - name: front-end image: nginx:1.13.9 ports: - containerPort: 80 But Tags are mutable and humans are prone to error. Not a good combination. Here, we’ll dig into why the use of tags can be dangerous and how to deploy your containers across a pipeline and across environments with determinism in mind.\nLet’s say that you want to ensure that whether it’s today or 5 years from now, that specific deployment uses the very same image that you have defined. Any updates or newer versions of an image should be executed as a new deployment. The solution: digest\nA digest takes the place of the tag when pulling an image. For example, to pull the above image by digest, run the following command:\ndocker run --name mynginx1 -P -d nginx@sha256:4771d09578c7c6a65299e110b3ee1c0a2592f5ea2618d23e4ffe7a4cab1ce5de You can now make sure that the same image is always loaded at every deployment. It doesn’t matter if the TAG of the image has been changed or not. This solves the problem of repeatability.\nContent Trust However, there’s an additionally hidden danger. It is possible for an attacker to replace a server image with another one infected with malware.\nDocker Content trust gives you the ability to verify both the integrity and the publisher of all the data received from a registry over any channel.\nPrior to version 1.8, Docker didn’t have a way to verify the authenticity of a server image. But in v1.8, a new feature called Docker Content Trust was introduced to automatically sign and verify the signature of a publisher.\nSo, as soon as a server image is downloaded, it is cross-checked with the signature of the publisher to see if someone tampered with it in any way. This solves the problem of trust.\nIn addition, you should scan all images for known vulnerabilities.\n","categories":"","description":"Ensure that you always get the right image","excerpt":"Ensure that you always get the right image","ref":"/docs/guides/applications/content_trust/","tags":"","title":"Integrity and Immutability"},{"body":"IPv6 in Gardener Clusters 🚧 IPv6 networking is currently under development.\n IPv6 Single-Stack Networking GEP-21 proposes IPv6 Single-Stack Support in the local Gardener environment. This documentation will be enhanced while implementing GEP-21, see gardener/gardener#7051.\nTo use IPv6 single-stack networking, the feature gate IPv6SingleStack must be enabled on gardener-apiserver and gardenlet.\nDevelopment/Testing Setup Developing or testing IPv6-related features requires a Linux machine (docker only supports IPv6 on Linux) and native IPv6 connectivity to the internet. If you’re on a different OS or don’t have IPv6 connectivity in your office environment or via your home ISP, make sure to check out gardener-community/dev-box-gcp, which allows you to circumvent these limitations.\nTo get started with the IPv6 setup and create a local IPv6 single-stack shoot cluster, run the following commands:\nmake kind-up gardener-up IPFAMILY=ipv6 k apply -f example/provider-local/shoot-ipv6.yaml Please also take a look at the guide on Deploying Gardener Locally for more details on setting up an IPv6 gardener for testing or development purposes.\nContainer Images If you plan on using custom images, make sure your registry supports IPv6 access.\nCheck the component checklist for tips concerning container registries and how to handle their IPv6 support.\n","categories":"","description":"","excerpt":"IPv6 in Gardener Clusters 🚧 IPv6 networking is currently under …","ref":"/docs/gardener/ipv6/","tags":"","title":"Ipv6"},{"body":"Istio Istio offers a service mesh implementation with focus on several important features - traffic, observability, security, and policy.\nPrerequisites Third-party JWT is used, therefore each Seed cluster where this feature is enabled must have Service Account Token Volume Projection enabled. Kubernetes 1.16+ Differences with Istio’s Default Profile The default profile which is recommended for production deployment, is not suitable for the Gardener use case, as it offers more functionality than desired. The current installation goes through heavy refactorings due to the IstioOperator and the mixture of Helm values + Kubernetes API specification makes configuring and fine-tuning it very hard. A more simplistic deployment is used by Gardener. The differences are the following:\n Telemetry is not deployed. istiod is deployed. istio-ingress-gateway is deployed in a separate istio-ingress namespace. istio-egress-gateway is not deployed. None of the Istio addons are deployed. Mixer (deprecated) is not deployed. Mixer CDRs are not deployed. Kubernetes Service, Istio’s VirtualService and ServiceEntry are NOT advertised in the service mesh. This means that if a Service needs to be accessed directly from the Istio Ingress Gateway, it should have networking.istio.io/exportTo: \"*\" annotation. VirtualService and ServiceEntry must have .spec.exportTo: [\"*\"] set on them respectively. Istio injector is not enabled. mTLS is enabled by default. Handling Multiple Availability Zones with Istio For various reasons, e.g., improved resiliency to certain failures, it may be beneficial to use multiple availability zones in a seed cluster. While availability zones have advantages in being able to cover some failure domains, they also come with some additional challenges. Most notably, the latency across availability zone boundaries is higher than within an availability zone. Furthermore, there might be additional cost implied by network traffic crossing an availability zone boundary. Therefore, it may be useful to try to keep traffic within an availability zone if possible. The istio deployment as part of Gardener has been adapted to allow this.\nA seed cluster spanning multiple availability zones may be used for highly-available shoot control planes. Those control planes may use a single or multiple availability zones. In addition to that, ordinary non-highly-available shoot control planes may be scheduled to such a seed cluster as well. The result is that the seed cluster may have control planes spanning multiple availability zones and control planes that are pinned to exactly one availability zone. These two types need to be handled differently when trying to prevent unnecessary cross-zonal traffic.\nThe goal is achieved by using multiple istio ingress gateways. The default istio ingress gateway spans all availability zones. It is used for multi-zonal shoot control planes. For each availability zone, there is an additional istio ingress gateway, which is utilized only for single-zone shoot control planes pinned to this availability zone. This is illustrated in the following diagram.\nPlease note that operators may need to perform additional tuning to prevent cross-zonal traffic completely. The loadbalancer settings in the seed specification offer various options, e.g., by setting the external traffic policy to local or using infrastructure specific loadbalancer annotations.\nFurthermore, note that this approach is also taken in case ExposureClasses are used. For each exposure class, additional zonal istio ingress gateways may be deployed to cover for single-zone shoot control planes using the exposure class.\n","categories":"","description":"","excerpt":"Istio Istio offers a service mesh implementation with focus on several …","ref":"/docs/gardener/istio/","tags":"","title":"Istio"},{"body":"Using annotated Istio Gateway and/or Istio Virtual Service as Source This tutorial describes how to use annotated Istio Gateway resources as source for Certificate resources.\nInstall Istio on your cluster Follow the Istio Getting Started to download and install Istio.\nThese are the typical commands for the istio demo installation\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - istioctl install --set profile=demo -y kubectl label namespace default istio-injection=enabled Note: If you are using a KinD cluster, the istio-ingressgateway service may be pending forever.\n$ kubectl -n istio-system get svc istio-ingressgateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE istio-ingressgateway LoadBalancer 10.96.88.189 \u003cpending\u003e 15021:30590/TCP,80:30185/TCP,443:30075/TCP,31400:30129/TCP,15443:30956/TCP 13m In this case, you may patch the status for demo purposes (of course it still would not accept connections)\nkubectl -n istio-system patch svc istio-ingressgateway --type=merge --subresource status --patch '{\"status\":{\"loadBalancer\":{\"ingress\":[{\"ip\":\"1.2.3.4\"}]}}}' Verify that Istio Gateway/VirtualService Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Create an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: #cert.gardener.cloud/dnsnames: \"*.example.com\" # alternative if you want to control the dns names explicitly. cert.gardener.cloud/purpose: managed spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 443 name: http protocol: HTTPS hosts: - \"httpbin.example.com\" # this is used by the dns-controller-manager to extract DNS names tls: credentialName: my-tls-secret EOF You should now see a created Certificate resource similar to:\n$ kubectl -n istio-system get cert -oyaml apiVersion: v1 items: - apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: generateName: httpbin-gateway-gateway- name: httpbin-gateway-gateway-hdbjb namespace: istio-system ownerReferences: - apiVersion: networking.istio.io/v1 blockOwnerDeletion: true controller: true kind: Gateway name: httpbin-gateway spec: commonName: httpbin.example.com secretName: my-tls-secret status: ... kind: List metadata: resourceVersion: \"\" Using a VirtualService as a source If the Gateway resource is annotated with cert.gardener.cloud/purpose: managed, hosts from all referencing VirtualServices resources are automatically extracted. These resources don’t need an additional annotation.\nCreate an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: cert.gardener.cloud/purpose: managed spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 443 name: https protocol: HTTPS hosts: - \"*\" tls: credentialName: my-tls-secret EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is used by dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF This should show a similar Certificate resource as above.\n","categories":"","description":"","excerpt":"Using annotated Istio Gateway and/or Istio Virtual Service as Source …","ref":"/docs/extensions/others/gardener-extension-shoot-cert-service/tutorials/istio-gateways/","tags":"","title":"Istio Gateways"},{"body":"Using annotated Istio Gateway and/or Istio Virtual Service as Source This tutorial describes how to use annotated Istio Gateway resources as source for DNSEntries with the Gardener shoot-dns-service extension.\nInstall Istio on your cluster Using a new or existing shoot cluster, follow the Istio Getting Started to download and install Istio.\nThese are the typical commands for the istio demo installation\nexport KUEBCONFIG=... curl -L https://istio.io/downloadIstio | sh - istioctl install --set profile=demo -y kubectl label namespace default istio-injection=enabled Verify that Istio Gateway/VirtualService Source works Install a sample service With automatic sidecar injection:\n$ kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.20/samples/httpbin/httpbin.yaml Using a Gateway as a source Create an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 80 name: http protocol: HTTP hosts: - \"httpbin.example.com\" # this is used by the dns-controller-manager to extract DNS names EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is also used by the dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF You should now see events in the namespace of the gateway:\n$ kubectl -n istio-system get events --sort-by={.metadata.creationTimestamp} LAST SEEN TYPE REASON OBJECT MESSAGE ... 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: created dns entry object shoot--foo--bar/httpbin-gateway-gateway-zpf8n 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry pending: waiting for dns reconciliation 38s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry is pending 36s Normal dns-annotation gateway/httpbin-gateway httpbin.example.com: dns entry active Using a VirtualService as a source If the Gateway resource is annotated with dns.gardener.cloud/dnsnames: \"*\", hosts from all referencing VirtualServices resources are automatically extracted. These resources don’t need an additional annotation.\nCreate an Istio Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: httpbin-gateway namespace: istio-system annotations: dns.gardener.cloud/dnsnames: \"*\" dns.gardener.cloud/class: garden spec: selector: istio: ingressgateway # use Istio default gateway implementation servers: - port: number: 80 name: http protocol: HTTP hosts: - \"*\" EOF Configure routes for traffic entering via the Gateway: $ cat \u003c\u003cEOF | kubectl apply -f - apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: httpbin namespace: default spec: hosts: - \"httpbin.example.com\" # this is used by dns-controller-manager to extract DNS names gateways: - istio-system/httpbin-gateway http: - match: - uri: prefix: /status - uri: prefix: /delay route: - destination: port: number: 8000 host: httpbin EOF This should show a similar events as above.\nTo get the targets to the extracted DNS names, the shoot-dns-service controller is able to gather information from the kubernetes service of the Istio Ingress Gateway.\nNote: It is also possible to set the targets my specifying an Ingress resource using the dns.gardener.cloud/ingress annotation on the Istio Ingress Gateway resource.\nNote: It is also possible to set the targets manually by using the dns.gardener.cloud/targets annotation on the Istio Ingress Gateway resource.\nAccess the sample service using curl $ curl -I http://httpbin.example.com/status/200 HTTP/1.1 200 OK server: istio-envoy date: Tue, 13 Feb 2024 07:49:37 GMT content-type: text/html; charset=utf-8 access-control-allow-origin: * access-control-allow-credentials: true content-length: 0 x-envoy-upstream-service-time: 15 Accessing any other URL that has not been explicitly exposed should return an HTTP 404 error:\n$ curl -I http://httpbin.example.com/headers HTTP/1.1 404 Not Found date: Tue, 13 Feb 2024 08:09:41 GMT server: istio-envoy transfer-encoding: chunked ","categories":"","description":"","excerpt":"Using annotated Istio Gateway and/or Istio Virtual Service as Source …","ref":"/docs/extensions/others/gardener-extension-shoot-dns-service/tutorials/istio-gateways/","tags":"","title":"Istio Gateways"},{"body":"Overview Use the Kubernetes command-line tool, kubectl, to deploy and manage applications on Kubernetes. Using kubectl, you can inspect cluster resources, as well as create, delete, and update components.\nBy default, the kubectl configuration is located at ~/.kube/config.\nLet us suppose that you have two clusters, one for development work and one for scratch work.\nHow to handle this easily without copying the used configuration always to the right place?\nExport the KUBECONFIG Environment Variable bash$ export KUBECONFIG=\u003cPATH-TO-M\u003e-CONFIG\u003e/kubeconfig-dev.yaml How to determine which cluster is used by the kubectl command?\nDetermine Active Cluster bash$ kubectl cluster-info Kubernetes master is running at https://api.dev.garden.shoot.canary.k8s-hana.ondemand.com KubeDNS is running at https://api.dev.garden.shoot.canary.k8s-hana.ondemand.com/api/v1/proxy/namespaces/kube-system/services/kube-dns To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. bash$ Display Cluster in the bash - Linux and Alike I found this tip on Stackoverflow and find it worth to be added here.\nEdit your ~/.bash_profile and add the following code snippet to show the current K8s context in the shell’s prompt:\nprompt_k8s(){ k8s_current_context=$(kubectl config current-context 2\u003e /dev/null) if [[ $? -eq 0 ]] ; then echo -e \"(${k8s_current_context}) \"; fi } PS1+='$(prompt_k8s)' After this, your bash command prompt contains the active KUBECONFIG context and you always know which cluster is active - develop or production.\nFor example:\nbash$ export KUBECONFIG=/Users/d023280/Documents/workspace/gardener-ui/kubeconfig_gardendev.yaml bash (garden_dev)$ Note the (garden_dev) prefix in the bash command prompt.\nThis helps immensely to avoid thoughtless mistakes.\nDisplay Cluster in the PowerShell - Windows Display the current K8s cluster in the title of PowerShell window.\nCreate a profile file for your shell under %UserProfile%\\Documents\\Windows­PowerShell\\Microsoft.PowerShell_profile.ps1\nCopy following code to Microsoft.PowerShell_profile.ps1\n function prompt_k8s { $k8s_current_context = (kubectl config current-context) | Out-String if($?) { return $k8s_current_context }else { return \"No K8S contenxt found\" } } $host.ui.rawui.WindowTitle = prompt_k8s If you want to switch to different cluster, you can set KUBECONFIG to new value, and re-run the file Microsoft.PowerShell_profile.ps1\n","categories":"","description":"Expose the active kubeconfig into bash","excerpt":"Expose the active kubeconfig into bash","ref":"/docs/guides/client-tools/bash-kubeconfig/","tags":"","title":"Kubeconfig Context as bash Prompt"},{"body":"This HowTo covers common Kubernetes antipatterns that we have seen over the past months.\nRunning as Root User Whenever possible, do not run containers as root user. One could be tempted to say that Kubernetes pods and nodes are well separated. Host and containers running on it share the same kernel. If a container is compromised, the root user in the container has full control over the underlying node.\nWatch the very good presentation by Liz Rice at the KubeCon 2018\n Use RUN groupadd -r anygroup \u0026\u0026 useradd -r -g anygroup myuser to create a group and add a user to it. Use the USER command to switch to this user. Note that you may also consider to provide an explicit UID/GID if required.\nFor example:\nARG GF_UID=\"500\" ARG GF_GID=\"500\" # add group \u0026 user RUN groupadd -r -g $GF_GID appgroup \u0026\u0026 \\ useradd appuser -r -u $GF_UID -g appgroup USER appuser Store Data or Logs in Containers Containers are ideal for stateless applications and should be transient. This means that no data or logs should be stored in the container, as they are lost when the container is closed. Use persistence volumes instead to persist data outside of containers. Using an ELK stack is another good option for storing and processing logs.\nUsing Pod IP Addresses Each pod is assigned an IP address. It is necessary for pods to communicate with each other to build an application, e.g. an application must communicate with a database. Existing pods are terminated and new pods are constantly started. If you would rely on the IP address of a pod or container, you would need to update the application configuration constantly. This makes the application fragile.\nCreate services instead. They provide a logical name that can be assigned independently of the varying number and IP addresses of containers. Services are the basic concept for load balancing within Kubernetes.\nMore Than One Process in a Container A docker file provides a CMD and ENTRYPOINT to start the image. CMD is often used around a script that makes a configuration and then starts the container. Do not try to start multiple processes with this script. It is important to consider the separation of concerns when creating docker images. Running multiple processes in a single pod makes managing your containers, collecting logs and updating each process more difficult.\nYou can split the image into multiple containers and manage them independently - even in one pod. Bear in mind that Kubernetes only monitors the process with PID=1. If more than one process is started within a container, then these no longer fall under the control of Kubernetes.\nCreating Images in a Running Container A new image can be created with the docker commit command. This is useful if changes have been made to the container and you want to persist them for later error analysis. However, images created like this are not reproducible and completely worthless for a CI/CD environment. Furthermore, another developer cannot recognize which components the image contains. Instead, always make changes to the docker file, close existing containers and start a new container with the updated image.\nSaving Passwords in a docker Image 💀 Do not save passwords in a Docker file! They are in plain text and are checked into a repository. That makes them completely vulnerable even if you are using a private repository like the Artifactory.\nAlways use Secrets or ConfigMaps to provision passwords or inject them by mounting a persistent volume.\nUsing the ’latest’ Tag Starting an image with tomcat is tempting. If no tags are specified, a container is started with the tomcat:latest image. This image may no longer be up to date and refer to an older version instead. Running a production application requires complete control of the environment with exact versions of the image.\nMake sure you always use a tag or even better the sha256 hash of the image, e.g., tomcat@sha256:c34ce3c1fcc0c7431e1392cc3abd0dfe2192ffea1898d5250f199d3ac8d8720f.\nWhy Use the sha256 Hash? Tags are not immutable and can be overwritten by a developer at any time. In this case you don’t have complete control over your image - which is bad.\nDifferent Images per Environment Don’t create different images for development, testing, staging and production environments. The image should be the source of truth and should only be created once and pushed to the repository. This image:tag should be used for different environments in the future.\nDepend on Start Order of Pods Applications often depend on containers being started in a certain order. For example, a database container must be up and running before an application can connect to it. The application should be resilient to such changes, as the db pod can be unreachable or restarted at any time. The application container should be able to handle such situations without terminating or crashing.\nAdditional Anti-Patterns and Patterns In the community, vast experience has been collected to improve the stability and usability of Docker and Kubernetes.\nRefer to Kubernetes Production Patterns for more information.\n","categories":"","description":"Common antipatterns for Kubernetes and Docker","excerpt":"Common antipatterns for Kubernetes and Docker","ref":"/docs/guides/applications/antipattern/","tags":"","title":"Kubernetes Antipatterns"},{"body":"Kubernetes Clients in Gardener This document aims at providing a general developer guideline on different aspects of using Kubernetes clients in a large-scale distributed system and project like Gardener. The points included here are not meant to be consulted as absolute rules, but rather as general rules of thumb that allow developers to get a better feeling about certain gotchas and caveats. It should be updated with lessons learned from maintaining the project and running Gardener in production.\nPrerequisites: Please familiarize yourself with the following basic Kubernetes API concepts first, if you’re new to Kubernetes. A good understanding of these basics will help you better comprehend the following document.\n Kubernetes API Concepts (including terminology, watch basics, etc.) Extending the Kubernetes API (including Custom Resources and aggregation layer / extension API servers) Extend the Kubernetes API with CustomResourceDefinitions Working with Kubernetes Objects Sample Controller (the diagram helps to build an understanding of an controller’s basic structure) Client Types: Client-Go, Generated, Controller-Runtime For historical reasons, you will find different kinds of Kubernetes clients in Gardener:\nClient-Go Clients client-go is the default/official client for talking to the Kubernetes API in Golang. It features the so called “client sets” for all built-in Kubernetes API groups and versions (e.g. v1 (aka core/v1), apps/v1). client-go clients are generated from the built-in API types using client-gen and are composed of interfaces for every known API GroupVersionKind. A typical client-go usage looks like this:\nvar ( ctx context.Context c kubernetes.Interface // \"k8s.io/client-go/kubernetes\" deployment *appsv1.Deployment // \"k8s.io/api/apps/v1\" ) updatedDeployment, err := c.AppsV1().Deployments(\"default\").Update(ctx, deployment, metav1.UpdateOptions{}) Important characteristics of client-go clients:\n clients are specific to a given API GroupVersionKind, i.e., clients are hard-coded to corresponding API-paths (don’t need to use the discovery API to map GVK to a REST endpoint path). client’s don’t modify the passed in-memory object (e.g. deployment in the above example). Instead, they return a new in-memory object. This means that controllers have to continue working with the new in-memory object or overwrite the shared object to not lose any state updates. Generated Client Sets for Gardener APIs Gardener’s APIs extend the Kubernetes API by registering an extension API server (in the garden cluster) and CustomResourceDefinitions (on Seed clusters), meaning that the Kubernetes API will expose additional REST endpoints to manage Gardener resources in addition to the built-in API resources. In order to talk to these extended APIs in our controllers and components, client-gen is used to generate client-go-style clients to pkg/client/{core,extensions,seedmanagement,...}.\nUsage of these clients is equivalent to client-go clients, and the same characteristics apply. For example:\nvar ( ctx context.Context c gardencoreclientset.Interface // \"github.com/gardener/gardener/pkg/client/core/clientset/versioned\" shoot *gardencorev1beta1.Shoot // \"github.com/gardener/gardener/pkg/apis/core/v1beta1\" ) updatedShoot, err := c.CoreV1beta1().Shoots(\"garden-my-project\").Update(ctx, shoot, metav1.UpdateOptions{}) Controller-Runtime Clients controller-runtime is a Kubernetes community project (kubebuilder subproject) for building controllers and operators for custom resources. Therefore, it features a generic client that follows a different approach and does not rely on generated client sets. Instead, the client can be used for managing any Kubernetes resources (built-in or custom) homogeneously. For example:\nvar ( ctx context.Context c client.Client // \"sigs.k8s.io/controller-runtime/pkg/client\" deployment *appsv1.Deployment // \"k8s.io/api/apps/v1\" shoot *gardencorev1beta1.Shoot // \"github.com/gardener/gardener/pkg/apis/core/v1beta1\" ) err := c.Update(ctx, deployment) // or err = c.Update(ctx, shoot) A brief introduction to controller-runtime and its basic constructs can be found at the official Go documentation.\nImportant characteristics of controller-runtime clients:\n The client functions take a generic client.Object or client.ObjectList value. These interfaces are implemented by all Golang types, that represent Kubernetes API objects or lists respectively which can be interacted with via usual API requests. [1] The client first consults a runtime.Scheme (configured during client creation) for recognizing the object’s GroupVersionKind (this happens on the client-side only). A runtime.Scheme is basically a registry for Golang API types, defaulting and conversion functions. Schemes are usually provided per GroupVersion (see this example for apps/v1) and can be combined to one single scheme for further usage (example). In controller-runtime clients, schemes are used only for mapping a typed API object to its GroupVersionKind. It then consults a meta.RESTMapper (also configured during client creation) for mapping the GroupVersionKind to a RESTMapping, which contains the GroupVersionResource and Scope (namespaced or cluster-scoped). From these values, the client can unambiguously determine the REST endpoint path of the corresponding API resource. For instance: appsv1.DeploymentList is available at /apis/apps/v1/deployments or /apis/apps/v1/namespaces/\u003cnamespace\u003e/deployments respectively. There are different RESTMapper implementations, but generally they are talking to the API server’s discovery API for retrieving RESTMappings for all API resources known to the API server (either built-in, registered via API extension or CustomResourceDefinitions). The default implementation of a controller-runtime (which Gardener uses as well) is the dynamic RESTMapper. It caches discovery results (i.e. RESTMappings) in-memory and only re-discovers resources from the API server when a client tries to use an unknown GroupVersionKind, i.e., when it encounters a No{Kind,Resource}MatchError. The client writes back results from the API server into the passed in-memory object. This means that controllers don’t have to worry about copying back the results and should just continue to work on the given in-memory object. This is a nice and flexible pattern, and helper functions should try to follow it wherever applicable. Meaning, if possible accept an object param, pass it down to clients and keep working on the same in-memory object instead of creating a new one in your helper function. The benefit is that you don’t lose updates to the API object and always have the last-known state in memory. Therefore, you don’t have to read it again, e.g., for getting the current resourceVersion when working with optimistic locking, and thus minimize the chances for running into conflicts. However, controllers must not use the same in-memory object concurrently in multiple goroutines. For example, decoding results from the API server in multiple goroutines into the same maps (e.g., labels, annotations) will cause panics because of “concurrent map writes”. Also, reading from an in-memory API object in one goroutine while decoding into it in another goroutine will yield non-atomic reads, meaning data might be corrupt and represent a non-valid/non-existing API object. Therefore, if you need to use the same in-memory object in multiple goroutines concurrently (e.g., shared state), remember to leverage proper synchronization techniques like channels, mutexes, atomic.Value and/or copy the object prior to use. The average controller however, will not need to share in-memory API objects between goroutines, and it’s typically an indicator that the controller’s design should be improved. The client decoder erases the object’s TypeMeta (apiVersion and kind fields) after retrieval from the API server, see kubernetes/kubernetes#80609, kubernetes-sigs/controller-runtime#1517. Unstructured and metadata-only requests objects are an exception to this because the contained TypeMeta is the only way to identify the object’s type. Because of this behavior, obj.GetObjectKind().GroupVersionKind() is likely to return an empty GroupVersionKind. I.e., you must not rely on TypeMeta being set or GetObjectKind() to return something usable. If you need to identify an object’s GroupVersionKind, use a scheme and its ObjectKinds function instead (or the helper function apiutil.GVKForObject). This is not specific to controller-runtime clients and applies to client-go clients as well. [1] Other lower level, config or internal API types (e.g., such as AdmissionReview) don’t implement client.Object. However, you also can’t interact with such objects via the Kubernetes API and thus also not via a client, so this can be disregarded at this point.\nMetadata-Only Clients Additionally, controller-runtime clients can be used to easily retrieve metadata-only objects or lists. This is useful for efficiently checking if at least one object of a given kind exists, or retrieving metadata of an object, if one is not interested in the rest (e.g., spec/status). The Accept header sent to the API server then contains application/json;as=PartialObjectMetadataList;g=meta.k8s.io;v=v1, which makes the API server only return metadata of the retrieved object(s). This saves network traffic and CPU/memory load on the API server and client side. If the client fully lists all objects of a given kind including their spec/status, the resulting list can be quite large and easily exceed the controllers available memory. That’s why it’s important to carefully check if a full list is actually needed, or if metadata-only list can be used instead.\nFor example:\nvar ( ctx context.Context c client.Client // \"sigs.k8s.io/controller-runtime/pkg/client\" shootList = \u0026metav1.PartialObjectMetadataList{} // \"k8s.io/apimachinery/pkg/apis/meta/v1\" ) shootList.SetGroupVersionKind(gardencorev1beta1.SchemeGroupVersion.WithKind(\"ShootList\")) if err := c.List(ctx, shootList, client.InNamespace(\"garden-my-project\"), client.Limit(1)); err != nil { return err } if len(shootList.Items) \u003e 0 { // project has at least one shoot } else { // project doesn't have any shoots } Gardener’s Client Collection, ClientMaps The Gardener codebase has a collection of clients (kubernetes.Interface), which can return all the above mentioned client types. Additionally, it contains helpers for rendering and applying helm charts (ChartRender, ChartApplier) and retrieving the API server’s version (Version). Client sets are managed by so called ClientMaps, which are a form of registry for all client set for a given type of cluster, i.e., Garden, Seed and Shoot. ClientMaps manage the whole lifecycle of clients: they take care of creating them if they don’t exist already, running their caches, refreshing their cached server version and invalidating them when they are no longer needed.\nvar ( ctx context.Context cm clientmap.ClientMap // \"github.com/gardener/gardener/pkg/client/kubernetes/clientmap\" shoot *gardencorev1beta1.Shoot ) cs, err := cm.GetClient(ctx, keys.ForShoot(shoot)) // kubernetes.Interface if err != nil { return err } c := cs.Client() // client.Client The client collection mainly exist for historical reasons (there used to be a lot of code using the client-go style clients). However, Gardener is in the process of moving more towards controller-runtime and only using their clients, as they provide many benefits and are much easier to use. Also, gardener/gardener#4251 aims at refactoring our controller and admission components to native controller-runtime components.\n ⚠️ Please always prefer controller-runtime clients over other clients when writing new code or refactoring existing code.\n Cache Types: Informers, Listers, Controller-Runtime Caches Similar to the different types of client(set)s, there are also different kinds of Kubernetes client caches. However, all of them are based on the same concept: Informers. An Informer is a watch-based cache implementation, meaning it opens watch connections to the API server and continuously updates cached objects based on the received watch events (ADDED, MODIFIED, DELETED). Informers offer to add indices to the cache for efficient object lookup (e.g., by name or labels) and to add EventHandlers for the watch events. The latter is used by controllers to fill queues with objects that should be reconciled on watch events.\nInformers are used in and created via several higher-level constructs:\nSharedInformerFactories, Listers The generated clients (built-in as well as extended) feature a SharedInformerFactory for every API group, which can be used to create and retrieve Informers for all GroupVersionKinds. Similarly, it can be used to retrieve Listers that allow getting and listing objects from the Informer’s cache. However, both of these constructs are only used for historical reasons, and we are in the process of migrating away from them in favor of cached controller-runtime clients (see gardener/gardener#2414, gardener/gardener#2822). Thus, they are described only briefly here.\nImportant characteristics of Listers:\n Objects read from Informers and Listers can always be slightly out-out-date (i.e., stale) because the client has to first observe changes to API objects via watch events (which can intermittently lag behind by a second or even more). Thus, don’t make any decisions based on data read from Listers if the consequences of deciding wrongfully based on stale state might be catastrophic (e.g. leaking infrastructure resources). In such cases, read directly from the API server via a client instead. Objects retrieved from Informers or Listers are pointers to the cached objects, so they must not be modified without copying them first, otherwise the objects in the cache are also modified. Controller-Runtime Caches controller-runtime features a cache implementation that can be used equivalently as their clients. In fact, it implements a subset of the client.Client interface containing the Get and List functions. Under the hood, a cache.Cache dynamically creates Informers (i.e., opens watches) for every object GroupVersionKind that is being retrieved from it.\nNote that the underlying Informers of a controller-runtime cache (cache.Cache) and the ones of a SharedInformerFactory (client-go) are not related in any way. Both create Informers and watch objects on the API server individually. This means that if you read the same object from different cache implementations, you may receive different versions of the object because the watch connections of the individual Informers are not synced.\n ⚠️ Because of this, controllers/reconcilers should get the object from the same cache in the reconcile loop, where the EventHandler was also added to set up the controller. For example, if a SharedInformerFactory is used for setting up the controller then read the object in the reconciler from the Lister instead of from a cached controller-runtime client.\n By default, the client.Client created by a controller-runtime Manager is a DelegatingClient. It delegates Get and List calls to a Cache, and all other calls to a client that talks directly to the API server. Exceptions are requests with *unstructured.Unstructured objects and object kinds that were configured to be excluded from the cache in the DelegatingClient.\n ℹ️ kubernetes.Interface.Client() returns a DelegatingClient that uses the cache returned from kubernetes.Interface.Cache() under the hood. This means that all Client() usages need to be ready for cached clients and should be able to cater with stale cache reads.\n Important characteristics of cached controller-runtime clients:\n Like for Listers, objects read from a controller-runtime cache can always be slightly out of date. Hence, don’t base any important decisions on data read from the cache (see above). In contrast to Listers, controller-runtime caches fill the passed in-memory object with the state of the object in the cache (i.e., they perform something like a “deep copy into”). This means that objects read from a controller-runtime cache can safely be modified without unintended side effects. Reading from a controller-runtime cache or a cached controller-runtime client implicitly starts a watch for the given object kind under the hood. This has important consequences: Reading a given object kind from the cache for the first time can take up to a few seconds depending on size and amount of objects as well as API server latency. This is because the cache has to do a full list operation and wait for an initial watch sync before returning results. ⚠️ Controllers need appropriate RBAC permissions for the object kinds they retrieve via cached clients (i.e., list and watch). ⚠️ By default, watches started by a controller-runtime cache are cluster-scoped, meaning it watches and caches objects across all namespaces. Thus, be careful which objects to read from the cache as it might significantly increase the controller’s memory footprint. There is no interaction with the cache on writing calls (Create, Update, Patch and Delete), see below. Uncached objects, filtered caches, APIReaders:\nIn order to allow more granular control over which object kinds should be cached and which calls should bypass the cache, controller-runtime offers a few mechanisms to further tweak the client/cache behavior:\n When creating a DelegatingClient, certain object kinds can be configured to always be read directly from the API instead of from the cache. Note that this does not prevent starting a new Informer when retrieving them directly from the cache. Watches can be restricted to a given (set of) namespace(s) by setting cache.Options.Namespaces. Watches can be filtered (e.g., by label) per object kind by configuring cache.Options.SelectorsByObject on creation of the cache. Retrieving metadata-only objects or lists from a cache results in a metadata-only watch/cache for that object kind. The APIReader can be used to always talk directly to the API server for a given Get or List call (use with care and only as a last resort!). To Cache or Not to Cache Although watch-based caches are an important factor for the immense scalability of Kubernetes, it definitely comes at a price (mainly in terms of memory consumption). Thus, developers need to be careful when introducing new API calls and caching new object kinds. Here are some general guidelines on choosing whether to read from a cache or not:\n Always try to use the cache wherever possible and make your controller able to tolerate stale reads. Leverage optimistic locking: use deterministic naming for objects you create (this is what the Deployment controller does [2]). Leverage optimistic locking / concurrency control of the API server: send updates/patches with the last-known resourceVersion from the cache (see below). This will make the request fail, if there were concurrent updates to the object (conflict error), which indicates that we have operated on stale data and might have made wrong decisions. In this case, let the controller handle the error with exponential backoff. This will make the controller eventually consistent. Track the actions you took, e.g., when creating objects with generateName (this is what the ReplicaSet controller does [3]). The actions can be tracked in memory and repeated if the expected watch events don’t occur after a given amount of time. Always try to write controllers with the assumption that data will only be eventually correct and can be slightly out of date (even if read directly from the API server!). If there is already some other code that needs a cache (e.g., a controller watch), reuse it instead of doing extra direct reads. Don’t read an object again if you just sent a write request. Write requests (Create, Update, Patch and Delete) don’t interact with the cache. Hence, use the current state that the API server returned (filled into the passed in-memory object), which is basically a “free direct read” instead of reading the object again from a cache, because this will probably set back the object to an older resourceVersion. If you are concerned about the impact of the resulting cache, try to minimize that by using filtered or metadata-only watches. If watching and caching an object type is not feasible, for example because there will be a lot of updates, and you are only interested in the object every ~5m, or because it will blow up the controllers memory footprint, fallback to a direct read. This can either be done by disabling caching the object type generally or doing a single request via an APIReader. In any case, please bear in mind that every direct API call results in a quorum read from etcd, which can be costly in a heavily-utilized cluster and impose significant scalability limits. Thus, always try to minimize the impact of direct calls by filtering results by namespace or labels, limiting the number of results and/or using metadata-only calls. [2] The Deployment controller uses the pattern \u003cdeployment-name\u003e-\u003cpodtemplate-hash\u003e for naming ReplicaSets. This means, the name of a ReplicaSet it tries to create/update/delete at any given time is deterministically calculated based on the Deployment object. By this, it is insusceptible to stale reads from its ReplicaSets cache.\n[3] In simple terms, the ReplicaSet controller tracks its CREATE pod actions as follows: when creating new Pods, it increases a counter of expected ADDED watch events for the corresponding ReplicaSet. As soon as such events arrive, it decreases the counter accordingly. It only creates new Pods for a given ReplicaSet once all expected events occurred (counter is back to zero) or a timeout has occurred. This way, it prevents creating more Pods than desired because of stale cache reads and makes the controller eventually consistent.\nConflicts, Concurrency Control, and Optimistic Locking Every Kubernetes API object contains the metadata.resourceVersion field, which identifies an object’s version in the backing data store, i.e., etcd. Every write to an object in etcd results in a newer resourceVersion. This field is mainly used for concurrency control on the API server in an optimistic locking fashion, but also for efficient resumption of interrupted watch connections.\nOptimistic locking in the Kubernetes API sense means that when a client wants to update an API object, then it includes the object’s resourceVersion in the request to indicate the object’s version the modifications are based on. If the resourceVersion in etcd has not changed in the meantime, the update request is accepted by the API server and the updated object is written to etcd. If the resourceVersion sent by the client does not match the one of the object stored in etcd, there were concurrent modifications to the object. Consequently, the request is rejected with a conflict error (status code 409, API reason Conflict), for example:\n{ \"kind\": \"Status\", \"apiVersion\": \"v1\", \"metadata\": {}, \"status\": \"Failure\", \"message\": \"Operation cannot be fulfilled on configmaps \\\"foo\\\": the object has been modified; please apply your changes to the latest version and try again\", \"reason\": \"Conflict\", \"details\": { \"name\": \"foo\", \"kind\": \"configmaps\" }, \"code\": 409 } This concurrency control is an important mechanism in Kubernetes as there are typically multiple clients acting on API objects at the same time (humans, different controllers, etc.). If a client receives a conflict error, it should read the object’s latest version from the API server, make the modifications based on the newest changes, and retry the update. The reasoning behind this is that a client might choose to make different decisions based on the concurrent changes made by other actors compared to the outdated version that it operated on.\nImportant points about concurrency control and conflicts:\n The resourceVersion field carries a string value and clients must not assume numeric values (the type and structure of versions depend on the backing data store). This means clients may compare resourceVersion values to detect whether objects were changed. But they must not compare resourceVersions to figure out which one is newer/older, i.e., no greater/less-than comparisons are allowed. By default, update calls (e.g. via client-go and controller-runtime clients) use optimistic locking as the passed in-memory usually object contains the latest resourceVersion known to the controller, which is then also sent to the API server. API servers can also choose to accept update calls without optimistic locking (i.e., without a resourceVersion in the object’s metadata) for any given resource. However, sending update requests without optimistic locking is strongly discouraged, as doing so overwrites the entire object, discarding any concurrent changes made to it. On the other side, patch requests can always be executed either with or without optimistic locking, by (not) including the resourceVersion in the patched object’s metadata. Sending patch requests without optimistic locking might be safe and even desirable as a patch typically updates only a specific section of the object. However, there are also situations where patching without optimistic locking is not safe (see below). Don’t Retry on Conflict Similar to how a human would typically handle a conflict error, there are helper functions implementing RetryOnConflict-semantics, i.e., try an update call, then re-read the object if a conflict occurs, apply the modification again and retry the update. However, controllers should generally not use RetryOnConflict-semantics. Instead, controllers should abort their current reconciliation run and let the queue handle the conflict error with exponential backoff. The reasoning behind this is that a conflict error indicates that the controller has operated on stale data and might have made wrong decisions earlier on in the reconciliation. When using a helper function that implements RetryOnConflict-semantics, the controller doesn’t check which fields were changed and doesn’t revise its previous decisions accordingly. Instead, retrying on conflict basically just ignores any conflict error and blindly applies the modification.\nTo properly solve the conflict situation, controllers should immediately return with the error from the update call. This will cause retries with exponential backoff so that the cache has a chance to observe the latest changes to the object. In a later run, the controller will then make correct decisions based on the newest version of the object, not run into conflict errors, and will then be able to successfully reconcile the object. This way, the controller becomes eventually consistent.\nThe other way to solve the situation is to modify objects without optimistic locking in order to avoid running into a conflict in the first place (only if this is safe). This can be a preferable solution for controllers with long-running reconciliations (which is actually an anti-pattern but quite unavoidable in some of Gardener’s controllers). Aborting the entire reconciliation run is rather undesirable in such cases, as it will add a lot of unnecessary waiting time for end users and overhead in terms of compute and network usage.\nHowever, in any case, retrying on conflict is probably not the right option to solve the situation (there are some correct use cases for it, though, they are very rare). Hence, don’t retry on conflict.\nTo Lock or Not to Lock As explained before, conflicts are actually important and prevent clients from doing wrongful concurrent updates. This means that conflicts are not something we generally want to avoid or ignore. However, in many cases controllers are exclusive owners of the fields they want to update and thus it might be safe to run without optimistic locking.\nFor example, the gardenlet is the exclusive owner of the spec section of the Extension resources it creates on behalf of a Shoot (e.g., the Infrastructure resource for creating VPC). Meaning, it knows the exact desired state and no other actor is supposed to update the Infrastructure’s spec fields. When the gardenlet now updates the Infrastructures spec section as part of the Shoot reconciliation, it can simply issue a PATCH request that only updates the spec and runs without optimistic locking. If another controller concurrently updated the object in the meantime (e.g., the status section), the resourceVersion got changed, which would cause a conflict error if running with optimistic locking. However, concurrent status updates would not change the gardenlet’s mind on the desired spec of the Infrastructure resource as it is determined only by looking at the Shoot’s specification. If the spec section was changed concurrently, it’s still fine to overwrite it because the gardenlet should reconcile the spec back to its desired state.\nGenerally speaking, if a controller is the exclusive owner of a given set of fields and they are independent of concurrent changes to other fields in that object, it can patch these fields without optimistic locking. This might ignore concurrent changes to other fields or blindly overwrite changes to the same fields, but this is fine if the mentioned conditions apply. Obviously, this applies only to patch requests that modify only a specific set of fields but not to update requests that replace the entire object.\nIn such cases, it’s even desirable to run without optimistic locking as it will be more performant and save retries. If certain requests are made with high frequency and have a good chance of causing conflicts, retries because of optimistic locking can cause a lot of additional network traffic in a large-scale Gardener installation.\nUpdates, Patches, Server-Side Apply There are different ways of modifying Kubernetes API objects. The following snippet demonstrates how to do a given modification with the most frequently used options using a controller-runtime client:\nvar ( ctx context.Context c client.Client shoot *gardencorev1beta1.Shoot ) // update shoot.Spec.Kubernetes.Version = \"1.26\" err := c.Update(ctx, shoot) // json merge patch patch := client.MergeFrom(shoot.DeepCopy()) shoot.Spec.Kubernetes.Version = \"1.26\" err = c.Patch(ctx, shoot, patch) // strategic merge patch patch = client.StrategicMergeFrom(shoot.DeepCopy()) shoot.Spec.Kubernetes.Version = \"1.26\" err = c.Patch(ctx, shoot, patch) Important characteristics of the shown request types:\n Update requests always send the entire object to the API server and update all fields accordingly. By default, optimistic locking is used (resourceVersion is included). Both patch types run without optimistic locking by default. However, it can be enabled explicitly if needed: // json merge patch + optimistic locking patch := client.MergeFromWithOptions(shoot.DeepCopy(), client.MergeFromWithOptimisticLock{}) // ... // strategic merge patch + optimistic locking patch = client.StrategicMergeFrom(shoot.DeepCopy(), client.MergeFromWithOptimisticLock{}) // ... Patch requests only contain the changes made to the in-memory object between the copy passed to client.*MergeFrom and the object passed to Client.Patch(). The diff is calculated on the client-side based on the in-memory objects only. This means that if in the meantime some fields were changed on the API server to a different value than the one on the client-side, the fields will not be changed back as long as they are not changed on the client-side as well (there will be no diff in memory). Thus, if you want to ensure a given state using patch requests, always read the object first before patching it, as there will be no diff otherwise, meaning the patch will be empty. For more information, see gardener/gardener#4057 and the comments in gardener/gardener#4027. Also, always send updates and patch requests even if your controller hasn’t made any changes to the current state on the API server. I.e., don’t make any optimization for preventing empty patches or no-op updates. There might be mutating webhooks in the system that will modify the object and that rely on update/patch requests being sent (even if they are no-op). Gardener’s extension concept makes heavy use of mutating webhooks, so it’s important to keep this in mind. JSON merge patches always replace lists as a whole and don’t merge them. Keep this in mind when operating on lists with merge patch requests. If the controller is the exclusive owner of the entire list, it’s safe to run without optimistic locking. Though, if you want to prevent overwriting concurrent changes to the list or its items made by other actors (e.g., additions/removals to the metadata.finalizers list), enable optimistic locking. Strategic merge patches are able to make more granular modifications to lists and their elements without replacing the entire list. It uses Golang struct tags of the API types to determine which and how lists should be merged. See Update API Objects in Place Using kubectl patch or the strategic merge patch documentation for more in-depth explanations and comparison with JSON merge patches. With this, controllers might be able to issue patch requests for individual list items without optimistic locking, even if they are not exclusive owners of the entire list. Remember to check the patchStrategy and patchMergeKey struct tags of the fields you want to modify before blindly adding patch requests without optimistic locking. Strategic merge patches are only supported by built-in Kubernetes resources and custom resources served by Extension API servers. Strategic merge patches are not supported by custom resources defined by CustomResourceDefinitions (see this comparison). In that case, fallback to JSON merge patches. Server-side Apply is yet another mechanism to modify API objects, which is supported by all API resources (in newer Kubernetes versions). However, it has a few problems and more caveats preventing us from using it in Gardener at the time of writing. See gardener/gardener#4122 for more details. Generally speaking, patches are often the better option compared to update requests because they can save network traffic, encoding/decoding effort, and avoid conflicts under the presented conditions. If choosing a patch type, consider which type is supported by the resource you’re modifying and what will happen in case of a conflict. Consider whether your modification is safe to run without optimistic locking. However, there is no simple rule of thumb on which patch type to choose.\n On Helper Functions Here is a note on some helper functions, that should be avoided and why:\ncontrollerutil.CreateOrUpdate does a basic get, mutate and create or update call chain, which is often used in controllers. We should avoid using this helper function in Gardener, because it is likely to cause conflicts for cached clients and doesn’t send no-op requests if nothing was changed, which can cause problems because of the heavy use of webhooks in Gardener extensions (see above). That’s why usage of this function was completely replaced in gardener/gardener#4227 and similar PRs.\ncontrollerutil.CreateOrPatch is similar to CreateOrUpdate but does a patch request instead of an update request. It has the same drawback as CreateOrUpdate regarding no-op updates. Also, controllers can’t use optimistic locking or strategic merge patches when using CreateOrPatch. Another reason for avoiding use of this function is that it also implicitly patches the status section if it was changed, which is confusing for others reading the code. To accomplish this, the func does some back and forth conversion, comparison and checks, which are unnecessary in most of our cases and simply wasted CPU cycles and complexity we want to avoid.\nThere were some Try{Update,UpdateStatus,Patch,PatchStatus} helper functions in Gardener that were already removed by gardener/gardener#4378 but are still used in some extension code at the time of writing. The reason for eliminating these functions is that they implement RetryOnConflict-semantics. Meaning, they first get the object, mutate it, then try to update and retry if a conflict error occurs. As explained above, retrying on conflict is a controller anti-pattern and should be avoided in almost every situation. The other problem with these functions is that they read the object first from the API server (always do a direct call), although in most cases we already have a recent version of the object at hand. So, using this function generally does unnecessary API calls and therefore causes unwanted compute and network load.\nFor the reasons explained above, there are similar helper functions that accomplish similar things but address the mentioned drawbacks: controllerutils.{GetAndCreateOrMergePatch,GetAndCreateOrStrategicMergePatch}. These can be safely used as replacements for the aforementioned helper funcs. If they are not fitting for your use case, for example because you need to use optimistic locking, just do the appropriate calls in the controller directly.\nRelated Links Kubernetes Client usage in Gardener (Community Meeting talk, 2020-06-26) These resources are only partially related to the topics covered in this doc, but might still be interesting for developer seeking a deeper understanding of Kubernetes API machinery, architecture and foundational concepts.\n API Conventions The Kubernetes Resource Model ","categories":"","description":"","excerpt":"Kubernetes Clients in Gardener This document aims at providing a …","ref":"/docs/gardener/kubernetes-clients/","tags":"","title":"Kubernetes Clients"},{"body":"KUBERNETES_SERVICE_HOST Environment Variable Injection In each Shoot cluster’s kube-system namespace a DaemonSet called apiserver-proxy is deployed. It routes traffic to the upstream Shoot Kube APIServer. See the APIServer SNI GEP for more details.\nTo skip this extra network hop, a mutating webhook called apiserver-proxy.networking.gardener.cloud is deployed next to the API server in the Seed. It adds a KUBERNETES_SERVICE_HOST environment variable to each container and init container that do not specify it. See the webhook repository for more information.\nOpt-Out of Pod Injection In some cases it’s desirable to opt-out of Pod injection:\n DNS is disabled on that individual Pod, but it still needs to talk to the kube-apiserver. Want to test the kube-proxy and kubelet in-cluster discovery. Opt-Out of Pod Injection for Specific Pods To opt out of the injection, the Pod should be labeled with apiserver-proxy.networking.gardener.cloud/inject: disable, e.g.:\napiVersion: apps/v1 kind: Deployment metadata: name: nginx labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx apiserver-proxy.networking.gardener.cloud/inject: disable spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 Opt-Out of Pod Injection on Namespace Level To opt out of the injection of all Pods in a namespace, you should label your namespace with apiserver-proxy.networking.gardener.cloud/inject: disable, e.g.:\napiVersion: v1 kind: Namespace metadata: labels: apiserver-proxy.networking.gardener.cloud/inject: disable name: my-namespace or via kubectl for existing namespace:\nkubectl label namespace my-namespace apiserver-proxy.networking.gardener.cloud/inject=disable Note: Please be aware that it’s not possible to disable injection on a namespace level and enable it for individual pods in it.\n Opt-Out of Pod Injection for the Entire Cluster If the injection is causing problems for different workloads and ignoring individual pods or namespaces is not possible, then the feature could be disabled for the entire cluster with the alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector annotation with value disable on the Shoot resource itself:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: annotations: alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector: 'disable' name: my-cluster or via kubectl for existing shoot cluster:\nkubectl label shoot my-cluster alpha.featuregates.shoot.gardener.cloud/apiserver-sni-pod-injector=disable Note: Please be aware that it’s not possible to disable injection on a cluster level and enable it for individual pods in it.\n ","categories":"","description":"","excerpt":"KUBERNETES_SERVICE_HOST Environment Variable Injection In each Shoot …","ref":"/docs/gardener/shoot_kubernetes_service_host_injection/","tags":"","title":"KUBERNETES_SERVICE_HOST Environment Variable Injection"},{"body":"Introduction Lakom is kubernetes admission controller which purpose is to implement cosign image signature verification with public cosign key. It also takes care to resolve image tags to sha256 digests. A built-in cache mechanism can be enabled to reduce the load toward the OCI registry.\nFlags Lakom admission controller is configurable via command line flags. The trusted cosign public keys and the associated algorithms associated with them are set viq configuration file provided with the flag --lakom-config-path.\n Flag Name Description Default Value --bind-address Address to bind to “0.0.0.0” --cache-refresh-interval Refresh interval for the cached objects 30s --cache-ttl TTL for the cached objects. Set to 0, if cache has to be disabled 10m0s --contention-profiling Enable lock contention profiling, if profiling is enabled false --health-bind-address Bind address for the health server “:8081” -h, --help help for lakom --insecure-allow-insecure-registries If set, communication via HTTP with registries will be allowed. false --insecure-allow-untrusted-images If set, the webhook will just return warning for the images without trusted signatures. false --kubeconfig Paths to a kubeconfig. Only required if out-of-cluster. --lakom-config-path Path to file with lakom configuration containing cosign public keys used to verify the image signatures --metrics-bind-address Bind address for the metrics server “:8080” --port Webhook server port 9443 --profiling Enable profiling via web interface host:port/debug/pprof/ false --tls-cert-dir Directory with server TLS certificate and key (must contain a tls.crt and tls.key file --use-only-image-pull-secrets If set, only the credentials from the image pull secrets of the pod are used to access the OCI registry. Otherwise, the node identity and docker config are also used. false --version prints version information and quits; –version=vX.Y.Z… sets the reported version Lakom Cosign Public Keys Configuration File Lakom cosign public keys configuration file should be YAML or JSON formatted. It can set multiple trusted keys, as each key must be given a name. The supported types of public keys are RSA, ECDSA and Ed25519. The RSA keys can be additionally configured with a signature verification algorithm specifying the scheme and hash function used during signature verification. As of now ECDSA and Ed25519 keys cannot be configured with specific algorithm.\npublicKeys: - name: example-public-key algorithm: RSASSA-PSS-SHA256 key: |------BEGIN PUBLIC KEY----- MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBAPeQXbIWMMXYV+9+j9b4jXTflnpfwn4E GMrmqYVhm0sclXb2FPP5aV/NFH6SZdHDZcT8LCNsNgxzxV4N+UE/JIsCAwEAAQ== -----END PUBLIC KEY----- Supported RSA Signature Verification Algorithms RSASSA-PKCS1-v1_5-SHA256: uses RSASSA-PKCS1-v1_5 scheme with SHA256 hash func RSASSA-PKCS1-v1_5-SHA384: uses RSASSA-PKCS1-v1_5 scheme with SHA384 hash func RSASSA-PKCS1-v1_5-SHA512: uses RSASSA-PKCS1-v1_5 scheme with SHA512 hash func RSASSA-PSS-SHA256: uses RSASSA-PSS scheme with SHA256 hash func RSASSA-PSS-SHA384: uses RSASSA-PSS scheme with SHA384 hash func RSASSA-PSS-SHA512: uses RSASSA-PSS scheme with SHA512 hash func ","categories":"","description":"","excerpt":"Introduction Lakom is kubernetes admission controller which purpose is …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/lakom/","tags":"","title":"Lakom"},{"body":"Gardener Extension for lakom services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-lakom-service extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nLakom Admission Controller Lakom is kubernetes admission controller which purpose is to implement cosign image signature verification against public cosign key. It also takes care to resolve image tags to sha256 digests. It also caches all OCI artifacts to reduce the load toward the OCI registry.\nExtension Resources Example extension resource:\n apiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-lakom-service namespace: shoot--project--abc spec: type: shoot-lakom-service When an extension resource is reconciled, the extension controller will create an instance of lakom admission controller. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally The Lakom admission controller can be configured with make dev-setup and started with make start-lakom. You can run the lakom extension controller locally on your machine by executing make start.\nIf you’d like to develop Lakom using a local cluster such as KinD, make sure your KUBECONFIG environment variable is targeting the local Garden cluster. Add 127.0.0.1 garden.local.gardener.cloud to your /etc/hosts. You can then run:\nmake extension-up This will trigger a skaffold deployment that builds the images, pushes them to the registry and installs the helm charts from /charts.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"A k8s admission controller verifying pods are using signed images (cosign signatures) and a gardener extension to install it for shoots and seeds.","excerpt":"A k8s admission controller verifying pods are using signed images …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/","tags":"","title":"Lakom service"},{"body":"e2e Test Suite Developers can run extended e2e tests, in addition to unit tests, for Etcd-Druid in or from their local environments. This is recommended to verify the desired behavior of several features and to avoid regressions in future releases.\nThe very same tests typically run as part of the component’s release job as well as on demand, e.g., when triggered by Etcd-Druid maintainers for open pull requests.\nTesting Etcd-Druid automatically involves a certain test coverage for gardener/etcd-backup-restore which is deployed as a side-car to the actual etcd container.\nPrerequisites The e2e test lifecycle is managed with the help of skaffold. Every involved step like setup, deploy, undeploy or cleanup is executed against a Kubernetes cluster which makes it a mandatory prerequisite at the same time. Only skaffold itself with involved docker, helm and kubectl executions as well as the e2e-tests are executed locally. Required binaries are automatically downloaded if you use the corresponding make target, as described in this document.\nIt’s expected that especially the deploy step is run against a Kubernetes cluster which doesn’t contain an Druid deployment or any left-overs like druid.gardener.cloud CRDs. The deploy step will likely fail in such scenarios.\n Tip: Create a fresh KinD cluster or a similar one with a small footprint before executing the tests.\n Providers The following providers are supported for e2e tests:\n AWS Azure GCP Local Valid credentials need to be provided when tests are executed with mentioned cloud providers.\n Flow An e2e test execution involves the following steps:\n Step Description setup Create a storage bucket which is used for etcd backups (only with cloud providers). deploy Build Docker image, upload it to registry (if remote cluster - see Docker build), deploy Helm chart (charts/druid) to Kubernetes cluster. test Execute e2e tests as defined in test/e2e. undeploy Remove the deployed artifacts from Kubernetes cluster. cleanup Delete storage bucket and Druid deployment from test cluster. Make target Executing e2e-tests is as easy as executing the following command with defined Env-Vars as desribed in the following section and as needed for your test scenario.\nmake test-e2e Common Env Variables The following environment variables influence how the flow described above is executed:\n PROVIDERS: Providers used for testing (all, aws, azure, gcp, local). Multiple entries must be comma separated. Note: Some tests will use very first entry from env PROVIDERS for e2e testing (ex: multi-node tests). So for multi-node tests to use specific provider, specify that provider as first entry in env PROVIDERS.\n KUBECONFIG: Kubeconfig pointing to cluster where Etcd-Druid will be deployed (preferably KinD). TEST_ID: Some ID which is used to create assets for and during testing. STEPS: Steps executed by make target (setup, deploy, test, undeploy, cleanup - default: all steps). AWS Env Variables AWS_ACCESS_KEY_ID: Key ID of the user. AWS_SECRET_ACCESS_KEY: Access key of the user. AWS_REGION: Region in which the test bucket is created. Example:\nmake \\ AWS_ACCESS_KEY_ID=\"abc\" \\ AWS_SECRET_ACCESS_KEY=\"xyz\" \\ AWS_REGION=\"eu-central-1\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"aws\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e Azure Env Variables STORAGE_ACCOUNT: Storage account used for managing the storage container. STORAGE_KEY: Key of storage account. Example:\nmake \\ STORAGE_ACCOUNT=\"abc\" \\ STORAGE_KEY=\"eHl6Cg==\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"azure\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e GCP Env Variables GCP_SERVICEACCOUNT_JSON_PATH: Path to the service account json file used for this test. GCP_PROJECT_ID: ID of the GCP project. Example:\nmake \\ GCP_SERVICEACCOUNT_JSON_PATH=\"/var/lib/secrets/serviceaccount.json\" \\ GCP_PROJECT_ID=\"xyz-project\" \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"gcp\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e Local Env Variables No special environment variables are required for running e2e tests with Local provider.\nExample:\nmake \\ KUBECONFIG=\"$HOME/.kube/config\" \\ PROVIDERS=\"local\" \\ TEST_ID=\"some-test-id\" \\ STEPS=\"setup,deploy,test,undeploy,cleanup\" \\ test-e2e e2e test with localstack The above-mentioned e2e tests need storage from real cloud providers to be setup. But there is a tool named localstack that enables to run e2e test with mock AWS storage. We can also provision KIND cluster for e2e tests. So, together with localstack and KIND cluster, we don’t need to depend on any actual cloud provider infrastructure to be setup to run e2e tests.\nHow are the KIND cluster and localstack set up KIND or Kubernetes-In-Docker is a kubernetes cluster that is set up inside a docker container. This cluster is with limited capability as it does not have much compute power. But this cluster can easily be setup inside a container and can be tear down easily just by removing a container. That’s why KIND cluster is very easy to use for e2e tests. Makefile command helps to spin up a KIND cluster and use the cluster to run e2e tests.\nThere is a docker image for localstack. The image is deployed as pod inside the KIND cluster through hack/e2e-test/infrastructure/localstack/localstack.yaml. Makefile takes care of deploying the yaml file in a KIND cluster.\nThe developer needs to run make ci-e2e-kind command. This command in turn runs hack/ci-e2e-kind.sh which spin up the KIND cluster and deploy localstack in it and then run the e2e tests using localstack as mock AWS storage provider. e2e tests are actually run on host machine but deploy the druid controller inside KIND cluster. Druid controller spawns multinode etcd clusters inside KIND cluster. e2e tests verify whether the druid controller performs its jobs correctly or not. Mock localstack storage is cleaned up after every e2e tests. That’s why the e2e tests need to access the localstack pod running inside KIND cluster. The network traffic between host machine and localstack pod is resolved via mapping localstack pod port to host port while setting up the KIND cluster via hack/e2e-test/infrastructure/kind/cluster.yaml\nHow to execute e2e tests with localstack and KIND cluster Run the following make command to spin up a KinD cluster, deploy localstack and run the e2e tests with provider aws:\nmake ci-e2e-kind ","categories":"","description":"","excerpt":"e2e Test Suite Developers can run extended e2e tests, in addition to …","ref":"/docs/other-components/etcd-druid/local-e2e-tests/","tags":"","title":"Local e2e Tests"},{"body":"Local development Purpose Develop new feature and fix bug on the Gardener Dashboard.\nRequirements Yarn. For the required version, refer to .engines.yarn in package.json. Node.js. For the required version, refer to .engines.node in package.json. Steps 1. Clone repository Clone the gardener/dashboard repository\ngit clone git@github.com:gardener/dashboard.git 2. Install dependencies Run yarn at the repository root to install all dependencies.\ncd dashboard yarn 3. Configuration Place the Gardener Dashboard configuration under ${HOME}/.gardener/config.yaml or alternatively set the path to the configuration file using the GARDENER_CONFIG environment variable.\nA local configuration example could look like follows:\nport: 3030 logLevel: debug logFormat: text apiServerUrl: https://my-local-cluster # garden cluster kube-apiserver url - kubectl config view --minify -ojsonpath='{.clusters[].cluster.server}' sessionSecret: c2VjcmV0 # symmetric key used for encryption frontend: dashboardUrl: pathname: /api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/ defaultHibernationSchedule: evaluation: - start: 00 17 * * 1,2,3,4,5 development: - start: 00 17 * * 1,2,3,4,5 end: 00 08 * * 1,2,3,4,5 production: ~ 4. Run it locally The Gardener Dashboard backend server requires a kubeconfig for the Garden cluster. You can set it e.g. by using the KUBECONFIG environment variable.\nIf you want to run the Garden cluster locally, follow the getting started locally documentation. Gardener Dashboard supports the local infrastructure provider that comes with the local Gardener cluster setup. See 6. Login to the dashboard for more information on how to use the Dashboard with a local gardener or any other Gardener landscape.\nStart the backend server (http://localhost:3030).\ncd backend export KUBECONFIG=/path/to/garden/cluster/kubeconfig.yaml yarn serve To start the frontend server, you have two options for handling the server certificate:\n Recommended Method: Run yarn setup in the frontend directory to generate a new self-signed CA and TLS server certificate before starting the frontend server for the first time. The CA is automatically added to the keychain on macOS. If you prefer not to add it to the keychain, you can use the --skip-keychain flag. For other operating systems, you will need to manually add the generated certificates to the local trust store.\n Alternative Method: If you prefer not to run yarn setup, a temporary self-signed certificate will be generated automatically. This certificate will not be added to the keychain. Note that you will need to click through the insecure warning in your browser to access the dashboard.\n We need to start a TLS dev server because we use cookie names with __Host- prefix. This requires the secure attribute to be set. For more information, see OWASP Host Prefix.\nStart the frontend dev server (https://localhost:8443) with https and hot reload enabled.\ncd frontend # yarn setup yarn serve You can now access the UI on https://localhost:8443/\n5. Login to the dashboard To login to the dashboard you can either configure oidc, or alternatively login using a token:\nTo login using a token, first create a service account.\nkubectl -n garden create serviceaccount dashboard-user Assign it a role, e.g. cluster-admin.\nkubectl set subject clusterrolebinding cluster-admin --serviceaccount=garden:dashboard-user Get the token of the service account.\nkubectl -n garden create token dashboard-user --duration 24h Copy the token and login to the dashboard.\nBuild Build docker image locally.\nmake build Push Push docker image to Google Container Registry.\nmake push This command expects a valid gcloud configuration named gardener.\ngcloud config configurations describe gardener is_active: true name: gardener properties: core: account: john.doe@example.org project: johndoe-1008 ","categories":"","description":"","excerpt":"Local development Purpose Develop new feature and fix bug on the …","ref":"/docs/dashboard/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-alicloud admission-alicloud is an admission webhook server which is responsible for the validation of the cloud provider (Alicloud in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-alicloud.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-alicloud.sh You are now ready to experiment with the admission-alicloud webhook server locally.\n","categories":"","description":"","excerpt":"admission-alicloud admission-alicloud is an admission webhook server …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-aws admission-aws is an admission webhook server which is responsible for the validation of the cloud provider (AWS in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-aws.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-aws.sh You are now ready to experiment with the admission-aws webhook server locally.\n","categories":"","description":"","excerpt":"admission-aws admission-aws is an admission webhook server which is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-azure admission-azure is an admission webhook server which is responsible for the validation of the cloud provider (Azure in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-azure.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-azure.sh You are now ready to experiment with the admission-azure webhook server locally.\n","categories":"","description":"","excerpt":"admission-azure admission-azure is an admission webhook server which …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-gcp admission-gcp is an admission webhook server which is responsible for the validation of the cloud provider (GCP in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-gcp.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-gcp.sh You are now ready to experiment with the admission-gcp webhook server locally.\n","categories":"","description":"","excerpt":"admission-gcp admission-gcp is an admission webhook server which is …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/local-setup/","tags":"","title":"Local Setup"},{"body":"admission-openstack admission-openstack is an admission webhook server which is responsible for the validation of the cloud provider (OpenStack in this case) specific fields and resources. The Gardener API server is cloud provider agnostic and it wouldn’t be able to perform similar validation.\nFollow the steps below to run the admission webhook server locally.\n Start the Gardener API server.\nFor details, check the Gardener local setup.\n Start the webhook server\nMake sure that the KUBECONFIG environment variable is pointing to the local garden cluster.\nmake start-admission Setup the ValidatingWebhookConfiguration.\nhack/dev-setup-admission-openstack.sh will configure the webhook Service which will allow the kube-apiserver of your local cluster to reach the webhook server. It will also apply the ValidatingWebhookConfiguration manifest.\n./hack/dev-setup-admission-openstack.sh You are now ready to experiment with the admission-openstack webhook server locally.\n","categories":"","description":"","excerpt":"admission-openstack admission-openstack is an admission webhook server …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/local-setup/","tags":"","title":"Local Setup"},{"body":"Overview Conceptually, all Gardener components are designed to run as a Pod inside a Kubernetes cluster. The Gardener API server extends the Kubernetes API via the user-aggregated API server concepts. However, if you want to develop it, you may want to work locally with the Gardener without building a Docker image and deploying it to a cluster each and every time. That means that the Gardener runs outside a Kubernetes cluster which requires providing a Kubeconfig in your local filesystem and point the Gardener to it when starting it (see below).\nFurther details can be found in\n Principles of Kubernetes, and its components Kubernetes Development Guide Architecture of Gardener This guide is split into two main parts:\n Preparing your setup by installing all dependencies and tools Getting the Gardener source code locally Preparing the Setup [macOS only] Installing homebrew The copy-paste instructions in this guide are designed for macOS and use the package manager Homebrew.\nOn macOS run\n/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\" [macOS only] Installing GNU bash Built-in apple-darwin bash is missing some features that could cause shell scripts to fail locally.\nbrew install bash Installing git We use git as VCS which you need to install. On macOS run\nbrew install git For other OS, please check the Git installation documentation.\nInstalling Go Install the latest version of Go. On macOS run\nbrew install go For other OS, please check Go installation documentation.\nInstalling kubectl Install kubectl. Please make sure that the version of kubectl is at least v1.25.x. On macOS run\nbrew install kubernetes-cli For other OS, please check the kubectl installation documentation.\nInstalling Docker You need to have docker installed and running. On macOS run\nbrew install --cask docker For other OS please check the docker installation documentation.\nInstalling iproute2 iproute2 provides a collection of utilities for network administration and configuration. On macOS run\nbrew install iproute2mac Installing jq jq is a lightweight and flexible command-line JSON processor. On macOS run\nbrew install jq Installing yq yq is a lightweight and portable command-line YAML processor. On macOS run\nbrew install yq Installing GNU Parallel GNU Parallel is a shell tool for executing jobs in parallel, used by the code generation scripts (make generate). On macOS run\nbrew install parallel [macOS only] Install GNU Core Utilities When running on macOS, install the GNU core utilities and friends:\nbrew install coreutils gnu-sed gnu-tar grep gzip This will create symbolic links for the GNU utilities with g prefix on your PATH, e.g., gsed or gbase64. To allow using them without the g prefix, add the gnubin directories to the beginning of your PATH environment variable (brew install and brew info will print out instructions for each formula):\nexport PATH=$(brew --prefix)/opt/coreutils/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gnu-sed/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gnu-tar/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/grep/libexec/gnubin:$PATH export PATH=$(brew --prefix)/opt/gzip/bin:$PATH [Windows Only] WSL2 Apart from Linux distributions and macOS, the local gardener setup can also run on the Windows Subsystem for Linux 2.\nWhile WSL1, plain docker for Windows and various Linux distributions and local Kubernetes environments may be supported, this setup was verified with:\n WSL2 Docker Desktop WSL2 Engine Ubuntu 18.04 LTS on WSL2 Nodeless local garden (see below) The Gardener repository and all the above-mentioned tools (git, golang, kubectl, …) should be installed in your WSL2 distro, according to the distribution-specific Linux installation instructions.\nGet the Sources Clone the repository from GitHub into your $GOPATH.\nmkdir -p $(go env GOPATH)/src/github.com/gardener cd $(go env GOPATH)/src/github.com/gardener git clone git@github.com:gardener/gardener.git cd gardener Note: Gardener is using Go modules and cloning the repository into $GOPATH is not a hard requirement. However it is still recommended to clone into $GOPATH because k8s.io/code-generator does not work yet outside of $GOPATH - kubernetes/kubernetes#86753.\n Start the Gardener Please see getting_started_locally.md how to build and deploy Gardener from your local sources.\n","categories":"","description":"","excerpt":"Overview Conceptually, all Gardener components are designed to run as …","ref":"/docs/gardener/local_setup/","tags":"","title":"Local Setup"},{"body":"Preparing the Local Development Setup (Mac OS X) Preparing the Local Development Setup (Mac OS X) Installing Golang environment Installing Docker (Optional) Setup Docker Hub account (Optional) Local development Installing the Machine Controller Manager locally Prepare the cluster Getting started Testing Machine Classes Usage Conceptionally, the Machine Controller Manager is designed to run in a container within a Pod inside a Kubernetes cluster. For development purposes, you can run the Machine Controller Manager as a Go process on your local machine. This process connects to your remote cluster to manage VMs for that cluster. That means that the Machine Controller Manager runs outside a Kubernetes cluster which requires providing a Kubeconfig in your local filesystem and point the Machine Controller Manager to it when running it (see below).\nAlthough the following installation instructions are for Mac OS X, similar alternate commands could be found for any Linux distribution.\nInstalling Golang environment Install the latest version of Golang (at least v1.8.3 is required) by using Homebrew:\n$ brew install golang In order to perform linting on the Go source code, install Golint:\n$ go get -u golang.org/x/lint/golint Installing Docker (Optional) In case you want to build Docker images for the Machine Controller Manager you have to install Docker itself. We recommend using Docker for Mac OS X which can be downloaded from here.\nSetup Docker Hub account (Optional) Create a Docker hub account at Docker Hub if you don’t already have one.\nLocal development ⚠️ Before you start developing, please ensure to comply with the following requirements:\n You have understood the principles of Kubernetes, and its components, what their purpose is and how they interact with each other. You have understood the architecture of the Machine Controller Manager The development of the Machine Controller Manager could happen by targeting any cluster. You basically need a Kubernetes cluster running on a set of machines. You just need the Kubeconfig file with the required access permissions attached to it.\nInstalling the Machine Controller Manager locally Clone the repository from GitHub.\n$ git clone git@github.com:gardener/machine-controller-manager.git $ cd machine-controller-manager Prepare the cluster Connect to the remote kubernetes cluster where you plan to deploy the Machine Controller Manager using kubectl. Set the environment variable KUBECONFIG to the path of the yaml file containing your cluster info Now, create the required CRDs on the remote cluster using the following command, $ kubectl apply -f kubernetes/crds.yaml Getting started Setup and Restore with Gardener\nSetup\nIn gardener access to static kubeconfig files is no longer supported due to security reasons. One needs to generate short-lived (max TTL = 1 day) admin kube configs for target and control clusters. A convenience script/Makefile target has been provided to do the required initial setup which includes:\n Creating a temporary directory where target and control kubeconfigs will be stored. Create a request to generate the short lived admin kubeconfigs. These are downloaded and stored in the temporary folder created above. In gardener clusters DWD (Dependency Watchdog) runs as an additional component which can interfere when MCM/CA is scaled down. To prevent that an annotation dependency-watchdog.gardener.cloud/ignore-scaling is added to machine-controller-manager deployment which prevents DWD from scaling up the deployment replicas. Scales down machine-controller-manager deployment in the control cluster to 0 replica. Creates the required .env file and populates required environment variables which are then used by the Makefile in both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Copies the generated and downloaded kubeconfig files for the target and control clusters to machine-controller-manager-provider-\u003cprovider-name\u003e project as well. To do the above you can either invoke make gardener-setup or you can directly invoke the script ./hack/gardener_local_setup.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nRestore\nOnce the testing is over you can invoke a convenience script/Makefile target which does the following:\n Removes all generated admin kubeconfig files from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Removes the .env file that was generated as part of the setup from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Scales up machine-controller-manager deployment in the control cluster back to 1 replica. Removes the annotation dependency-watchdog.gardener.cloud/ignore-scaling that was added to prevent DWD to scale up MCM. To do the above you can either invoke make gardener-restore or you can directly invoke the script ./hack/gardener_local_restore.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nSetup and Restore without Gardener\nSetup\nIf you are not running MCM components in a gardener cluster, then it is assumed that there is not going to be any DWD (Dependency Watchdog) component. A convenience script/Makefile target has been provided to the required initial setup which includes:\n Copies the provided control and target kubeconfig files to machine-controller-manager-provider-\u003cprovider-name\u003e project. Scales down machine-controller-manager deployment in the control cluster to 0 replica. Creates the required .env file and populates required environment variables which are then used by the Makefile in both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. To do the above you can either invoke make non-gardener-setup or you can directly invoke the script ./hack/non_gardener_local_setup.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nRestore\nOnce the testing is over you can invoke a convenience script/Makefile target which does the following:\n Removes all provided kubeconfig files from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Removes the .env file that was generated as part of the setup from both machine-controller-manager and in machine-controller-manager-provider-\u003cprovider-name\u003e projects. Scales up machine-controller-manager deployment in the control cluster back to 1 replica. To do the above you can either invoke make non-gardener-restore or you can directly invoke the script ./hack/non_gardener_local_restore.sh. If you invoke the script with -h or --help option then it will give you all CLI options that one can pass.\nOnce the setup is done then you can start the machine-controller-manager as a local process using the following Makefile target:\n$ make start I1227 11:08:19.963638 55523 controllermanager.go:204] Starting shared informers I1227 11:08:20.766085 55523 controller.go:247] Starting machine-controller-manager ⚠️ The file dev/target-kubeconfig.yaml points to the cluster whose nodes you want to manage. dev/control-kubeconfig.yaml points to the cluster from where you want to manage the nodes from. However, dev/control-kubeconfig.yaml is optional.\nThe Machine Controller Manager should now be ready to manage the VMs in your kubernetes cluster.\n⚠️ This is assuming that your MCM is built to manage machines for any in-tree supported providers. There is a new way to deploy and manage out of tree (external) support for providers whose development can be found here\nTesting Machine Classes To test the creation/deletion of a single instance for one particular machine class you can use the managevm cli. The corresponding INFRASTRUCTURE-machine-class.yaml and the INFRASTRUCTURE-secret.yaml need to be defined upfront. To build and run it\nGO111MODULE=on go build -o managevm cmd/machine-controller-manager-cli/main.go # create machine ./managevm --secret PATH_TO/INFRASTRUCTURE-secret.yaml --machineclass PATH_TO/INFRASTRUCTURE-machine-class.yaml --classkind INFRASTRUCTURE --machinename test # delete machine ./managevm --secret PATH_TO/INFRASTRUCTURE-secret.yaml --machineclass PATH_TO/INFRASTRUCTURE-machine-class.yaml --classkind INFRASTRUCTURE --machinename test --machineid INFRASTRUCTURE:///REGION/INSTANCE_ID Usage To start using Machine Controller Manager, follow the links given at usage here.\n","categories":"","description":"","excerpt":"Preparing the Local Development Setup (Mac OS X) Preparing the Local …","ref":"/docs/other-components/machine-controller-manager/local_setup/","tags":"","title":"Local Setup"},{"body":"How to Create Log Parser for Container into fluent-bit If our log message is parsed correctly, it has to be showed in Plutono like this:\n{\"log\":\"OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io\",\"pid\":\"1\",\"severity\":\"INFO\",\"source\":\"controller.go:107\"} Otherwise it will looks like this:\n{ \"log\":\"{ \\\"level\\\":\\\"info\\\",\\\"ts\\\":\\\"2020-06-01T11:23:26.679Z\\\",\\\"logger\\\":\\\"gardener-resource-manager.health-reconciler\\\",\\\"msg\\\":\\\"Finished ManagedResource health checks\\\",\\\"object\\\":\\\"garden/provider-aws-dsm9r\\\" }\\n\" } } Create a Custom Parser First of all, we need to know how the log for the specific container looks like (for example, lets take a log from the alertmanager : level=info ts=2019-01-28T12:33:49.362015626Z caller=main.go:175 build_context=\"(go=go1.11.2, user=root@4ecc17c53d26, date=20181109-15:40:48))\n We can see that this log contains 4 subfields(severity=info, timestamp=2019-01-28T12:33:49.362015626Z, source=main.go:175 and the actual message). So we have to write a regex which matches this log in 4 groups(We can use https://regex101.com/ like helping tool). So, for this purpose our regex looks like this:\n ^level=(?\u003cseverity\u003e\\w+)\\s+ts=(?\u003ctime\u003e\\d{4}-\\d{2}-\\d{2}[Tt].*[zZ])\\s+caller=(?\u003csource\u003e[^\\s]*+)\\s+(?\u003clog\u003e.*) Now we have to create correct time format for the timestamp (We can use this site for this purpose: http://ruby-doc.org/stdlib-2.4.1/libdoc/time/rdoc/Time.html#method-c-strptime). So our timestamp matches correctly the following format: %Y-%m-%dT%H:%M:%S.%L It’s time to apply our new regex into fluent-bit configuration. To achieve that we can just deploy in the cluster where the fluent-operator is deployed the following custom resources: apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: seed name: \u003c\u003c pod-name \u003e\u003e--(\u003c\u003c container-name \u003e\u003e) spec: filters: - parser: keyName: log parser: \u003c\u003c container-name \u003e\u003e-parser reserveData: true match: kubernetes.\u003c\u003c pod-name \u003e\u003e*\u003c\u003c container-name \u003e\u003e* EXAMPLE apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: seed name: alertmanager spec: filters: - parser: keyName: log parser: alertmanager-parser reserveData: true match: \"kubernetes.alertmanager*alertmanager*\" Now lets check if there already exists ClusterParser with such a regex and time format that we need. If it doesn’t, create one: apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterParser metadata: name: \u003c\u003c container-name \u003e\u003e-parser labels: fluentbit.gardener/type: \"seed\" spec: regex: timeKey: time timeFormat: \u003c\u003c time-format \u003e\u003e regex: \"\u003c\u003c regex \u003e\u003e\" EXAMPLE apiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterParser metadata: name: alermanager-parser labels: fluentbit.gardener/type: \"seed\" spec: regex: timeKey: time timeFormat: \"%Y-%m-%dT%H:%M:%S.%L\" regex: \"^level=(?\u003cseverity\u003e\\\\w+)\\\\s+ts=(?\u003ctime\u003e\\\\d{4}-\\\\d{2}-\\\\d{2}[Tt].*[zZ])\\\\s+caller=(?\u003csource\u003e[^\\\\s]*+)\\\\s+(?\u003clog\u003e.*)\" Follow your development setup to validate that the parsers are working correctly. ","categories":"","description":"","excerpt":"How to Create Log Parser for Container into fluent-bit If our log …","ref":"/docs/gardener/log_parsers/","tags":"","title":"Log Parsers"},{"body":"Logging in Gardener Components This document aims at providing a general developer guideline on different aspects of logging practices and conventions used in the Gardener codebase. It contains mostly Gardener-specific points, and references other existing and commonly accepted logging guidelines for general advice. Developers and reviewers should consult this guide when writing, refactoring, and reviewing Gardener code. If parts are unclear or new learnings arise, this guide should be adapted accordingly.\nLogging Libraries / Implementations Historically, Gardener components have been using logrus. There is a global logrus logger (logger.Logger) that is initialized by components on startup and used across the codebase. In most places, it is used as a printf-style logger and only in some instances we make use of logrus’ structured logging functionality.\nIn the process of migrating our components to native controller-runtime components (see gardener/gardener#4251), we also want to make use of controller-runtime’s built-in mechanisms for streamlined logging. controller-runtime uses logr, a simple structured logging interface, for library-internal logging and logging in controllers.\nlogr itself is only an interface and doesn’t provide an implementation out of the box. Instead, it needs to be backed by a logging implementation like zapr. Code that uses the logr interface is thereby not tied to a specific logging implementation and makes the implementation easily exchangeable. controller-runtime already provides a set of helpers for constructing zapr loggers, i.e., logr loggers backed by zap, which is a popular logging library in the go community. Hence, we are migrating our component logging from logrus to logr (backed by zap) as part of gardener/gardener#4251.\n ⚠️ logger.Logger (logrus logger) is deprecated in Gardener and shall not be used in new code – use logr loggers when writing new code! (also see Migration from logrus to logr)\nℹ️ Don’t use zap loggers directly, always use the logr interface in order to avoid tight coupling to a specific logging implementation.\n gardener-apiserver differs from the other components as it is based on the apiserver library and therefore uses klog – just like kube-apiserver. As gardener-apiserver writes (almost) no logs in our coding (outside the apiserver library), there is currently no plan for switching the logging implementation. Hence, the following sections focus on logging in the controller and admission components only.\nlogcheck Tool To ensure a smooth migration to logr and make logging in Gardener components more consistent, the logcheck tool was added. It enforces (parts of) this guideline and detects programmer-level errors early on in order to prevent bugs. Please check out the tool’s documentation for a detailed description.\nStructured Logging Similar to efforts in the Kubernetes project, we want to migrate our component logs to structured logging. As motivated above, we will use the logr interface instead of klog though.\nYou can read more about the motivation behind structured logging in logr’s background and FAQ (also see this blog post by Dave Cheney). Also, make sure to check out controller-runtime’s logging guideline with specifics for projects using the library. The following sections will focus on the most important takeaways from those guidelines and give general instructions on how to apply them to Gardener and its controller-runtime components.\n Note: Some parts in this guideline differ slightly from controller-runtime’s document.\n TL;DR of Structured Logging ❌ Stop using printf-style logging:\nvar logger *logrus.Logger logger.Infof(\"Scaling deployment %s/%s to %d replicas\", deployment.Namespace, deployment.Name, replicaCount) ✅ Instead, write static log messages and enrich them with additional structured information in form of key-value pairs:\nvar logger logr.Logger logger.Info(\"Scaling deployment\", \"deployment\", client.ObjectKeyFromObject(deployment), \"replicas\", replicaCount) Log Configuration Gardener components can be configured to either log in json (default) or text format: json format is supposed to be used in production, while text format might be nicer for development.\n# json {\"level\":\"info\",\"ts\":\"2021-12-16T08:32:21.059+0100\",\"msg\":\"Hello botanist\",\"garden\":\"eden\"} # text 2021-12-16T08:32:21.059+0100 INFO Hello botanist {\"garden\": \"eden\"} Components can be set to one of the following log levels (with increasing verbosity): error, info (default), debug.\nLog Levels logr uses V-levels (numbered log levels), higher V-level means higher verbosity. V-levels are relative (in contrast to klog’s absolute V-levels), i.e., V(1) creates a logger, that is one level more verbose than its parent logger.\nIn Gardener components, the mentioned log levels in the component config (error, info, debug) map to the zap levels with the same names (see here). Hence, our loggers follow the same mapping from numerical logr levels to named zap levels like described in zapr, i.e.:\n component config specifies debug ➡️ both V(0) and V(1) are enabled component config specifies info ➡️ V(0) is enabled, V(1) will not be shown component config specifies error ➡️ neither V(0) nor V(1) will be shown Error() logs will always be shown This mapping applies to the components’ root loggers (the ones that are not “derived” from any other logger; constructed on component startup). If you derive a new logger with e.g. V(1), the mapping will shift by one. For example, V(0) will then log at zap’s debug level.\nThere is no warning level (see Dave Cheney’s post). If there is an error condition (e.g., unexpected error received from a called function), the error should either be handled or logged at error if it is neither handled nor returned. If you have an error value at hand that doesn’t represent an actual error condition, but you still want to log it as an informational message, log it at info level with key err.\nWe might consider to make use of a broader range of log levels in the future when introducing more logs and common command line flags for our components (comparable to --v of Kubernetes components). For now, we stick to the mentioned two log levels like controller-runtime: info (V(0)) and debug (V(1)).\nLogging in Controllers Named Loggers Controllers should use named loggers that include their name, e.g.:\ncontrollerLogger := rootLogger.WithName(\"controller\").WithName(\"shoot\") controllerLogger.Info(\"Deploying kube-apiserver\") results in\n2021-12-16T09:27:56.550+0100 INFO controller.shoot Deploying kube-apiserver Logger names are hierarchical. You can make use of it, where controllers are composed of multiple “subcontrollers”, e.g., controller.shoot.hibernation or controller.shoot.maintenance.\nUsing the global logger logf.Log directly is discouraged and should be rather exceptional because it makes correlating logs with code harder. Preferably, all parts of the code should use some named logger.\nReconciler Loggers In your Reconcile function, retrieve a logger from the given context.Context. It inherits from the controller’s logger (i.e., is already named) and is preconfigured with name and namespace values for the reconciliation request:\nfunc (r *reconciler) Reconcile(ctx context.Context, request reconcile.Request) (reconcile.Result, error) { log := logf.FromContext(ctx) log.Info(\"Reconciling Shoot\") // ... return reconcile.Result{}, nil } results in\n2021-12-16T09:35:59.099+0100 INFO controller.shoot Reconciling Shoot {\"name\": \"sunflower\", \"namespace\": \"garden-greenhouse\"} The logger is injected by controller-runtime’s Controller implementation. The logger returned by logf.FromContext is never nil. If the context doesn’t carry a logger, it falls back to the global logger (logf.Log), which might discard logs if not configured, but is also never nil.\n ⚠️ Make sure that you don’t overwrite the name or namespace value keys for such loggers, otherwise you will lose information about the reconciled object.\n The controller implementation (controller-runtime) itself takes care of logging the error returned by reconcilers. Hence, don’t log an error that you are returning. Generally, functions should not return an error, if they already logged it, because that means the error is already handled and not an error anymore. See Dave Cheney’s post for more on this.\nMessages Log messages should be static. Don’t put variable content in there, i.e., no fmt.Sprintf or string concatenation (+). Use key-value pairs instead. Log messages should be capitalized. Note: This contrasts with error messages, that should not be capitalized. However, both should not end with a punctuation mark. Keys and Values Use WithValues instead of repeatedly adding key-value pairs for multiple log statements. WithValues creates a new logger from the parent, that carries the given key-value pairs. E.g., use it when acting on one object in multiple steps and logging something for each step:\nlog := parentLog.WithValues(\"infrastructure\", client.ObjectKeyFromObject(infrastrucutre)) // ... log.Info(\"Creating Infrastructure\") // ... log.Info(\"Waiting for Infrastructure to be reconciled\") // ... Note: WithValues bypasses controller-runtime’s special zap encoder that nicely encodes ObjectKey/NamespacedName and runtime.Object values, see kubernetes-sigs/controller-runtime#1290. Thus, the end result might look different depending on the value and its Stringer implementation.\n Use lowerCamelCase for keys. Don’t put spaces in keys, as it will make log processing with simple tools like jq harder.\n Keys should be constant, human-readable, consistent across the codebase and naturally match parts of the log message, see logr guideline.\n When logging object keys (name and namespace), use the object’s type as the log key and a client.ObjectKey/types.NamespacedName value as value, e.g.:\nvar deployment *appsv1.Deployment log.Info(\"Creating Deployment\", \"deployment\", client.ObjectKeyFromObject(deployment)) which results in\n{\"level\":\"info\",\"ts\":\"2021-12-16T08:32:21.059+0100\",\"msg\":\"Creating Deployment\",\"deployment\":{\"name\": \"bar\", \"namespace\": \"foo\"}} There are cases where you don’t have the full object key or the object itself at hand, e.g., if an object references another object (in the same namespace) by name (think secretRef or similar). In such a cases, either construct the full object key including the implied namespace or log the object name under a key ending in Name, e.g.:\nvar ( // object to reconcile shoot *gardencorev1beta1.Shoot // retrieved via logf.FromContext, preconfigured by controller with namespace and name of reconciliation request log logr.Logger ) // option a: full object key, manually constructed log.Info(\"Shoot uses SecretBinding\", \"secretBinding\", client.ObjectKey{Namespace: shoot.Namespace, Name: *shoot.Spec.SecretBindingName}) // option b: only name under respective *Name log key log.Info(\"Shoot uses SecretBinding\", \"secretBindingName\", *shoot.Spec.SecretBindingName) Both options result in well-structured logs, that are easy to interpret and process:\n{\"level\":\"info\",\"ts\":\"2022-01-18T18:00:56.672+0100\",\"msg\":\"Shoot uses SecretBinding\",\"name\":\"my-shoot\",\"namespace\":\"garden-project\",\"secretBinding\":{\"namespace\":\"garden-project\",\"name\":\"aws\"}} {\"level\":\"info\",\"ts\":\"2022-01-18T18:00:56.673+0100\",\"msg\":\"Shoot uses SecretBinding\",\"name\":\"my-shoot\",\"namespace\":\"garden-project\",\"secretBindingName\":\"aws\"} When handling generic client.Object values (e.g. in helper funcs), use object as key.\n When adding timestamps to key-value pairs, use time.Time values. By this, they will be encoded in the same format as the log entry’s timestamp.\nDon’t use metav1.Time values, as they will be encoded in a different format by their Stringer implementation. Pass \u003csomeTimestamp\u003e.Time to loggers in case you have a metav1.Time value at hand.\n Same applies to durations. Use time.Duration values instead of *metav1.Duration. Durations can be handled specially by zap just like timestamps.\n Event recorders not only create Event objects but also log them. However, both Gardener’s manually instantiated event recorders and the ones that controller-runtime provides log to debug level and use generic formats, that are not very easy to interpret or process (no structured logs). Hence, don’t use event recorders as replacements for well-structured logs. If a controller records an event for a completed action or important information, it should probably log it as well, e.g.:\nlog.Info(\"Creating ManagedSeed\", \"replica\", r.GetObjectKey()) a.recorder.Eventf(managedSeedSet, corev1.EventTypeNormal, EventCreatingManagedSeed, \"Creating ManagedSeed %s\", r.GetFullName()) Logging in Test Code If the tested production code requires a logger, you can pass logr.Discard() or logf.NullLogger{} in your test, which simply discards all logs.\n logf.Log is safe to use in tests and will not cause a nil pointer deref, even if it’s not initialized via logf.SetLogger. It is initially set to a NullLogger by default, which means all logs are discarded, unless logf.SetLogger is called in the first 30 seconds of execution.\n Pass zap.WriteTo(GinkgoWriter) in tests where you want to see the logs on test failure but not on success, for example:\nlogf.SetLogger(logger.MustNewZapLogger(logger.DebugLevel, logger.FormatJSON, zap.WriteTo(GinkgoWriter))) log := logf.Log.WithName(\"test\") ","categories":"","description":"","excerpt":"Logging in Gardener Components This document aims at providing a …","ref":"/docs/gardener/logging/","tags":"","title":"Logging"},{"body":"Logging and Monitoring for Extensions Gardener provides an integrated logging and monitoring stack for alerting, monitoring, and troubleshooting of its managed components by operators or end users. For further information how to make use of it in these roles, refer to the corresponding guides for exploring logs and for monitoring with Plutono.\nThe components that constitute the logging and monitoring stack are managed by Gardener. By default, it deploys Prometheus and Alertmanager (managed via prometheus-operator, and Plutono into the garden namespace of all seed clusters. If the logging is enabled in the gardenlet configuration (logging.enabled), it will deploy fluent-operator and Vali in the garden namespace too.\nEach shoot namespace hosts managed logging and monitoring components. As part of the shoot reconciliation flow, Gardener deploys a shoot-specific Prometheus, blackbox-exporter, Plutono, and, if configured, an Alertmanager into the shoot namespace, next to the other control plane components. If the logging is enabled in the gardenlet configuration (logging.enabled) and the shoot purpose is not testing, it deploys a shoot-specific Vali in the shoot namespace too.\nThe logging and monitoring stack is extensible by configuration. Gardener extensions can take advantage of that and contribute monitoring configurations encoded in ConfigMaps for their own, specific dashboards, alerts and other supported assets and integrate with it. As with other Gardener resources, they will be continuously reconciled. The extensions can also deploy directly fluent-operator custom resources which will be created in the seed cluster and plugged into the fluent-bit instance.\nThis guide is about the roles and extensibility options of the logging and monitoring stack components, and how to integrate extensions with:\n Monitoring Logging Monitoring Seed Cluster Cache Prometheus The central Prometheus instance in the garden namespace (called “cache Prometheus”) fetches metrics and data from all seed cluster nodes and all seed cluster pods. It uses the federation concept to allow the shoot-specific instances to scrape only the metrics for the pods of the control plane they are responsible for. This mechanism allows to scrape the metrics for the nodes/pods once for the whole cluster, and to have them distributed afterwards. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this cache Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=cache, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: cache name: cache-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Seed Prometheus Another Prometheus instance in the garden namespace (called “seed Prometheus”) fetches metrics and data from seed system components, kubelets, cAdvisors, and extensions. If you want your extension pods to be scraped then they must be annotated with prometheus.io/scrape=true and prometheus.io/port=\u003cmetrics-port\u003e. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this seed Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=seed, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: seed name: seed-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Aggregate Prometheus Another Prometheus instance in the garden namespace (called “aggregate Prometheus”) stores pre-aggregated data from the cache Prometheus and shoot Prometheus. An ingress exposes this Prometheus instance allowing it to be scraped from another cluster. For more details, continue reading here.\nTypically, this is not necessary, but in case an extension wants to extend the configuration for this aggregate Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=aggregate, for example:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: aggregate name: aggregate-my-component namespace: garden spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics Plutono A Plutono instance is deployed by gardenlet into the seed cluster’s garden namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the garden namespace labelled with dashboard.monitoring.gardener.cloud/seed=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/seed: \"true\" name: extension-foo-my-custom-dashboard namespace: garden data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Shoot Cluster Shoot Prometheus The shoot-specific metrics are then made available to operators and users in the shoot Plutono, using the shoot Prometheus as data source.\nExtension controllers might deploy components as part of their reconciliation next to the shoot’s control plane. Examples for this would be a cloud-controller-manager or CSI controller deployments. Extensions that want to have their managed control plane components integrated with monitoring can contribute their per-shoot configuration for scraping Prometheus metrics, Alertmanager alerts or Plutono dashboards.\nExtensions Monitoring Integration In case an extension wants to extend the configuration for the shoot Prometheus, they can create the prometheus-operator’s custom resources and label them with prometheus=shoot.\nServiceMonitor When the component runs in the seed cluster (e.g., as part of the shoot control plane), ServiceMonitor resources should be used:\napiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: prometheus: shoot name: shoot-my-controlplane-component namespace: shoot--foo--bar spec: selector: matchLabels: app: my-component endpoints: - metricRelabelings: - action: keep regex: ^(metric1|metric2|...)$ sourceLabels: - __name__ port: metrics In case HTTPS scheme is used, the CA certificate should be provided like this:\nspec: scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt In case the component requires credentials when contacting its metrics endpoint, provide them like this:\nspec: authorization: credentials: name: \u003cname-of-secret-containing-credentials\u003e key: \u003cdata-keyin-secret\u003e If the component delegates authorization to the kube-apiserver of the shoot cluster, you can use the shoot-access-prometheus-shoot secret:\nspec: authorization: credentials: name: shoot-access-prometheus-shoot key: token # in case the component's server certificate is signed by the cluster CA: scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt ScrapeConfigs If the component runs in the shoot cluster itself, metrics are scraped via the kube-apiserver proxy. In this case, Prometheus needs to authenticate itself with the API server. This can be done like this:\napiVersion: monitoring.coreos.com/v1alpha1 kind: ScrapeConfig metadata: labels: prometheus: shoot name: shoot-my-cluster-component namespace: shoot--foo--bar spec: authorization: credentials: name: shoot-access-prometheus-shoot key: token scheme: HTTPS tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt kubernetesSDConfigs: - apiServer: https://kube-apiserver authorization: credentials: name: shoot-access-prometheus-shoot key: token followRedirects: true namespaces: names: - kube-system role: endpoints tlsConfig: ca: secret: name: \u003cname-of-ca-bundle-secret\u003e key: bundle.crt cert: {} metricRelabelings: - sourceLabels: - __name__ action: keep regex: ^(metric1|metric2)$ - sourceLabels: - namespace action: keep regex: kube-system relabelings: - action: replace replacement: my-cluster-component targetLabel: job - sourceLabels: [__meta_kubernetes_service_name, __meta_kubernetes_pod_container_port_name] separator: ; regex: my-component-service;metrics replacement: $1 action: keep - sourceLabels: [__meta_kubernetes_endpoint_node_name] separator: ; regex: (.*) targetLabel: node replacement: $1 action: replace - sourceLabels: [__meta_kubernetes_pod_name] separator: ; regex: (.*) targetLabel: pod replacement: $1 action: replace - targetLabel: __address__ replacement: kube-apiserver:443 - sourceLabels: [__meta_kubernetes_pod_name, __meta_kubernetes_pod_container_port_number] separator: ; regex: (.+);(.+) targetLabel: __metrics_path__ replacement: /api/v1/namespaces/kube-system/pods/${1}:${2}/proxy/metrics action: replace [!TIP] Developers can make use of the pkg/component/observability/monitoring/prometheus/shoot.ClusterComponentScrapeConfigSpec function in order to generate a ScrapeConfig like above.\n PrometheusRule Similar to ServiceMonitors, PrometheusRules can be created with the prometheus=shoot label:\napiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: shoot name: shoot-my-component namespace: shoot--foo--bar spec: groups: - name: my.rules rules: # ... Plutono Dashboards A Plutono instance is deployed by gardenlet into the shoot cluster’s namespace for visualizing monitoring metrics and logs via dashboards. In order to provide custom dashboards, create a ConfigMap in the shoot cluster’s namespace labelled with dashboard.monitoring.gardener.cloud/shoot=true that contains the respective JSON documents, for example:\napiVersion: v1 kind: ConfigMap metadata: labels: dashboard.monitoring.gardener.cloud/shoot: \"true\" name: extension-foo-my-custom-dashboard namespace: shoot--project--name data: my-custom-dashboard.json: \u003cdashboard-JSON-document\u003e Logging In Kubernetes clusters, container logs are non-persistent and do not survive stopped and destroyed containers. Gardener addresses this problem for the components hosted in a seed cluster by introducing its own managed logging solution. It is integrated with the Gardener monitoring stack to have all troubleshooting context in one place.\nGardener logging consists of components in three roles - log collectors and forwarders, log persistency and exploration/consumption interfaces. All of them live in the seed clusters in multiple instances:\n Logs are persisted by Vali instances deployed as StatefulSets - one per shoot namespace, if the logging is enabled in the gardenlet configuration (logging.enabled) and the shoot purpose is not testing, and one in the garden namespace. The shoot instances store logs from the control plane components hosted there. The garden Vali instance is responsible for logs from the rest of the seed namespaces - kube-system, garden, extension-*, and others. Fluent-bit DaemonSets deployed by the fluent-operator on each seed node collect logs from it. A custom plugin takes care to distribute the collected log messages to the Vali instances that they are intended for. This allows to fetch the logs once for the whole cluster, and to distribute them afterwards. Plutono is the UI component used to explore monitoring and log data together for easier troubleshooting and in context. Plutono instances are configured to use the corresponding Vali instances, sharing the same namespace as data providers. There is one Plutono Deployment in the garden namespace and one Deployment per shoot namespace (exposed to the end users and to the operators). Logs can be produced from various sources, such as containers or systemd, and in different formats. The fluent-bit design supports configurable data pipeline to address that problem. Gardener provides such configuration for logs produced by all its core managed components as ClusterFilters and ClusterParsers . Extensions can contribute their own, specific configurations as fluent-operator custom resources too. See for example the logging configuration for the Gardener AWS provider extension.\nFluent-bit Log Parsers and Filters To integrate with Gardener logging, extensions can and should specify how fluent-bit will handle the logs produced by the managed components that they contribute to Gardener. Normally, that would require to configure a parser for the specific logging format, if none of the available is applicable, and a filter defining how to apply it. For a complete reference for the configuration options, refer to fluent-bit’s documentation.\nTo contribute its own configuration to the fluent-bit agents data pipelines, an extension must deploy a fluent-operator custom resource labeled with fluentbit.gardener/type: seed in the seed cluster.\n Note: Take care to provide the correct data pipeline elements in the corresponding fields and not to mix them.\n Example: Logging configuration for provider-specific cloud-controller-manager deployed into shoot namespaces that reuses the kube-apiserver-parser defined in logging.go to parse the component logs:\napiVersion: fluentbit.fluent.io/v1alpha2 kind: ClusterFilter metadata: labels: fluentbit.gardener/type: \"seed\" name: cloud-controller-manager-aws-cloud-controller-manager spec: filters: - parser: keyName: log parser: kube-apiserver-parser reserveData: true match: kubernetes.*cloud-controller-manager*aws-cloud-controller-manager* Further details how to define parsers and use them with examples can be found in the following guide.\nPlutono The two types of Plutono instances found in a seed cluster are configured to expose logs of different origin in their dashboards:\n Garden Plutono dashboards expose logs from non-shoot namespaces of the seed clusters Pod Logs Extensions Systemd Logs Shoot Plutono dashboards expose logs from the shoot cluster namespace where they belong Kube Apiserver Kube Controller Manager Kube Scheduler Cluster Autoscaler VPA components Kubernetes Pods If the type of logs exposed in the Plutono instances needs to be changed, it is necessary to update the corresponding instance dashboard configurations.\nTips Be careful to create ClusterFilters and ClusterParsers with unique names because they are not namespaced. We use pod_name for filters with one container and pod_name--container_name for pods with multiple containers. Be careful to match exactly the log names that you need for a particular parser in your filters configuration. The regular expression you will supply will match names in the form kubernetes.pod_name.\u003cmetadata\u003e.container_name. If there are extensions with the same container and pod names, they will all match the same parser in a filter. That may be a desired effect, if they all share the same log format. But it will be a problem if they don’t. To solve it, either the pod or container names must be unique, and the regular expression in the filter has to match that unique pattern. A recommended approach is to prefix containers with the extension name and tune the regular expression to match it. For example, using myextension-container as container name and a regular expression kubernetes.mypod.*myextension-container will guarantee match of the right log name. Make sure that the regular expression does not match more than you expect. For example, kubernetes.systemd.*systemd.* will match both systemd-service and systemd-monitor-service. You will want to be as specific as possible. It’s a good idea to put the logging configuration into the Helm chart that also deploys the extension controller, while the monitoring configuration can be part of the Helm chart/deployment routine that deploys the component managed by the controller. References and Additional Resources GitHub Issue Describing the Concept Exemplary Implementation (Monitoring) for the GCP Provider Exemplary Implementation (ClusterFilter) for the AWS Provider Exemplary Implementation (ClusterParser) for the Shoot DNS Service ","categories":"","description":"","excerpt":"Logging and Monitoring for Extensions Gardener provides an integrated …","ref":"/docs/gardener/extensions/logging-and-monitoring/","tags":"","title":"Logging And Monitoring"},{"body":"Logging Stack Motivation Kubernetes uses the underlying container runtime logging, which does not persist logs for stopped and destroyed containers. This makes it difficult to investigate issues in the very common case of not running containers. Gardener provides a solution to this problem for the managed cluster components by introducing its own logging stack.\nComponents A Fluent-bit daemonset which works like a log collector and custom Golang plugin which spreads log messages to their Vali instances. One Vali Statefulset in the garden namespace which contains logs for the seed cluster and one per shoot namespace which contains logs for shoot’s controlplane. One Plutono Deployment in garden namespace and two Deployments per shoot namespace (one exposed to the end users and one for the operators). Plutono is the UI component used in the logging stack. Container Logs Rotation and Retention Container log rotation in Kubernetes describes a subtile but important implementation detail depending on the type of the used high-level container runtime. When the used container runtime is not CRI compliant (such as dockershim), then the kubelet does not provide any rotation or retention implementations, hence leaving those aspects to the downstream components. When the used container runtime is CRI compliant (such as containerd), then the kubelet provides the necessary implementation with two configuration options:\n ContainerLogMaxSize for rotation ContainerLogMaxFiles for retention ContainerD Runtime In this case, it is possible to configure the containerLogMaxSize and containerLogMaxFiles fields in the Shoot specification. Both fields are optional and if nothing is specified, then the kubelet rotates on the size 100M. Those fields are part of provider’s workers definition. Here is an example:\nspec: provider: workers: - cri: name: containerd kubernetes: kubelet: # accepted values are of resource.Quantity containerLogMaxSize: 150Mi containerLogMaxFiles: 10 The values of the containerLogMaxSize and containerLogMaxFiles fields need to be considered with care since container log files claim disk space from the host. On the opposite side, log rotations on too small sizes may result in frequent rotations which can be missed by other components (log shippers) observing these rotations.\nIn the majority of the cases, the defaults should do just fine. Custom configuration might be of use under rare conditions.\nExtension of the Logging Stack The logging stack is extended to scrape logs from the systemd services of each shoots’ nodes and from all Gardener components in the shoot kube-system namespace. These logs are exposed only to the Gardener operators.\nAlso, in the shoot control plane an event-logger pod is deployed, which scrapes events from the shoot kube-system namespace and shoot control-plane namespace in the seed. The event-logger logs the events to the standard output. Then the fluent-bit gets these events as container logs and sends them to the Vali in the shoot control plane (similar to how it works for any other control plane component). How to Access the Logs The logs are accessible via Plutono. To access them:\n Authenticate via basic auth to gain access to Plutono. The Plutono URL can be found in the Logging and Monitoring section of a cluster in the Gardener Dashboard alongside the credentials. The secret containing the credentials is stored in the project namespace following the naming pattern \u003cshoot-name\u003e.monitoring. For Gardener operators, the credentials are also stored in the control-plane (shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e) namespace in the observability-ingress-users-\u003chash\u003e secret in the seed.\n Plutono contains several dashboards that aim to facilitate the work of operators and users. From the Explore tab, users and operators have unlimited abilities to extract and manipulate logs.\n Note: Gardener Operators are people part of the Gardener team with operator permissions, not operators of the end-user cluster!\n How to Use the Explore Tab If you click on the Log browser \u003e button, you will see all of the available labels. Clicking on the label, you can see all of its available values for the given period of time you have specified. If you are searching for logs for the past one hour, do not expect to see labels or values for which there were no logs for that period of time. By clicking on a value, Plutono automatically eliminates all other labels and/or values with which no valid log stream can be made. After choosing the right labels and their values, click on the Show logs button. This will build Log query and execute it. This approach is convenient when you don’t know the labels names or they values. Once you feel comfortable, you can start to use the LogQL language to search for logs. Next to the Log browser \u003e button is the place where you can type log queries.\nExamples:\n If you want to get logs for calico-node-\u003chash\u003e pod in the cluster kube-system: The name of the node on which calico-node was running is known, but not the hash suffix of the calico-node pod. Also we want to search for errors in the logs.\n{pod_name=~\"calico-node-.+\", nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} |~ \"error\"\nHere, you will get as much help as possible from the Plutono by giving you suggestions and auto-completion.\n If you want to get the logs from kubelet systemd service of a given node and search for a pod name in the logs:\n{unit=\"kubelet.service\", nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} |~ \"pod name\"\n Note: Under unit label there is only the docker, containerd, kubelet and kernel logs.\n If you want to get the logs from gardener-node-agent systemd service of a given node and search for a string in the logs:\n{job=\"systemd-combine-journal\",nodename=\"ip-10-222-31-182.eu-central-1.compute.internal\"} | unpack | unit=\"gardener-node-agent.service\"\n Note: {job=\"systemd-combine-journal\",nodename=\"\u003cnode name\u003e\"} stream pack all logs from systemd services except docker, containerd, kubelet, and kernel. To filter those log by unit, you have to unpack them first.\n Retrieving events: If you want to get the events from the shoot kube-system namespace generated by kubelet and related to the node-problem-detector:\n{job=\"event-logging\"} | unpack | origin_extracted=\"shoot\",source=\"kubelet\",object=~\".*node-problem-detector.*\"\n If you want to get the events generated by MCM in the shoot control plane in the seed:\n{job=\"event-logging\"} | unpack | origin_extracted=\"seed\",source=~\".*machine-controller-manager.*\"\n Note: In order to group events by origin, one has to specify origin_extracted because the origin label is reserved for all of the logs from the seed and the event-logger resides in the seed, so all of its logs are coming as they are only from the seed. The actual origin is embedded in the unpacked event. When unpacked, the embedded origin becomes origin_extracted.\n Expose Logs for Component to User Plutono Exposing logs for a new component to the User’s Plutono is described in the How to Expose Logs to the Users section.\nConfiguration Fluent-bit The Fluent-bit configurations can be found on pkg/component/observability/logging/fluentoperator/customresources There are six different specifications:\n FluentBit: Defines the fluent-bit DaemonSet specifications ClusterFluentBitConfig: Defines the labelselectors of the resources which fluent-bit will match ClusterInput: Defines the location of the input stream of the logs ClusterOutput: Defines the location of the output source (Vali for example) ClusterFilter: Defines filters which match specific keys ClusterParser: Defines parsers which are used by the filters Vali The Vali configurations can be found on charts/seed-bootstrap/charts/vali/templates/vali-configmap.yaml\nThe main specifications there are:\n Index configuration: Currently the following one is used: schema_config: configs: - from: 2018-04-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 24h from: Is the date from which logs collection is started. Using a date in the past is okay. store: The DB used for storing the index. object_store: Where the data is stored. schema: Schema version which should be used (v11 is currently recommended). index.prefix: The prefix for the index. index.period: The period for updating the indices. Adding a new index happens with new config block definition. The from field should start from the current day + previous index.period and should not overlap with the current index. The prefix also should be different.\n schema_config: configs: - from: 2018-04-15 store: boltdb object_store: filesystem schema: v11 index: prefix: index_ period: 24h - from: 2020-06-18 store: boltdb object_store: filesystem schema: v11 index: prefix: index_new_ period: 24h chunk_store_config Configuration chunk_store_config: max_look_back_period: 336h chunk_store_config.max_look_back_period should be the same as the retention_period\n table_manager Configuration table_manager: retention_deletes_enabled: true retention_period: 336h table_manager.retention_period is the living time for each log message. Vali will keep messages for (table_manager.retention_period - index.period) time due to specification in the Vali implementation.\nPlutono This is the Vali configuration that Plutono uses:\n - name: vali type: vali access: proxy url: http://logging.{{ .Release.Namespace }}.svc:3100 jsonData: maxLines: 5000 name: Is the name of the datasource. type: Is the type of the datasource. access: Should be set to proxy. url: Vali’s url svc: Vali’s port jsonData.maxLines: The limit of the log messages which Plutono will show to the users. Decrease this value if the browser works slowly!\n","categories":"","description":"","excerpt":"Logging Stack Motivation Kubernetes uses the underlying container …","ref":"/docs/gardener/logging-usage/","tags":"","title":"Logging Usage"},{"body":"Creating/Deleting machines (VM) Creating/Deleting machines (VM) Setting up your usage environment Important : Creating machine Inspect status of machine Delete machine Setting up your usage environment Follow the steps described here Important : Make sure that the kubernetes/machine_objects/machine.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_objects/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine Modify kubernetes/machine_objects/machine.yaml as per your requirement and create the VM as shown below: $ kubectl apply -f kubernetes/machine_objects/machine.yaml You should notice that the Machine Controller Manager has immediately picked up your manifest and started to create a new machine by talking to the cloud provider.\n Check Machine Controller Manager machines in the cluster $ kubectl get machine NAME STATUS AGE test-machine Running 5m A new machine is created with the name provided in the kubernetes/machine_objects/machine.yaml file.\n After a few minutes (~3 minutes for AWS), you should notice a new node joining the cluster. You can verify this by running: $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-14-52.eu-east-1.compute.internal. Ready 1m v1.8.0 This shows that a new node has successfully joined the cluster.\nInspect status of machine To inspect the status of any created machine, run the command given below.\n$ kubectl get machine test-machine -o yaml apiVersion: machine.sapcloud.io/v1alpha1 kind: Machine metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"Machine\",\"metadata\":{\"annotations\":{},\"labels\":{\"test-label\":\"test-label\"},\"name\":\"test-machine\",\"namespace\":\"\"},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}} clusterName: \"\" creationTimestamp: 2017-12-27T06:58:21Z finalizers: - machine.sapcloud.io/operator generation: 0 initializers: null labels: node: ip-10-250-14-52.eu-east-1.compute.internal test-label: test-label name: test-machine namespace: \"\" resourceVersion: \"12616948\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine uid: 535e596c-ead3-11e7-a6c0-828f843e4186 spec: class: kind: AWSMachineClass name: test-aws providerID: aws:///eu-east-1/i-00bef3f2618ffef23 status: conditions: - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has sufficient disk space available reason: KubeletHasSufficientDisk status: \"False\" type: OutOfDisk - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has sufficient memory available reason: KubeletHasSufficientMemory status: \"False\" type: MemoryPressure - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T06:59:16Z message: kubelet has no disk pressure reason: KubeletHasNoDiskPressure status: \"False\" type: DiskPressure - lastHeartbeatTime: 2017-12-27T07:00:46Z lastTransitionTime: 2017-12-27T07:00:06Z message: kubelet is posting ready status reason: KubeletReady status: \"True\" type: Ready currentStatus: lastUpdateTime: 2017-12-27T07:00:06Z phase: Running lastOperation: description: Machine is now ready lastUpdateTime: 2017-12-27T07:00:06Z state: Successful type: Create node: ip-10-250-14-52.eu-west-1.compute.internal Delete machine To delete the VM using the kubernetes/machine_objects/machine.yaml as shown below\n$ kubectl delete -f kubernetes/machine_objects/machine.yaml Now the Machine Controller Manager picks up the manifest immediately and starts to delete the existing VM by talking to the cloud provider. The node should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Creating/Deleting machines (VM) Creating/Deleting machines (VM) …","ref":"/docs/other-components/machine-controller-manager/machine/","tags":"","title":"Machine"},{"body":"machine-controller-manager-provider-local Out of tree (controller-based) implementation for local as a new provider. The local out-of-tree provider implements the interface defined at MCM OOT driver.\nFundamental Design Principles Following are the basic principles kept in mind while developing the external plugin.\n Communication between this Machine Controller (MC) and Machine Controller Manager (MCM) is achieved using the Kubernetes native declarative approach. Machine Controller (MC) behaves as the controller used to interact with the local provider and manage the VMs corresponding to the machine objects. Machine Controller Manager (MCM) deals with higher level objects such as machine-set and machine-deployment objects. ","categories":"","description":"","excerpt":"machine-controller-manager-provider-local Out of tree …","ref":"/docs/gardener/extensions/machine-controller-provider-local/","tags":"","title":"Machine Controller Provider Local"},{"body":"Maintaining machine replicas using machines-deployments Maintaining machine replicas using machines-deployments Setting up your usage environment Important ⚠️ Creating machine-deployment Inspect status of machine-deployment Health monitoring Update your machines Inspect existing cluster configuration Perform a rolling update Re-check cluster configuration More variants of updates Undo an update Pause an update Delete machine-deployment Setting up your usage environment Follow the steps described here\nImportant ⚠️ Make sure that the kubernetes/machine_objects/machine-deployment.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_classes/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine-deployment Modify kubernetes/machine_objects/machine-deployment.yaml as per your requirement. Modify the number of replicas to the desired number of machines. Then, create an machine-deployment. $ kubectl apply -f kubernetes/machine_objects/machine-deployment.yaml Now the Machine Controller Manager picks up the manifest immediately and starts to create a new machines based on the number of replicas you have provided in the manifest.\n Check Machine Controller Manager machine-deployments in the cluster $ kubectl get machinedeployment NAME READY DESIRED UP-TO-DATE AVAILABLE AGE test-machine-deployment 3 3 3 0 10m You will notice a new machine-deployment with your given name\n Check Machine Controller Manager machine-sets in the cluster $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 3 3 0 10m You will notice a new machine-set backing your machine-deployment\n Check Machine Controller Manager machines in the cluster $ kubectl get machine NAME STATUS AGE test-machine-deployment-5bc6dd7c8f-5d24b Pending 5m test-machine-deployment-5bc6dd7c8f-6mpn4 Pending 5m test-machine-deployment-5bc6dd7c8f-dpt2q Pending 5m Now you will notice N (number of replicas specified in the manifest) new machines whose name are prefixed with the machine-deployment object name that you created.\n After a few minutes (~3 minutes for AWS), you would see that new nodes have joined the cluster. You can see this using $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-20-19.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-27-123.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-31-80.eu-west-1.compute.internal Ready 1m v1.8.0 This shows how new nodes have joined your cluster\nInspect status of machine-deployment To inspect the status of any created machine-deployment run the command below,\n$ kubectl get machinedeployment test-machine-deployment -o yaml You should get the following output.\napiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: annotations: deployment.kubernetes.io/revision: \"1\" kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"MachineDeployment\",\"metadata\":{\"annotations\":{},\"name\":\"test-machine-deployment\",\"namespace\":\"\"},\"spec\":{\"minReadySeconds\":200,\"replicas\":3,\"selector\":{\"matchLabels\":{\"test-label\":\"test-label\"}},\"strategy\":{\"rollingUpdate\":{\"maxSurge\":1,\"maxUnavailable\":1},\"type\":\"RollingUpdate\"},\"template\":{\"metadata\":{\"labels\":{\"test-label\":\"test-label\"}},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}}}} clusterName: \"\" creationTimestamp: 2017-12-27T08:55:56Z generation: 0 initializers: null name: test-machine-deployment namespace: \"\" resourceVersion: \"12634168\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine-deployment uid: c0b488f7-eae3-11e7-a6c0-828f843e4186 spec: minReadySeconds: 200 replicas: 3 selector: matchLabels: test-label: test-label strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: creationTimestamp: null labels: test-label: test-label spec: class: kind: AWSMachineClass name: test-aws status: availableReplicas: 3 conditions: - lastTransitionTime: 2017-12-27T08:57:22Z lastUpdateTime: 2017-12-27T08:57:22Z message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: \"True\" type: Available readyReplicas: 3 replicas: 3 updatedReplicas: 3 Health monitoring Health monitor is also applied similar to how it’s described for machine-sets\nUpdate your machines Let us consider the scenario where you wish to update all nodes of your cluster from t2.xlarge machines to m5.xlarge machines. Assume that your current test-aws has its spec.machineType: t2.xlarge and your deployment test-machine-deployment points to this AWSMachineClass.\nInspect existing cluster configuration Check Nodes present in the cluster $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-20-19.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-27-123.eu-west-1.compute.internal Ready 1m v1.8.0 ip-10-250-31-80.eu-west-1.compute.internal Ready 1m v1.8.0 Check Machine Controller Manager machine-sets in the cluster. You will notice one machine-set backing your machine-deployment $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 3 3 3 10m Login to your cloud provider (AWS). In the VM management console, you will find N VMs created of type t2.xlarge. Perform a rolling update To update this machine-deployment VMs to m5.xlarge, we would do the following:\n Copy your existing aws-machine-class.yaml cp kubernetes/machine_classes/aws-machine-class.yaml kubernetes/machine_classes/aws-machine-class-new.yaml Modify aws-machine-class-new.yaml, and update its metadata.name: test-aws2 and spec.machineType: m5.xlarge Now create this modified MachineClass kubectl apply -f kubernetes/machine_classes/aws-machine-class-new.yaml Edit your existing machine-deployment kubectl edit machinedeployment test-machine-deployment Update from spec.template.spec.class.name: test-aws to spec.template.spec.class.name: test-aws2 Re-check cluster configuration After a few minutes (~3mins)\n Check nodes present in cluster now. They are different nodes. $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-11-171.eu-west-1.compute.internal Ready 4m v1.8.0 ip-10-250-17-213.eu-west-1.compute.internal Ready 5m v1.8.0 ip-10-250-31-81.eu-west-1.compute.internal Ready 5m v1.8.0 Check Machine Controller Manager machine-sets in the cluster. You will notice two machine-sets backing your machine-deployment $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-deployment-5bc6dd7c8f 0 0 0 1h test-machine-deployment-86ff45cc5 3 3 3 20m Login to your cloud provider (AWS). In the VM management console, you will find N VMs created of type t2.xlarge in terminated state, and N new VMs of type m5.xlarge in running state. This shows how a rolling update of a cluster from nodes with t2.xlarge to m5.xlarge went through.\nMore variants of updates The above demonstration was a simple use case. This could be more complex like - updating the system disk image versions/ kubelet versions/ security patches etc. You can also play around with the maxSurge and maxUnavailable fields in machine-deployment.yaml You can also change the update strategy from rollingupdate to recreate Undo an update Edit the existing machine-deployment $ kubectl edit machinedeployment test-machine-deployment Edit the deployment to have this new field of spec.rollbackTo.revision: 0 as shown as comments in kubernetes/machine_objects/machine-deployment.yaml This will undo your update to the previous version. Pause an update You can also pause the update while update is going on by editing the existing machine-deployment $ kubectl edit machinedeployment test-machine-deployment Edit the deployment to have this new field of spec.paused: true as shown as comments in kubernetes/machine_objects/machine-deployment.yaml\n This will pause the rollingUpdate if it’s in process\n To resume the update, edit the deployment as mentioned above and remove the field spec.paused: true updated earlier\n Delete machine-deployment To delete the VM using the kubernetes/machine_objects/machine-deployment.yaml $ kubectl delete -f kubernetes/machine_objects/machine-deployment.yaml The Machine Controller Manager picks up the manifest and starts to delete the existing VMs by talking to the cloud provider. The nodes should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Maintaining machine replicas using machines-deployments Maintaining …","ref":"/docs/other-components/machine-controller-manager/machine_deployment/","tags":"","title":"Machine Deployment"},{"body":"Machine Error code handling Notational Conventions The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997).\nThe key words “unspecified”, “undefined”, and “implementation-defined” are to be interpreted as described in the rationale for the C99 standard.\nAn implementation is not compliant if it fails to satisfy one or more of the MUST, REQUIRED, or SHALL requirements for the protocols it implements. An implementation is compliant if it satisfies all the MUST, REQUIRED, and SHALL requirements for the protocols it implements.\nTerminology Term Definition CR Custom Resource (CR) is defined by a cluster admin using the Kubernetes Custom Resource Definition primitive. VM A Virtual Machine (VM) provisioned and managed by a provider. It could also refer to a physical machine in case of a bare metal provider. Machine Machine refers to a VM that is provisioned/managed by MCM. It typically describes the metadata used to store/represent a Virtual Machine Node Native kubernetes Node object. The objects you get to see when you do a “kubectl get nodes”. Although nodes can be either physical/virtual machines, for the purposes of our discussions it refers to a VM. MCM Machine Controller Manager (MCM) is the controller used to manage higher level Machine Custom Resource (CR) such as machine-set and machine-deployment CRs. Provider/Driver/MC Provider (or) Driver (or) Machine Controller (MC) is the driver responsible for managing machine objects present in the cluster from whom it manages these machines. A simple example could be creation/deletion of VM on the provider. Pre-requisite MachineClass Resources MCM introduces the CRD MachineClass. This is a blueprint for creating machines that join a certain cluster as nodes in a certain role. The provider only works with MachineClass resources that have the structure described here.\nProviderSpec The MachineClass resource contains a providerSpec field that is passed in the ProviderSpec request field to CMI methods such as CreateMachine. The ProviderSpec can be thought of as a machine template from which the VM specification must be adopted. It can contain key-value pairs of these specs. An example for these key-value pairs are given below.\n Parameter Mandatory Type Description vmPool Yes string VM pool name, e.g. TEST-WOKER-POOL size Yes string VM size, e.g. xsmall, small, etc. Each size maps to a number of CPUs and memory size. rootFsSize No int Root (/) filesystem size in GB tags Yes map Tags to be put on the created VM Most of the ProviderSpec fields are not mandatory. If not specified, the provider passes an empty value in the respective Create VM parameter.\nThe tags can be used to map a VM to its corresponding machine object’s Name\nThe ProviderSpec is validated by methods that receive it as a request field for presence of all mandatory parameters and tags, and for validity of all parameters.\nSecrets The MachineClass resource also contains a secretRef field that contains a reference to a secret. The keys of this secret are passed in the Secrets request field to CMI methods.\nThe secret can contain sensitive data such as\n cloud-credentials secret data used to authenticate at the provider cloud-init scripts used to initialize a new VM. The cloud-init script is expected to contain scripts to initialize the Kubelet and make it join the cluster. Identifying Cluster Machines To implement certain methods, the provider should be able to identify all machines associated with a particular Kubernetes cluster. This can be achieved using one/more of the below mentioned ways:\n Names of VMs created by the provider are prefixed by the cluster ID specified in the ProviderSpec. VMs created by the provider are tagged with the special tags like kubernetes.io/cluster (for the cluster ID) and kubernetes.io/role (for the role), specified in the ProviderSpec. Mapping Resource Groups to individual cluster. Error Scheme All provider API calls defined in this spec MUST return a machine error status, which is very similar to standard machine status.\nMachine Provider Interface The provider MUST have a unique way to map a machine object to a VM which triggers the deletion for the corresponding VM backing the machine object. The provider SHOULD have a unique way to map the ProviderSpec of a machine-class to a unique Cluster. This avoids deletion of other machines, not backed by the MCM. CreateMachine A Provider is REQUIRED to implement this interface method. This interface method will be called by the MCM to provision a new VM on behalf of the requesting machine object.\n This call requests the provider to create a VM backing the machine-object.\n If VM backing the Machine.Name already exists, and is compatible with the specified Machine object in the CreateMachineRequest, the Provider MUST reply 0 OK with the corresponding CreateMachineResponse.\n The provider can OPTIONALLY make use of the MachineClass supplied in the MachineClass in the CreateMachineRequest to communicate with the provider.\n The provider can OPTIONALLY make use of the secrets supplied in the Secret in the CreateMachineRequest to communicate with the provider.\n The provider can OPTIONALLY make use of the Status.LastKnownState in the Machine object to decode the state of the VM operation based on the last known state of the VM. This can be useful to restart/continue an operations which are mean’t to be atomic.\n The provider MUST have a unique way to map a machine object to a VM. This could be implicitly provided by the provider by letting you set VM-names (or) could be explicitly specified by the provider using appropriate tags to map the same.\n This operation SHOULD be idempotent.\n The CreateMachineResponse returned by this method is expected to return\n ProviderID that uniquely identifys the VM at the provider. This is expected to match with the node.Spec.ProviderID on the node object. NodeName that is the expected name of the machine when it joins the cluster. It must match with the node name. LastKnownState is an OPTIONAL field that can store details of the last known state of the VM. It can be used by future operation calls to determine current infrastucture state. This state is saved on the machine object. // CreateMachine call is responsible for VM creation on the provider CreateMachine(context.Context, *CreateMachineRequest) (*CreateMachineResponse, error)// CreateMachineRequest is the create request for VM creation type CreateMachineRequest struct {\t// Machine object from whom VM is to be created \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// CreateMachineResponse is the create response for VM creation type CreateMachineResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string\t// LastKnownState represents the last state of the VM during an creation/deletion error \tLastKnownState string}CreateMachine Errors If the provider is unable to complete the CreateMachine call successfully, it MUST return a non-ok ginterface method code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in creating/adopting a VM that matches supplied creation request. The CreateMachineResponse is returned with desired values N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and ProviderSpec. Make sure all parameters are in permitted range of values. Exact issue to be given in .message Update providerSpec to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 6 ALREADY_EXISTS Already exists but desired parameters doesn’t match Parameters of the existing VM don’t match the ProviderSpec Create machine with a different name N 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to create an VM and it’s required dependencies Update requestor permissions to grant the same N 8 RESOURCE_EXHAUSTED Resource limits have been reached The requestor doesn’t have enough resource limits to process this creation request Enhance resource limits associated with the user/account to process this N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 10 ABORTED Operation is pending Indicates that there is already an operation pending for the specified machine Wait until previous pending operation is processed Y 11 OUT_OF_RANGE Resources were out of range The requested number of CPUs, memory size, of FS size in ProviderSpec falls outside of the corresponding valid range Update request paramaters to request valid resource requests N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nInitializeMachine Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error.\nThis interface method will be called by the MCM to initialize a new VM just after creation. This can be used to configure network configuration etc.\n This call requests the provider to initialize a newly created VM backing the machine-object. The InitializeMachineResponse returned by this method is expected to return ProviderID that uniquely identifys the VM at the provider. This is expected to match with the node.Spec.ProviderID on the node object. NodeName that is the expected name of the machine when it joins the cluster. It must match with the node name. // InitializeMachine call is responsible for VM initialization on the provider. InitializeMachine(context.Context, *InitializeMachineRequest) (*InitializeMachineResponse, error)// InitializeMachineRequest encapsulates params for the VM Initialization operation (Driver.InitializeMachine). type InitializeMachineRequest struct {\t// Machine object representing VM that must be initialized \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// InitializeMachineResponse is the response for VM instance initialization (Driver.InitializeMachine). type InitializeMachineResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string}InitializeMachine Errors If the provider is unable to complete the InitializeMachine call successfully, it MUST return a non-ok machine code in the machine status.\nIf the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in initializing a VM that matches supplied initialization request. The InitializeMachineResponse is returned with desired values N 5 NOT_FOUND Timeout VM Instance for Machine isn’t found at provider Skip Initialization and Continue N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Skip Initialization and continue N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. Needs investigation and possible intervention to fix this Y 17 UNINITIALIZED Failed Initialization VM Instance could not be initializaed Initialization is reattempted in next reconcile cycle Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nDeleteMachine A Provider is REQUIRED to implement this driver call. This driver call will be called by the MCM to deprovision/delete/terminate a VM backed by the requesting machine object.\n If a VM corresponding to the specified machine-object’s name does not exist or the artifacts associated with the VM do not exist anymore (after deletion), the Provider MUST reply 0 OK.\n The provider SHALL only act on machines belonging to the cluster-id/cluster-name obtained from the ProviderSpec.\n The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the DeleteMachineRequest to communicate with the provider.\n The provider can OPTIONALY make use of the Spec.ProviderID map in the Machine object.\n The provider can OPTIONALLY make use of the Status.LastKnownState in the Machine object to decode the state of the VM operation based on the last known state of the VM. This can be useful to restart/continue an operations which are mean’t to be atomic.\n This operation SHOULD be idempotent.\n The provider must have a unique way to map a machine object to a VM which triggers the deletion for the corresponding VM backing the machine object.\n The DeleteMachineResponse returned by this method is expected to return\n LastKnownState is an OPTIONAL field that can store details of the last known state of the VM. It can be used by future operation calls to determine current infrastucture state. This state is saved on the machine object. // DeleteMachine call is responsible for VM deletion/termination on the provider DeleteMachine(context.Context, *DeleteMachineRequest) (*DeleteMachineResponse, error)// DeleteMachineRequest is the delete request for VM deletion type DeleteMachineRequest struct {\t// Machine object from whom VM is to be deleted \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// DeleteMachineResponse is the delete response for VM deletion type DeleteMachineResponse struct {\t// LastKnownState represents the last state of the VM during an creation/deletion error \tLastKnownState string}DeleteMachine Errors If the provider is unable to complete the DeleteMachine call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in deleting a VM that matches supplied deletion request. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and make sure that it is in the desired format and not a blank value. Exact issue to be given in .message Update Machine.Name to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to delete an VM and it’s required dependencies Update requestor permissions to grant the same N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 10 ABORTED Operation is pending Indicates that there is already an operation pending for the specified machine Wait until previous pending operation is processed Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGetMachineStatus A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. This call will be invoked by the MC to get the status of a machine. This optional driver call helps in optimizing the working of the provider by avoiding unwanted calls to CreateMachine() and DeleteMachine().\n If a VM corresponding to the specified machine object’s Machine.Name exists on provider the GetMachineStatusResponse fields are to be filled similar to the CreateMachineResponse. The provider SHALL only act on machines belonging to the cluster-id/cluster-name obtained from the ProviderSpec. The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the GetMachineStatusRequest to communicate with the provider. The provider can OPTIONALY make use of the VM unique ID (returned by the provider on machine creation) passed in the ProviderID map in the GetMachineStatusRequest. This operation MUST be idempotent. // GetMachineStatus call get's the status of the VM backing the machine object on the provider GetMachineStatus(context.Context, *GetMachineStatusRequest) (*GetMachineStatusResponse, error)// GetMachineStatusRequest is the get request for VM info type GetMachineStatusRequest struct {\t// Machine object from whom VM status is to be fetched \tMachine *v1alpha1.Machine\t// MachineClass backing the machine object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// GetMachineStatusResponse is the get response for VM info type GetMachineStatusResponse struct {\t// ProviderID is the unique identification of the VM at the cloud provider. \t// ProviderID typically matches with the node.Spec.ProviderID on the node object. \t// Eg: gce://project-name/region/vm-ID \tProviderID string\t// NodeName is the name of the node-object registered to kubernetes. \tNodeName string}GetMachineStatus Errors If the provider is unable to complete the GetMachineStatus call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call was successful in getting machine details for given machine Machine.Name N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied Machine.Name and make sure that it is in the desired format and not a blank value. Exact issue to be given in .message Update Machine.Name to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 5 NOT_FOUND Machine isn’t found at provider The machine could not be found at provider Not required N 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to get details for the VM and it’s required dependencies Update requestor permissions to grant the same N 9 PRECONDITION_FAILED VM is in inconsistent state The VM is in a state that is invalid for this operation Manual intervention might be needed to fix the state of the VM N 11 OUT_OF_RANGE Multiple VMs found Multiple VMs found with matching machine object names Orphan VM handler to cleanup orphan VMs / Manual intervention maybe required if orphan VM handler isn’t enabled. Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N 17 UNINITIALIZED Failed Initialization VM Instance could not be initializaed Initialization is reattempted in next reconcile cycle N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nListMachines A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. The Provider SHALL return the information about all the machines associated with the MachineClass. Make sure to use appropriate filters to achieve the same to avoid data transfer overheads. This optional driver call helps in cleaning up orphan VMs present in the cluster. If not implemented, any orphan VM that might have been created incorrectly by the MCM/Provider (due to bugs in code/infra) might require manual clean up.\n If the Provider succeeded in returning a list of Machine.Name with their corresponding ProviderID, then return 0 OK. The ListMachineResponse contains a map of MachineList whose Key is expected to contain the ProviderID \u0026 Value is expected to contain the Machine.Name corresponding to it’s kubernetes machine CR object The provider can OPTIONALY make use of the secrets supplied in the Secrets map in the ListMachinesRequest to communicate with the provider. // ListMachines lists all the machines that might have been created by the supplied machineClass ListMachines(context.Context, *ListMachinesRequest) (*ListMachinesResponse, error)// ListMachinesRequest is the request object to get a list of VMs belonging to a machineClass type ListMachinesRequest struct {\t// MachineClass object \tMachineClass *v1alpha1.MachineClass\t// Secret backing the machineClass object \tSecret *corev1.Secret}// ListMachinesResponse is the response object of the list of VMs belonging to a machineClass type ListMachinesResponse struct {\t// MachineList is the map of list of machines. Format for the map should be \u003cProviderID, MachineName\u003e. \tMachineList map[string]string}ListMachines Errors If the provider is unable to complete the ListMachines call successfully, it MUST return a non-ok machine code in the machine status. If the conditions defined below are encountered, the provider MUST return the specified machine error code. The MCM MUST implement the specified error recovery behavior when it encounters the machine error code.\n machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call for listing all VMs associated with ProviderSpec was successful. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied ProviderSpec and make sure that all required fields are present in their desired value format. Exact issue to be given in .message Update ProviderSpec to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 7 PERMISSION_DENIED Insufficent permissions The requestor doesn’t have enough permissions to list VMs and it’s required dependencies Update requestor permissions to grant the same N 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y 16 UNAUTHENTICATED Missing provider credentials Request does not have valid authentication credentials for the operation Fix the provider credentials N The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGetVolumeIDs A Provider can OPTIONALLY implement this driver call. Else should return a UNIMPLEMENTED status in error. This driver call will be called by the MCM to get the VolumeIDs for the list of PersistentVolumes (PVs) supplied. This OPTIONAL (but recommended) driver call helps in serailzied eviction of pods with PVs while draining of machines. This implies applications backed by PVs would be evicted one by one, leading to shorter application downtimes.\n On succesful returnal of a list of Volume-IDs for all supplied PVSpecs, the Provider MUST reply 0 OK. The GetVolumeIDsResponse is expected to return a repeated list of strings consisting of the VolumeIDs for PVSpec that could be extracted. If for any PV the Provider wasn’t able to identify the Volume-ID, the provider MAY chose to ignore it and return the Volume-IDs for the rest of the PVs for whom the Volume-ID was found. Getting the VolumeID from the PVSpec depends on the Cloud-provider. You can extract this information by parsing the PVSpec based on the ProviderType https://github.com/kubernetes/api/blob/release-1.15/core/v1/types.go#L297-L339 https://github.com/kubernetes/api/blob/release-1.15//core/v1/types.go#L175-L257 This operation MUST be idempotent. // GetVolumeIDsRequest is the request object to get a list of VolumeIDs for a PVSpec type GetVolumeIDsRequest struct {\t// PVSpecsList is a list of PV specs for whom volume-IDs are required \t// Plugin should parse this raw data into pre-defined list of PVSpecs \tPVSpecs []*corev1.PersistentVolumeSpec}// GetVolumeIDsResponse is the response object of the list of VolumeIDs for a PVSpec type GetVolumeIDsResponse struct {\t// VolumeIDs is a list of VolumeIDs. \tVolumeIDs []string}GetVolumeIDs Errors machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful The call getting list of VolumeIDs for the list of PersistentVolumes was successful. N 1 CANCELED Cancelled Call was cancelled. Perform any pending clean-up tasks and return the call N 2 UNKNOWN Something went wrong Not enough information on what went wrong Retry operation after sometime Y 3 INVALID_ARGUMENT Re-check supplied parameters Re-check the supplied PVSpecList and make sure that it is in the desired format. Exact issue to be given in .message Update PVSpecList to fix issues. N 4 DEADLINE_EXCEEDED Timeout The call processing exceeded supplied deadline Retry operation after sometime Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this service. Retry with an alternate logic or implement this method at the provider. Most methods by default are in this state N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Needs manual intervension to fix this N 14 UNAVAILABLE Not Available Unavailable indicates the service is currently unavailable. Retry operation after sometime Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nGenerateMachineClassForMigration A Provider SHOULD implement this driver call, else it MUST return a UNIMPLEMENTED status in error. This driver call will be called by the Machine Controller to try to perform a machineClass migration for an unknown machineClass Kind. This helps in migration of one kind of machineClass to another kind. For instance an machineClass custom resource of AWSMachineClass to MachineClass.\n On successful generation of machine class the Provider MUST reply 0 OK (or) nil error. GenerateMachineClassForMigrationRequest expects the provider-specific machine class (eg. AWSMachineClass) to be supplied as the ProviderSpecificMachineClass. The provider is responsible for unmarshalling the golang struct. It also passes a reference to an existing MachineClass object. The provider is expected to fill in thisMachineClass object based on the conversions. An optional ClassSpec containing the type ClassSpec struct is also provided to decode the provider info. GenerateMachineClassForMigration is only responsible for filling up the passed MachineClass object. The task of creating the new CR of the new kind (MachineClass) with the same name as the previous one and also annotating the old machineClass CR with a migrated annotation and migrating existing references is done by the calling library implicitly. This operation MUST be idempotent. // GenerateMachineClassForMigrationRequest is the request for generating the generic machineClass // for the provider specific machine class type GenerateMachineClassForMigrationRequest struct {\t// ProviderSpecificMachineClass is provider specfic machine class object. \t// E.g. AWSMachineClass \tProviderSpecificMachineClass interface{}\t// MachineClass is the machine class object generated that is to be filled up \tMachineClass *v1alpha1.MachineClass\t// ClassSpec contains the class spec object to determine the machineClass kind \tClassSpec *v1alpha1.ClassSpec}// GenerateMachineClassForMigrationResponse is the response for generating the generic machineClass // for the provider specific machine class type GenerateMachineClassForMigrationResponse struct{}MigrateMachineClass Errors machine Code Condition Description Recovery Behavior Auto Retry Required 0 OK Successful Migration of provider specific machine class was successful Machine reconcilation is retried once the new class has been created Y 12 UNIMPLEMENTED Not implemented Unimplemented indicates operation is not implemented or not supported/enabled in this provider. None N 13 INTERNAL Major error Means some invariants expected by underlying system has been broken. If you see one of these errors, something is very broken. Might need manual intervension to fix this Y The status message MUST contain a human readable description of error, if the status code is not OK. This string MAY be surfaced by MCM to end users.\nConfiguration and Operation Supervised Lifecycle Management For Providers packaged in software form: Provider Packages SHOULD use a well-documented container image format (e.g., Docker, OCI). The chosen package image format MAY expose configurable Provider properties as environment variables, unless otherwise indicated in the section below. Variables so exposed SHOULD be assigned default values in the image manifest. A Provider Supervisor MAY programmatically evaluate or otherwise scan a Provider Package’s image manifest in order to discover configurable environment variables. A Provider SHALL NOT assume that an operator or Provider Supervisor will scan an image manifest for environment variables. Environment Variables Variables defined by this specification SHALL be identifiable by their MC_ name prefix. Configuration properties not defined by the MC specification SHALL NOT use the same MC_ name prefix; this prefix is reserved for common configuration properties defined by the MC specification. The Provider Supervisor SHOULD supply all RECOMMENDED MC environment variables to a Provider. The Provider Supervisor SHALL supply all REQUIRED MC environment variables to a Provider. Logging Providers SHOULD generate log messages to ONLY standard output and/or standard error. In this case the Provider Supervisor SHALL assume responsibility for all log lifecycle management. Provider implementations that deviate from the above recommendation SHALL clearly and unambiguously document the following: Logging configuration flags and/or variables, including working sample configurations. Default log destination(s) (where do the logs go if no configuration is specified?) Log lifecycle management ownership and related guidance (size limits, rate limits, rolling, archiving, expunging, etc.) applicable to the logging mechanism embedded within the Provider. Providers SHOULD NOT write potentially sensitive data to logs (e.g. secrets). Available Services Provider Packages MAY support all or a subset of CMI services; service combinations MAY be configurable at runtime by the Provider Supervisor. This specification does not dictate the mechanism by which mode of operation MUST be discovered, and instead places that burden upon the VM Provider. Misconfigured provider software SHOULD fail-fast with an OS-appropriate error code. Linux Capabilities Providers SHOULD clearly document any additionally required capabilities and/or security context. Cgroup Isolation A Provider MAY be constrained by cgroups. Resource Requirements VM Providers SHOULD unambiguously document all of a Provider’s resource requirements. Deploying Recommended: The MCM and Provider are typically expected to run as two containers inside a common Pod. However, for the security reasons they could execute on seperate Pods provided they have a secure way to exchange data between them. ","categories":"","description":"","excerpt":"Machine Error code handling Notational Conventions The keywords …","ref":"/docs/other-components/machine-controller-manager/machine_error_codes/","tags":"","title":"Machine Error Codes"},{"body":"Maintaining machine replicas using machines-sets Maintaining machine replicas using machines-sets Setting up your usage environment Important ⚠️ Creating machine-set Inspect status of machine-set Health monitoring Delete machine-set Setting up your usage environment Follow the steps described here Important ⚠️ Make sure that the kubernetes/machines_objects/machine-set.yaml points to the same class name as the kubernetes/machine_classes/aws-machine-class.yaml.\n Similarly kubernetes/machine_classes/aws-machine-class.yaml secret name and namespace should be same as that mentioned in kubernetes/secrets/aws-secret.yaml\n Creating machine-set Modify kubernetes/machine_objects/machine-set.yaml as per your requirement. You can modify the number of replicas to the desired number of machines. Then, create an machine-set: $ kubectl apply -f kubernetes/machine_objects/machine-set.yaml You should notice that the Machine Controller Manager has immediately picked up your manifest and started to create a new machines based on the number of replicas you have provided in the manifest.\n Check Machine Controller Manager machine-sets in the cluster $ kubectl get machineset NAME DESIRED CURRENT READY AGE test-machine-set 3 3 0 1m You will see a new machine-set with your given name\n Check Machine Controller Manager machines in the cluster: $ kubectl get machine NAME STATUS AGE test-machine-set-b57zs Pending 5m test-machine-set-c4bg8 Pending 5m test-machine-set-kvskg Pending 5m Now you will see N (number of replicas specified in the manifest) new machines whose names are prefixed with the machine-set object name that you created.\n After a few minutes (~3 minutes for AWS), you should notice new nodes joining the cluster. You can verify this by running: $ kubectl get nodes NAME STATUS AGE VERSION ip-10-250-0-234.eu-west-1.compute.internal Ready 3m v1.8.0 ip-10-250-15-98.eu-west-1.compute.internal Ready 3m v1.8.0 ip-10-250-6-21.eu-west-1.compute.internal Ready 2m v1.8.0 This shows how new nodes have joined your cluster\nInspect status of machine-set To inspect the status of any created machine-set run the following command: $ kubectl get machineset test-machine-set -o yaml apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineSet metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {\"apiVersion\":\"machine.sapcloud.io/v1alpha1\",\"kind\":\"MachineSet\",\"metadata\":{\"annotations\":{},\"name\":\"test-machine-set\",\"namespace\":\"\",\"test-label\":\"test-label\"},\"spec\":{\"minReadySeconds\":200,\"replicas\":3,\"selector\":{\"matchLabels\":{\"test-label\":\"test-label\"}},\"template\":{\"metadata\":{\"labels\":{\"test-label\":\"test-label\"}},\"spec\":{\"class\":{\"kind\":\"AWSMachineClass\",\"name\":\"test-aws\"}}}}} clusterName: \"\" creationTimestamp: 2017-12-27T08:37:42Z finalizers: - machine.sapcloud.io/operator generation: 0 initializers: null name: test-machine-set namespace: \"\" resourceVersion: \"12630893\" selfLink: /apis/machine.sapcloud.io/v1alpha1/test-machine-set uid: 3469faaa-eae1-11e7-a6c0-828f843e4186 spec: machineClass: {} minReadySeconds: 200 replicas: 3 selector: matchLabels: test-label: test-label template: metadata: creationTimestamp: null labels: test-label: test-label spec: class: kind: AWSMachineClass name: test-aws status: availableReplicas: 3 fullyLabeledReplicas: 3 machineSetCondition: null lastOperation: lastUpdateTime: null observedGeneration: 0 readyReplicas: 3 replicas: 3 Health monitoring If you try to delete/terminate any of the machines backing the machine-set by either talking to the Machine Controller Manager or from the cloud provider, the Machine Controller Manager recreates a matching healthy machine to replace the deleted machine. Similarly, if any of your machines are unreachable or in an unhealthy state (kubelet not ready / disk pressure) for longer than the configured timeout (~ 5mins), the Machine Controller Manager recreates the nodes to replace the unhealthy nodes. Delete machine-set To delete the VM using the kubernetes/machine_objects/machine-set.yaml: $ kubectl delete -f kubernetes/machine-set.yaml Now the Machine Controller Manager has immediately picked up your manifest and started to delete the existing VMs by talking to the cloud provider. Your nodes should be detached from the cluster in a few minutes (~1min for AWS).\n","categories":"","description":"","excerpt":"Maintaining machine replicas using machines-sets Maintaining machine …","ref":"/docs/other-components/machine-controller-manager/machine_set/","tags":"","title":"Machine Set"},{"body":"Manage certificates with Gardener for public domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please see Manage certificates with Gardener for default domain You want to use a certificate for a custom domain. If this is your case, please keep reading this article. Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Your custom domain is under a public top level domain (e.g. .com) Your custom zone is resolvable with a public resolver via the internet (e.g. 8.8.8.8) You have a custom DNS provider configured and working (see “DNS Providers”) As part of the Let’s Encrypt ACME challenge validation process, Gardener sets a DNS TXT entry and Let’s Encrypt checks if it can both resolve and authenticate it. Therefore, it’s important that your DNS-entries are publicly resolvable. You can check this by querying e.g. Googles public DNS server and if it returns an entry your DNS is publicly visible:\n# returns the A record for cert-example.example.com using Googles DNS server (8.8.8.8) dig cert-example.example.com @8.8.8.8 A DNS provider In order to issue certificates for a custom domain you need to specify a DNS provider which is permitted to create DNS records for subdomains of your requested domain in the certificate. For example, if you request a certificate for host.example.com your DNS provider must be capable of managing subdomains of host.example.com.\nDNS providers are normally specified in the shoot manifest. To learn more on how to configure one, please see the DNS provider documentation.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) Gateways (both Istio gateways and from the Gateway API) Certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will be created automatically.\nUsing an Ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed # Optional but recommended, this is going to create the DNS entry at the same time dns.gardener.cloud/class: garden dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/commonname: \"*.example.com\" # optional, if not specified the first name from spec.tls[].hosts is used as common name #cert.gardener.cloud/dnsnames: \"\" # optional, if not specified the names from spec.tls[].hosts are used #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" spec: tls: - hosts: # Must not exceed 64 characters. - amazing.example.com # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Replace the hosts and rules[].host value again with your own domain and adjust the remaining Ingress attributes in accordance with your deployment (e.g. the above is for an istio Ingress controller and forwards traffic to a service1 on port 80).\nUsing a Service of type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/secretname: tls-secret dns.gardener.cloud/dnsnames: example.example.com dns.gardener.cloud/class: garden # Optional dns.gardener.cloud/ttl: \"600\" cert.gardener.cloud/commonname: \"*.example.example.com\" cert.gardener.cloud/dnsnames: \"\" #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using a Gateway resource Please see Istio Gateways or Gateway API for details.\nUsing the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: amazing.example.com secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden # If delegated domain for DNS01 challenge should be used. This has only an effect if a CNAME record is set for # '_acme-challenge.amazing.example.com'. # For example: If a CNAME record exists '_acme-challenge.amazing.example.com' =\u003e '_acme-challenge.writable.domain.com', # the DNS challenge will be written to '_acme-challenge.writable.domain.com'. #followCNAME: true # optionally set labels for the secret #secretLabels: # key1: value1 # key2: value2 # Optionally specify the preferred certificate chain: if the CA offers multiple certificate chains, prefer the chain with an issuer matching this Subject Common Name. If no match, the default offered chain will be used. #preferredChain: \"ISRG Root X1\" # Optionally specify algorithm and key size for private key. Allowed algorithms: \"RSA\" (allowed sizes: 2048, 3072, 4096) and \"ECDSA\" (allowed sizes: 256, 384) # If not specified, RSA with 2048 is used. #privateKey: # algorithm: ECDSA # size: 384 Supported attributes Here is a list of all supported annotations regarding the certificate extension:\n Path Annotation Value Required Description N/A cert.gardener.cloud/purpose: managed Yes when using annotations Flag for Gardener that this specific Ingress or Service requires a certificate spec.commonName cert.gardener.cloud/commonname: E.g. “*.demo.example.com” or “special.example.com” Certificate and Ingress : No Service: Yes, if DNS names unset Specifies for which domain the certificate request will be created. If not specified, the names from spec.tls[].hosts are used. This entry must comply with the 64 character limit. spec.dnsNames cert.gardener.cloud/dnsnames: E.g. “special.example.com” Certificate and Ingress : No Service: Yes, if common name unset Additional domains the certificate should be valid for (Subject Alternative Name). If not specified, the names from spec.tls[].hosts are used. Entries in this list can be longer than 64 characters. spec.secretRef.name cert.gardener.cloud/secretname: any-name Yes for certificate and Service Specifies the secret which contains the certificate/key pair. If the secret is not available yet, it’ll be created automatically as soon as the certificate has been issued. spec.issuerRef.name cert.gardener.cloud/issuer: E.g. gardener No Specifies the issuer you want to use. Only necessary if you request certificates for custom domains. N/A cert.gardener.cloud/revoked: true otherwise always false No Use only to revoke a certificate, see reference for more details spec.followCNAME cert.gardener.cloud/follow-cname E.g. true No Specifies that the usage of a delegated domain for DNS challenges is allowed. Details see Follow CNAME. spec.preferredChain cert.gardener.cloud/preferred-chain E.g. ISRG Root X1 No Specifies the Common Name of the issuer for selecting the certificate chain. Details see Preferred Chain. spec.secretLabels cert.gardener.cloud/secret-labels for annotation use e.g. key1=value1,key2=value2 No Specifies labels for the certificate secret. spec.privateKey.algorithm cert.gardener.cloud/private-key-algorithm RSA, ECDSA No Specifies algorithm for private key generation. The default value is depending on configuration of the extension (default of the default is RSA). You may request a new certificate without privateKey settings to find out the concrete defaults in your Gardener. spec.privateKey.size cert.gardener.cloud/private-key-size \"256\", \"384\", \"2048\", \"3072\", \"4096\" No Specifies size for private key generation. Allowed values for RSA are 2048, 3072, and 4096. For ECDSA allowed values are 256 and 384. The default values are depending on the configuration of the extension (defaults of the default values are 3072 for RSA and 384 for ECDSA respectively). Request a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.example.com\" spec: tls: - hosts: - amazing.example.com secretName: tls-secret rules: - host: amazing.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nUsing a custom Issuer Most Gardener deployment with the certification extension enabled have a preconfigured garden issuer. It is also usually configured to use Let’s Encrypt as the certificate provider.\nIf you need a custom issuer for a specific cluster, please see Using a custom Issuer\nQuotas For security reasons there may be a default quota on the certificate requests per day set globally in the controller registration of the shoot-cert-service.\nThe default quota only applies if there is no explicit quota defined for the issuer itself with the field requestsPerDayQuota, e.g.:\nkind: Shoot ... spec: extensions: - type: shoot-cert-service providerConfig: apiVersion: service.cert.extensions.gardener.cloud/v1alpha1 kind: CertConfig issuers: - email: your-email@example.com name: custom-issuer # issuer name must be specified in every custom issuer request, must not be \"garden\" server: 'https://acme-v02.api.letsencrypt.org/directory' requestsPerDayQuota: 10 DNS Propagation As stated before, cert-manager uses the ACME challenge protocol to authenticate that you are the DNS owner for the domain’s certificate you are requesting. This works by creating a DNS TXT record in your DNS provider under _acme-challenge.example.example.com containing a token to compare with. The TXT record is only applied during the domain validation. Typically, the record is propagated within a few minutes. But if the record is not visible to the ACME server for any reasons, the certificate request is retried again after several minutes. This means you may have to wait up to one hour after the propagation problem has been resolved before the certificate request is retried. Take a look in the events with kubectl describe ingress example for troubleshooting.\nCharacter Restrictions Due to restriction of the common name to 64 characters, you may to leave the common name unset in such cases.\nFor example, the following request is invalid:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-invalid namespace: default spec: commonName: morethan64characters.ingress.shoot.project.default-domain.gardener.cloud But it is valid to request a certificate for this domain if you have left the common name unset:\napiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: dnsNames: - morethan64characters.ingress.shoot.project.default-domain.gardener.cloud References Gardener cert-management Managing DNS with Gardener ","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/guides/networking/certificate-extension/","tags":["task"],"title":"Manage Certificates with Gardener"},{"body":"Manage certificates with Gardener for default domain Introduction Dealing with applications on Kubernetes which offer a secure service endpoints (e.g. HTTPS) also require you to enable a secured communication via SSL/TLS. With the certificate extension enabled, Gardener can manage commonly trusted X.509 certificate for your application endpoint. From initially requesting certificate, it also handeles their renewal in time using the free Let’s Encrypt API.\nThere are two senarios with which you can use the certificate extension\n You want to use a certificate for a subdomain the shoot’s default DNS (see .spec.dns.domain of your shoot resource, e.g. short.ingress.shoot.project.default-domain.gardener.cloud). If this is your case, please keep reading this article. You want to use a certificate for a custom domain. If this is your case, please see Manage certificates with Gardener for public domain Prerequisites Before you start this guide there are a few requirements you need to fulfill:\n You have an existing shoot cluster Since you are using the default DNS name, all DNS configuration should already be done and ready.\nIssue a certificate Every X.509 certificate is represented by a Kubernetes custom resource certificate.cert.gardener.cloud in your cluster. A Certificate resource may be used to initiate a new certificate request as well as to manage its lifecycle. Gardener’s certificate service regularly checks the expiration timestamp of Certificates, triggers a renewal process if necessary and replaces the existing X.509 certificate with a new one.\n Your application should be able to reload replaced certificates in a timely manner to avoid service disruptions.\n Certificates can be requested via 3 resources type\n Ingress Service (type LoadBalancer) certificate (Gardener CRD) If either of the first 2 are used, a corresponding Certificate resource will automatically be created.\nUsing an ingress Resource apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\"spec: tls: - hosts: # Must not exceed 64 characters. - short.ingress.shoot.project.default-domain.gardener.cloud # Certificate and private key reside in this secret. secretName: tls-secret rules: - host: short.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Using a service type LoadBalancer apiVersion: v1 kind: Service metadata: annotations: cert.gardener.cloud/purpose: managed # Certificate and private key reside in this secret. cert.gardener.cloud/secretname: tls-secret # You may add more domains separated by commas (e.g. \"service.shoot.project.default-domain.gardener.cloud, amazing.shoot.project.default-domain.gardener.cloud\") dns.gardener.cloud/dnsnames: \"service.shoot.project.default-domain.gardener.cloud\" dns.gardener.cloud/ttl: \"600\" #cert.gardener.cloud/issuer: custom-issuer # optional to specify custom issuer (use namespace/name for shoot issuers) #cert.gardener.cloud/follow-cname: \"true\" # optional, same as spec.followCNAME in certificates #cert.gardener.cloud/secret-labels: \"key1=value1,key2=value2\" # optional labels for the certificate secret #cert.gardener.cloud/preferred-chain: \"chain name\" # optional to specify preferred-chain (value is the Subject Common Name of the root issuer) #cert.gardener.cloud/private-key-algorithm: ECDSA # optional to specify algorithm for private key, allowed values are 'RSA' or 'ECDSA' #cert.gardener.cloud/private-key-size: \"384\" # optional to specify size of private key, allowed values for RSA are \"2048\", \"3072\", \"4096\" and for ECDSA \"256\" and \"384\" name: test-service namespace: default spec: ports: - name: http port: 80 protocol: TCP targetPort: 8080 type: LoadBalancer Using the custom Certificate resource apiVersion: cert.gardener.cloud/v1alpha1 kind: Certificate metadata: name: cert-example namespace: default spec: commonName: short.ingress.shoot.project.default-domain.gardener.cloud secretRef: name: tls-secret namespace: default # Optionnal if using the default issuer issuerRef: name: garden If you’re interested in the current progress of your request, you’re advised to consult the description, more specifically the status attribute in case the issuance failed.\nRequest a wildcard certificate In order to avoid the creation of multiples certificates for every single endpoints, you may want to create a wildcard certificate for your shoot’s default cluster.\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: cert.gardener.cloud/purpose: managed cert.gardener.cloud/commonName: \"*.ingress.shoot.project.default-domain.gardener.cloud\" spec: tls: - hosts: - amazing.ingress.shoot.project.default-domain.gardener.cloud secretName: tls-secret rules: - host: amazing.ingress.shoot.project.default-domain.gardener.cloud http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 Please note that this can also be achived by directly adding an annotation to a Service type LoadBalancer. You could also create a Certificate object with a wildcard domain.\nMore information For more information and more examples about using the certificate extension, please see Manage certificates with Gardener for public domain\n","categories":"","description":"Use the Gardener cert-management to get fully managed, publicly trusted TLS certificates","excerpt":"Use the Gardener cert-management to get fully managed, publicly …","ref":"/docs/guides/networking/certificate-extension-default-domain/","tags":["task"],"title":"Manage Certificates with Gardener for Default Domain"},{"body":"ManagedSeeds: Register Shoot as Seed An existing shoot can be registered as a seed by creating a ManagedSeed resource. This resource contains:\n The name of the shoot that should be registered as seed. A gardenlet section that contains: gardenlet deployment parameters, such as the number of replicas, the image, etc. The GardenletConfiguration resource that contains controllers configuration, feature gates, and a seedConfig section that contains the Seed spec and parts of its metadata. Additional configuration parameters, such as the garden connection bootstrap mechanism (see TLS Bootstrapping), and whether to merge the provided configuration with the configuration of the parent gardenlet. gardenlet is deployed to the shoot, and it registers a new seed upon startup based on the seedConfig section.\n Note: Earlier Gardener allowed specifying a seedTemplate directly in the ManagedSeed resource. This feature is discontinued, any seed configuration must be via the GardenletConfiguration.\n Note the following important aspects:\n Unlike the Seed resource, the ManagedSeed resource is namespaced. Currently, managed seeds are restricted to the garden namespace. The newly created Seed resource always has the same name as the ManagedSeed resource. Attempting to specify a different name in the seedConfig will fail. The ManagedSeed resource must always refer to an existing shoot. Attempting to create a ManagedSeed referring to a non-existing shoot will fail. A shoot that is being referred to by a ManagedSeed cannot be deleted. Attempting to delete such a shoot will fail. You can omit practically everything from the gardenlet section, including all or most of the Seed spec fields. Proper defaults will be supplied in all cases, based either on the most common use cases or the information already available in the Shoot resource. Also, if your seed is configured to host HA shoot control planes, then gardenlet will be deployed with multiple replicas across nodes or availability zones by default. Some Seed spec fields, for example the provider type and region, networking CIDRs for pods, services, and nodes, etc., must be the same as the corresponding Shoot spec fields of the shoot that is being registered as seed. Attempting to use different values (except empty ones, so that they are supplied by the defaulting mechanims) will fail. Deploying gardenlet to the Shoot To register a shoot as a seed and deploy gardenlet to the shoot using a default configuration, create a ManagedSeed resource similar to the following:\napiVersion: seedmanagement.gardener.cloud/v1alpha1 kind: ManagedSeed metadata: name: my-managed-seed namespace: garden spec: shoot: name: crazy-botany gardenlet: {} For an example that uses non-default configuration, see 55-managed-seed-gardenlet.yaml\nRenewing the Gardenlet Kubeconfig Secret In order to make the ManagedSeed controller renew the gardenlet’s kubeconfig secret, annotate the ManagedSeed with gardener.cloud/operation=renew-kubeconfig. This will trigger a reconciliation during which the kubeconfig secret is deleted and the bootstrapping is performed again (during which gardenlet obtains a new client certificate).\nIt is also possible to trigger the renewal on the secret directly, see Rotate Certificates Using Bootstrap kubeconfig.\nSpecifying apiServer replicas and autoscaler Options There are few configuration options that are not supported in a Shoot resource but due to backward compatibility reasons it is possible to specify them for a Shoot that is referred by a ManagedSeed. These options are:\n Option Description apiServer.autoscaler.minReplicas Controls the minimum number of kube-apiserver replicas for the shoot registered as seed cluster. apiServer.autoscaler.maxReplicas Controls the maximum number of kube-apiserver replicas for the shoot registered as seed cluster. apiServer.replicas Controls how many kube-apiserver replicas the shoot registered as seed cluster gets by default. It is possible to specify these options via the shoot.gardener.cloud/managed-seed-api-server annotation on the Shoot resource. Example configuration:\n annotations: shoot.gardener.cloud/managed-seed-api-server: \"apiServer.replicas=3,apiServer.autoscaler.minReplicas=3,apiServer.autoscaler.maxReplicas=6\" Enforced Configuration Options The following configuration options are enforced by Gardener API server for the ManagedSeed resources:\n The vertical pod autoscaler should be enabled from the Shoot specification.\nThe vertical pod autoscaler is a prerequisite for a Seed cluster. It is possible to enable the VPA feature for a Seed (using the Seed spec) and for a Shoot (using the Shoot spec). In context of ManagedSeeds, enabling the VPA in the Seed spec (instead of the Shoot spec) offers less flexibility and increases the network transfer and cost. Due to these reasons, the Gardener API server enforces the vertical pod autoscaler to be enabled from the Shoot specification.\n The nginx-ingress addon should not be enabled for a Shoot referred by a ManagedSeed.\nAn Ingress controller is also a prerequisite for a Seed cluster. For a Seed cluster, it is possible to enable Gardener managed Ingress controller or to deploy self-managed Ingress controller. There is also the nginx-ingress addon that can be enabled for a Shoot (using the Shoot spec). However, the Shoot nginx-ingress addon is in deprecated mode and it is not recommended for production clusters. Due to these reasons, the Gardener API server does not allow the Shoot nginx-ingress addon to be enabled for ManagedSeeds.\n ","categories":"","description":"","excerpt":"ManagedSeeds: Register Shoot as Seed An existing shoot can be …","ref":"/docs/gardener/managed_seed/","tags":"","title":"Managed Seed"},{"body":"Deploy Resources to the Shoot Cluster We have introduced a component called gardener-resource-manager that is deployed as part of every shoot control plane in the seed. One of its tasks is to manage CRDs, so called ManagedResources. Managed resources contain Kubernetes resources that shall be created, reconciled, updated, and deleted by the gardener-resource-manager.\nExtension controllers may create these ManagedResources in the shoot namespace if they need to create any resource in the shoot cluster itself, for example RBAC roles (or anything else).\nWhere can I find more examples and more information how to use ManagedResources? Please take a look at the respective documentation.\n","categories":"","description":"","excerpt":"Deploy Resources to the Shoot Cluster We have introduced a component …","ref":"/docs/gardener/extensions/managedresources/","tags":"","title":"Managedresources"},{"body":"Request DNS Names in Shoot Clusters Introduction Within a shoot cluster, it is possible to request DNS records via the following resource types:\n Ingress Service DNSEntry It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-dns-service extension. This extension uses the seed’s dns management infrastructure to maintain DNS names for shoot clusters. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In some Gardener setups the shoot-dns-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-dns-service ... Before you start You should :\n Have created a shoot cluster Have created and correctly configured a DNS Provider (Please consult this page for more information) Have a basic understanding of DNS (see link under References) There are 2 types of DNS that you can use within Kubernetes :\n internal (usually managed by coreDNS) external (managed by a public DNS provider). This page, and the extension, exclusively works for external DNS handling.\nGardener allows 2 way of managing your external DNS:\n Manually, which means you are in charge of creating / maintaining your Kubernetes related DNS entries Via the Gardener DNS extension Gardener DNS extension The managed external DNS records feature of the Gardener clusters makes all this easier. You do not need DNS service provider specific knowledge, and in fact you do not need to leave your cluster at all to achieve that. You simply annotate the Ingress / Service that needs its DNS records managed and it will be automatically created / managed by Gardener.\nManaged external DNS records are supported with the following DNS provider types:\n aws-route53 azure-dns azure-private-dns google-clouddns openstack-designate alicloud-dns cloudflare-dns Request DNS records for Ingress resources To request a DNS name for Ingress, Service or Gateway (Istio or Gateway API) objects in the shoot cluster it must be annotated with the DNS class garden and an annotation denoting the desired DNS names.\nExample for an annotated Ingress resource:\napiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: amazing-ingress annotations: # Let Gardener manage external DNS records for this Ingress. dns.gardener.cloud/dnsnames: special.example.com # Use \"*\" to collects domains names from .spec.rules[].host dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden # If you are delegating the certificate management to Gardener, uncomment the following line #cert.gardener.cloud/purpose: managed spec: rules: - host: special.example.com http: paths: - pathType: Prefix path: \"/\" backend: service: name: amazing-svc port: number: 8080 # Uncomment the following part if you are delegating the certificate management to Gardener #tls: # - hosts: # - special.example.com # secretName: my-cert-secret-name For an Ingress, the DNS names are already declared in the specification. Nevertheless the dnsnames annotation must be present. Here a subset of the DNS names of the ingress can be specified. If DNS names for all names are desired, the value all can be used.\nKeep in mind that ingress resources are ignored unless an ingress controller is set up. Gardener does not provide an ingress controller by default. For more details, see Ingress Controllers and Service in the Kubernetes documentation.\nRequest DNS records for service type LoadBalancer Example for an annotated Service (it must have the type LoadBalancer) resource:\napiVersion: v1 kind: Service metadata: name: amazing-svc annotations: # Let Gardener manage external DNS records for this Service. dns.gardener.cloud/dnsnames: special.example.com dns.gardener.cloud/ttl: \"600\" dns.gardener.cloud/class: garden spec: selector: app: amazing-app ports: - protocol: TCP port: 80 targetPort: 8080 type: LoadBalancer Request DNS records for Gateway resources Please see Istio Gateways or Gateway API for details.\nCreating a DNSEntry resource explicitly It is also possible to create a DNS entry via the Kubernetes resource called DNSEntry:\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSEntry metadata: annotations: # Let Gardener manage this DNS entry. dns.gardener.cloud/class: garden name: special-dnsentry namespace: default spec: dnsName: special.example.com ttl: 600 targets: - 1.2.3.4 If one of the accepted DNS names is a direct subname of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nYou can check the status of the DNSEntry with\n$ kubectl get dnsentry NAME DNS TYPE PROVIDER STATUS AGE mydnsentry special.example.com aws-route53 default/aws Ready 24s As soon as the status of the entry is Ready, the provider has accepted the new DNS record. Depending on the provider and your DNS settings and cache, it may take up to 24 hours for the new entry to be propagated over all internet.\nMore examples can be found here\nRequest DNS records for Service/Ingress resources using a DNSAnnotation resource In rare cases it may not be possible to add annotations to a Service or Ingress resource object.\nE.g.: the helm chart used to deploy the resource may not be adaptable for some reasons or some automation is used, which always restores the original content of the resource object by dropping any additional annotations.\nIn these cases, it is recommended to use an additional DNSAnnotation resource in order to have more flexibility that DNSentry resources. The DNSAnnotation resource makes the DNS shoot service behave as if annotations have been added to the referenced resource.\nFor the Ingress example shown above, you can create a DNSAnnotation resource alternatively to provide the annotations.\napiVersion: dns.gardener.cloud/v1alpha1 kind: DNSAnnotation metadata: annotations: dns.gardener.cloud/class: garden name: test-ingress-annotation namespace: default spec: resourceRef: kind: Ingress apiVersion: networking.k8s.io/v1 name: test-ingress namespace: default annotations: dns.gardener.cloud/dnsnames: '*' dns.gardener.cloud/class: garden Note that the DNSAnnotation resource itself needs the dns.gardener.cloud/class=garden annotation. This also only works for annotations known to the DNS shoot service (see Accepted External DNS Records Annotations).\nFor more details, see also DNSAnnotation objects\nAccepted External DNS Records Annotations Here are all of the accepted annotation related to the DNS extension:\n Annotation Description dns.gardener.cloud/dnsnames Mandatory for service and ingress resources, accepts a comma-separated list of DNS names if multiple names are required. For ingress you can use the special value '*'. In this case, the DNS names are collected from .spec.rules[].host. dns.gardener.cloud/class Mandatory, in the context of the shoot-dns-service it must always be set to garden. dns.gardener.cloud/ttl Recommended, overrides the default Time-To-Live of the DNS record. dns.gardener.cloud/cname-lookup-interval Only relevant if multiple domain name targets are specified. It specifies the lookup interval for CNAMEs to map them to IP addresses (in seconds) dns.gardener.cloud/realms Internal, for restricting provider access for shoot DNS entries. Typcially not set by users of the shoot-dns-service. dns.gardener.cloud/ip-stack Only relevant for provider type aws-route53 if target is an AWS load balancer domain name. Can be set for service, ingress and DNSEntry resources. It specify which DNS records with alias targets are created instead of the usual CNAME records. If the annotation is not set (or has the value ipv4), only an A record is created. With value dual-stack, both A and AAAA records are created. With value ipv6 only an AAAA record is created. service.beta.kubernetes.io/aws-load-balancer-ip-address-type=dualstack For services, behaves similar to dns.gardener.cloud/ip-stack=dual-stack. loadbalancer.openstack.org/load-balancer-address Internal, for services only: support for PROXY protocol on Openstack (which needs a hostname as ingress). Typcially not set by users of the shoot-dns-service. If one of the accepted DNS names is a direct subdomain of the shoot’s ingress domain, this is already handled by the standard wildcard entry for the ingress domain. Therefore, this name should be excluded from the dnsnames list in the annotation. If only this DNS name is configured in the ingress, no explicit DNS entry is required, and the DNS annotations should be omitted at all.\nTroubleshooting General DNS tools To check the DNS resolution, use the nslookup or dig command.\n$ nslookup special.your-domain.com or with dig\n$ dig +short special.example.com Depending on your network settings, you may get a successful response faster using a public DNS server (e.g. 8.8.8.8, 8.8.4.4, or 1.1.1.1) dig @8.8.8.8 +short special.example.com DNS record events The DNS controller publishes Kubernetes events for the resource which requested the DNS record (Ingress, Service, DNSEntry). These events reveal more information about the DNS requests being processed and are especially useful to check any kind of misconfiguration, e.g. requests for a domain you don’t own.\nEvents for a successfully created DNS record:\n$ kubectl describe service my-service Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal dns-annotation 19s dns-controller-manager special.example.com: dns entry is pending Normal dns-annotation 19s (x3 over 19s) dns-controller-manager special.example.com: dns entry pending: waiting for dns reconciliation Normal dns-annotation 9s (x3 over 10s) dns-controller-manager special.example.com: dns entry active Please note, events vanish after their retention period (usually 1h).\nDNSEntry status DNSEntry resources offer a .status sub-resource which can be used to check the current state of the object.\nStatus of a erroneous DNSEntry.\n status: message: No responsible provider found observedGeneration: 3 provider: remote state: Error References Understanding DNS Kubernetes Internal DNS DNSEntry API (Golang) Managing Certificates with Gardener ","categories":"","description":"Setup Gardener-managed DNS records in cluster.","excerpt":"Setup Gardener-managed DNS records in cluster.","ref":"/docs/guides/networking/dns-extension/","tags":"","title":"Managing DNS with Gardener"},{"body":"Hugo uses Markdown for its simple content format. However, there are a lot of things that Markdown doesn’t support well. You could use pure HTML to expand possibilities. A typical example is reducing the original dimensions of an image.\nHowever, use HTML judicially and to the minimum extent possible. Using HTML in markdowns makes it harder to maintain and publish coherent documentation bundles. This is a job typically performed by a publishing platform mechanisms, such as Hugo’s layouts. Considering that the source documentation might be published by multiple platforms you should be considerate in using markup that may bind it to a particular one.\nFor the same reason, avoid inline scripts and styles in your content. If you absolutely need to use them and they are not working as expected, please create a documentation issue and describe your case.\nTip Markdown is great for its simplicity but may be also constraining for the same reason. Before looking at HTML to make up for that, first check the shortcodes for alternatives. ","categories":"","description":"","excerpt":"Hugo uses Markdown for its simple content format. However, there are a …","ref":"/docs/contribute/documentation/markup/","tags":"","title":"Markdown"},{"body":"Monitoring etcd-druid uses Prometheus for metrics reporting. The metrics can be used for real-time monitoring and debugging of compaction jobs.\nThe simplest way to see the available metrics is to cURL the metrics endpoint /metrics. The format is described here.\nFollow the Prometheus getting started doc to spin up a Prometheus server to collect etcd metrics.\nThe naming of metrics follows the suggested Prometheus best practices. All compaction related metrics are put under namespace etcddruid and the respective subsystems.\nSnapshot Compaction These metrics provide information about the compaction jobs that run after some interval in shoot control planes. Studying the metrics, we can deduce how many compaction job ran successfully, how many failed, how many delta events compacted etc.\n Name Description Type etcddruid_compaction_jobs_total Total number of compaction jobs initiated by compaction controller. Counter etcddruid_compaction_jobs_current Number of currently running compaction job. Gauge etcddruid_compaction_job_duration_seconds Total time taken in seconds to finish a running compaction job. Histogram etcddruid_compaction_num_delta_events Total number of etcd events to be compacted by a compaction job. Gauge There are two labels for etcddruid_compaction_jobs_total metrics. The label succeeded shows how many of the compaction jobs are succeeded and label failed shows how many of compaction jobs are failed.\nThere are two labels for etcddruid_compaction_job_duration_seconds metrics. The label succeeded shows how much time taken by a successful job to complete and label failed shows how much time taken by a failed compaction job.\netcddruid_compaction_jobs_current metric comes with label etcd_namespace that indicates the namespace of the Etcd running in the control plane of a shoot cluster..\nEtcd These metrics are exposed by the etcd process that runs in each etcd pod.\nThe following list metrics is applicable to clustering of a multi-node etcd cluster. The full list of metrics exposed by etcd is available here.\n No. Metrics Name Description Comments 1 etcd_disk_wal_fsync_duration_seconds latency distributions of fsync called by WAL. High disk operation latencies indicate disk issues. 2 etcd_disk_backend_commit_duration_seconds latency distributions of commit called by backend. High disk operation latencies indicate disk issues. 3 etcd_server_has_leader whether or not a leader exists. 1: leader exists, 0: leader not exists. To capture quorum loss or to check the availability of etcd cluster. 4 etcd_server_is_leader whether or not this member is a leader. 1 if it is, 0 otherwise. 5 etcd_server_leader_changes_seen_total number of leader changes seen. Helpful in fine tuning the zonal cluster like etcd-heartbeat time etc, it can also indicates the etcd load and network issues. 6 etcd_server_is_learner whether or not this member is a learner. 1 if it is, 0 otherwise. 7 etcd_server_learner_promote_successes total number of successful learner promotions while this member is leader. Might be helpful in checking the success of API calls called by backup-restore. 8 etcd_network_client_grpc_received_bytes_total total number of bytes received from grpc clients. Client Traffic In. 9 etcd_network_client_grpc_sent_bytes_total total number of bytes sent to grpc clients. Client Traffic Out. 10 etcd_network_peer_sent_bytes_total total number of bytes sent to peers. Useful for network usage. 11 etcd_network_peer_received_bytes_total total number of bytes received from peers. Useful for network usage. 12 etcd_network_active_peers current number of active peer connections. Might be useful in detecting issues like network partition. 13 etcd_server_proposals_committed_total total number of consensus proposals committed. A consistently large lag between a single member and its leader indicates that member is slow or unhealthy. 14 etcd_server_proposals_pending current number of pending proposals to commit. Pending proposals suggests there is a high client load or the member cannot commit proposals. 15 etcd_server_proposals_failed_total total number of failed proposals seen. Might indicates downtime caused by a loss of quorum. 16 etcd_server_proposals_applied_total total number of consensus proposals applied. Difference between etcd_server_proposals_committed_total and etcd_server_proposals_applied_total should usually be small. 17 etcd_mvcc_db_total_size_in_bytes total size of the underlying database physically allocated in bytes. 18 etcd_server_heartbeat_send_failures_total total number of leader heartbeat send failures. Might be helpful in fine-tuning the cluster or detecting slow disk or any network issues. 19 etcd_network_peer_round_trip_time_seconds round-trip-time histogram between peers. Might be helpful in fine-tuning network usage specially for zonal etcd cluster. 20 etcd_server_slow_apply_total total number of slow apply requests. Might indicate overloaded from slow disk. 21 etcd_server_slow_read_indexes_total total number of pending read indexes not in sync with leader’s or timed out read index requests. The full list of metrics is available here.\nEtcd-Backup-Restore These metrics are exposed by the etcd-backup-restore container in each etcd pod.\nThe following list metrics is applicable to clustering of a multi-node etcd cluster. The full list of metrics exposed by etcd-backup-restore is available here.\n No. Metrics Name Description 1. etcdbr_cluster_size to capture the scale-up/scale-down scenarios. 2. etcdbr_is_learner whether or not this member is a learner. 1 if it is, 0 otherwise. 3. etcdbr_is_learner_count_total total number times member added as the learner. 4. etcdbr_restoration_duration_seconds total latency distribution required to restore the etcd member. 5. etcdbr_add_learner_duration_seconds total latency distribution of adding the etcd member as a learner to the cluster. 6. etcdbr_member_remove_duration_seconds total latency distribution removing the etcd member from the cluster. 7. etcdbr_member_promote_duration_seconds total latency distribution of promoting the learner to the voting member. 8. etcdbr_defragmentation_duration_seconds total latency distribution of defragmentation of each etcd cluster member. Prometheus supplied metrics The Prometheus client library provides a number of metrics under the go and process namespaces.\n","categories":"","description":"","excerpt":"Monitoring etcd-druid uses Prometheus for metrics reporting. The …","ref":"/docs/other-components/etcd-druid/metrics/","tags":"","title":"Metrics"},{"body":"Migrate Azure Shoot Load Balancer from basic to standard SKU This guide descibes how to migrate the Load Balancer of an Azure Shoot cluster from the basic SKU to the standard SKU. Be aware: You need to delete and recreate all services of type Load Balancer, which means that the public ip addresses of your service endpoints will change. Please do this only if the Stakeholder really needs to migrate this Shoot to use standard Load Balancers. All new Shoot clusters will automatically use Azure Standard Load Balancers.\n Disable temporarily Gardeners reconciliation.\nThe Gardener Controller Manager need to be configured to allow ignoring Shoot clusters. This can be configured in its the ControllerManagerConfiguration via the field .controllers.shoot.respectSyncPeriodOverwrite=\"true\". # In the Garden cluster. kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/ignore=\"true\" # In the Seed cluster. kubectl -n \u003cshoot-namespace\u003e scale deployment gardener-resource-manager --replicas=0 Backup all Kubernetes services of type Load Balancer. # In the Shoot cluster. # Determine all Load Balancer services. kubectl get service --all-namespaces | grep LoadBalancer # Backup each Load Balancer service. echo \"---\" \u003e\u003e service-backup.yaml \u0026\u0026 kubectl -n \u003cnamespace\u003e get service \u003cservice-name\u003e -o yaml \u003e\u003e service-backup.yaml Delete all Load Balancer services. # In the Shoot cluster. kubectl -n \u003cnamespace\u003e delete service \u003cservice-name\u003e Wait until until Load Balancer is deleted. Wait until all services of type Load Balancer are deleted and the Azure Load Balancer resource is also deleted. Check via the Azure Portal if the Load Balancer within the Shoot Resource Group has been deleted. This should happen automatically after all Kubernetes Load Balancer service are gone within a few minutes. Alternatively the Azure cli can be used to check the Load Balancer in the Shoot Resource Group. The credentials to configure the cli are available on the Seed cluster in the Shoot namespace.\n# In the Seed cluster. # Fetch the credentials from cloudprovider secret. kubectl -n \u003cshoot-namespace\u003e get secret cloudprovider -o yaml # Configure the Azure cli, with the base64 decoded values of the cloudprovider secret. az login --service-principal --username \u003cclientID\u003e --password \u003cclientSecret\u003e --tenant \u003ctenantID\u003e az account set -s \u003csubscriptionID\u003e # Fetch the constantly the Shoot Load Balancer in the Shoot Resource Group. Wait until the resource is gone. watch 'az network lb show -g shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e -n shoot--\u003cproject-name\u003e--\u003cshoot-name\u003e' # Logout. az logout Modify the cloud-povider-config configmap in the Seed namespace of the Shoot. The key cloudprovider.conf contains the Kubernetes cloud-provider configuration. The value is a multiline string. Please change the value of the field loadBalancerSku from basic to standard. Iff the field does not exists then append loadBalancerSku: \\\"standard\\\"\\n to the value/string. # In the Seed cluster. kubectl -n \u003cshoot-namespace\u003e edit cm cloud-provider-config Enable Gardeners reconcilation and trigger a reconciliation. # In the Garden cluster # Enable reconcilation kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/ignore- # Trigger reconcilation kubectl annotate shoot \u003cshoot-name\u003e shoot.garden.sapcloud.io/operation=\"reconcile\" Wait until the cluster has been reconciled.\nRecreate the services from the backup file. Probably you need to remove some fields from the service defintions e.g. .spec.clusterIP, .metadata.uid or .status etc. kubectl apply -f service-backup.yaml If successful remove backup file. # Delete the backup file. rm -f service-backup.yaml ","categories":"","description":"","excerpt":"Migrate Azure Shoot Load Balancer from basic to standard SKU This …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/migrate-loadbalancer/","tags":"","title":"Migrate Loadbalancer"},{"body":"Control Plane Migration Control Plane Migration is a new Gardener feature that has been recently implemented as proposed in GEP-7 Shoot Control Plane Migration. It should be properly supported by all extensions controllers. This document outlines some important points that extension maintainers should keep in mind to properly support migration in their extensions.\nOverall Principles The following principles should always be upheld:\n All states maintained by the extension that is external from the seed cluster, for example infrastructure resources in a cloud provider, DNS entries, etc., should be kept during the migration. No such state should be deleted and then recreated, as this might cause disruption in the availability of the shoot cluster. All Kubernetes resources maintained by the extension in the shoot cluster itself should also be kept during the migration. No such resources should be deleted and then recreated. Migrate and Restore Operations Two new operations have been introduced in Gardener. They can be specified as values of the gardener.cloud/operation annotation on an extension resource to indicate that an operation different from a normal reconcile should be performed by the corresponding extension controller:\n The migrate operation is used to ask the extension controller in the source seed to stop reconciling extension resources (in case they are requeued due to errors) and perform cleanup activities, if such are required. These cleanup activities might involve removing finalizers on resources in the shoot namespace that have been previously created by the extension controller and deleting them without actually deleting any resources external to the seed cluster. This is also the last opportunity for extensions to persist their state into the .status.state field of the reconciled extension resource before its restored in the new destination seed cluster. The restore operation is used to ask the extension controller in the destination seed to restore any state saved in the extension resource status, before performing the actual reconciliation. Unlike the reconcile operation, extension controllers must remove the gardener.cloud/operation annotation at the end of a successful reconciliation when the current operation is migrate or restore, not at the beginning of a reconciliation.\nCleaning-Up Source Seed Resources All resources in the source seed that have been created by an extension controller, for example secrets, config maps, managed resources, etc., should be properly cleaned up by the extension controller when the current operation is migrate. As mentioned above, such resources should be deleted without actually deleting any resources external to the seed cluster.\nThere is one exception to this: Secrets labeled with persist=true created via the secrets manager. They should be kept (i.e., the Cleanup function of secrets manager should not be called) and will be garbage collected automatically at the end of the migrate operation. This ensures that they can be properly persisted in the ShootState resource and get restored on the new destination seed cluster.\nFor many custom resources, for example MCM resources, the above requirement means in practice that any finalizers should be removed before deleting the resource, in addition to ensuring that the resource deletion is not reconciled by its respective controller if there is no finalizer. For managed resources, the above requirement means in practice that the spec.keepObjects field should be set to true before deleting the extension resource.\nHere it is assumed that any resources that contain state needed by the extension controller can be safely deleted, since any such state has been saved as described in Saving and Restoring Extension States at the end of the last successful reconciliation.\nSaving and Restoring Extension States Some extension controllers create and maintain their own state when reconciling extension resources. For example, most infrastructure controllers use Terraform and maintain the terraform state in a special config map in the shoot namespace. This state must be properly migrated to the new seed cluster during control plane migration, so that subsequent reconciliations in the new seed could find and use it appropriately.\nAll extension controllers that require such state migration must save their state in the status.state field of their extension resource at the end of a successful reconciliation. They must also restore their state from that same field upon reconciling an extension resource when the current operation is restore, as specified by the gardener.cloud/operation annotation, before performing the actual reconciliation.\nAs an example, an infrastructure controller that uses Terraform must save the terraform state in the status.state field of the Infrastructure resource. An Infrastructure resource with a properly saved state might look as follows:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig resourceGroup: name: mygroup ... status: state: |{ \"version\": 3, \"terraform_version\": \"0.11.14\", \"serial\": 2, \"lineage\": \"3a1e2faa-e7b6-f5f0-5043-368dd8ea6c10\", ... } Extension controllers that do not use a saved state and therefore do not require state migration could leave the status.state field as nil at the end of a successful reconciliation, and just perform a normal reconciliation when the current operation is restore.\nIn addition, extension controllers that use referenced resources (usually secrets) must also make sure that these resources are added to the status.resources field of their extension resource at the end of a successful reconciliation, so they could be properly migrated by Gardener to the destination seed.\nImplementation Details Migrate and Restore Actuator Methods Most extension controller implementations follow a common pattern where a generic Reconciler implementation delegates to an Actuator interface that contains the methods Reconcile and Delete, provided by the extension. Two methods Migrate and Restore are available in all such Actuator interfaces, see the infrastructure Actuator interface as an example. These methods are called by the generic reconcilers for the migrate and restore operations respectively, and should be implemented by the extension according to the above guidelines.\nExtension Controllers Based on Generic Actuators In practice, the implementation of many extension controllers (for example, the ControlPlane and Worker controllers in most provider extensions) are based on a generic Actuator implementation that only delegates to extension methods for behavior that is truly provider specific. In all such cases, the Migrate and Restore methods have already been implemented properly in the generic actuators and there is nothing more to do in the extension itself.\nIn some rare cases, extension controllers based on a generic actuator might still introduce a custom Actuator implementation to override some of the generic actuator methods in order to enhance or change their behavior in a certain way. In such cases, the Migrate and Restore methods might need to be overridden as well, see the Azure controlplane controller as an example.\nWorker State Note that the machine state is handled specially by gardenlet (i.e., all relevant objects in the machine.sapcloud.io/v1alpha1 API are directly persisted by gardenlet and NOT by the generic actuators). In the past, they were persisted to the Worker’s .status.state field by the so-called “worker state reconciler”, however, this reconciler was dropped and changed as part of GEP-22. Nowadays, gardenlet directly writes the state to the ShootState resource during the Migrate phase of a Shoot (without the detour of the Worker’s .status.state field). On restoration, unlike for other extension kinds, gardenlet no longer populates the machine state into the Worker’s .status.state field. Instead, the extension controller should read the machine state directly from the ShootState in the garden cluster (see this document for information how to access the garden cluster) and use it to subsequently restore the relevant machine.sapcloud.io/v1alpha1 resources. This flow is implemented in the generic Worker actuator. As a result, Extension controllers using this generic actuator do not need to implement any custom logic.\nExtension Controllers Not Based on Generic Actuators The implementation of some extension controllers (for example, the infrastructure controllers in all provider extensions) are not based on a generic Actuator implementation. Such extension controllers must always provide a proper implementation of the Migrate and Restore methods according to the above guidelines, see the AWS infrastructure controller as an example. In practice, this might result in code duplication between the different extensions, since the Migrate and Restore code is usually not provider or OS-specific.\n If you do not use the generic Worker actuator, see this section for information how to handle the machine state related to the Worker resource.\n ","categories":"","description":"","excerpt":"Control Plane Migration Control Plane Migration is a new Gardener …","ref":"/docs/gardener/extensions/migration/","tags":"","title":"Migration"},{"body":"Migration from Gardener v0 to v1 Please refer to the document for older Gardener versions.\n","categories":"","description":"","excerpt":"Migration from Gardener v0 to v1 Please refer to the document for …","ref":"/docs/gardener/deployment/migration_v0_to_v1/","tags":"","title":"Migration V0 To V1"},{"body":"Monitoring Work In Progress We will be introducing metrics for Dependency-Watchdog-Prober and Dependency-Watchdog-Weeder. These metrics will be pushed to prometheus. Once that is completed we will provide details on all the metrics that will be supported here.\n","categories":"","description":"","excerpt":"Monitoring Work In Progress We will be introducing metrics for …","ref":"/docs/other-components/dependency-watchdog/deployment/monitor/","tags":"","title":"Monitor"},{"body":"Monitoring The shoot-rsyslog-relp extension exposes metrics for the rsyslog service running on a Shoot’s nodes so that they can be easily viewed by cluster owners and operators in the Shoot’s Prometheus and Plutono instances. The exposed monitoring data offers valuable insights into the operation of the rsyslog service and can be used to detect and debug ongoing issues. This guide describes the various metrics, alerts and logs available to cluster owners and operators.\nMetrics Metrics for the rsyslog service originate from its impstats module. These include the number of messages in the various queues, the number of ingested messages, the number of processed messages by configured actions, system resources used by the rsyslog service, and others. More information about them can be found in the impstats documentation and the statistics counter documentation. They are exposed via the node-exporter running on each Shoot node and are scraped by the Shoot’s Prometheus instance.\nThese metrics can also be viewed in a dedicated dashboard named Rsyslog Stats in the Shoot’s Plutono instance. You can select the node for which you wish the metrics to be displayed from the Node dropdown menu (by default metrics are summed over all nodes).\nFollowing is a list of all exposed rsyslog metrics. The name and origin labels can be used to determine wether the metric is for: a queue, an action, plugins or system stats; the node label can be used to determine the node the metric originates from:\nrsyslog_pstat_submitted Number of messages that were submitted to the rsyslog service from its input. Currently rsyslog uses the /run/systemd/journal/syslog socket as input.\n Type: Counter Labels: name node origin rsyslog_pstat_processed Number of messages that are successfully processed by an action and sent to the target server.\n Type: Counter Labels: name node origin rsyslog_pstat_failed Number of messages that could not be processed by an action nor sent to the target server.\n Type: Counter Labels: name node origin rsyslog_pstat_suspended Total number of times an action suspended itself. Note that this counts the number of times the action transitioned from active to suspended state. The counter is no indication of how long the action was suspended or how often it was retried.\n Type: Counter Labels: name node origin rsyslog_pstat_suspended_duration The total number of seconds this action was disabled.\n Type: Counter Labels: name node origin rsyslog_pstat_resumed The total number of times this action resumed itself. A resumption occurs after the action has detected that a failure condition does no longer exist.\n Type: Counter Labels: name node origin rsyslog_pstat_utime User time used in microseconds.\n Type: Counter Labels: name node origin rsyslog_pstat_stime System time used in microsends.\n Type: Counter Labels: name node origin rsyslog_pstat_maxrss Maximum resident set size\n Type: Gauge Labels: name node origin rsyslog_pstat_minflt Total number of minor faults the task has made per second, those which have not required loading a memory page from disk.\n Type: Counter Labels: name node origin rsyslog_pstat_majflt Total number of major faults the task has made per second, those which have required loading a memory page from disk.\n Type: Counter Labels: name node origin rsyslog_pstat_inblock Filesystem input operations.\n Type: Counter Labels: name node origin rsyslog_pstat_oublock Filesystem output operations.\n Type: Counter Labels: name node origin rsyslog_pstat_nvcsw Voluntary context switches.\n Type: Counter Labels: name node origin rsyslog_pstat_nivcsw Involuntary context switches.\n Type: Counter Labels: name node origin rsyslog_pstat_openfiles Number of open files.\n Type: Counter Labels: name node origin rsyslog_pstat_size Messages currently in queue.\n Type: Gauge Labels: name node origin rsyslog_pstat_enqueued Total messages enqueued.\n Type: Counter Labels: name node origin rsyslog_pstat_full Times queue was full.\n Type: Counter Labels: name node origin rsyslog_pstat_discarded_full Messages discarded due to queue being full.\n Type: Counter Labels: name node origin rsyslog_pstat_discarded_nf Messages discarded when queue not full.\n Type: Counter Labels: name node origin rsyslog_pstat_maxqsize Maximum size queue has reached.\n Type: Gauge Labels: name node origin rsyslog_augenrules_load_success Shows whether the augenrules --load command was executed successfully or not on the node.\n Type: Gauge Labels: node Alerts There are three alerts defined for the rsyslog service in the Shoot’s Prometheus instance:\nRsyslogTooManyRelpActionFailures This indicates that the cumulative failure rate in processing relp action messages is greater than 2%. In other words, it compares the rate of processed relp action messages to the rate of failed relp action messages and fires an alert when the following expression evaluates to true:\nsum(rate(rsyslog_pstat_failed{origin=\"core.action\",name=\"rsyslg-relp\"}[5m])) / sum(rate(rsyslog_pstat_processed{origin=\"core.action\",name=\"rsyslog-relp\"}[5m])) \u003e bool 0.02` RsyslogRelpActionProcessingRateIsZero This indicates that no messages are being sent to the upstream rsyslog target by the relp action. An alert is fired when the following expression evaluates to true:\nrate(rsyslog_pstat_processed{origin=\"core.action\",name=\"rsyslog-relp\"}[5m]) == 0 RsyslogRelpAuditRulesNotLoadedSuccessfully This indicates that augenrules --load was not executed successfully when called to load the configured audit rules. You should check if the auditd configuration you provided is valid. An alert is fired when the following expression evaluates to true:\nabsent(rsyslog_augenrules_load_success == 1) Users can subscribe to these alerts by following the Gardener alerting guide.\nLogging There are two ways to view the logs of the rsyslog service running on the Shoot’s nodes - either using the Explore tab of the Shoot’s Plutono instance, or ssh-ing directly to a node.\nTo view logs in Plutono, navigate to the Explore tab and select vali from the Explore dropdown menu. Afterwards enter the following vali query:\n{nodename=\"\u003cname-of-node\u003e\"} |~ \"\\\"unit\\\":\\\"rsyslog.service\\\"\"\nNotice that you cannot use the unit label to filter for the rsyslog.service unit logs. Instead, you have to grep for the service as displayed in the example above.\nTo view logs when directly ssh-ing to a node in the Shoot cluster, use either of the following commands on the node:\nsystemctl status rsyslog\njournalctl -u rsyslog\n","categories":"","description":"","excerpt":"Monitoring The shoot-rsyslog-relp extension exposes metrics for the …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/monitoring/","tags":"","title":"Monitoring"},{"body":"Monitoring Roles of the different Prometheus instances Cache Prometheus Deployed in the garden namespace. Important scrape targets:\n cadvisor node-exporter kube-state-metrics Purpose: Act as a reverse proxy that supports server-side filtering, which is not supported by Prometheus exporters but by federation. Metrics in this Prometheus are kept for a short amount of time (~1 day) since other Prometheus instances are expected to federate from it and move metrics over. For example, the shoot Prometheus queries this Prometheus to retrieve metrics corresponding to the shoot’s control plane. This way, we achieve isolation so that shoot owners are only able to query metrics for their shoots. Please note Prometheus does not support isolation features. Another example is if another Prometheus needs access to cadvisor metrics, which does not support server-side filtering, so it will query this Prometheus instead of the cadvisor. This strategy also reduces load on the kubelets and API Server.\nNote some of these Prometheus’ metrics have high cardinality (e.g., metrics related to all shoots managed by the seed). Some of these are aggregated with recording rules. These pre-aggregated metrics are scraped by the aggregate Prometheus.\nThis Prometheus is not used for alerting.\nAggregate Prometheus Deployed in the garden namespace. Important scrape targets:\n other Prometheus instances logging components Purpose: Store pre-aggregated data from the cache Prometheus and shoot Prometheus. An ingress exposes this Prometheus allowing it to be scraped from another cluster. Such pre-aggregated data is also used for alerting.\nSeed Prometheus Deployed in the garden namespace. Important scrape targets:\n pods in extension namespaces annotated with: prometheus.io/scrape=true prometheus.io/port=\u003cport\u003e prometheus.io/name=\u003cname\u003e cadvisor metrics from pods in the garden and extension namespaces The job name label will be applied to all metrics from that service.\nPurpose: Entrypoint for operators when debugging issues with extensions or other garden components.\nThis Prometheus is not used for alerting.\nShoot Prometheus Deployed in the shoot control plane namespace. Important scrape targets:\n control plane components shoot nodes (node-exporter) blackbox-exporter used to measure connectivity Purpose: Monitor all relevant components belonging to a shoot cluster managed by Gardener. Shoot owners can view the metrics in Plutono dashboards and receive alerts based on these metrics. For alerting internals refer to this document.\nCollect all shoot Prometheus with remote write An optional collection of all shoot Prometheus metrics to a central Prometheus (or cortex) instance is possible with the monitoring.shoot setting in GardenletConfiguration:\nmonitoring: shoot: remoteWrite: url: https://remoteWriteUrl # remote write URL keep:# metrics that should be forwarded to the external write endpoint. If empty all metrics get forwarded - kube_pod_container_info externalLabels: # add additional labels to metrics to identify it on the central instance additional: label If basic auth is needed it can be set via secret in garden namespace (Gardener API Server). Example secret\nDisable Gardener Monitoring If you wish to disable metric collection for every shoot and roll your own then you can simply set.\nmonitoring: shoot: enabled: false ","categories":"","description":"","excerpt":"Monitoring Roles of the different Prometheus instances Cache …","ref":"/docs/gardener/monitoring/readme/","tags":"","title":"Monitoring"},{"body":"Extending the Monitoring Stack This document provides instructions to extend the Shoot cluster monitoring stack by integrating new scrape targets, alerts and dashboards.\nPlease ensure that you have understood the basic principles of Prometheus and its ecosystem before you continue.\n‼️ The purpose of the monitoring stack is to observe the behaviour of the control plane and the system components deployed by Gardener onto the worker nodes. Monitoring of custom workloads running in the cluster is out of scope.\nOverview Each Shoot cluster comes with its own monitoring stack. The following components are deployed into the seed and shoot:\n Seed Prometheus Plutono blackbox-exporter kube-state-metrics (Seed metrics) kube-state-metrics (Shoot metrics) Alertmanager (Optional) Shoot node-exporter(s) kube-state-metrics blackbox-exporter In each Seed cluster there is a Prometheus in the garden namespace responsible for collecting metrics from the Seed kubelets and cAdvisors. These metrics are provided to each Shoot Prometheus via federation.\nThe alerts for all Shoot clusters hosted on a Seed are routed to a central Alertmanger running in the garden namespace of the Seed. The purpose of this central Alertmanager is to forward all important alerts to the operators of the Gardener setup.\nThe Alertmanager in the Shoot namespace on the Seed is only responsible for forwarding alerts from its Shoot cluster to a cluster owner/cluster alert receiver via email. The Alertmanager is optional and the conditions for a deployment are already described in Alerting.\nThe node-exporter’s textfile collector is enabled and configured to parse all *.prom files in the /var/lib/node-exporter/textfile-collector directory on each Shoot node. Scripts and programs which run on Shoot nodes and cannot expose an endpoint to be scraped by prometheus can use this directory to export metrics in files that match the glob *.prom using the text format.\nAdding New Monitoring Targets After exploring the metrics which your component provides or adding new metrics, you should be aware which metrics are required to write the needed alerts and dashboards.\nPrometheus prefers a pull based metrics collection approach and therefore the targets to observe need to be defined upfront. The targets are defined in charts/seed-monitoring/charts/core/charts/prometheus/templates/config.yaml. New scrape jobs can be added in the section scrape_configs. Detailed information how to configure scrape jobs and how to use the kubernetes service discovery are available in the Prometheus documentation.\nThe job_name of a scrape job should be the name of the component e.g. kube-apiserver or vpn. The collection interval should be the default of 30s. You do not need to specify this in the configuration.\nPlease do not ingest all metrics which are provided by a component. Rather, collect only those metrics which are needed to define the alerts and dashboards (i.e. whitelist). This can be achieved by adding the following metric_relabel_configs statement to your scrape jobs (replace exampleComponent with component name).\n - job_name: example-component ... metric_relabel_configs: {{ include \"prometheus.keep-metrics.metric-relabel-config\" .Values.allowedMetrics.exampleComponent | indent 6 }} The whitelist for the metrics of your job can be maintained in charts/seed-monitoring/charts/core/charts/prometheus/values.yaml in section allowedMetrics.exampleComponent (replace exampleComponent with component name). Check the following example:\nallowedMetrics: ... exampleComponent: * metrics_name_1 * metrics_name_2 ... Adding Alerts The alert definitons are located in charts/seed-monitoring/charts/core/charts/prometheus/rules. There are two approaches for adding new alerts.\n Adding additional alerts for a component which already has a set of alerts. In this case you have to extend the existing rule file for the component. Adding alerts for a new component. In this case a new rule file with name scheme example-component.rules.yaml needs to be added. Add the new alert to alertInhibitionGraph.dot, add any required inhibition flows and render the new graph. To render the graph, run: dot -Tpng ./content/alertInhibitionGraph.dot -o ./content/alertInhibitionGraph.png Create a test for the new alert. See Alert Tests. Example alert:\ngroups: * name: example.rules rules: * alert: ExampleAlert expr: absent(up{job=\"exampleJob\"} == 1) for: 20m labels: service: example severity: critical # How severe is the alert? (blocker|critical|info|warning) type: shoot # For which topology is the alert relevant? (seed|shoot) visibility: all # Who should receive the alerts? (all|operator|owner) annotations: description: A longer description of the example alert that should also explain the impact of the alert. summary: Short summary of an example alert. If the deployment of component is optional then the alert definitions needs to be added to charts/seed-monitoring/charts/core/charts/prometheus/optional-rules instead. Furthermore the alerts for component need to be activatable in charts/seed-monitoring/charts/core/charts/prometheus/values.yaml via rules.optional.example-component.enabled. The default should be true.\nBasic instruction how to define alert rules can be found in the Prometheus documentation.\nRouting Tree The Alertmanager is grouping incoming alerts based on labels into buckets. Each bucket has its own configuration like alert receivers, initial delaying duration or resending frequency, etc. You can find more information about Alertmanager routing in the Prometheus/Alertmanager documentation. The routing trees for the Alertmanagers deployed by Gardener are depicted below.\nCentral Seed Alertmanager\n∟ main route (all alerts for all shoots on the seed will enter) ∟ group by project and shoot name ∟ group by visibility \"all\" and \"operator\" ∟ group by severity \"blocker\", \"critical\", and \"info\" → route to Garden operators ∟ group by severity \"warning\" (dropped) ∟ group by visibility \"owner\" (dropped) Shoot Alertmanager\n∟ main route (only alerts for one Shoot will enter) ∟ group by visibility \"all\" and \"owner\" ∟ group by severity \"blocker\", \"critical\", and \"info\" → route to cluster alert receiver ∟ group by severity \"warning\" (dropped, will change soon → route to cluster alert receiver) ∟ group by visibility \"operator\" (dropped) Alert Inhibition All alerts related to components running on the Shoot workers are inhibited in case of an issue with the vpn connection, because those components can’t be scraped anymore and Prometheus will fire alerts in consequence. The components running on the workers are probably healthy and the alerts are presumably false positives. The inhibition flow is shown in the figure below. If you add a new alert, make sure to add it to the diagram.\nAlert Attributes Each alert rule definition has to contain the following annotations:\n summary: A short description of the issue. description: A detailed explanation of the issue with hints to the possible root causes and the impact assessment of the issue. In addition, each alert must contain the following labels:\n type shoot: Components running on the Shoot worker nodes in the kube-system namespace. seed: Components running on the Seed in the Shoot namespace as part of/next to the control plane. service Name of the component (in lowercase) e.g. kube-apiserver, alertmanager or vpn. severity blocker: All issues which make the cluster entirely unusable, e.g. KubeAPIServerDown or KubeSchedulerDown critical: All issues which affect single functionalities/components but do not affect the cluster in its core functionality e.g. VPNDown or KubeletDown. info: All issues that do not affect the cluster or its core functionality, but if this component is down we cannot determine if a blocker alert is firing. (i.e. A component with an info level severity is a dependency for a component with a blocker severity) warning: No current existing issue, rather a hint for situations which could lead to real issue in the close future e.g. HighLatencyApiServerToWorkers or ApiServerResponseSlow. Adding Plutono Dashboards The dashboard definition files are located in charts/seed-monitoring/charts/plutono/dashboards. Every dashboard needs its own file.\nIf you are adding a new component dashboard please also update the overview dashboard by adding a chart for its current up/down status and with a drill down option to the component dashboard.\nDashboard Structure The dashboards should be structured in the following way. The assignment of the component dashboards to the categories should be handled via dashboard tags.\n Kubernetes control plane components (Tag: control-plane) All components which are part of the Kubernetes control plane e. g. Kube API Server, Kube Controller Manager, Kube Scheduler and Cloud Controller Manager ETCD + Backup/Restore Kubernetes Addon Manager Node/Machine components (Tag: node/machine) All metrics which are related to the behaviour/control of the Kubernetes nodes and kubelets Machine-Controller-Manager + Cluster Autoscaler Networking components (Tag: network) CoreDNS, KubeProxy, Calico, VPN, Nginx Ingress Addon components (Tag: addon) Cert Broker Monitoring components (Tag: monitoring) Logging components (Tag: logging) Mandatory Charts for Component Dashboards For each new component, its corresponding dashboard should contain the following charts in the first row, before adding custom charts for the component in the subsequent rows.\n Pod up/down status up{job=\"example-component\"} Pod/containers cpu utilization Pod/containers memory consumption Pod/containers network i/o That information is provided by the cAdvisor metrics. These metrics are already integrated. Please check the other dashboards for detailed information on how to query.\nChart Requirements Each chart needs to contain:\n a meaningful name a detailed description (for non trivial charts) appropriate x/y axis descriptions appropriate scaling levels for the x/y axis proper units for the x/y axis Dashboard Parameters The following parameters should be added to all dashboards to ensure a homogeneous experience across all dashboards.\nDashboards have to:\n contain a title which refers to the component name(s) contain a timezone statement which should be the browser time contain tags which express where the component is running (seed or shoot) and to which category the component belong (see dashboard structure) contain a version statement with a value of 1 be immutable Example dashboard configuration:\n{ \"title\": \"example-component\", \"timezone\": \"utc\", \"tags\": [ \"seed\", \"control-plane\" ], \"version\": 1, \"editable\": \"false\" } Furthermore, all dashboards should contain the following time options:\n{ \"time\": { \"from\": \"now-1h\", \"to\": \"now\" }, \"timepicker\": { \"refresh_intervals\": [ \"30s\", \"1m\", \"5m\" ], \"time_options\": [ \"5m\", \"15m\", \"1h\", \"6h\", \"12h\", \"24h\", \"2d\", \"10d\" ] } } ","categories":"","description":"","excerpt":"Extending the Monitoring Stack This document provides instructions to …","ref":"/docs/gardener/monitoring-stack/","tags":"","title":"Monitoring Stack"},{"body":"Overview You can configure a NetworkPolicy to deny all the traffic from other namespaces while allowing all the traffic coming from the same namespace the pod was deployed into.\nThere are many reasons why you may chose to employ Kubernetes network policies:\n Isolate multi-tenant deployments Regulatory compliance Ensure containers assigned to different environments (e.g. dev/staging/prod) cannot interfere with each other Kubernetes network policies are application centric compared to infrastructure/network centric standard firewalls. There are no explicit CIDRs or IP addresses used for matching source or destination IP’s. Network policies build up on labels and selectors which are key concepts of Kubernetes that are used to organize (for example, all DB tier pods of an app) and select subsets of objects.\nExample We create two nginx HTTP-Servers in two namespaces and block all traffic between the two namespaces. E.g. you are unable to get content from namespace1 if you are sitting in namespace2.\nSetup the Namespaces # create two namespaces for test purpose kubectl create ns customer1 kubectl create ns customer2 # create a standard HTTP web server kubectl run nginx --image=nginx --replicas=1 --port=80 -n=customer1 kubectl run nginx --image=nginx --replicas=1 --port=80 -n=customer2 # expose the port 80 for external access kubectl expose deployment nginx --port=80 --type=NodePort -n=customer1 kubectl expose deployment nginx --port=80 --type=NodePort -n=customer2 Test Without NP Create a pod with curl preinstalled inside the namespace customer1:\n# create a \"bash\" pod in one namespace kubectl run -i --tty client --image=tutum/curl -n=customer1 Try to curl the exposed nginx server to get the default index.html page. Execute this in the bash prompt of the pod created above.\n# get the index.html from the nginx of the namespace \"customer1\" =\u003e success curl http://nginx.customer1 # get the index.html from the nginx of the namespace \"customer2\" =\u003e success curl http://nginx.customer2 Both calls are done in a pod within the namespace customer1 and both nginx servers are always reachable, no matter in what namespace.\n Test with NP Install the NetworkPolicy from your shell:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-from-other-namespaces spec: podSelector: matchLabels: ingress: - from: - podSelector: {} it applies the policy to ALL pods in the named namespace as the spec.podSelector.matchLabels is empty and therefore selects all pods. it allows traffic from ALL pods in the named namespace, as spec.ingress.from.podSelector is empty and therefore selects all pods. kubectl apply -f ./network-policy.yaml -n=customer1 kubectl apply -f ./network-policy.yaml -n=customer2 After this, curl http://nginx.customer2 shouldn’t work anymore if you are a service inside the namespace customer1 and vice versa Note This policy, once applied, will also disable all external traffic to these pods. For example, you can create a service of type LoadBalancer in namespace customer1 that match the nginx pod. When you request the service by its \u003cEXTERNAL_IP\u003e:\u003cPORT\u003e, then the network policy that will deny the ingress traffic from the service and the request will time out. Related Links You can get more information on how to configure the NetworkPolicies at:\n Calico WebSite Kubernetes NP Recipes ","categories":"","description":"Deny all traffic from other namespaces","excerpt":"Deny all traffic from other namespaces","ref":"/docs/guides/applications/network-isolation/","tags":"","title":"Namespace Isolation"},{"body":"Necessary Labeling for Custom CSI Components Some provider extensions for Gardener are using CSI components to manage persistent volumes in the shoot clusters. Additionally, most of the provider extensions are deploying controllers for taking volume snapshots (CSI snapshotter).\nEnd-users can deploy their own CSI components and controllers into shoot clusters. In such situations, there are multiple controllers acting on the VolumeSnapshot custom resources (each responsible for those instances associated with their respective driver provisioner types).\nHowever, this might lead to operational conflicts that cannot be overcome by Gardener alone. Concretely, Gardener cannot know which custom CSI components were installed by end-users which can lead to issues, especially during shoot cluster deletion. You can add a label to your custom CSI components indicating that Gardener should not try to remove them during shoot cluster deletion. This means you have to take care of the lifecycle for these components yourself!\nRecommendations Custom CSI components are typically regular Deployments running in the shoot clusters.\nPlease label them with the shoot.gardener.cloud/no-cleanup=true label.\nBackground Information When a shoot cluster is deleted, Gardener deletes most Kubernetes resources (Deployments, DaemonSets, StatefulSets, etc.). Gardener will also try to delete CSI components if they are not marked with the above mentioned label.\nThis can result in VolumeSnapshot resources still having finalizers that will never be cleaned up. Consequently, manual intervention is required to clean them up before the cluster deletion can continue.\n","categories":"","description":"","excerpt":"Necessary Labeling for Custom CSI Components Some provider extensions …","ref":"/docs/gardener/csi_components/","tags":"","title":"Necessary Labeling for Custom CSI Components"},{"body":"Gardener Network Extension Gardener is an open-source project that provides a nested user model. Basically, there are two types of services provided by Gardener to its users:\n Managed: end-users only request a Kubernetes cluster (Clusters-as-a-Service) Hosted: operators utilize Gardener to provide their own managed version of Kubernetes (Cluster-Provisioner-as-a-service) Whether a user is an operator or an end-user, it makes sense to provide choice. For example, for an end-user it might make sense to choose a network-plugin that would support enforcing network policies (some plugins does not come with network-policy support by default). For operators however, choice only matters for delegation purposes i.e., when providing an own managed-service, it becomes important to also provide choice over which network-plugins to use.\nFurthermore, Gardener provisions clusters on different cloud-providers with different networking requirements. For example, Azure does not support Calico overlay networking with IP in IP [1], this leads to the introduction of manual exceptions in static add-on charts which is error prone and can lead to failures during upgrades.\nFinally, every provider is different, and thus the network always needs to adapt to the infrastructure needs to provide better performance. Consistency does not necessarily lie in the implementation but in the interface.\nMotivation Prior to the Network Extensibility concept, Gardener followed a mono network-plugin support model (i.e., Calico). Although this seemed to be the easier approach, it did not completely reflect the real use-case. The goal of the Gardener Network Extensions is to support different network plugins, therefore, the specification for the network resource won’t be fixed and will be customized based on the underlying network plugin.\nTo do so, a ProviderConfig field in the spec will be provided where each plugin will define. Below is an example for how to deploy Calico as the cluster network plugin.\nThe Network Extensions Resource Here is what a typical Network resource would look-like:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: my-network spec: ipFamilies: - IPv4 podCIDR: 100.244.0.0/16 serviceCIDR: 100.32.0.0/13 type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig backend: bird ipam: cidr: usePodCIDR type: host-local The above resources is divided into two parts (more information can be found at Using the Networking Calico Extension):\n global configuration (e.g., podCIDR, serviceCIDR, and type) provider specific config (e.g., for calico we can choose to configure a bird backend) Note: Certain cloud-provider extensions might have webhooks that would modify the network-resource to fit into their network specific context. As previously mentioned, Azure does not support IPIP, as a result, the Azure provider extension implements a webhook to mutate the backend and set it to None instead of bird.\n Supporting a New Network Extension Provider To add support for another networking provider (e.g., weave, Cilium, Flannel) a network extension controller needs to be implemented which would optionally have its own custom configuration specified in the spec.providerConfig in the Network resource. For example, if support for a network plugin named gardenet is required, the following Network resource would be created:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Network metadata: name: my-network spec: ipFamilies: - IPv4 podCIDR: 100.244.0.0/16 serviceCIDR: 100.32.0.0/13 type: gardenet providerConfig: apiVersion: gardenet.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig gardenetCustomConfigField: \u003cvalue\u003e ipam: cidr: usePodCIDR type: host-local Once applied, the presumably implemented Gardenet extension controller would pick the configuration up, parse the providerConfig, and create the necessary resources in the shoot.\nFor additional reference, please have a look at the networking-calico provider extension, which provides more information on how to configure the necessary charts, as well as the actuators required to reconcile networking inside the Shoot cluster to the desired state.\nSupporting kube-proxy-less Service Routing Some networking extensions support service routing without the kube-proxy component. This is why Gardener supports disabling of kube-proxy for service routing by setting .spec.kubernetes.kubeproxy.enabled to false in the Shoot specification. The implicit contract of the flag is:\nIf kube-proxy is disabled, then the networking extension is responsible for the service routing.\nThe networking extensions need to handle this twofold:\n During the reconciliation of the networking resources, the extension needs to check whether kube-proxy takes care of the service routing or the networking extension itself should handle it. In case the networking extension should be responsible according to .spec.kubernetes.kubeproxy.enabled (but is unable to perform the service routing), it should raise an error during the reconciliation. If the networking extension should handle the service routing, it may reconfigure itself accordingly. (Optional) In case the networking extension does not support taking over the service routing (in some scenarios), it is recommended to also provide a validating admission webhook to reject corresponding changes early on. The validation may take the current operating mode of the networking extension into consideration. Related Links [1] Calico overlay networking on Azure ","categories":"","description":"","excerpt":"Gardener Network Extension Gardener is an open-source project that …","ref":"/docs/gardener/extensions/network/","tags":"","title":"Network"},{"body":"NetworkPolicys In Garden, Seed, Shoot Clusters This document describes which Kubernetes NetworkPolicys deployed by Gardener into the various clusters.\nGarden Cluster (via gardener-operator and gardener-resource-manager)\nThe gardener-operator runs a NetworkPolicy controller which is responsible for the following namespaces:\n garden istio-system *istio-ingress-* shoot-* extension-* (in case the garden cluster is a seed cluster at the same time) It deploys the following so-called “general NetworkPolicys”:\n Name Purpose deny-all Denies all ingress and egress traffic for all pods in this namespace. Hence, all traffic must be explicitly allowed. allow-to-dns Allows egress traffic from pods labeled with networking.gardener.cloud/to-dns=allowed to DNS pods running in the kube-sytem namespace. In practice, most of the pods performing network egress traffic need this label. allow-to-runtime-apiserver Allows egress traffic from pods labeled with networking.gardener.cloud/to-runtime-apiserver=allowed to the API server of the runtime cluster. allow-to-blocked-cidrs Allows egress traffic from pods labeled with networking.gardener.cloud/to-blocked-cidrs=allowed to explicitly blocked addresses configured by human operators (configured via .spec.networking.blockedCIDRs in the Seed). For instance, this can be used to block the cloud provider’s metadata service. allow-to-public-networks Allows egress traffic from pods labeled with networking.gardener.cloud/to-public-networks=allowed to all public network IPs, except for private networks (RFC1918), carrier-grade NAT (RFC6598), and explicitly blocked addresses configured by human operators for all pods labeled with networking.gardener.cloud/to-public-networks=allowed. In practice, this blocks egress traffic to all networks in the cluster and only allows egress traffic to public IPv4 addresses. allow-to-private-networks Allows egress traffic from pods labeled with networking.gardener.cloud/to-private-networks=allowed to the private networks (RFC1918) and carrier-grade NAT (RFC6598) except for cluster-specific networks (configured via .spec.networks in the Seed). Apart from those, the gardener-operator also enables the NetworkPolicy controller of gardener-resource-manager. Please find more information in the linked document. In summary, most of the pods that initiate connections with other pods will have labels with networking.resources.gardener.cloud/ prefixes. This way, they leverage the automatically created NetworkPolicys by the controller. As a result, in most cases no special/custom-crafted NetworkPolicys must be created anymore.\nSeed Cluster (via gardenlet and gardener-resource-manager)\nIn seed clusters it works the same way as in the garden cluster managed by gardener-operator. When a seed cluster is the garden cluster at the same time, gardenlet does not enable the NetworkPolicy controller (since gardener-operator already runs it). Otherwise, it uses the exact same controller and code like gardener-operator, resulting in the same behaviour in both garden and seed clusters.\nLogging \u0026 Monitoring Seed System Namespaces As part of the seed reconciliation flow, the gardenlet deploys various Prometheus instances into the garden namespace. See also this document for more information. Each pod that should be scraped for metrics by these instances must have a Service which is annotated with\nannotations: networking.resources.gardener.cloud/from-all-seed-scrape-targets-allowed-ports: '[{\"port\":\u003cmetrics-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' If the respective pod is not running in the garden namespace, the Service needs these annotations in addition:\nannotations: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"garden\"}}]' If the respective pod is running in an extension-* namespace, the Service needs this annotation in addition:\nannotations: networking.resources.gardener.cloud/pod-label-selector-namespace-alias: extensions This automatically allows the needed network traffic from the respective Prometheus pods.\nShoot Namespaces As part of the shoot reconciliation flow, the gardenlet deploys a shoot-specific Prometheus into the shoot namespace. Each pod that should be scraped for metrics must have a Service which is annotated with\nannotations: networking.resources.gardener.cloud/from-all-scrape-targets-allowed-ports: '[{\"port\":\u003cmetrics-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' This automatically allows the network traffic from the Prometheus pod.\nWebhook Servers Components serving webhook handlers that must be reached by kube-apiservers of the virtual garden cluster or shoot clusters just need to annotate their Service as follows:\nannotations: networking.resources.gardener.cloud/from-all-webhook-targets-allowed-ports: '[{\"port\":\u003cserver-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' This automatically allows the network traffic from the API server pods.\nIn case the servers run in a different namespace than the kube-apiservers, the following annotations are needed:\nannotations: networking.resources.gardener.cloud/from-all-webhook-targets-allowed-ports: '[{\"port\":\u003cserver-port-on-pod\u003e,\"protocol\":\"\u003cprotocol, typically TCP\u003e\"}]' networking.resources.gardener.cloud/pod-label-selector-namespace-alias: extensions # for the virtual garden cluster: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"kubernetes.io/metadata.name\":\"garden\"}}]' # for shoot clusters: networking.resources.gardener.cloud/namespace-selectors: '[{\"matchLabels\":{\"gardener.cloud/role\":\"shoot\"}}]' Additional Namespace Coverage in Garden/Seed Cluster In some cases, garden or seed clusters might run components in dedicated namespaces which are not covered by the controller by default (see list above). Still, it might(/should) be desired to also include such “custom namespaces” into the control of the NetworkPolicy controllers.\nIn order to do so, human operators can adapt the component configs of gardener-operator or gardenlet by providing label selectors for additional namespaces:\ncontrollers: networkPolicy: additionalNamespaceSelectors: - matchLabels: foo: bar Communication With kube-apiserver For Components In Custom Namespaces Egress Traffic Component running in such custom namespaces might need to initiate the communication with the kube-apiservers of the virtual garden cluster or a shoot cluster. In order to achieve this, their custom namespace must be labeled with networking.gardener.cloud/access-target-apiserver=allowed. This will make the NetworkPolicy controllers automatically provisioning the required policies into their namespace.\nAs a result, the respective component pods just need to be labeled with\n networking.resources.gardener.cloud/to-garden-virtual-garden-kube-apiserver-tcp-443=allowed (virtual garden cluster) networking.resources.gardener.cloud/to-all-shoots-kube-apiserver-tcp-443=allowed (shoot clusters) Ingress Traffic Components running in such custom namespaces might serve webhook handlers that must be reached by the kube-apiservers of the virtual garden cluster or a shoot cluster. In order to achieve this, their Service must be annotated. Please refer to this section for more information.\nShoot Cluster (via gardenlet)\nFor shoot clusters, the concepts mentioned above don’t apply and are not enabled. Instead, gardenlet only deploys a few “custom” NetworkPolicys for the shoot system components running in the kube-system namespace. All other namespaces in the shoot cluster do not contain network policies deployed by gardenlet.\nAs a best practice, every pod deployed into the kube-system namespace should use appropriate NetworkPolicy in order to only allow required network traffic. Therefore, pods should have labels matching to the selectors of the available network policies.\ngardenlet deploys the following NetworkPolicys:\nNAME POD-SELECTOR gardener.cloud--allow-dns k8s-app in (kube-dns) gardener.cloud--allow-from-seed networking.gardener.cloud/from-seed=allowed gardener.cloud--allow-to-dns networking.gardener.cloud/to-dns=allowed gardener.cloud--allow-to-apiserver networking.gardener.cloud/to-apiserver=allowed gardener.cloud--allow-to-from-nginx app=nginx-ingress gardener.cloud--allow-to-kubelet networking.gardener.cloud/to-kubelet=allowed gardener.cloud--allow-to-public-networks networking.gardener.cloud/to-public-networks=allowed gardener.cloud--allow-vpn app=vpn-shoot Note that a deny-all policy will not be created by gardenlet. Shoot owners can create it manually if needed/desired. Above listed NetworkPolicys ensure that the traffic for the shoot system components is allowed in case such deny-all policies is created.\nWebhook Servers in Shoot Clusters Shoot components serving webhook handlers must be reached by kube-apiservers of the shoot cluster. However, the control plane components, e.g. kube-apiserver, run on the seed cluster decoupled by a VPN connection. Therefore, shoot components serving webhook handlers need to allow the VPN endpoints in the shoot cluster as clients to allow kube-apiservers to call them.\nFor the kube-system namespace, the network policy gardener.cloud--allow-from-seed fulfils the purpose to allow pods to mark themselves as targets for such calls, allowing corresponding traffic to pass through.\nFor custom namespaces, operators can use the network policy gardener.cloud--allow-from-seed as a template. Please note that the label selector may change over time, i.e. with Gardener version updates. This is why a simpler variant with a reduced label selector like the example below is recommended:\napiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-from-seed namespace: custom-namespace spec: ingress: - from: - namespaceSelector: matchLabels: gardener.cloud/purpose: kube-system podSelector: matchLabels: app: vpn-shoot Implications for Gardener Extensions Gardener extensions sometimes need to deploy additional components into the shoot namespace in the seed cluster hosting the control plane. For example, the gardener-extension-provider-aws deploys the cloud-controller-manager into the shoot namespace. In most cases, such pods require network policy labels to allow the traffic they are initiating.\nFor components deployed in the kube-system namespace of the shoots (e.g., CNI plugins or CSI drivers, etc.), custom NetworkPolicys might be required to ensure the respective components can still communicate in case the user creates a deny-all policy.\n","categories":"","description":"","excerpt":"NetworkPolicys In Garden, Seed, Shoot Clusters This document describes …","ref":"/docs/gardener/network_policies/","tags":"","title":"Network Policies"},{"body":"Gardener Extension for Network Problem Detector \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-networking-problemdetector extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nExtension Resources Currently there is nothing to specify in the extension spec.\nExample extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-networking-problemdetector namespace: shoot--project--abc spec: When an extension resource is reconciled, the extension controller will create two daemonsets nwpd-agent-pod-net and nwpd-agent-node-net deploying the “network problem detector agent”. These daemon sets perform and collect various checks between all nodes of the Kubernetes cluster, to its Kube API server and/or external endpoints. Checks are performed using TCP connections, PING (ICMP) or mDNS (UDP). More details about the network problem detector agent can be found in its repository gardener/network-problem-detector.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension for deploying network problem detector","excerpt":"Gardener extension for deploying network problem detector","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/","tags":"","title":"Networking problemdetector"},{"body":"Adding Cloud Providers This document provides an overview of how to integrate a new cloud provider into Gardener. Each component that requires integration has a detailed description of how to integrate it and the steps required.\nCloud Components Gardener is composed of 2 or more Kubernetes clusters:\n Shoot: These are the end-user clusters, the regular Kubernetes clusters you have seen. They provide places for your workloads to run. Seed: This is the “management” cluster. It manages the control planes of shoots by running them as native Kubernetes workloads. These two clusters can run in the same cloud provider, but they do not need to. For example, you could run your Seed in AWS, while having one shoot in Azure, two in Google, two in Alicloud, and three in Equinix Metal.\nThe Seed cluster deploys and manages the Shoot clusters. Importantly, for this discussion, the etcd data store backing each Shoot runs as workloads inside the Seed. Thus, to use the above example, the clusters in Azure, Google, Alicloud and Equinix Metal will have their worker nodes and master nodes running in those clouds, but the etcd clusters backing them will run as separate deployments in the Seed Kubernetes cluster on AWS.\nThis distinction becomes important when preparing the integration to a new cloud provider.\nGardener Cloud Integration Gardener and its related components integrate with cloud providers at the following key lifecycle elements:\n Create/destroy/get/list machines for the Shoot. Create/destroy/get/list infrastructure components for the Shoot, e.g. VPCs, subnets, routes, etc. Backup/restore etcd for the Seed via writing files to and reading them from object storage. Thus, the integrations you need for your cloud provider depend on whether you want to deploy Shoot clusters to the provider, Seed or both.\n Shoot Only: machine lifecycle management, infrastructure Seed: etcd backup/restore Gardener API In addition to the requirements to integrate with the cloud provider, you also need to enable the core Gardener app to receive, validate, and process requests to use that cloud provider.\n Expose the cloud provider to the consumers of the Gardener API, so it can be told to use that cloud provider as an option. Validate that API as requests come in. Write cloud provider specific implementation (called “provider extension”). Cloud Provider API Requirements In order for a cloud provider to integrate with Gardener, the provider must have an API to perform machine lifecycle events, specifically:\n Create a machine Destroy a machine Get information about a machine and its state List machines In addition, if the Seed is to run on the given provider, it also must have an API to save files to block storage and retrieve them, for etcd backup/restore.\nThe current integration with cloud providers is to add their API calls to Gardener and the Machine Controller Manager. As both Gardener and the Machine Controller Manager are written in go, the cloud provider should have a go SDK. However, if it has an API that is wrappable in go, e.g. a REST API, then you can use that to integrate.\nThe Gardener team is working on bringing cloud provider integrations out-of-tree, making them plugable, which should simplify the process and make it possible to use other SDKs.\nSummary To add a new cloud provider, you need some or all of the following. Each repository contains instructions on how to extend it to a new cloud provider.\n Type Purpose Location Documentation Seed or Shoot Machine Lifecycle machine-controller-manager MCM new cloud provider Seed only etcd backup/restore etcd-backup-restore In process All Extension implementation gardener Extension controller ","categories":"","description":"","excerpt":"Adding Cloud Providers This document provides an overview of how to …","ref":"/docs/gardener/new-cloud-provider/","tags":"","title":"New Cloud Provider"},{"body":"Adding Support For a New Kubernetes Version This document describes the steps needed to perform in order to confidently add support for a new Kubernetes minor version.\n ⚠️ Typically, once a minor Kubernetes version vX.Y is supported by Gardener, then all patch versions vX.Y.Z are also automatically supported without any required action. This is because patch versions do not introduce any new feature or API changes, so there is nothing that needs to be adapted in gardener/gardener code.\n The Kubernetes community release a new minor version roughly every 4 months. Please refer to the official documentation about their release cycles for any additional information.\nShortly before a new release, an “umbrella” issue should be opened which is used to collect the required adaptations and to track the work items. For example, #5102 can be used as a template for the issue description. As you can see, the task of supporting a new Kubernetes version also includes the provider extensions maintained in the gardener GitHub organization and is not restricted to gardener/gardener only.\nGenerally, the work items can be split into two groups: The first group contains tasks specific to the changes in the given Kubernetes release, the second group contains Kubernetes release-independent tasks.\n ℹ️ Upgrading the k8s.io/* and sigs.k8s.io/controller-runtime Golang dependencies is typically tracked and worked on separately (see e.g. #4772 or #5282).\n Deriving Release-Specific Tasks Most new minor Kubernetes releases incorporate API changes, deprecations, or new features. The community announces them via their change logs. In order to derive the release-specific tasks, the respective change log for the new version vX.Y has to be read and understood (for example, the changelog for v1.24).\nAs already mentioned, typical changes to watch out for are:\n API version promotions or deprecations Feature gate promotions or deprecations CLI flag changes for Kubernetes components New default values in resources New available fields in resources New features potentially relevant for the Gardener system Changes of labels or annotations Gardener relies on … Obviously, this requires a certain experience and understanding of the Gardener project so that all “relevant changes” can be identified. While reading the change log, add the tasks (along with the respective PR in kubernetes/kubernetes to the umbrella issue).\n ℹ️ Some of the changes might be specific to certain cloud providers. Pay attention to those as well and add related tasks to the issue.\n List Of Release-Independent Tasks The following paragraphs describe recurring tasks that need to be performed for each new release.\nMake Sure a New hyperkube Image Is Released The gardener/hyperkube repository is used to release container images consisting of the kubectl and kubelet binaries.\nThere is a CI/CD job that runs periodically and releases a new hyperkube image when there is a new Kubernetes release. Before proceeding with the next steps, make sure that a new hyperkube image is released for the corresponding new Kubernetes minor version. Make sure that container image is present in GCR.\nAdapting Gardener Allow instantiation of a Kubernetes client for the new minor version and update the README.md: See this example commit. The list of supported versions is meanwhile maintained here in the SupportedVersions variable. Maintain the Kubernetes feature gates used for validation of Shoot resources: The feature gates are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-feature-gates.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-feature-gates.sh v1.26 v1.27). It will present 3 lists of feature gates: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e and feature gates that got locked to default in \u003cnew-version\u003e. Add all added feature gates to the map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed feature gates, add \u003cnew-version\u003e as RemovedInVersion to the already existing feature gate in the map. For feature gates locked to default, add \u003cnew-version\u003e as LockedToDefaultInVersion to the already existing feature gate in the map. See this example commit. Maintain the Kubernetes kube-apiserver admission plugins used for validation of Shoot resources: The admission plugins are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-admission-plugins.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-admission-plugins.sh 1.26 1.27). It will present 2 lists of admission plugins: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added admission plugins to the admissionPluginsVersionRanges map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed admission plugins, add \u003cnew-version\u003e as RemovedInVersion to the already existing admission plugin in the map. Flag any admission plugins that are required (plugins that must not be disabled in the Shoot spec) by setting the Required boolean variable to true for the admission plugin in the map. Flag any admission plugins that are forbidden by setting the Forbidden boolean variable to true for the admission plugin in the map. Maintain the Kubernetes kube-apiserver API groups used for validation of Shoot resources: The API groups are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-api-groups.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-api-groups.sh 1.26 1.27). It will present 2 lists of API GroupVersions and 2 lists of API GroupVersionResources: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added group versions to the apiGroupVersionRanges map and group version resources to the apiGVRVersionRanges map with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed APIs, add \u003cnew-version\u003e as RemovedInVersion to the already existing API in the corresponding map. Flag any APIs that are required (APIs that must not be disabled in the Shoot spec) by setting the Required boolean variable to true for the API in the apiGVRVersionRanges map. If this API also should not be disabled for Workerless Shoots, then set RequiredForWorkerless boolean variable also to true. If the API is required for both Shoot types, then both of these booleans need to be set to true. If the whole API Group is required, then mark it correspondingly in the apiGroupVersionRanges map. Maintain the Kubernetes kube-controller-manager controllers for each API group used in deploying required KCM controllers based on active APIs: The API groups are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compute-k8s-controllers.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compute-k8s-controllers.sh 1.28 1.29). If it complains that the path for the controller is not present in the map, check the release branch of the new Kubernetes version and find the correct path for the missing/wrong controller. You can do so by checking the file cmd/kube-controller-manager/app/controllermanager.go and where the controller is initialized from. As of now, there is no straight-forward way to map each controller to its file. If this has improved, please enhance the script. If the paths are correct, it will present 2 lists of controllers: those added and those removed for each API group in \u003cnew-version\u003e compared to \u003cold-version\u003e. Add all added controllers to the APIGroupControllerMap map and under the corresponding API group with \u003cnew-version\u003e as AddedInVersion and no RemovedInVersion. For any removed controllers, add \u003cnew-version\u003e as RemovedInVersion to the already existing controller in the corresponding API group map. If you are unable to find the removed controller name, then check for its alias. Either in the staging/src/k8s.io/cloud-provider/names/controller_names.go file (example) or in the cmd/kube-controller-manager/app/* files (example for apps API group). This is because for kubernetes versions starting from v1.28, we don’t maintain the aliases in the controller, but the controller names itself since some controllers can be initialized without aliases as well (example). The old alias should still be working since it should be backwards compatible as explained here. Once the support for kubernetes version \u003c v1.28 is droppped, we can drop the usages of these aliases and move completely to controller names. Make sure that the API groups in this file are in sync with the groups in this file. For example, core/v1 is replaced by the script as v1 and apiserverinternal as internal. This is because the API groups registered by the apiserver (example) and the file path imported by the controllers (example) might be slightly different in some cases. Maintain the ServiceAccount names for the controllers part of kube-controller-manager: The names are maintained in this file. To maintain this list for new Kubernetes versions, run hack/compare-k8s-controllers.sh \u003cold-version\u003e \u003cnew-version\u003e (e.g. hack/compare-k8s-controllers.sh 1.26 1.27). It will present 2 lists of controllers: those added and those removed in \u003cnew-version\u003e compared to \u003cold-version\u003e. Double check whether such ServiceAccount indeed appears in the kube-system namespace when creating a cluster with \u003cnew-version\u003e. Note that it sometimes might be hidden behind a default-off feature gate. You can create a local cluster with the new version using the local provider. It could so happen that the name of the controller is used in the form of a constant and not a string, see example, In that case not the value of the constant separetely. You could also cross check the names with the result of the compute-k8s-controllers.sh script used in the previous step. If it appears, add all added controllers to the list based on the Kubernetes version (example). For any removed controllers, add them only to the Kubernetes version if it is low enough. Maintain the names of controllers used for workerless Shoots, here after carefully evaluating whether they are needed if there are no workers. Maintain copies of the DaemonSet controller’s scheduling logic: gardener-resource-manager’s Node controller uses a copy of parts of the DaemonSet controller’s logic for determining whether a specific Node should run a daemon pod of a given DaemonSet: see this file. Check the referenced upstream files for changes to the DaemonSet controller’s logic and adapt our copies accordingly. This might include introducing version-specific checks in our codebase to handle different shoot cluster versions. Maintain version specific defaulting logic in shoot admission plugin: Sometimes default values for shoots are intentionally changed with the introduction of a new Kubernetes version. The final Kubernetes version for a shoot is determined in the Shoot Validator Admission Plugin. Any defaulting logic that depends on the version should be placed in this admission plugin (example). Ensure that maintenance-controller is able to auto-update shoots to the new Kubernetes version. Changes to the shoot spec required for the Kubernetes update should be enforced in such cases (examples). Bump the used Kubernetes version for local e2e test. See this example commit. Filing the Pull Request Work on all the tasks you have collected and validate them using the local provider. Execute the e2e tests and if everything looks good, then go ahead and file the PR (example PR). Generally, it is great if you add the PRs also to the umbrella issue so that they can be tracked more easily.\nAdapting Provider Extensions After the PR in gardener/gardener for the support of the new version has been merged, you can go ahead and work on the provider extensions.\n Actually, you can already start even if the PR is not yet merged and use the branch of your fork.\n Update the github.com/gardener/gardener dependency in the extension and update the README.md. Work on release-specific tasks related to this provider. Maintaining the cloud-controller-manager Images Some of the cloud providers are not yet using upstream cloud-controller-manager images. Instead, we build and maintain them ourselves:\n cloud-provider-gcp Until we switch to upstream images, you need to update the Kubernetes dependencies and release a new image. The required steps are as follows:\n Checkout the legacy-cloud-provider branch of the respective repository Bump the versions in the Dockerfile (example commit). Update the VERSION to vX.Y.Z-dev where Z is the latest available Kubernetes patch version for the vX.Y minor version. Update the k8s.io/* dependencies in the go.mod file to vX.Y.Z and run go mod tidy (example commit). Checkout a new release-vX.Y branch and release it (example) As you are already on it, it is great if you also bump the k8s.io/* dependencies for the last three minor releases as well. In this case, you need to checkout the release-vX.{Y-{1,2,3}} branches and only perform the last three steps (example branch, example commit).\n Now you need to update the new releases in the imagevector/images.yaml of the respective provider extension so that they are used (see this example commit for reference).\nFiling the Pull Request Again, work on all the tasks you have collected. This time, you cannot use the local provider for validation but should create real clusters on the various infrastructures. Typically, the following validations should be performed:\n Create new clusters with versions \u003c vX.Y Create new clusters with version = vX.Y Upgrade old clusters from version vX.{Y-1} to version vX.Y Delete clusters with versions \u003c vX.Y Delete clusters with version = vX.Y If everything looks good, then go ahead and file the PR (example PR). Generally, it is again great if you add the PRs also to the umbrella issue so that they can be tracked more easily.\n","categories":"","description":"","excerpt":"Adding Support For a New Kubernetes Version This document describes …","ref":"/docs/gardener/new-kubernetes-version/","tags":"","title":"New Kubernetes Version"},{"body":"Gardener Extension to configure rsyslog with relp module \nGardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.\nUsage Configuring the Rsyslog Relp Extension - learn what is the use-case for rsyslog-relp, how to enable it and configure it Local Setup and Development Deploying the Rsyslog Relp Extension Locally - learn how to set up a local development environment Developer Docs for Gardener Shoot Rsyslog Relp Extension - learn about the inner workings ","categories":"","description":"Gardener extension controller which configures the rsyslog and auditd services installed on shoot nodes.","excerpt":"Gardener extension controller which configures the rsyslog and auditd …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/","tags":"","title":"Node Audit Logging"},{"body":"NodeLocalDNS Configuration This is a short guide describing how to enable DNS caching on the shoot cluster nodes.\nBackground Currently in Gardener we are using CoreDNS as a deployment that is auto-scaled horizontally to cover for QPS-intensive applications. However, doing so does not seem to be enough to completely circumvent DNS bottlenecks such as:\n Cloud provider limits for DNS lookups. Unreliable UDP connections that forces a period of timeout in case packets are dropped. Unnecessary node hopping since CoreDNS is not deployed on all nodes, and as a result DNS queries end-up traversing multiple nodes before reaching the destination server. Inefficient load-balancing of services (e.g., round-robin might not be enough when using IPTables mode) and more … To workaround the issues described above, node-local-dns was introduced. The architecture is described below. The idea is simple:\n For new queries, the connection is upgraded from UDP to TCP and forwarded towards the cluster IP for the original CoreDNS server. For previously resolved queries, an immediate response from the same node where the requester workload / pod resides is provided. Configuring NodeLocalDNS All that needs to be done to enable the usage of the node-local-dns feature is to set the corresponding option (spec.systemComponents.nodeLocalDNS.enabled) in the Shoot resource to true:\n... spec: ... systemComponents: nodeLocalDNS: enabled: true ... It is worth noting that:\n When migrating from IPVS to IPTables, existing pods will continue to leverage the node-local-dns cache. When migrating from IPtables to IPVS, only newer pods will be switched to the node-local-dns cache. During the reconfiguration of the node-local-dns there might be a short disruption in terms of domain name resolution depending on the setup. Usually, DNS requests are repeated for some time as UDP is an unreliable protocol, but that strictly depends on the application/way the domain name resolution happens. It is recommended to let the shoot be reconciled during the next maintenance period. Enabling or disabling node-local-dns triggers a rollout of all shoot worker nodes, see also this document. For more information about node-local-dns, please refer to the KEP or to the usage documentation.\nKnown Issues Custom DNS configuration may not work as expected in conjunction with NodeLocalDNS. Please refer to Custom DNS Configuration.\n","categories":"","description":"","excerpt":"NodeLocalDNS Configuration This is a short guide describing how to …","ref":"/docs/gardener/node-local-dns/","tags":"","title":"NodeLocalDNS Configuration"},{"body":"Gardener Extension for openid connect services \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the shoot-oidc-service extension.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nCompatibility The following lists compatibility requirements of this extension controller with regards to other Gardener components.\n OIDC Extension Gardener Notes == v0.15.0 \u003e= 1.60.0 \u003c= v1.64.0 A typical side-effect when running Gardener \u003c v1.63.0 is an unexpected scale-down of the OIDC webhook from 2 -\u003e 1. == v0.16.0 \u003e= 1.65.0 Extension Resources Example extension resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Extension metadata: name: extension-shoot-oidc-service namespace: shoot--project--abc spec: type: shoot-oidc-service When an extension resource is reconciled, the extension controller will create an instance of OIDC Webhook Authenticator. These resources are placed inside the shoot namespace on the seed. Also, the controller takes care about generating necessary RBAC resources for the seed as well as for the shoot.\nPlease note, this extension controller relies on the Gardener-Resource-Manager to deploy k8s resources to seed and shoot clusters.\nHow to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nWe are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for OpenID Connect services for shoot clusters","excerpt":"Gardener extension controller for OpenID Connect services for shoot …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/","tags":"","title":"OpenID Connect services"},{"body":"ClusterOpenIDConnectPreset and OpenIDConnectPreset This page provides an overview of ClusterOpenIDConnectPresets and OpenIDConnectPresets, which are objects for injecting OpenIDConnect Configuration into Shoot at creation time. The injected information contains configuration for the Kube API Server and optionally configuration for kubeconfig generation using said configuration.\nOpenIDConnectPreset An OpenIDConnectPreset is an API resource for injecting additional runtime OIDC requirements into a Shoot at creation time. You use label selectors to specify the Shoot to which a given OpenIDConnectPreset applies.\nUsing a OpenIDConnectPresets allows project owners to not have to explicitly provide the same OIDC configuration for every Shoot in their Project.\nFor more information about the background, see the issue for OpenIDConnectPreset.\nHow OpenIDConnectPreset Works Gardener provides an admission controller (OpenIDConnectPreset) which, when enabled, applies OpenIDConnectPresets to incoming Shoot creation requests. When a Shoot creation request occurs, the system does the following:\n Retrieve all OpenIDConnectPreset available for use in the Shoot namespace.\n Check if the shoot label selectors of any OpenIDConnectPreset matches the labels on the Shoot being created.\n If multiple presets are matched then only one is chosen and results are sorted based on:\n .spec.weight value. lexicographically ordering their names (e.g., 002preset \u003e 001preset) If the Shoot already has a .spec.kubernetes.kubeAPIServer.oidcConfig, then no mutation occurs.\n Simple OpenIDConnectPreset Example This is a simple example to show how a Shoot is modified by the OpenIDConnectPreset:\napiVersion: settings.gardener.cloud/v1alpha1 kind: OpenIDConnectPreset metadata: name: test-1 namespace: default spec: shootSelector: matchLabels: oidc: enabled server: clientID: test-1 issuerURL: https://foo.bar # caBundle: | # -----BEGIN CERTIFICATE----- # Li4u # -----END CERTIFICATE----- groupsClaim: groups-claim groupsPrefix: groups-prefix usernameClaim: username-claim usernamePrefix: username-prefix signingAlgs: - RS256 requiredClaims: key: value weight: 90 Create the OpenIDConnectPreset:\nkubectl apply -f preset.yaml Examine the created OpenIDConnectPreset:\nkubectl get openidconnectpresets NAME ISSUER SHOOT-SELECTOR AGE test-1 https://foo.bar oidc=enabled 1s Simple Shoot example:\nThis is a sample of a Shoot with some fields omitted:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: version: 1.20.2 Create the Shoot:\nkubectl apply -f shoot.yaml Examine the created Shoot:\nkubectl get shoot preset -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: kubeAPIServer: oidcConfig: clientID: test-1 groupsClaim: groups-claim groupsPrefix: groups-prefix issuerURL: https://foo.bar requiredClaims: key: value signingAlgs: - RS256 usernameClaim: username-claim usernamePrefix: username-prefix version: 1.20.2 Disable OpenIDConnectPreset The OpenIDConnectPreset admission control is enabled by default. To disable it, use the --disable-admission-plugins flag on the gardener-apiserver.\nFor example:\n--disable-admission-plugins=OpenIDConnectPreset ClusterOpenIDConnectPreset A ClusterOpenIDConnectPreset is an API resource for injecting additional runtime OIDC requirements into a Shoot at creation time. In contrast to OpenIDConnect, it’s a cluster-scoped resource. You use label selectors to specify the Project and Shoot to which a given OpenIDCConnectPreset applies.\nUsing a OpenIDConnectPresets allows cluster owners to not have to explicitly provide the same OIDC configuration for every Shoot in specific Project.\nFor more information about the background, see the issue for ClusterOpenIDConnectPreset.\nHow ClusterOpenIDConnectPreset Works Gardener provides an admission controller (ClusterOpenIDConnectPreset) which, when enabled, applies ClusterOpenIDConnectPresets to incoming Shoot creation requests. When a Shoot creation request occurs, the system does the following:\n Retrieve all ClusterOpenIDConnectPresets available.\n Check if the project label selector of any ClusterOpenIDConnectPreset matches the labels of the Project in which the Shoot is being created.\n Check if the shoot label selectors of any ClusterOpenIDConnectPreset matches the labels on the Shoot being created.\n If multiple presets are matched then only one is chosen and results are sorted based on:\n .spec.weight value. lexicographically ordering their names ( e.g. 002preset \u003e 001preset ) If the Shoot already has a .spec.kubernetes.kubeAPIServer.oidcConfig then no mutation occurs.\n Note: Due to the previous requirement, if a Shoot is matched by both OpenIDConnectPreset and ClusterOpenIDConnectPreset, then OpenIDConnectPreset takes precedence over ClusterOpenIDConnectPreset.\n Simple ClusterOpenIDConnectPreset Example This is a simple example to show how a Shoot is modified by the ClusterOpenIDConnectPreset:\napiVersion: settings.gardener.cloud/v1alpha1 kind: ClusterOpenIDConnectPreset metadata: name: test spec: shootSelector: matchLabels: oidc: enabled projectSelector: {} # selects all projects. server: clientID: cluster-preset issuerURL: https://foo.bar # caBundle: | # -----BEGIN CERTIFICATE----- # Li4u # -----END CERTIFICATE----- groupsClaim: groups-claim groupsPrefix: groups-prefix usernameClaim: username-claim usernamePrefix: username-prefix signingAlgs: - RS256 requiredClaims: key: value weight: 90 Create the ClusterOpenIDConnectPreset:\nkubectl apply -f preset.yaml Examine the created ClusterOpenIDConnectPreset:\nkubectl get clusteropenidconnectpresets NAME ISSUER PROJECT-SELECTOR SHOOT-SELECTOR AGE test https://foo.bar \u003cnone\u003e oidc=enabled 1s This is a sample of a Shoot, with some fields omitted:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: version: 1.20.2 Create the Shoot:\nkubectl apply -f shoot.yaml Examine the created Shoot:\nkubectl get shoot preset -o yaml apiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: preset namespace: default labels: oidc: enabled spec: kubernetes: kubeAPIServer: oidcConfig: clientID: cluster-preset groupsClaim: groups-claim groupsPrefix: groups-prefix issuerURL: https://foo.bar requiredClaims: key: value signingAlgs: - RS256 usernameClaim: username-claim usernamePrefix: username-prefix version: 1.20.2 Disable ClusterOpenIDConnectPreset The ClusterOpenIDConnectPreset admission control is enabled by default. To disable it, use the --disable-admission-plugins flag on the gardener-apiserver.\nFor example:\n--disable-admission-plugins=ClusterOpenIDConnectPreset ","categories":"","description":"","excerpt":"ClusterOpenIDConnectPreset and OpenIDConnectPreset This page provides …","ref":"/docs/gardener/openidconnect-presets/","tags":"","title":"OpenIDConnect Presets"},{"body":"Register OpenID Connect provider in Shoot Clusters Introduction Within a shoot cluster, it is possible to dynamically register OpenID Connect providers. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-oidc-service extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-oidc-service extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-oidc-service ... OpenID Connect provider In order to register an OpenID Connect provider an openidconnect resource should be deployed in the shoot cluster.\nIt is strongly recommended to NOT disable prefixing since it may result in unwanted impersonations. The rule of thumb is to always use meaningful and unique prefixes for both username and groups. A good way to ensure this is to use the name of the openidconnect resource as shown in the example below.\napiVersion: authentication.gardener.cloud/v1alpha1 kind: OpenIDConnect metadata: name: abc spec: # issuerURL is the URL the provider signs ID Tokens as. # This will be the \"iss\" field of all tokens produced by the provider and is used for configuration discovery. issuerURL: https://abc-oidc-provider.example # clientID is the audience for which the JWT must be issued for, the \"aud\" field. clientID: my-shoot-cluster # usernameClaim is the JWT field to use as the user's username. usernameClaim: sub # usernamePrefix, if specified, causes claims mapping to username to be prefix with the provided value. # A value \"oidc:\" would result in usernames like \"oidc:john\". # If not provided, the prefix defaults to \"( .metadata.name )/\". The value \"-\" can be used to disable all prefixing. usernamePrefix: \"abc:\" # groupsClaim, if specified, causes the OIDCAuthenticator to try to populate the user's groups with an ID Token field. # If the groupsClaim field is present in an ID Token the value must be a string or list of strings. # groupsClaim: groups # groupsPrefix, if specified, causes claims mapping to group names to be prefixed with the value. # A value \"oidc:\" would result in groups like \"oidc:engineering\" and \"oidc:marketing\". # If not provided, the prefix defaults to \"( .metadata.name )/\". # The value \"-\" can be used to disable all prefixing. # groupsPrefix: \"abc:\" # caBundle is a PEM encoded CA bundle which will be used to validate the OpenID server's certificate. If unspecified, system's trusted certificates are used. # caBundle: \u003cbase64 encoded bundle\u003e # supportedSigningAlgs sets the accepted set of JOSE signing algorithms that can be used by the provider to sign tokens. # The default value is RS256. # supportedSigningAlgs: # - RS256 # requiredClaims, if specified, causes the OIDCAuthenticator to verify that all the # required claims key value pairs are present in the ID Token. # requiredClaims: # customclaim: requiredvalue # maxTokenExpirationSeconds if specified, sets a limit in seconds to the maximum validity duration of a token. # Tokens issued with validity greater that this value will not be verified. # Setting this will require that the tokens have the \"iat\" and \"exp\" claims. # maxTokenExpirationSeconds: 3600 # jwks if specified, provides an option to specify JWKS keys offline. # jwks: # keys is a base64 encoded JSON webkey Set. If specified, the OIDCAuthenticator skips the request to the issuer's jwks_uri endpoint to retrieve the keys. # keys: \u003cbase64 encoded jwks\u003e ","categories":"","description":"","excerpt":"Register OpenID Connect provider in Shoot Clusters Introduction Within …","ref":"/docs/extensions/others/gardener-extension-shoot-oidc-service/openidconnects/","tags":"","title":"Openidconnects"},{"body":"Contract: OperatingSystemConfig Resource Gardener uses the machine API and leverages the functionalities of the machine-controller-manager (MCM) in order to manage the worker nodes of a shoot cluster. The machine-controller-manager itself simply takes a reference to an OS-image and (optionally) some user-data (a script or configuration that is executed when a VM is bootstrapped), and forwards both to the provider’s API when creating VMs. MCM does not have any restrictions regarding supported operating systems as it does not modify or influence the machine’s configuration in any way - it just creates/deletes machines with the provided metadata.\nConsequently, Gardener needs to provide this information when interacting with the machine-controller-manager. This means that basically every operating system is possible to be used, as long as there is some implementation that generates the OS-specific configuration in order to provision/bootstrap the machines.\n⚠️ Currently, there are a few requirements of pre-installed components that must be present in all OS images:\n containerd ctr (client CLI) containerd must listen on its default socket path: unix:///run/containerd/containerd.sock containerd must be configured to work with the default configuration file in: /etc/containerd/config.toml (eventually created by Gardener). systemd The reasons for that will become evident later.\nWhat does the user-data bootstrapping the machines contain? Gardener installs a few components onto every worker machine in order to allow it to join the shoot cluster. There is the kubelet process, some scripts for continuously checking the health of kubelet and containerd, but also configuration for log rotation, CA certificates, etc. You can find the complete configuration at the components folder. We are calling this the “original” user-data.\nHow does Gardener bootstrap the machines? gardenlet makes use of gardener-node-agent to perform the bootstrapping and reconciliation of systemd units and files on the machine. Please refer to this document for a first overview.\nUsually, you would submit all the components you want to install onto the machine as part of the user-data during creation time. However, some providers do have a size limitation (around ~16KB) for that user-data. That’s why we do not send the “original” user-data to the machine-controller-manager (who then forwards it to the provider’s API). Instead, we only send a small “init” script that bootstrap the gardener-node-agent. It fetches the “original” content from a Secret and applies it on the machine directly. This way we can extend the “original” user-data without any size restrictions (except for the 1 MB limit for Secrets).\nThe high-level flow is as follows:\n For every worker pool X in the Shoot specification, Gardener creates a Secret named cloud-config-\u003cX\u003e in the kube-system namespace of the shoot cluster. The secret contains the “original” OperatingSystemConfig (i.e., systemd units and files for kubelet, etc.). Gardener generates a kubeconfig with minimal permissions just allowing reading these secrets. It is used by the gardener-node-agent later. Gardener provides the gardener-node-init.sh bash script and the machine image stated in the Shoot specification to the machine-controller-manager. Based on this information, the machine-controller-manager creates the VM. After the VM has been provisioned, the gardener-node-init.sh script starts, fetches the gardener-node-agent binary, and starts it. The gardener-node-agent will read the gardener-node-agent-\u003cX\u003e Secret for its worker pool (containing the “original” OperatingSystemConfig), and reconciles it. The gardener-node-agent can update itself in case of newer Gardener versions, and it performs a continuous reconciliation of the systemd units and files in the provided OperatingSystemConfig (just like any other Kubernetes controller).\nWhat needs to be implemented to support a new operating system? As part of the Shoot reconciliation flow, gardenlet will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile units: - name: containerd.service dropIns: - name: 10-containerd-opts.conf content: |[Service] Environment=\"SOME_OPTS=--foo=bar\" - name: containerd-monitor.service command: start enable: true content: |[Unit] Description=Containerd-monitor daemon After=kubelet.service [Install] WantedBy=multi-user.target [Service] Restart=always EnvironmentFile=/etc/environment ExecStart=/opt/bin/health-monitor containerd files: - path: /var/lib/kubelet/ca.crt permissions: 0644 encoding: b64 content: secretRef: name: default-token-5dtjz dataKey: token - path: /etc/sysctl.d/99-k8s-general.conf permissions: 0644 content: inline: data: |# A higher vm.max_map_count is great for elasticsearch, mongo, or other mmap users # See https://github.com/kubernetes/kops/issues/1340 vm.max_map_count = 135217728 In order to support a new operating system, you need to write a controller that watches all OperatingSystemConfigs with .spec.type=\u003cmy-operating-system\u003e. For those it shall generate a configuration blob that fits to your operating system.\nOperatingSystemConfigs can have two purposes: either provision or reconcile.\nprovision Purpose The provision purpose is used by gardenlet for the user-data that it later passes to the machine-controller-manager (and then to the provider’s API) when creating new VMs. It contains the gardener-node-init.sh script and systemd unit.\nThe OS controller has to translate the .spec.units and .spec.files into configuration that fits to the operating system. For example, a Flatcar controller might generate a CoreOS cloud-config or Ignition, SLES might generate cloud-init, and others might simply generate a bash script translating the .spec.units into systemd units, and .spec.files into real files on the disk.\n ⚠️ Please avoid mixing in additional systemd units or files - this step should just translate what gardenlet put into .spec.units and .spec.files.\n After generation, extension controllers are asked to store their OS config inside a Secret (as it might contain confidential data) in the same namespace. The secret’s .data could look like this:\napiVersion: v1 kind: Secret metadata: name: osc-result-pool-01-original namespace: default ownerReferences: - apiVersion: extensions.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: OperatingSystemConfig name: pool-01-original uid: 99c0c5ca-19b9-11e9-9ebd-d67077b40f82 data: cloud_config: base64(generated-user-data) Finally, the secret’s metadata must be provided in the OperatingSystemConfig’s .status field:\n... status: cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 reconcile Purpose The reconcile purpose contains the “original” OperatingSystemConfig (which is later stored in Secrets in the shoot’s kube-system namespace (see step 1)).\nThe OS controller does not need to translate anything here, but it has the option to provide additional systemd units or files via the .status field:\nstatus: extensionUnits: - name: my-custom-service.service command: start enable: true content: |[Unit] // some systemd unit content extensionFiles: - path: /etc/some/file permissions: 0644 content: inline: data: some-file-content lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 The gardener-node-agent will merge .spec.units and .status.extensionUnits as well as .spec.files and .status.extensionFiles when applying.\nYou can find an example implementation here.\nBootstrap Tokens gardenlet adds a file with the content \u003c\u003cBOOTSTRAP_TOKEN\u003e\u003e to the OperatingSystemConfig with purpose provision and sets transmitUnencoded=true. This instructs the responsible OS extension to pass this file (with its content in clear-text) to the corresponding Worker resource.\nmachine-controller-manager makes sure that\n a bootstrap token gets created per machine the \u003c\u003cBOOTSTRAP_TOKEN\u003e\u003e string in the user data of the machine gets replaced by the generated token. After the machine has been bootstrapped, the token secret in the shoot cluster gets deleted again.\nThe token is used to bootstrap Gardener Node Agent and kubelet.\nWhat needs to be implemented to support a new operating system? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile units: - name: docker.service dropIns: - name: 10-docker-opts.conf content: |[Service] Environment=\"DOCKER_OPTS=--log-opt max-size=60m --log-opt max-file=3\" - name: docker-monitor.service command: start enable: true content: |[Unit] Description=Containerd-monitor daemon After=kubelet.service [Install] WantedBy=multi-user.target [Service] Restart=always EnvironmentFile=/etc/environment ExecStart=/opt/bin/health-monitor docker files: - path: /var/lib/kubelet/ca.crt permissions: 0644 encoding: b64 content: secretRef: name: default-token-5dtjz dataKey: token - path: /etc/sysctl.d/99-k8s-general.conf permissions: 0644 content: inline: data: |# A higher vm.max_map_count is great for elasticsearch, mongo, or other mmap users # See https://github.com/kubernetes/kops/issues/1340 vm.max_map_count = 135217728 In order to support a new operating system, you need to write a controller that watches all OperatingSystemConfigs with .spec.type=\u003cmy-operating-system\u003e. For those it shall generate a configuration blob that fits to your operating system. For example, a CoreOS controller might generate a CoreOS cloud-config or Ignition, SLES might generate cloud-init, and others might simply generate a bash script translating the .spec.units into systemd units, and .spec.files into real files on the disk.\nOperatingSystemConfigs can have two purposes which can be used (or ignored) by the extension controllers: either provision or reconcile.\n The provision purpose is used by Gardener for the user-data that it later passes to the machine-controller-manager (and then to the provider’s API) when creating new VMs. It contains the gardener-node-init unit. The reconcile purpose contains the “original” user-data (that is then stored in Secrets in the shoot’s kube-system namespace (see step 1). This is downloaded and applies late (see step 5). As described above, the “original” user-data must be re-applicable to allow in-place updates. The way how this is done is specific to the generated operating system config (e.g., for CoreOS cloud-init the command is /usr/bin/coreos-cloudinit --from-file=\u003cpath\u003e, whereas SLES would run cloud-init --file \u003cpath\u003e single -n write_files --frequency=once). Consequently, besides the generated OS config, the extension controller must also provide a command for re-application an updated version of the user-data. As visible in the mentioned examples, the command requires a path to the user-data file. As soon as Gardener detects that the user data has changed it will reload the systemd daemon and restart all the units provided in the .status.units[] list (see the below example). The same logic applies during the very first application of the whole configuration.\nAfter generation, extension controllers are asked to store their OS config inside a Secret (as it might contain confidential data) in the same namespace. The secret’s .data could look like this:\napiVersion: v1 kind: Secret metadata: name: osc-result-pool-01-original namespace: default ownerReferences: - apiVersion: extensions.gardener.cloud/v1alpha1 blockOwnerDeletion: true controller: true kind: OperatingSystemConfig name: pool-01-original uid: 99c0c5ca-19b9-11e9-9ebd-d67077b40f82 data: cloud_config: base64(generated-user-data) Finally, the secret’s metadata, the OS-specific command to re-apply the configuration, and the list of systemd units that shall be considered to be restarted if an updated version of the user-data is re-applied must be provided in the OperatingSystemConfig’s .status field:\n... status: cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default lastOperation: description: Successfully generated cloud config lastUpdateTime: \"2019-01-23T07:45:23Z\" progress: 100 state: Succeeded type: Reconcile observedGeneration: 5 units: - docker-monitor.service Once the .status indicates that the extension controller finished reconciling Gardener will continue with the next step of the shoot reconciliation flow.\nCRI Support Gardener supports specifying a Container Runtime Interface (CRI) configuration in the OperatingSystemConfig resource. If the .spec.cri section exists, then the name property is mandatory. The only supported value for cri.name at the moment is: containerd. For example:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: \u003cmy-operating-system\u003e purpose: reconcile cri: name: containerd # cgroupDriver: cgroupfs # or systemd containerd: sandboxImage: registry.k8s.io/pause # registries: # - upstream: docker.io # server: https://registry-1.docker.io # hosts: # - url: http://\u003cservice-ip\u003e:\u003cport\u003e] # plugins: # - op: add # add (default) or remove # path: [io.containerd.grpc.v1.cri, containerd] # values: '{\"default_runtime_name\": \"runc\"}' ... To support containerd, an OS extension must satisfy the following criteria:\n The operating system must have built-in containerd and ctr (client CLI). containerd must listen on its default socket path: unix:///run/containerd/containerd.sock containerd must be configured to work with the default configuration file in: /etc/containerd/config.toml (Created by Gardener). For a convenient handling, gardener-node-agent can manage various aspects of containerd’s config, e.g. the registry configuration, if given in the OperatingSystemConfig. Any Gardener extension which needs to modify the config, should check the functionality exposed through this API first. If applicable, adjustments can be implemented through mutating webhooks, acting on the created or updated OperatingSystemConfig resource.\nIf CRI configurations are not supported, it is recommended to create a validating webhook running in the garden cluster that prevents specifying the .spec.providers.workers[].cri section in the Shoot objects.\nReferences and Additional Resources OperatingSystemConfig API (Golang Specification) Gardener Node Agent ","categories":"","description":"","excerpt":"Contract: OperatingSystemConfig Resource Gardener uses the machine API …","ref":"/docs/gardener/extensions/operatingsystemconfig/","tags":"","title":"Operatingsystemconfig"},{"body":"Using the Alicloud provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for this provider extension. In addition, this document also describes how to enable the use of customized machine images for Alicloud.\nCloudProfile resource This section describes, how the configuration for CloudProfile looks like for Alicloud by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of Alicloud shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Alicloud environment (AMIs). You have to map every version that you specify in .spec.machineImages[].versions here such that the Alicloud extension knows the AMI for every version you want to offer.\nAn example CloudProfileConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2023.4.0 regions: - name: eu-central-1 id: coreos_2023_4_0_64_30G_alibase_20190319.vhd Example CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: alicloud spec: type: alicloud kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2023.4.0 machineTypes: - name: ecs.sn2ne.large cpu: \"2\" gpu: \"0\" memory: 8Gi volumeTypes: - name: cloud_efficiency class: standard - name: cloud_essd class: premium regions: - name: eu-central-1 zones: - name: eu-central-1a - name: eu-central-1b providerConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2023.4.0 regions: - name: eu-central-1 id: coreos_2023_4_0_64_30G_alibase_20190319.vhd Enable customized machine images for the Alicloud extension Customized machine images can be created for an Alicloud account and shared with other Alicloud accounts. The same customized machine image has different image ID in different regions on Alicloud. If you need to enable encrypted system disk, you must provide customized machine images. Administrators/Operators need to explicitly declare them per imageID per region as below:\nmachineImages: - name: customized_coreos regions: - imageID: \u003cimage_id_in_eu_central_1\u003e region: eu-central-1 - imageID: \u003cimage_id_in_cn_shanghai\u003e region: cn-shanghai ... version: 2191.4.1 ... End-users have to have the permission to use the customized image from its creator Alicloud account. To enable end-users to use customized images, the images are shared from Alicloud account of Seed operator with end-users’ Alicloud accounts. Administrators/Operators need to explicitly provide Seed operator’s Alicloud account access credentials (base64 encoded) as below:\nmachineImageOwnerSecret: name: machine-image-owner accessKeyID: \u003cbase64_encoded_access_key_id\u003e accessKeySecret: \u003cbase64_encoded_access_key_secret\u003e As a result, a Secret named machine-image-owner by default will be created in namespace of Alicloud provider extension.\nOperators should also maintain custom image IDs which are to be shared with end-users as below:\ntoBeSharedImageIDs: - \u003cimage_id_1\u003e - \u003cimage_id_2\u003e - \u003cimage_id_3\u003e Example ControllerDeployment manifest for enabling customized machine images apiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: extension-provider-alicloud spec: type: helm providerConfig: chart: | H4sIFAAAAAAA/yk... values: config: machineImageOwnerSecret: accessKeyID: \u003cbase64_encoded_access_key_id\u003e accessKeySecret: \u003cbase64_encoded_access_key_secret\u003e toBeSharedImageIDs: - \u003cimage_id_1\u003e - \u003cimage_id_2\u003e ... machineImages: - name: customized_coreos regions: - imageID: \u003cimage_id_in_eu_central_1\u003e region: eu-central-1 - imageID: \u003cimage_id_in_cn_shanghai\u003e region: cn-shanghai ... version: 2191.4.1 ... csi: enableADController: true resources: limits: cpu: 500m memory: 1Gi requests: memory: 128Mi Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type alicloud can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Alicloud Object Storage Service.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region).\nPlease find below an example Seed manifest (partly) that configures backups using Alicloud Object Storage Service.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: alicloud region: cn-shanghai backup: provider: alicloud secretRef: name: backup-credentials namespace: garden ... An example of the referenced secret containing the credentials for the Alicloud Object Storage Service can be found in the example folder.\nPermissions for Alicloud Object Storage Service Please make sure the RAM user associated with the provided AccessKey pair has the following permission.\n AliyunOSSFullAccess ","categories":"","description":"","excerpt":"Using the Alicloud provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/operations/","tags":"","title":"Operations"},{"body":"Using the AWS provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. Similarly, the core.gardener.cloud/v1beta1.Seed resource is structured. Additionally, it allows to configure settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains what is necessary to configure for this provider extension.\nCloudProfile resource In this section we are describing how the configuration for CloudProfiles looks like for AWS and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating AWS shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the AWS environment (AMIs). You have to map every version that you specify in .spec.machineImages[].versions here such that the AWS extension knows the AMI for every version you want to offer. For each AMI an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 regions: - name: eu-central-1 ami: ami-034fd8c3f4026eb39 # architecture: amd64 # optional Example CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: aws spec: type: aws kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: m5.large cpu: \"2\" gpu: \"0\" memory: 8Gi usable: true volumeTypes: - name: gp2 class: standard usable: true - name: io1 class: premium usable: true regions: - name: eu-central-1 zones: - name: eu-central-1a - name: eu-central-1b - name: eu-central-1c providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 regions: - name: eu-central-1 ami: ami-034fd8c3f4026eb39 # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to manage backup infrastructure, i.e., you can specify configuration for the .spec.backup field.\nBackup configuration Please find below an example Seed manifest (partly) that configures backups. As you can see, the location/region where the backups will be stored can be different to the region where the seed cluster is running.\napiVersion: v1 kind: Secret metadata: name: backup-credentials namespace: garden type: Opaque data: accessKeyID: base64(access-key-id) secretAccessKey: base64(secret-access-key) --- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: aws region: eu-west-1 backup: provider: aws region: eu-central-1 secretRef: name: backup-credentials namespace: garden ... Please look up https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys as well.\nPermissions for AWS IAM user Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):\n Click to expand the AWS IAM policy document! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": \"s3:*\", \"Resource\": \"*\" } ] } ","categories":"","description":"","excerpt":"Using the AWS provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/operations/","tags":"","title":"Operations"},{"body":"Using the Azure provider extension with Gardener as an operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for the Azure provider extension.\nCloudProfile resource This section describes, how the configuration for CloudProfiles looks like for Azure by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of Azure shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Azure environment (image urn, id, communityGalleryImageID or sharedGalleryImageID). You have to map every version that you specify in .spec.machineImages[].versions to an available VM image in your subscription. The VM image can be either from the Azure Marketplace and will then get identified via a urn, it can be a custom VM image from a shared image gallery and is then identified sharedGalleryImageID, or it can be from a community image gallery and is then identified by its communityGalleryImageID. You can use id field also to specifiy the image location in the azure compute gallery (in which case it would have a different kind of path) but it is not recommended as it sometimes faces problems in cross subscription image sharing. For each machine image version an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig countUpdateDomains: - region: westeurope count: 5 countFaultDomains: - region: westeurope count: 3 machineTypes: - name: Standard_D3_v2 acceleratedNetworking: true - name: Standard_X machineImages: - name: coreos versions: - version: 2135.6.0 urn: \"CoreOS:CoreOS:Stable:2135.6.0\" # architecture: amd64 # optional acceleratedNetworking: true - name: myimage versions: - version: 1.0.0 id: \"/subscriptions/\u003csubscription ID where the gallery is located\u003e/resourceGroups/myGalleryRG/providers/Microsoft.Compute/galleries/myGallery/images/myImageDefinition/versions/1.0.0\" - name: GardenLinuxCommunityImage versions: - version: 1.0.0 communityGalleryImageID: \"/CommunityGalleries/gardenlinux-567905d8-921f-4a85-b423-1fbf4e249d90/Images/gardenlinux/Versions/576.1.1\" - name: SharedGalleryImageName versions: - version: 1.0.0 sharedGalleryImageID: \"/SharedGalleries/sharedGalleryName/Images/sharedGalleryImageName/Versions/sharedGalleryImageVersionName\" The cloud profile configuration contains information about the update via .countUpdateDomains[] and failure domain via .countFaultDomains[] counts in the Azure regions you want to offer.\nThe .machineTypes[] list contain provider specific information to the machine types e.g. if the machine type support Azure Accelerated Networking, see .machineTypes[].acceleratedNetworking.\nAdditionally, it contains the real machine image identifiers in the Azure environment. You can provide either URN for Azure Market Place images or id of Shared Image Gallery images. When Shared Image Gallery is used, you have to ensure that the image is available in the desired regions and the end-user subscriptions have access to the image or to the whole gallery. You have to map every version that you specify in .spec.machineImages[].versions here such that the Azure extension knows the machine image identifiers for every version you want to offer. Furthermore, you can specify for each image version via .machineImages[].versions[].acceleratedNetworking if Azure Accelerated Networking is supported.\nExample CloudProfile manifest The possible values for .spec.volumeTypes[].name on Azure are Standard_LRS, StandardSSD_LRS and Premium_LRS. There is another volume type called UltraSSD_LRS but this type is not supported to use as os disk. If an end user select a volume type whose name is not equal to one of the valid values then the machine will be created with the default volume type which belong to the selected machine type. Therefore it is recommended to configure only the valid values for the .spec.volumeType[].name in the CloudProfile.\nPlease find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: azure spec: type: azure kubernetes: versions: - version: 1.28.2 - version: 1.23.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: Standard_D3_v2 cpu: \"4\" gpu: \"0\" memory: 14Gi - name: Standard_D4_v3 cpu: \"4\" gpu: \"0\" memory: 16Gi volumeTypes: - name: Standard_LRS class: standard usable: true - name: StandardSSD_LRS class: premium usable: false - name: Premium_LRS class: premium usable: false regions: - name: westeurope providerConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineTypes: - name: Standard_D3_v2 acceleratedNetworking: true - name: Standard_D4_v3 countUpdateDomains: - region: westeurope count: 5 countFaultDomains: - region: westeurope count: 3 machineImages: - name: coreos versions: - version: 2303.3.0 urn: CoreOS:CoreOS:Stable:2303.3.0 # architecture: amd64 # optional acceleratedNetworking: true - version: 2135.6.0 urn: \"CoreOS:CoreOS:Stable:2135.6.0\" # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type azure can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Azure Blob storage.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region), but can also be explicitly configured via the field spec.backup.region. The region of the backup can be different from where the Seed cluster is running. However, usually it makes sense to pick the same region for the backup bucket as used for the Seed cluster.\nPlease find below an example Seed manifest (partly) that configures backups using Azure Blob storage.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: azure region: westeurope backup: provider: azure region: westeurope # default region secretRef: name: backup-credentials namespace: garden ... The referenced secret has to contain the provider credentials of the Azure subscription. Please take a look here on how to create an Azure Application, Service Principle and how to obtain credentials. The example below demonstrates how the secret has to look like.\napiVersion: v1 kind: Secret metadata: name: core-azure namespace: garden-dev type: Opaque data: clientID: base64(client-id) clientSecret: base64(client-secret) subscriptionID: base64(subscription-id) tenantID: base64(tenant-id) Permissions for Azure Blob storage Please make sure the Azure application has the following IAM roles.\n Contributor Miscellaneous Gardener managed Service Principals The operators of the Gardener Azure extension can provide a list of managed service principals (technical users) that can be used for Azure Shoots. This eliminates the need for users to provide own service principals for their clusters.\nThe user would need to grant the managed service principal access to their subscription with proper permissions.\nAs service principals are managed in an Azure Active Directory for each supported Active Directory, an own service principal needs to be provided.\nIn case the user provides an own service principal in the Shoot secret, this one will be used instead of the managed one provided by the operator.\nEach managed service principal will be maintained in a Secret like that:\napiVersion: v1 kind: Secret metadata: name: service-principal-my-tenant namespace: extension-provider-azure labels: azure.provider.extensions.gardener.cloud/purpose: tenant-service-principal-secret data: tenantID: base64(my-tenant) clientID: base64(my-service-princiapl-id) clientSecret: base64(my-service-princiapl-secret) type: Opaque The user needs to provide in its Shoot secret a tenantID and subscriptionID.\nThe managed service principal will be assigned based on the tenantID. In case there is a managed service principal secret with a matching tenantID, this one will be used for the Shoot. If there is no matching managed service principal secret then the next Shoot operation will fail.\nOne of the benefits of having managed service principals is that the operator controls the lifecycle of the service principal and can rotate its secrets.\nAfter the service principal secret has been rotated and the corresponding secret is updated, all Shoot clusters using it need to be reconciled or the last operation to be retried.\n","categories":"","description":"","excerpt":"Using the Azure provider extension with Gardener as an operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/operations/","tags":"","title":"Operations"},{"body":"Using the Equinix Metal provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for Equinix Metal and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating Equinix Metal shoot clusters.\nExample CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: equinix-metal spec: type: equinixmetal kubernetes: versions: - version: 1.27.2 - version: 1.26.7 - version: 1.25.10 #expirationDate: \"2023-03-15T23:59:59Z\" machineImages: - name: flatcar versions: - version: 0.0.0-stable machineTypes: - name: t1.small cpu: \"4\" gpu: \"0\" memory: 8Gi usable: true regions: # List of offered metros - name: ny zones: # List of offered facilities within the respective metro - name: ewr1 - name: ny5 - name: ny7 providerConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: flatcar versions: - version: 0.0.0-stable id: flatcar_stable - version: 3510.2.2 ipxeScriptUrl: https://stable.release.flatcar-linux.net/amd64-usr/3510.2.2/flatcar_production_packet.ipxe CloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the Equinix Metal environment (IDs). You have to map every version that you specify in .spec.machineImages[].versions here such that the Equinix Metal extension knows the ID for every version you want to offer.\nEquinix Metal supports two different options to specify the image:\n Supported Operating System: Images that are provided by Equinix Metal. They are referenced by their ID (slug). See (Operating Systems Reference)[https://deploy.equinix.com/developers/docs/metal/operating-systems/supported/#operating-systems-reference] for all supported operating system and their ids. Custom iPXE Boot: Equinix Metal supports passing custom iPXE scripts during provisioning, which allows you to install a custom operating system manually. This is useful if you want to have a custom image or want to pin to a specific version. See Custom iPXE Boot for details. An example CloudProfileConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: flatcar versions: - version: 0.0.0-stable id: flatcar_stable - version: 3510.2.2 ipxeScriptUrl: https://stable.release.flatcar-linux.net/amd64-usr/3510.2.2/flatcar_production_packet.ipxe NOTE: CloudProfileConfig is not a Custom Resource, so you cannot create it directly.\n ","categories":"","description":"","excerpt":"Using the Equinix Metal provider extension with Gardener as operator …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/operations/","tags":"","title":"Operations"},{"body":"Using the GCP provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration. The core.gardener.cloud/v1beta1.Seed resource is structured similarly. Additionally, it allows configuring settings for the backups of the main etcds’ data of shoot clusters control planes running in this seed cluster.\nThis document explains the necessary configuration for this provider extension.\nCloudProfile resource This section describes, how the configuration for CloudProfiles looks like for GCP by providing an example CloudProfile manifest with minimal configuration that can be used to allow the creation of GCP shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the GCP environment (image URLs). You have to map every version that you specify in .spec.machineImages[].versions here such that the GCP extension knows the image URL for every version you want to offer. For each machine image version an architecture field can be specified which specifies the CPU architecture of the machine on which given machine image can be used.\nAn example CloudProfileConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 image: projects/coreos-cloud/global/images/coreos-stable-2135-6-0-v20190801 # architecture: amd64 # optional Example CloudProfile manifest If you want to allow that shoots can create VMs with local SSDs volumes then you have to specify the type of the disk with SCRATCH in the .spec.volumeTypes[] list. Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: gcp spec: type: gcp kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 machineTypes: - name: n1-standard-4 cpu: \"4\" gpu: \"0\" memory: 15Gi volumeTypes: - name: pd-standard class: standard - name: pd-ssd class: premium - name: SCRATCH class: standard regions: - region: europe-west1 names: - europe-west1-b - europe-west1-c - europe-west1-d providerConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 image: projects/coreos-cloud/global/images/coreos-stable-2135-6-0-v20190801 # architecture: amd64 # optional Seed resource This provider extension does not support any provider configuration for the Seed’s .spec.provider.providerConfig field. However, it supports to managing of backup infrastructure, i.e., you can specify a configuration for the .spec.backup field.\nBackup configuration A Seed of type gcp can be configured to perform backups for the main etcds’ of the shoot clusters control planes using Google Cloud Storage buckets.\nThe location/region where the backups will be stored defaults to the region of the Seed (spec.provider.region), but can also be explicitly configured via the field spec.backup.region. The region of the backup can be different from where the seed cluster is running. However, usually it makes sense to pick the same region for the backup bucket as used for the Seed cluster.\nPlease find below an example Seed manifest (partly) that configures backups using Google Cloud Storage buckets.\n--- apiVersion: core.gardener.cloud/v1beta1 kind: Seed metadata: name: my-seed spec: provider: type: gcp region: europe-west1 backup: provider: gcp region: europe-west1 # default region secretRef: name: backup-credentials namespace: garden ... An example of the referenced secret containing the credentials for the GCP Cloud storage can be found in the example folder.\nPermissions for GCP Cloud Storage Please make sure the service account associated with the provided credentials has the following IAM roles.\n Storage Admin ","categories":"","description":"","excerpt":"Using the GCP provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/operations/","tags":"","title":"Operations"},{"body":"Using the OpenStack provider extension with Gardener as operator The core.gardener.cloud/v1beta1.CloudProfile resource declares a providerConfig field that is meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for OpenStack and provide an example CloudProfile manifest with minimal configuration that you can use to allow creating OpenStack shoot clusters.\nCloudProfileConfig The cloud profile configuration contains information about the real machine image IDs in the OpenStack environment (image names). You have to map every version that you specify in .spec.machineImages[].versions here such that the OpenStack extension knows the image ID for every version you want to offer.\nIt also contains optional default values for DNS servers that shall be used for shoots. In the dnsServers[] list you can specify IP addresses that are used as DNS configuration for created shoot subnets.\nAlso, you have to specify the keystone URL in the keystoneURL field to your environment.\nAdditionally, you can influence the HTTP request timeout when talking to the OpenStack API in the requestTimeout field. This may help when you have for example a long list of load balancers in your environment.\nIn case your OpenStack system uses Octavia for network load balancing then you have to set the useOctavia field to true such that the cloud-controller-manager for OpenStack gets correctly configured (it defaults to false).\nSome hypervisors (especially those which are VMware-based) don’t automatically send a new volume size to a Linux kernel when a volume is resized and in-use. For those hypervisors you can enable the storage plugin interacting with Cinder to telling the SCSI block device to refresh its information to provide information about it’s updated size to the kernel. You might need to enable this behavior depending on the underlying hypervisor of your OpenStack installation. The rescanBlockStorageOnResize field controls this. Please note that it only applies for Kubernetes versions where CSI is used.\nSome openstack configurations do not allow to attach more volumes than a specific amount to a single node. To tell the k8s scheduler to not over schedule volumes on a node, you can set nodeVolumeAttachLimit which defaults to 256. Some openstack configurations have different names for volume and compute availability zones, which might cause pods to go into pending state as there are no nodes available in the detected volume AZ. To ignore the volume AZ when scheduling pods, you can set ignoreVolumeAZ to true (it defaults to false). See CSI Cinder driver.\nThe cloud profile config also contains constraints for floating pools and load balancer providers that can be used in shoots.\nIf your OpenStack system supports server groups, the serverGroupPolicies property will enable your end-users to create shoots with workers where the nodes are managed by Nova’s server groups. Specifying serverGroupPolicies is optional and can be omitted. If enabled, the end-user can choose whether or not to use this feature for a shoot’s workers. Gardener will handle the creation of the server group and node assignment.\nTo enable this feature, an operator should:\n specify the allowed policy values (e.g. affintity, anti-affinity) in this section. Only the policies in the allow-list will be available for end-users. make sure that your OpenStack project has enough server group capacity. Otherwise, shoot creation will fail. If your OpenStack system has multiple volume-types, the storageClasses property enables the creation of kubernetes storageClasses for shoots. Set storageClasses[].parameters.type to map it with an openstack volume-type. Specifying storageClasses is optional and can be omitted.\nAn example CloudProfileConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 # Fallback to image name if no region mapping is found # Only works for amd64 and is strongly discouraged. Prefer image IDs! image: coreos-2135.6.0 regions: - name: europe id: \"1234-amd64\" architecture: amd64 # optional, defaults to amd64 - name: europe id: \"1234-arm64\" architecture: arm64 - name: asia id: \"5678-amd64\" architecture: amd64 # keystoneURL: https://url-to-keystone/v3/ # keystoneURLs: # - region: europe # url: https://europe.example.com/v3/ # - region: asia # url: https://asia.example.com/v3/ # dnsServers: # - 10.10.10.11 # - 10.10.10.12 # requestTimeout: 60s # useOctavia: true # useSNAT: true # rescanBlockStorageOnResize: true # ignoreVolumeAZ: true # nodeVolumeAttachLimit: 30 # serverGroupPolicies: # - soft-anti-affinity # - anti-affinity # resolvConfOptions: # - rotate # - timeout:1 # storageClasses: # - name: example-sc # default: false # provisioner: cinder.csi.openstack.org # volumeBindingMode: WaitForFirstConsumer # parameters: # type: storage_premium_perf0 constraints: floatingPools: - name: fp-pool-1 # region: europe # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-*\" # region: europe # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-eu-demo\" # region: europe # domain: demo # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" # - name: \"fp-pool-eu-dev\" # region: europe # domain: dev # nonConstraining: true # loadBalancerClasses: # - name: lb-class-1 # floatingSubnetID: \"1234\" # floatingNetworkID: \"4567\" # subnetID: \"7890\" loadBalancerProviders: - name: haproxy # - name: f5 # region: asia # - name: haproxy # region: asia Please note that it is possible to configure a region mapping for keystone URLs, floating pools, and load balancer providers. Additionally, floating pools can be constrainted to a keystone domain by specifying the domain field. Floating pool names may also contains simple wildcard expressions, like * or fp-pool-* or *-fp-pool. Please note that the * must be either single or at the beginning or at the end. Consequently, fp-*-pool is not possible/allowed. The default behavior is that, if found, the regional (and/or domain restricted) entry is taken. If no entry for the given region exists then the fallback value is the most matching entry (w.r.t. wildcard matching) in the list without a region field (or the keystoneURL value for the keystone URLs). If an additional floating pool should be selectable for a region and/or domain, you can mark it as non constraining with setting the optional field nonConstraining to true. Multiple loadBalancerProviders can be specified in the CloudProfile. Each provider may specify a region constraint for where it can be used. If at least one region specific entry exists in the CloudProfile, the shoot’s specified loadBalancerProvider must adhere to the list of the available providers of that region. Otherwise, one of the non-regional specific providers should be used. Each entry in the loadBalancerProviders must be uniquely identified by its name and if applicable, its region.\nThe loadBalancerClasses field is an optional list of load balancer classes which can be when the corresponding floating pool network is choosen. The load balancer classes can be configured in the same way as in the ControlPlaneConfig in the Shoot resource, therefore see here for more details.\nSome OpenStack environments don’t need these regional mappings, hence, the region and keystoneURLs fields are optional. If your OpenStack environment only has regional values and it doesn’t make sense to provide a (non-regional) fallback then simply omit keystoneURL and always specify region.\nIf Gardener creates and manages the router of a shoot cluster, it is additionally possible to specify that the enable_snat field is set to true via useSNAT: true in the CloudProfileConfig.\nOn some OpenStack enviroments, there may be the need to set options in the file /etc/resolv.conf on worker nodes. If the field resolvConfOptions is set, a systemd service will be installed which copies /run/systemd/resolve/resolv.conf on every change to /etc/resolv.conf and appends the given options.\nExample CloudProfile manifest Please find below an example CloudProfile manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: openstack spec: type: openstack kubernetes: versions: - version: 1.27.3 - version: 1.26.8 expirationDate: \"2022-10-31T23:59:59Z\" machineImages: - name: coreos versions: - version: 2135.6.0 architectures: # optional, defaults to [amd64] - amd64 - arm64 machineTypes: - name: medium_4_8 cpu: \"4\" gpu: \"0\" memory: 8Gi architecture: amd64 # optional, defaults to amd64 storage: class: standard type: default size: 40Gi - name: medium_4_8_arm cpu: \"4\" gpu: \"0\" memory: 8Gi architecture: arm64 storage: class: standard type: default size: 40Gi regions: - name: europe-1 zones: - name: europe-1a - name: europe-1b - name: europe-1c providerConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: CloudProfileConfig machineImages: - name: coreos versions: - version: 2135.6.0 # Fallback to image name if no region mapping is found # Only works for amd64 and is strongly discouraged. Prefer image IDs! image: coreos-2135.6.0 regions: - name: europe id: \"1234-amd64\" architecture: amd64 # optional, defaults to amd64 - name: europe id: \"1234-arm64\" architecture: arm64 - name: asia id: \"5678-amd64\" architecture: amd64 keystoneURL: https://url-to-keystone/v3/ constraints: floatingPools: - name: fp-pool-1 loadBalancerProviders: - name: haproxy ","categories":"","description":"","excerpt":"Using the OpenStack provider extension with Gardener as operator The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/operations/","tags":"","title":"Operations"},{"body":"Using the Calico networking extension with Gardener as operator This document explains configuration options supported by the networking-calico extension.\nRun calico-node in non-privileged and non-root mode Feature State: Alpha\nMotivation Running containers in privileged mode is not recommended as privileged containers run with all linux capabilities enabled and can access the host’s resources. Running containers in privileged mode opens number of security threats such as breakout to underlying host OS.\nSupport for non-privileged and non-root mode The Calico project has a preliminary support for running the calico-node component in non-privileged mode (see this guide). Similar to Tigera Calico operator the networking-calico extension can also run calico-node in non-privileged and non-root mode. This feature is controller via feature gate named NonPrivilegedCalicoNode. The feature gates are configured in the ControllerConfiguration of networking-calico. The corresponding ControllerDeployment configuration that enables the NonPrivilegedCalicoNode would look like:\napiVersion: core.gardener.cloud/v1beta1 kind: ControllerDeployment metadata: name: networking-calico type: helm providerConfig: values: chart: \u003comitted\u003e config: featureGates: NonPrivilegedCalicoNode: false Limitations The support for the non-privileged mode in the Calico project is not ready for productive usage. The upstream documentation states that in non-privileged mode the support for features added after Calico v3.21 is not guaranteed. Calico in non-privileged mode does not support eBPF dataplane. That’s why when eBPF dataplane is enabled, calico-node has to run in privileged mode (even when the NonPrivilegedCalicoNode feature gate is enabled). (At the time of writing this guide) there is the following issue projectcalico/calico#5348 that is not addressed. (At the time of writing this guide) the upstream adoptions seems to be low. The Calico charts and manifest in projectcalico/calico run calico-node in privileged mode. ","categories":"","description":"","excerpt":"Using the Calico networking extension with Gardener as operator This …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/operations/","tags":"","title":"Operations"},{"body":"Packages:\n operations.gardener.cloud/v1alpha1 operations.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: Bastion Bastion Bastion holds details about an SSH bastion for a shoot cluster.\n Field Description apiVersion string operations.gardener.cloud/v1alpha1 kind string Bastion metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec BastionSpec Specification of the Bastion.\n shootRef Kubernetes core/v1.LocalObjectReference ShootRef defines the target shoot for a Bastion. The name field of the ShootRef is immutable.\n seedName string (Optional) SeedName is the name of the seed to which this Bastion is currently scheduled. This field is populated at the beginning of a create/reconcile operation.\n providerType string (Optional) ProviderType is cloud provider used by the referenced Shoot.\n sshPublicKey string SSHPublicKey is the user’s public key. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n status BastionStatus (Optional) Most recently observed status of the Bastion.\n BastionIngressPolicy (Appears on: BastionSpec) BastionIngressPolicy represents an ingress policy for SSH bastion hosts.\n Field Description ipBlock Kubernetes networking/v1.IPBlock IPBlock defines an IP block that is allowed to access the bastion.\n BastionSpec (Appears on: Bastion) BastionSpec is the specification of a Bastion.\n Field Description shootRef Kubernetes core/v1.LocalObjectReference ShootRef defines the target shoot for a Bastion. The name field of the ShootRef is immutable.\n seedName string (Optional) SeedName is the name of the seed to which this Bastion is currently scheduled. This field is populated at the beginning of a create/reconcile operation.\n providerType string (Optional) ProviderType is cloud provider used by the referenced Shoot.\n sshPublicKey string SSHPublicKey is the user’s public key. This field is immutable.\n ingress []BastionIngressPolicy Ingress controls from where the created bastion host should be reachable.\n BastionStatus (Appears on: Bastion) BastionStatus holds the most recently observed status of the Bastion.\n Field Description ingress Kubernetes core/v1.LoadBalancerIngress (Optional) Ingress holds the public IP and/or hostname of the bastion instance.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Bastion’s current state.\n lastHeartbeatTimestamp Kubernetes meta/v1.Time (Optional) LastHeartbeatTimestamp is the time when the bastion was last marked as not to be deleted. When this is set, the ExpirationTimestamp is advanced as well.\n expirationTimestamp Kubernetes meta/v1.Time (Optional) ExpirationTimestamp is the time after which a Bastion is supposed to be garbage collected.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Bastion. It corresponds to the Bastion’s generation, which is updated on mutation by the API Server.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n operations.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/operations/","tags":"","title":"Operations"},{"body":"Packages:\n operator.gardener.cloud/v1alpha1 operator.gardener.cloud/v1alpha1 Package v1alpha1 contains the configuration of the Gardener Operator.\nResource Types: ACMEIssuer (Appears on: DefaultIssuer) ACMEIssuer specifies an issuer using an ACME server.\n Field Description email string Email is the e-mail for the ACME user.\n server string Server is the ACME server endpoint.\n secretRef Kubernetes core/v1.LocalObjectReference (Optional) SecretRef is a reference to a secret containing a private key of the issuer (data key ‘privateKey’).\n precheckNameservers []string (Optional) PrecheckNameservers overwrites the default precheck nameservers used for checking DNS propagation. Format host or host:port, e.g. “8.8.8.8” same as “8.8.8.8:53” or “google-public-dns-a.google.com:53”.\n AdmissionDeploymentSpec (Appears on: Deployment) AdmissionDeploymentSpec contains the deployment specification for the admission controller of an extension.\n Field Description runtimeCluster DeploymentSpec (Optional) RuntimeCluster is the deployment configuration for the admission in the runtime cluster. The runtime deployment is responsible for creating the admission controller in the runtime cluster.\n virtualCluster DeploymentSpec (Optional) VirtualCluster is the deployment configuration for the admission deployment in the garden cluster. The garden deployment installs necessary resources in the virtual garden cluster e.g. RBAC that are necessary for the admission controller.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the deployment values. The values will be applied to both admission deployments.\n AuditWebhook (Appears on: GardenerAPIServerConfig, KubeAPIServerConfig) AuditWebhook contains settings related to an audit webhook configuration.\n Field Description batchMaxSize int32 (Optional) BatchMaxSize is the maximum size of a batch.\n kubeconfigSecretName string KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this webhook.\n version string (Optional) Version is the API version to send and expect from the webhook.\n Authentication (Appears on: KubeAPIServerConfig) Authentication contains settings related to authentication.\n Field Description webhook AuthenticationWebhook (Optional) Webhook contains settings related to an authentication webhook configuration.\n AuthenticationWebhook (Appears on: Authentication) AuthenticationWebhook contains settings related to an authentication webhook configuration.\n Field Description cacheTTL Kubernetes meta/v1.Duration (Optional) CacheTTL is the duration to cache responses from the webhook authenticator.\n kubeconfigSecretName string KubeconfigSecretName specifies the name of a secret containing the kubeconfig for this webhook.\n version string (Optional) Version is the API version to send and expect from the webhook.\n Backup (Appears on: ETCDMain) Backup contains the object store configuration for backups for the virtual garden etcd.\n Field Description provider string Provider is a provider name. This field is immutable.\n bucketName string BucketName is the name of the backup bucket.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is a reference to a Secret object containing the cloud provider credentials for the object store where backups should be stored. It should have enough privileges to manipulate the objects as well as buckets.\n CAIssuer (Appears on: DefaultIssuer) CAIssuer specifies an issuer using a root or intermediate CA to be used for signing.\n Field Description secretRef Kubernetes core/v1.LocalObjectReference SecretRef is a reference to a TLS secret containing the CA for signing certificates.\n CertManagement (Appears on: RuntimeCluster) CertManagement configures the cert-management component for issuing TLS certificates from an ACME server.\n Field Description config CertManagementConfig (Optional) Config contains configuration for deploying the cert-controller-manager.\n defaultIssuer DefaultIssuer DefaultIssuer is the default issuer used for requesting TLS certificates.\n CertManagementConfig (Appears on: CertManagement) CertManagementConfig contains information for deploying the cert-controller-manager.\n Field Description caCertificatesSecretRef Kubernetes core/v1.LocalObjectReference (Optional) CACertificatesSecretRef are additional root certificates to access ACME servers with private TLS certificates. The certificates are expected at key ‘bundle.crt’.\n ControlPlane (Appears on: VirtualCluster) ControlPlane holds information about the general settings for the control plane of the virtual garden cluster.\n Field Description highAvailability HighAvailability (Optional) HighAvailability holds the configuration settings for high availability settings.\n Credentials (Appears on: GardenStatus) Credentials contains information about the virtual garden cluster credentials.\n Field Description rotation CredentialsRotation (Optional) Rotation contains information about the credential rotations.\n CredentialsRotation (Appears on: Credentials) CredentialsRotation contains information about the rotation of credentials.\n Field Description certificateAuthorities github.com/gardener/gardener/pkg/apis/core/v1beta1.CARotation (Optional) CertificateAuthorities contains information about the certificate authority credential rotation.\n serviceAccountKey github.com/gardener/gardener/pkg/apis/core/v1beta1.ServiceAccountKeyRotation (Optional) ServiceAccountKey contains information about the service account key credential rotation.\n etcdEncryptionKey github.com/gardener/gardener/pkg/apis/core/v1beta1.ETCDEncryptionKeyRotation (Optional) ETCDEncryptionKey contains information about the ETCD encryption key credential rotation.\n observability github.com/gardener/gardener/pkg/apis/core/v1beta1.ObservabilityRotation (Optional) Observability contains information about the observability credential rotation.\n workloadIdentityKey WorkloadIdentityKeyRotation (Optional) WorkloadIdentityKey contains information about the workload identity key credential rotation.\n DNS (Appears on: VirtualCluster) DNS holds information about DNS settings.\n Field Description domains []string (Optional) Domains are the external domains of the virtual garden cluster. The first given domain in this list is immutable.\n DashboardGitHub (Appears on: GardenerDashboardConfig) DashboardGitHub contains configuration for the GitHub ticketing feature.\n Field Description apiURL string APIURL is the URL to the GitHub API.\n organisation string Organisation is the name of the GitHub organisation.\n repository string Repository is the name of the GitHub repository.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is the reference to a secret in the garden namespace containing the GitHub credentials.\n pollInterval Kubernetes meta/v1.Duration (Optional) PollInterval is the interval of how often the GitHub API is polled for issue updates. This field is used as a fallback mechanism to ensure state synchronization, even when there is a GitHub webhook configuration. If a webhook event is missed or not successfully delivered, the polling will help catch up on any missed updates. If this field is not provided and there is no ‘webhookSecret’ key in the referenced secret, it will be implicitly defaulted to 15m.\n DashboardOIDC (Appears on: GardenerDashboardConfig) DashboardOIDC contains configuration for the OIDC settings.\n Field Description sessionLifetime Kubernetes meta/v1.Duration (Optional) SessionLifetime is the maximum duration of a session.\n additionalScopes []string (Optional) AdditionalScopes is the list of additional OIDC scopes.\n secretRef Kubernetes core/v1.LocalObjectReference SecretRef is the reference to a secret in the garden namespace containing the OIDC client ID and secret for the dashboard.\n DashboardTerminal (Appears on: GardenerDashboardConfig) DashboardTerminal contains configuration for the terminal settings.\n Field Description container DashboardTerminalContainer Container contains configuration for the dashboard terminal container.\n allowedHosts []string (Optional) AllowedHosts should consist of permitted hostnames (without the scheme) for terminal connections. It is important to consider that the usage of wildcards follows the rules defined by the content security policy. ‘.seed.local.gardener.cloud’, or ‘.other-seeds.local.gardener.cloud’. For more information, see https://github.com/gardener/dashboard/blob/master/docs/operations/webterminals.md#allowlist-for-hosts.\n DashboardTerminalContainer (Appears on: DashboardTerminal) DashboardTerminalContainer contains configuration for the dashboard terminal container.\n Field Description image string Image is the container image for the dashboard terminal container.\n description string (Optional) Description is a description for the dashboard terminal container with hints for the user.\n DefaultIssuer (Appears on: CertManagement) DefaultIssuer specifies an issuer to be created on the cluster.\n Field Description acme ACMEIssuer (Optional) ACME is the ACME protocol specific spec. Either ACME or CA must be specified.\n ca CAIssuer (Optional) CA is the CA specific spec. Either ACME or CA must be specified.\n Deployment (Appears on: ExtensionSpec) Deployment specifies how an extension can be installed for a Gardener landscape. It includes the specification for installing an extension and/or an admission controller.\n Field Description extension ExtensionDeploymentSpec (Optional) ExtensionDeployment contains the deployment configuration an extension.\n admission AdmissionDeploymentSpec (Optional) AdmissionDeployment contains the deployment configuration for an admission controller.\n DeploymentSpec (Appears on: AdmissionDeploymentSpec, ExtensionDeploymentSpec) DeploymentSpec is the specification for the deployment of a component.\n Field Description helm ExtensionHelm Helm contains the specification for a Helm deployment.\n ETCD (Appears on: VirtualCluster) ETCD contains configuration for the etcds of the virtual garden cluster.\n Field Description main ETCDMain (Optional) Main contains configuration for the main etcd.\n events ETCDEvents (Optional) Events contains configuration for the events etcd.\n ETCDEvents (Appears on: ETCD) ETCDEvents contains configuration for the events etcd.\n Field Description storage Storage (Optional) Storage contains storage configuration.\n ETCDMain (Appears on: ETCD) ETCDMain contains configuration for the main etcd.\n Field Description backup Backup (Optional) Backup contains the object store configuration for backups for the virtual garden etcd.\n storage Storage (Optional) Storage contains storage configuration.\n Extension Extension describes a Gardener extension.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ExtensionSpec Spec contains the specification of this extension.\n resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerResource (Optional) Resources is a list of combinations of kinds (DNSRecord, Backupbucket, …) and their actual types (aws-route53, gcp).\n deployment Deployment (Optional) Deployment contains deployment configuration for an extension and it’s admission controller.\n status ExtensionStatus Status contains the status of this extension.\n ExtensionDeploymentSpec (Appears on: Deployment) ExtensionDeploymentSpec specifies how to install the extension in a gardener landscape. The installation is split into two parts: - installing the extension in the virtual garden cluster by creating the ControllerRegistration and ControllerDeployment - installing the extension in the runtime cluster (if necessary).\n Field Description DeploymentSpec DeploymentSpec (Members of DeploymentSpec are embedded into this type.) (Optional) DeploymentSpec is the deployment configuration for the extension.\n values k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) Values are the deployment values used in the creation of the ControllerDeployment in the virtual garden cluster.\n runtimeClusterValues k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.JSON (Optional) RuntimeClusterValues are the deployment values for the extension deployment running in the runtime garden cluster. If no values are specified, a runtime deployment is considered deactivated.\n policy github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerDeploymentPolicy (Optional) Policy controls how the controller is deployed. It defaults to ‘OnDemand’.\n seedSelector Kubernetes meta/v1.LabelSelector (Optional) SeedSelector contains an optional label selector for seeds. Only if the labels match then this controller will be considered for a deployment. An empty list means that all seeds are selected.\n ExtensionHelm (Appears on: DeploymentSpec) ExtensionHelm is the configuration for a helm deployment.\n Field Description ociRepository github.com/gardener/gardener/pkg/apis/core/v1.OCIRepository (Optional) OCIRepository defines where to pull the chart from.\n ExtensionSpec (Appears on: Extension) ExtensionSpec contains the specification of a Gardener extension.\n Field Description resources []github.com/gardener/gardener/pkg/apis/core/v1beta1.ControllerResource (Optional) Resources is a list of combinations of kinds (DNSRecord, Backupbucket, …) and their actual types (aws-route53, gcp).\n deployment Deployment (Optional) Deployment contains deployment configuration for an extension and it’s admission controller.\n ExtensionStatus (Appears on: Extension) ExtensionStatus is the status of a Gardener extension.\n Field Description observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this resource.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of an Extension’s current state.\n providerStatus k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderStatus contains type-specific status.\n Garden Garden describes a list of gardens.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec GardenSpec Spec contains the specification of this garden.\n runtimeCluster RuntimeCluster RuntimeCluster contains configuration for the runtime cluster.\n virtualCluster VirtualCluster VirtualCluster contains configuration for the virtual cluster.\n status GardenStatus Status contains the status of this garden.\n GardenSpec (Appears on: Garden) GardenSpec contains the specification of a garden environment.\n Field Description runtimeCluster RuntimeCluster RuntimeCluster contains configuration for the runtime cluster.\n virtualCluster VirtualCluster VirtualCluster contains configuration for the virtual cluster.\n GardenStatus (Appears on: Garden) GardenStatus is the status of a garden environment.\n Field Description gardener github.com/gardener/gardener/pkg/apis/core/v1beta1.Gardener (Optional) Gardener holds information about the Gardener which last acted on the Garden.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition Conditions is a list of conditions.\n lastOperation github.com/gardener/gardener/pkg/apis/core/v1beta1.LastOperation (Optional) LastOperation holds information about the last operation on the Garden.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n credentials Credentials (Optional) Credentials contains information about the virtual garden cluster credentials.\n encryptedResources []string (Optional) EncryptedResources is the list of resources which are currently encrypted in the virtual garden by the virtual kube-apiserver. Resources which are encrypted by default will not appear here. See https://github.com/gardener/gardener/blob/master/docs/concepts/operator.md#etcd-encryption-config for more details.\n Gardener (Appears on: VirtualCluster) Gardener contains the configuration settings for the Gardener components.\n Field Description clusterIdentity string ClusterIdentity is the identity of the garden cluster. This field is immutable.\n gardenerAPIServer GardenerAPIServerConfig (Optional) APIServer contains configuration settings for the gardener-apiserver.\n gardenerAdmissionController GardenerAdmissionControllerConfig (Optional) AdmissionController contains configuration settings for the gardener-admission-controller.\n gardenerControllerManager GardenerControllerManagerConfig (Optional) ControllerManager contains configuration settings for the gardener-controller-manager.\n gardenerScheduler GardenerSchedulerConfig (Optional) Scheduler contains configuration settings for the gardener-scheduler.\n gardenerDashboard GardenerDashboardConfig (Optional) Dashboard contains configuration settings for the gardener-dashboard.\n gardenerDiscoveryServer GardenerDiscoveryServerConfig (Optional) DiscoveryServer contains configuration settings for the gardener-discovery-server.\n GardenerAPIServerConfig (Appears on: Gardener) GardenerAPIServerConfig contains configuration settings for the gardener-apiserver.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) admissionPlugins []github.com/gardener/gardener/pkg/apis/core/v1beta1.AdmissionPlugin (Optional) AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding configuration.\n auditConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.AuditConfig (Optional) AuditConfig contains configuration settings for the audit of the kube-apiserver.\n auditWebhook AuditWebhook (Optional) AuditWebhook contains settings related to an audit webhook configuration.\n logging github.com/gardener/gardener/pkg/apis/core/v1beta1.APIServerLogging (Optional) Logging contains configuration for the log level and HTTP access logs.\n requests github.com/gardener/gardener/pkg/apis/core/v1beta1.APIServerRequests (Optional) Requests contains configuration for request-specific settings for the kube-apiserver.\n watchCacheSizes github.com/gardener/gardener/pkg/apis/core/v1beta1.WatchCacheSizes (Optional) WatchCacheSizes contains configuration of the API server’s watch cache sizes. Configuring these flags might be useful for large-scale Garden clusters with a lot of parallel update requests and a lot of watching controllers (e.g. large ManagedSeed clusters). When the API server’s watch cache’s capacity is too small to cope with the amount of update requests and watchers for a particular resource, it might happen that controller watches are permanently stopped with too old resource version errors. Starting from kubernetes v1.19, the API server’s watch cache size is adapted dynamically and setting the watch cache size flags will have no effect, except when setting it to 0 (which disables the watch cache).\n encryptionConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.EncryptionConfig (Optional) EncryptionConfig contains customizable encryption configuration of the Gardener API server.\n GardenerAdmissionControllerConfig (Appears on: Gardener) GardenerAdmissionControllerConfig contains configuration settings for the gardener-admission-controller.\n Field Description logLevel string (Optional) LogLevel is the configured log level for the gardener-admission-controller. Must be one of [info,debug,error]. Defaults to info.\n resourceAdmissionConfiguration ResourceAdmissionConfiguration (Optional) ResourceAdmissionConfiguration is the configuration for resource size restrictions for arbitrary Group-Version-Kinds.\n GardenerControllerManagerConfig (Appears on: Gardener) GardenerControllerManagerConfig contains configuration settings for the gardener-controller-manager.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) defaultProjectQuotas []ProjectQuotaConfiguration (Optional) DefaultProjectQuotas is the default configuration matching projects are set up with if a quota is not already specified.\n logLevel string (Optional) LogLevel is the configured log level for the gardener-controller-manager. Must be one of [info,debug,error]. Defaults to info.\n GardenerDashboardConfig (Appears on: Gardener) GardenerDashboardConfig contains configuration settings for the gardener-dashboard.\n Field Description enableTokenLogin bool (Optional) EnableTokenLogin specifies whether it is possible to log into the dashboard with a JWT token. If disabled, OIDC must be configured.\n frontendConfigMapRef Kubernetes core/v1.LocalObjectReference (Optional) FrontendConfigMapRef is the reference to a ConfigMap in the garden namespace containing the frontend configuration.\n assetsConfigMapRef Kubernetes core/v1.LocalObjectReference (Optional) AssetsConfigMapRef is the reference to a ConfigMap in the garden namespace containing the assets (logos/icons).\n gitHub DashboardGitHub (Optional) GitHub contains configuration for the GitHub ticketing feature.\n logLevel string (Optional) LogLevel is the configured log level. Must be one of [trace,debug,info,warn,error]. Defaults to info.\n oidcConfig DashboardOIDC (Optional) OIDC contains configuration for the OIDC provider. This field must be provided when EnableTokenLogin is false.\n terminal DashboardTerminal (Optional) Terminal contains configuration for the terminal settings.\n GardenerDiscoveryServerConfig (Appears on: Gardener) GardenerDiscoveryServerConfig contains configuration settings for the gardener-discovery-server.\nGardenerSchedulerConfig (Appears on: Gardener) GardenerSchedulerConfig contains configuration settings for the gardener-scheduler.\n Field Description KubernetesConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubernetesConfig (Members of KubernetesConfig are embedded into this type.) logLevel string (Optional) LogLevel is the configured log level for the gardener-scheduler. Must be one of [info,debug,error]. Defaults to info.\n GroupResource (Appears on: KubeAPIServerConfig) GroupResource contains a list of resources which should be stored in etcd-events instead of etcd-main.\n Field Description group string Group is the API group name.\n resource string Resource is the resource name.\n HighAvailability (Appears on: ControlPlane) HighAvailability specifies the configuration settings for high availability for a resource.\nIngress (Appears on: RuntimeCluster) Ingress configures the Ingress specific settings of the runtime cluster.\n Field Description domains []string (Optional) Domains specify the ingress domains of the cluster pointing to the ingress controller endpoint. They will be used to construct ingress URLs for system applications running in runtime cluster.\n controller github.com/gardener/gardener/pkg/apis/core/v1beta1.IngressController Controller configures a Gardener managed Ingress Controller listening on the ingressDomain.\n KubeAPIServerConfig (Appears on: Kubernetes) KubeAPIServerConfig contains configuration settings for the kube-apiserver.\n Field Description KubeAPIServerConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubeAPIServerConfig (Members of KubeAPIServerConfig are embedded into this type.) (Optional) KubeAPIServerConfig contains all configuration values not specific to the virtual garden cluster.\n auditWebhook AuditWebhook (Optional) AuditWebhook contains settings related to an audit webhook configuration.\n authentication Authentication (Optional) Authentication contains settings related to authentication.\n resourcesToStoreInETCDEvents []GroupResource (Optional) ResourcesToStoreInETCDEvents contains a list of resources which should be stored in etcd-events instead of etcd-main. The ‘events’ resource is always stored in etcd-events. Note that adding or removing resources from this list will not migrate them automatically from the etcd-main to etcd-events or vice versa.\n sni SNI (Optional) SNI contains configuration options for the TLS SNI settings.\n KubeControllerManagerConfig (Appears on: Kubernetes) KubeControllerManagerConfig contains configuration settings for the kube-controller-manager.\n Field Description KubeControllerManagerConfig github.com/gardener/gardener/pkg/apis/core/v1beta1.KubeControllerManagerConfig (Members of KubeControllerManagerConfig are embedded into this type.) (Optional) KubeControllerManagerConfig contains all configuration values not specific to the virtual garden cluster.\n certificateSigningDuration Kubernetes meta/v1.Duration (Optional) CertificateSigningDuration is the maximum length of duration signed certificates will be given. Individual CSRs may request shorter certs by setting spec.expirationSeconds.\n Kubernetes (Appears on: VirtualCluster) Kubernetes contains the version and configuration options for the Kubernetes components of the virtual garden cluster.\n Field Description kubeAPIServer KubeAPIServerConfig (Optional) KubeAPIServer contains configuration settings for the kube-apiserver.\n kubeControllerManager KubeControllerManagerConfig (Optional) KubeControllerManager contains configuration settings for the kube-controller-manager.\n version string Version is the semantic Kubernetes version to use for the virtual garden cluster.\n Maintenance (Appears on: VirtualCluster) Maintenance contains information about the time window for maintenance operations.\n Field Description timeWindow github.com/gardener/gardener/pkg/apis/core/v1beta1.MaintenanceTimeWindow TimeWindow contains information about the time window for maintenance operations.\n Networking (Appears on: VirtualCluster) Networking defines networking parameters for the virtual garden cluster.\n Field Description services string Services is the CIDR of the service network. This field is immutable.\n ProjectQuotaConfiguration (Appears on: GardenerControllerManagerConfig) ProjectQuotaConfiguration defines quota configurations.\n Field Description config k8s.io/apimachinery/pkg/runtime.RawExtension Config is the quota specification used for the project set-up. Only v1.ResourceQuota resources are supported.\n projectSelector Kubernetes meta/v1.LabelSelector (Optional) ProjectSelector is an optional setting to select the projects considered for quotas. Defaults to empty LabelSelector, which matches all projects.\n Provider (Appears on: RuntimeCluster) Provider defines the provider-specific information for this cluster.\n Field Description zones []string (Optional) Zones is the list of availability zones the cluster is deployed to.\n ResourceAdmissionConfiguration (Appears on: GardenerAdmissionControllerConfig) ResourceAdmissionConfiguration contains settings about arbitrary kinds and the size each resource should have at most.\n Field Description limits []ResourceLimit Limits contains configuration for resources which are subjected to size limitations.\n unrestrictedSubjects []Kubernetes rbac/v1.Subject (Optional) UnrestrictedSubjects contains references to users, groups, or service accounts which aren’t subjected to any resource size limit.\n operationMode ResourceAdmissionWebhookMode (Optional) OperationMode specifies the mode the webhooks operates in. Allowed values are “block” and “log”. Defaults to “block”.\n ResourceAdmissionWebhookMode (string alias)\n (Appears on: ResourceAdmissionConfiguration) ResourceAdmissionWebhookMode is an alias type for the resource admission webhook mode.\nResourceLimit (Appears on: ResourceAdmissionConfiguration) ResourceLimit contains settings about a kind and the size each resource should have at most.\n Field Description apiGroups []string (Optional) APIGroups is the name of the APIGroup that contains the limited resource. WildcardAll represents all groups.\n apiVersions []string (Optional) APIVersions is the version of the resource. WildcardAll represents all versions.\n resources []string Resources is the name of the resource this rule applies to. WildcardAll represents all resources.\n size k8s.io/apimachinery/pkg/api/resource.Quantity Size specifies the imposed limit.\n RuntimeCluster (Appears on: GardenSpec) RuntimeCluster contains configuration for the runtime cluster.\n Field Description ingress Ingress Ingress configures Ingress specific settings for the Garden cluster.\n networking RuntimeNetworking Networking defines the networking configuration of the runtime cluster.\n provider Provider Provider defines the provider-specific information for this cluster.\n settings Settings (Optional) Settings contains certain settings for this cluster.\n volume Volume (Optional) Volume contains settings for persistent volumes created in the runtime cluster.\n certManagement CertManagement (Optional) CertManagement configures the cert-management component for issuing TLS certificates from an ACME server.\n RuntimeNetworking (Appears on: RuntimeCluster) RuntimeNetworking defines the networking configuration of the runtime cluster.\n Field Description nodes string (Optional) Nodes is the CIDR of the node network. This field is immutable.\n pods string Pods is the CIDR of the pod network. This field is immutable.\n services string Services is the CIDR of the service network. This field is immutable.\n blockCIDRs []string (Optional) BlockCIDRs is a list of network addresses that should be blocked.\n SNI (Appears on: KubeAPIServerConfig) SNI contains configuration options for the TLS SNI settings.\n Field Description secretName string SecretName is the name of a secret containing the TLS certificate and private key.\n domainPatterns []string (Optional) DomainPatterns is a list of fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names.\n SettingLoadBalancerServices (Appears on: Settings) SettingLoadBalancerServices controls certain settings for services of type load balancer that are created in the runtime cluster.\n Field Description annotations map[string]string (Optional) Annotations is a map of annotations that will be injected/merged into every load balancer service object.\n SettingTopologyAwareRouting (Appears on: Settings) SettingTopologyAwareRouting controls certain settings for topology-aware traffic routing in the cluster. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Field Description enabled bool Enabled controls whether certain Services deployed in the cluster should be topology-aware. These Services are virtual-garden-etcd-main-client, virtual-garden-etcd-events-client and virtual-garden-kube-apiserver. Additionally, other components that are deployed to the runtime cluster via other means can read this field and according to its value enable/disable topology-aware routing for their Services.\n SettingVerticalPodAutoscaler (Appears on: Settings) SettingVerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the seed.\n Field Description enabled bool (Optional) Enabled controls whether the VPA components shall be deployed into this cluster. It is true by default because the operator (and Gardener) heavily rely on a VPA being deployed. You should only disable this if your runtime cluster already has another, manually/custom managed VPA deployment. If this is not the case, but you still disable it, then reconciliation will fail.\n Settings (Appears on: RuntimeCluster) Settings contains certain settings for this cluster.\n Field Description loadBalancerServices SettingLoadBalancerServices (Optional) LoadBalancerServices controls certain settings for services of type load balancer that are created in the runtime cluster.\n verticalPodAutoscaler SettingVerticalPodAutoscaler (Optional) VerticalPodAutoscaler controls certain settings for the vertical pod autoscaler components deployed in the cluster.\n topologyAwareRouting SettingTopologyAwareRouting (Optional) TopologyAwareRouting controls certain settings for topology-aware traffic routing in the cluster. See https://github.com/gardener/gardener/blob/master/docs/operations/topology_aware_routing.md.\n Storage (Appears on: ETCDEvents, ETCDMain) Storage contains storage configuration.\n Field Description capacity k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) Capacity is the storage capacity for the volumes.\n className string (Optional) ClassName is the name of a storage class.\n VirtualCluster (Appears on: GardenSpec) VirtualCluster contains configuration for the virtual cluster.\n Field Description controlPlane ControlPlane (Optional) ControlPlane holds information about the general settings for the control plane of the virtual cluster.\n dns DNS DNS holds information about DNS settings.\n etcd ETCD (Optional) ETCD contains configuration for the etcds of the virtual garden cluster.\n gardener Gardener Gardener contains the configuration options for the Gardener control plane components.\n kubernetes Kubernetes Kubernetes contains the version and configuration options for the Kubernetes components of the virtual garden cluster.\n maintenance Maintenance Maintenance contains information about the time window for maintenance operations.\n networking Networking Networking contains information about cluster networking such as CIDRs, etc.\n Volume (Appears on: RuntimeCluster) Volume contains settings for persistent volumes created in the runtime cluster.\n Field Description minimumSize k8s.io/apimachinery/pkg/api/resource.Quantity (Optional) MinimumSize defines the minimum size that should be used for PVCs in the runtime cluster.\n WorkloadIdentityKeyRotation (Appears on: CredentialsRotation) WorkloadIdentityKeyRotation contains information about the workload identity key credential rotation.\n Field Description phase github.com/gardener/gardener/pkg/apis/core/v1beta1.CredentialsRotationPhase Phase describes the phase of the workload identity key credential rotation.\n lastCompletionTime Kubernetes meta/v1.Time (Optional) LastCompletionTime is the most recent time when the workload identity key credential rotation was successfully completed.\n lastInitiationTime Kubernetes meta/v1.Time (Optional) LastInitiationTime is the most recent time when the workload identity key credential rotation was initiated.\n lastInitiationFinishedTime Kubernetes meta/v1.Time (Optional) LastInitiationFinishedTime is the recent time when the workload identity key credential rotation initiation was completed.\n lastCompletionTriggeredTime Kubernetes meta/v1.Time (Optional) LastCompletionTriggeredTime is the recent time when the workload identity key credential rotation completion was triggered.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n operator.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/operator/","tags":"","title":"Operator"},{"body":"DEP-05: Operator Out-of-band Tasks Table of Contents DEP-05: Operator Out-of-band Tasks Table of Contents Summary Terminology Motivation Goals Non-Goals Proposal Custom Resource Golang API Spec Status Custom Resource YAML API Lifecycle Creation Execution Deletion Use Cases Recovery from permanent quorum loss Task Config Pre-Conditions Trigger on-demand snapshot compaction Possible scenarios Task Config Pre-Conditions Trigger on-demand full/delta snapshot Possible scenarios Task Config Pre-Conditions Trigger on-demand maintenance of etcd cluster Possible Scenarios Task Config Pre-Conditions Copy Backups Task Possible Scenarios Task Config Pre-Conditions Metrics Summary This DEP proposes an enhancement to etcd-druid’s capabilities to handle out-of-band tasks, which are presently performed manually or invoked programmatically via suboptimal APIs. The document proposes the establishment of a unified interface by defining a well-structured API to harmonize the initiation of any out-of-band task, monitor its status, and simplify the process of adding new tasks and managing their lifecycles.\nTerminology etcd-druid: etcd-druid is an operator to manage the etcd clusters.\n backup-sidecar: It is the etcd-backup-restore sidecar container running in each etcd-member pod of etcd cluster.\n leading-backup-sidecar: A backup-sidecar that is associated to an etcd leader of an etcd cluster.\n out-of-band task: Any on-demand tasks/operations that can be executed on an etcd cluster without modifying the Etcd custom resource spec (desired state).\n Motivation Today, etcd-druid mainly acts as an etcd cluster provisioner (creation, maintenance and deletion). In future, capabilities of etcd-druid will be enhanced via etcd-member proposal by providing it access to much more detailed information about each etcd cluster member. While we enhance the reconciliation and monitoring capabilities of etcd-druid, it still lacks the ability to allow users to invoke out-of-band tasks on an existing etcd cluster.\nThere are new learnings while operating etcd clusters at scale. It has been observed that we regularly need capabilities to trigger out-of-band tasks which are outside of the purview of a regular etcd reconciliation run. Many of these tasks are multi-step processes, and performing them manually is error-prone, even if an operator follows a well-written step-by-step guide. Thus, there is a need to automate these tasks. Some examples of an on-demand/out-of-band tasks:\n Recover from a permanent quorum loss of etcd cluster. Trigger an on-demand full/delta snapshot. Trigger an on-demand snapshot compaction. Trigger an on-demand maintenance of etcd cluster. Copy the backups from one object store to another object store. Goals Establish a unified interface for operator tasks by defining a single dedicated custom resource for out-of-band tasks. Define a contract (in terms of prerequisites) which needs to be adhered to by any task implementation. Facilitate the easy addition of new out-of-band task(s) through this custom resource. Provide CLI capabilities to operators, making it easy to invoke supported out-of-band tasks. Non-Goals In the current scope, capability to abort/suspend an out-of-band task is not going to be provided. This could be considered as an enhancement based on pull. Ordering (by establishing dependency) of out-of-band tasks submitted for the same etcd cluster has not been considered in the first increment. In a future version based on how operator tasks are used, we will enhance this proposal and the implementation. Proposal Authors propose creation of a new single dedicated custom resource to represent an out-of-band task. Etcd-druid will be enhanced to process the task requests and update its status which can then be tracked/observed.\nCustom Resource Golang API EtcdOperatorTask is the new custom resource that will be introduced. This API will be in v1alpha1 version and will be subject to change. We will be respecting Kubernetes Deprecation Policy.\n// EtcdOperatorTask represents an out-of-band operator task resource. type EtcdOperatorTask struct { metav1.TypeMeta metav1.ObjectMeta // Spec is the specification of the EtcdOperatorTask resource. Spec EtcdOperatorTaskSpec `json:\"spec\"` // Status is most recently observed status of the EtcdOperatorTask resource. Status EtcdOperatorTaskStatus `json:\"status,omitempty\"` } Spec The authors propose that the following fields should be specified in the spec (desired state) of the EtcdOperatorTask custom resource.\n To capture the type of out-of-band operator task to be performed, .spec.type field should be defined. It can have values from all supported out-of-band tasks eg. “OnDemandSnaphotTask”, “QuorumLossRecoveryTask” etc. To capture the configuration specific to each task, a .spec.config field should be defined of type string as each task can have different input configuration. // EtcdOperatorTaskSpec is the spec for a EtcdOperatorTask resource. type EtcdOperatorTaskSpec struct { // Type specifies the type of out-of-band operator task to be performed. Type string `json:\"type\"` // Config is a task specific configuration. Config string `json:\"config,omitempty\"` // TTLSecondsAfterFinished is the time-to-live to garbage collect the // related resource(s) of task once it has been completed. // +optional TTLSecondsAfterFinished *int32 `json:\"ttlSecondsAfterFinished,omitempty\"` // OwnerEtcdReference refers to the name and namespace of the corresponding // Etcd owner for which the task has been invoked. OwnerEtcdRefrence types.NamespacedName `json:\"ownerEtcdRefrence\"` } Status The authors propose the following fields for the Status (current state) of the EtcdOperatorTask custom resource to monitor the progress of the task.\n// EtcdOperatorTaskStatus is the status for a EtcdOperatorTask resource. type EtcdOperatorTaskStatus struct { // ObservedGeneration is the most recent generation observed for the resource. ObservedGeneration *int64 `json:\"observedGeneration,omitempty\"` // State is the last known state of the task. State TaskState `json:\"state\"` // Time at which the task has moved from \"pending\" state to any other state. InitiatedAt metav1.Time `json:\"initiatedAt\"` // LastError represents the errors when processing the task. // +optional LastErrors []LastError `json:\"lastErrors,omitempty\"` // Captures the last operation status if task involves many stages. // +optional LastOperation *LastOperation `json:\"lastOperation,omitempty\"` } type LastOperation struct { // Name of the LastOperation. Name opsName `json:\"name\"` // Status of the last operation, one of pending, progress, completed, failed. State OperationState `json:\"state\"` // LastTransitionTime is the time at which the operation state last transitioned from one state to another. LastTransitionTime metav1.Time `json:\"lastTransitionTime\"` // A human readable message indicating details about the last operation. Reason string `json:\"reason\"` } // LastError stores details of the most recent error encountered for the task. type LastError struct { // Code is an error code that uniquely identifies an error. Code ErrorCode `json:\"code\"` // Description is a human-readable message indicating details of the error. Description string `json:\"description\"` // ObservedAt is the time at which the error was observed. ObservedAt metav1.Time `json:\"observedAt\"` } // TaskState represents the state of the task. type TaskState string const ( TaskStateFailed TaskState = \"Failed\" TaskStatePending TaskState = \"Pending\" TaskStateRejected TaskState = \"Rejected\" TaskStateSucceeded TaskState = \"Succeeded\" TaskStateInProgress TaskState = \"InProgress\" ) // OperationState represents the state of last operation. type OperationState string const ( OperationStateFailed OperationState = \"Failed\" OperationStatePending OperationState = \"Pending\" OperationStateCompleted OperationState = \"Completed\" OperationStateInProgress OperationState = \"InProgress\" ) Custom Resource YAML API apiVersion: druid.gardener.cloud/v1alpha1 kind: EtcdOperatorTask metadata: name: \u003cname of operator task resource\u003e namespace: \u003ccluster namespace\u003e generation: \u003cspecific generation of the desired state\u003e spec: type: \u003ctype/category of supported out-of-band task\u003e ttlSecondsAfterFinished: \u003ctime-to-live to garbage collect the custom resource after it has been completed\u003e config: \u003ctask specific configuration\u003e ownerEtcdRefrence: \u003crefer to corresponding etcd owner name and namespace for which task has been invoked\u003e status: observedGeneration: \u003cspecific observedGeneration of the resource\u003e state: \u003clast known current state of the out-of-band task\u003e initiatedAt: \u003ctime at which task move to any other state from \"pending\" state\u003e lastErrors: - code: \u003cerror-code\u003e description: \u003cdescription of the error\u003e observedAt: \u003ctime the error was observed\u003e lastOperation: name: \u003coperation-name\u003e state: \u003ctask state as seen at the completion of last operation\u003e lastTransitionTime: \u003ctime of transition to this state\u003e reason: \u003creason/message if any\u003e Lifecycle Creation Task(s) can be created by creating an instance of the EtcdOperatorTask custom resource specific to a task.\n Note: In future, either a kubectl extension plugin or a druidctl tool will be introduced. Dedicated sub-commands will be created for each out-of-band task. This will drastically increase the usability for an operator for performing such tasks, as the CLI extension will automatically create relevant instance(s) of EtcdOperatorTask with the provided configuration.\n Execution Authors propose to introduce a new controller which watches for EtcdOperatorTask custom resource. Each out-of-band task may have some task specific configuration defined in .spec.config. The controller needs to parse this task specific config, which comes as a string, according to the schema defined for each task. For every out-of-band task, a set of pre-conditions can be defined. These pre-conditions are evaluated against the current state of the target etcd cluster. Based on the evaluation result (boolean), the task is permitted or denied execution. If multiple tasks are invoked simultaneously or in pending state, then they will be executed in a First-In-First-Out (FIFO) manner. Note: Dependent ordering among tasks will be addressed later which will enable concurrent execution of tasks when possible.\n Deletion Upon completion of the task, irrespective of its final state, Etcd-druid will ensure the garbage collection of the task custom resource and any other Kubernetes resources created to execute the task. This will be done according to the .spec.ttlSecondsAfterFinished if defined in the spec, or a default expiry time will be assumed.\nUse Cases Recovery from permanent quorum loss Recovery from permanent quorum loss involves two phases - identification and recovery - both of which are done manually today. This proposal intends to automate the latter. Recovery today is a multi-step process and needs to be performed carefully by a human operator. Automating these steps would be prudent, to make it quicker and error-free. The identification of the permanent quorum loss would remain a manual process, requiring a human operator to investigate and confirm that there is indeed a permanent quorum loss with no possibility of auto-healing.\nTask Config We do not need any config for this task. When creating an instance of EtcdOperatorTask for this scenario, .spec.config will be set to nil (unset).\nPre-Conditions There should be a quorum loss in a multi-member etcd cluster. For a single-member etcd cluster, invoking this task is unnecessary as the restoration of the single member is automatically handled by the backup-restore process. There should not already be a permanent-quorum-loss-recovery-task running for the same etcd cluster. Trigger on-demand snapshot compaction Etcd-druid provides a configurable etcd-events-threshold flag. When this threshold is breached, then a snapshot compaction is triggered for the etcd cluster. However, there are scenarios where an ad-hoc snapshot compaction may be required.\nPossible scenarios If an operator anticipates a scenario of permanent quorum loss, they can trigger an on-demand snapshot compaction to create a compacted full-snapshot. This can potentially reduce the recovery time from a permanent quorum loss. As an additional benefit, a human operator can leverage the current implementation of snapshot compaction, which internally triggers restoration. Hence, by initiating an on-demand snapshot compaction task, the operator can verify the integrity of etcd cluster backups, particularly in cases of potential backup corruption or re-encryption. The success or failure of this snapshot compaction can offer valuable insights into these scenarios. Task Config We do not need any config for this task. When creating an instance of EtcdOperatorTask for this scenario, .spec.config will be set to nil (unset).\nPre-Conditions There should not be a on-demand snapshot compaction task already running for the same etcd cluster. Note: on-demand snapshot compaction runs as a separate job in a separate pod, which interacts with the backup bucket and not the etcd cluster itself, hence it doesn’t depend on the health of etcd cluster members.\n Trigger on-demand full/delta snapshot Etcd custom resource provides an ability to set FullSnapshotSchedule which currently defaults to run once in 24 hrs. DeltaSnapshotPeriod is also made configurable which defines the duration after which a delta snapshot will be taken. If a human operator does not wish to wait for the scheduled full/delta snapshot, they can trigger an on-demand (out-of-schedule) full/delta snapshot on the etcd cluster, which will be taken by the leading-backup-restore.\nPossible scenarios An on-demand full snapshot can be triggered if scheduled snapshot fails due to any reason. Gardener Shoot Hibernation: Every etcd cluster incurs an inherent cost of preserving the volumes even when a gardener shoot control plane is scaled down, i.e the shoot is in a hibernated state. However, it is possible to save on hyperscaler costs by invoking this task to take a full snapshot before scaling down the etcd cluster, and deleting the etcd data volumes afterwards. Gardener Control Plane Migration: In gardener, a cluster control plane can be moved from one seed cluster to another. This process currently requires the etcd data to be replicated on the target cluster, so a full snapshot of the etcd cluster in the source seed before the migration would allow for faster restoration of the etcd cluster in the target seed. Task Config // SnapshotType can be full or delta snapshot. type SnapshotType string const ( SnapshotTypeFull SnapshotType = \"full\" SnapshotTypeDelta SnapshotType = \"delta\" ) type OnDemandSnapshotTaskConfig struct { // Type of on-demand snapshot. Type SnapshotType `json:\"type\"` } spec: config: | type: \u003ctype of on-demand snapshot\u003e Pre-Conditions Etcd cluster should have a quorum. There should not already be a on-demand snapshot task running with the same SnapshotType for the same etcd cluster. Trigger on-demand maintenance of etcd cluster Operator can trigger on-demand maintenance of etcd cluster which includes operations like etcd compaction, etcd defragmentation etc.\nPossible Scenarios If an etcd cluster is heavily loaded, which is causing performance degradation of an etcd cluster, and the operator does not want to wait for the scheduled maintenance window then an on-demand maintenance task can be triggered which will invoke etcd-compaction, etcd-defragmentation etc. on the target etcd cluster. This will make the etcd cluster lean and clean, thus improving cluster performance. Task Config type OnDemandMaintenanceTaskConfig struct { // MaintenanceType defines the maintenance operations need to be performed on etcd cluster. MaintenanceType maintenanceOps `json:\"maintenanceType` } type maintenanceOps struct { // EtcdCompaction if set to true will trigger an etcd compaction on the target etcd. // +optional EtcdCompaction bool `json:\"etcdCompaction,omitempty\"` // EtcdDefragmentation if set to true will trigger a etcd defragmentation on the target etcd. // +optional EtcdDefragmentation bool `json:\"etcdDefragmentation,omitempty\"` } spec: config: |maintenanceType: etcdCompaction: \u003ctrue/false\u003e etcdDefragmentation: \u003ctrue/false\u003e Pre-Conditions Etcd cluster should have a quorum. There should not already be a duplicate task running with same maintenanceType. Copy Backups Task Copy the backups(full and delta snapshots) of etcd cluster from one object store(source) to another object store(target).\nPossible Scenarios In Gardener, the Control Plane Migration process utilizes the copy-backups task. This task is responsible for copying backups from one object store to another, typically located in different regions. Task Config // EtcdCopyBackupsTaskConfig defines the parameters for the copy backups task. type EtcdCopyBackupsTaskConfig struct { // SourceStore defines the specification of the source object store provider. SourceStore StoreSpec `json:\"sourceStore\"` // TargetStore defines the specification of the target object store provider for storing backups. TargetStore StoreSpec `json:\"targetStore\"` // MaxBackupAge is the maximum age in days that a backup must have in order to be copied. // By default all backups will be copied. // +optional MaxBackupAge *uint32 `json:\"maxBackupAge,omitempty\"` // MaxBackups is the maximum number of backups that will be copied starting with the most recent ones. // +optional MaxBackups *uint32 `json:\"maxBackups,omitempty\"` } spec: config: |sourceStore: \u003csource object store specification\u003e targetStore: \u003ctarget object store specification\u003e maxBackupAge: \u003cmaximum age in days that a backup must have in order to be copied\u003e maxBackups: \u003cmaximum no. of backups that will be copied\u003e Note: For detailed object store specification please refer here\n Pre-Conditions There should not already be a copy-backups task running. Note: copy-backups-task runs as a separate job, and it operates only on the backup bucket, hence it doesn’t depend on health of etcd cluster members.\n Note: copy-backups-task has already been implemented and it’s currently being used in Control Plane Migration but copy-backups-task will be harmonized with EtcdOperatorTask custom resource.\n Metrics Authors proposed to introduce the following metrics:\n etcddruid_operator_task_duration_seconds : Histogram which captures the runtime for each etcd operator task. Labels:\n Key: type, Value: all supported tasks Key: state, Value: One-Of {failed, succeeded, rejected} Key: etcd, Value: name of the target etcd resource Key: etcd_namespace, Value: namespace of the target etcd resource etcddruid_operator_tasks_total: Counter which counts the number of etcd operator tasks. Labels:\n Key: type, Value: all supported tasks Key: state, Value: One-Of {failed, succeeded, rejected} Key: etcd, Value: name of the target etcd resource Key: etcd_namespace, Value: namespace of the target etcd resource ","categories":"","description":"","excerpt":"DEP-05: Operator Out-of-band Tasks Table of Contents DEP-05: Operator …","ref":"/docs/other-components/etcd-druid/proposals/05-etcd-operator-tasks/","tags":"","title":"operator out-of-band tasks"},{"body":"Disclaimer If an application depends on other services deployed separately, do not rely on a certain start sequence of containers. Instead, ensure that the application can cope with unavailability of the services it depends on.\nIntroduction Kubernetes offers a feature called InitContainers to perform some tasks during a pod’s initialization. In this tutorial, we demonstrate how to use InitContainers in order to orchestrate a starting sequence of multiple containers. The tutorial uses the example app url-shortener, which consists of two components:\n postgresql database webapp which depends on the postgresql database and provides two endpoints: create a short url from a given location and redirect from a given short URL to the corresponding target location This app represents the minimal example where an application relies on another service or database. In this example, if the application starts before the database is ready, the application will fail as shown below:\n$ kubectl logs webapp-958cf5567-h247n time=\"2018-06-12T11:02:42Z\" level=info msg=\"Connecting to Postgres database using: host=`postgres:5432` dbname=`url_shortener_db` username=`user`\\n\" time=\"2018-06-12T11:02:42Z\" level=fatal msg=\"failed to start: failed to open connection to database: dial tcp: lookup postgres on 100.64.0.10:53: no such host\\n\" $ kubectl get po -w NAME READY STATUS RESTARTS AGE webapp-958cf5567-h247n 0/1 Pending 0 0s webapp-958cf5567-h247n 0/1 Pending 0 0s webapp-958cf5567-h247n 0/1 ContainerCreating 0 0s webapp-958cf5567-h247n 0/1 ContainerCreating 0 1s webapp-958cf5567-h247n 0/1 Error 0 2s webapp-958cf5567-h247n 0/1 Error 1 3s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 1 4s webapp-958cf5567-h247n 0/1 Error 2 18s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 2 29s webapp-958cf5567-h247n 0/1 Error 3 43s webapp-958cf5567-h247n 0/1 CrashLoopBackOff 3 56s If the restartPolicy is set to Always (default) in the yaml file, the application will continue to restart the pod with an exponential back-off delay in case of failure.\nUsing InitContaniner To avoid such a situation, InitContainers can be defined, which are executed prior to the application container. If one of the InitContainers fails, the application container won’t be triggered.\napiVersion: apps/v1 kind: Deployment metadata: name: webapp spec: selector: matchLabels: app: webapp template: metadata: labels: app: webapp spec: initContainers: # check if DB is ready, and only continue when true - name: check-db-ready image: postgres:9.6.5 command: ['sh', '-c', 'until pg_isready -h postgres -p 5432; do echo waiting for database; sleep 2; done;'] containers: - image: xcoulon/go-url-shortener:0.1.0 name: go-url-shortener env: - name: POSTGRES_HOST value: postgres - name: POSTGRES_PORT value: \"5432\" - name: POSTGRES_DATABASE value: url_shortener_db - name: POSTGRES_USER value: user - name: POSTGRES_PASSWORD value: mysecretpassword ports: - containerPort: 8080 In the above example, the InitContainers use the docker image postgres:9.6.5, which is different from the application container.\nThis also brings the advantage of not having to include unnecessary tools (e.g., pg_isready) in the application container.\nWith introduction of InitContainers, in case the database is not available yet, the pod startup will look like similarly to:\n$ kubectl get po -w NAME READY STATUS RESTARTS AGE nginx-deployment-5cc79d6bfd-t9n8h 1/1 Running 0 5d privileged-pod 1/1 Running 0 4d webapp-fdcb49cbc-4gs4n 0/1 Pending 0 0s webapp-fdcb49cbc-4gs4n 0/1 Pending 0 0s webapp-fdcb49cbc-4gs4n 0/1 Init:0/1 0 0s webapp-fdcb49cbc-4gs4n 0/1 Init:0/1 0 1s $ kubectl logs webapp-fdcb49cbc-4gs4n Error from server (BadRequest): container \"go-url-shortener\" in pod \"webapp-fdcb49cbc-4gs4n\" is waiting to start: PodInitializing ","categories":"","description":"How to orchestrate a startup sequence of multiple containers","excerpt":"How to orchestrate a startup sequence of multiple containers","ref":"/docs/guides/applications/container-startup/","tags":"","title":"Orchestration of Container Startup"},{"body":"The Gardener project implements the documentation-as-code paradigm. Essentially this means that:\n Documentation resides close to the code it describes - in the corresponding GitHub repositories. Only documentation with regards to cross-cutting concerns that cannot be affiliated to a specific component repository is hosted in the general gardener/documentation repository. We use tools to develop, validate and integrate documentation sources The change management process is largely automated with automatic validation, integration and deployment using docforge and docs-toolbelt. The documentation sources are intended for reuse and not bound to a specific publishing platform. The physical organization in a repository is irrelevant for the tool support. What needs to be maintained is the intended result in a docforge documentation bundle manifest configuration, very much like virtual machines configurations, that docforge can reliably recreate in any case. We use GitHub as distributed, versioning storage system and docforge to pull sources in their desired state to forge documentation bundles according to a desired specification provided as a manifest. Content Organization Documentation that can be affiliated to component is hosted and maintained in the component repository.\nA good way to organize your documentation is to place it in a ‘docs’ folder and create separate subfolders per role activity. For example:\nrepositoryX |_ docs |_ usage | |_ images | |_ 01.png | |_ hibernation.md |_ operations |_ deployment Do not use folders just because they are in the template. Stick to the predefined roles and corresponding activities for naming convention. A system makes it easier to maintain and get oriented. While recommended, this is not a mandatory way of organizing the documentation.\n User: usage Operator: operations Gardener (service) provider: deployment Gardener Developer: development Gardener Extension Developer: extensions Publishing on gardener.cloud The Gardener website is one of the multiple optional publishing channels where the source material might end up as documentation. We use docforge and automated integration and publish process to enable transparent change management.\nTo have documentation published on the website it is necessary to use the docforge manifests available at gardener/documentation/.docforge and register a reference to your documentation.\nNote This is work in progress and we are transitioning to a more transparent way of integrating component documentation. This guide will be updated as we progress. These manifests describe a particular publishing goal, i.e. using Hugo to publish on the website, and you will find out that they contain Hugo-specific front-matter properties. Consult with the documentation maintainers for details. Use the gardener channel in slack or open a PR.\n","categories":"","description":"","excerpt":"The Gardener project implements the documentation-as-code paradigm. …","ref":"/docs/contribute/documentation/organization/","tags":"","title":"Organization"},{"body":"Overview The kubectl command-line tool uses kubeconfig files to find the information it needs to choose a cluster and communicate with the API server of a cluster.\nProblem If you’ve become aware of a security breach that affects you, you may want to revoke or cycle credentials in case anything was leaked. However, this is not possible with the initial or master kubeconfig from your cluster.\nPitfall Never distribute the kubeconfig, which you can download directly within the Gardener dashboard, for a productive cluster.\nCreate a Custom kubeconfig File for Each User Create a separate kubeconfig for each user. One of the big advantages of this approach is that you can revoke them and control the permissions better. A limitation to single namespaces is also possible here.\nThe script creates a new ServiceAccount with read privileges in the whole cluster (Secrets are excluded). To run the script, Deno, a secure TypeScript runtime, must be installed.\n#!/usr/bin/env -S deno run --allow-run /* * This script create Kubernetes ServiceAccount and other required resource and print KUBECONFIG to console. * Depending on your requirements you might want change clusterRoleBindingTemplate() function * * In order to execute this script it's required to install Deno.js https://deno.land/ (TypeScript \u0026 JavaScript runtime). * It's single executable binary for the major OSs from the original author of the Node.js * example: deno run --allow-run kubeconfig-for-custom-user.ts d00001 * example: deno run --allow-run kubeconfig-for-custom-user.ts d00001 --delete * * known issue: shebang does works under the Linux but not for Windows Linux Subsystem */ const KUBECTL = \"/usr/local/bin/kubectl\" //or // const KUBECTL = \"C:\\\\Program Files\\\\Docker\\\\Docker\\\\resources\\\\bin\\\\kubectl.exe\" const serviceAccName = Deno.args[0] const deleteIt = Deno.args[1] if (serviceAccName == undefined || serviceAccName == \"--delete\" ) { console.log(\"please provide username as an argument, for example: deno run --allow-run kubeconfig-for-custom-user.ts USER_NAME [--delete]\") Deno.exit(1) } if (deleteIt == \"--delete\") { exec([KUBECTL, \"delete\", \"serviceaccount\", serviceAccName]) exec([KUBECTL, \"delete\", \"secret\", `${serviceAccName}-secret`]) exec([KUBECTL, \"delete\", \"clusterrolebinding\", `view-${serviceAccName}-global`]) Deno.exit(0) } await exec([KUBECTL, \"create\", \"serviceaccount\", serviceAccName, \"-o\", \"json\"]) await exec([KUBECTL, \"create\", \"-o\", \"json\", \"-f\", \"-\"], secretYamlTemplate()) let secret = await exec([KUBECTL, \"get\", \"secret\", `${serviceAccName}-secret`, \"-o\", \"json\"]) let caCRT = secret.data[\"ca.crt\"]; let userToken = atob(secret.data[\"token\"]); //decode base64 let kubeConfig = await exec([KUBECTL, \"config\", \"view\", \"--minify\", \"-o\", \"json\"]); let clusterApi = kubeConfig.clusters[0].cluster.server let clusterName = kubeConfig.clusters[0].name await exec([KUBECTL, \"create\", \"-o\", \"json\", \"-f\", \"-\"], clusterRoleBindingTemplate()) console.log(kubeConfigTemplate(caCRT, userToken, clusterApi, clusterName, serviceAccName + \"-\" + clusterName)) async function exec(args: string[], stdInput?: string): Promise\u003cObject\u003e { console.log(\"# \"+args.join(\" \")) let opt: Deno.RunOptions = { cmd: args, stdout: \"piped\", stderr: \"piped\", stdin: \"piped\", }; const p = Deno.run(opt); if (stdInput != undefined) { await p.stdin.write(new TextEncoder().encode(stdInput)); await p.stdin.close(); } const status = await p.status() const output = await p.output() const stderrOutput = await p.stderrOutput() if (status.code === 0) { return JSON.parse(new TextDecoder().decode(output)) } else { let error = new TextDecoder().decode(stderrOutput); return \"\" } } function clusterRoleBindingTemplate() { return ` apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: view-${serviceAccName}-global subjects: - kind: ServiceAccount name: ${serviceAccName}namespace: default roleRef: kind: ClusterRole name: view apiGroup: rbac.authorization.k8s.io ` } function secretYamlTemplate() { return ` apiVersion: v1 kind: Secret metadata: name: ${serviceAccName}-secret annotations: kubernetes.io/service-account.name: ${serviceAccName}type: kubernetes.io/service-account-token` } function kubeConfigTemplate(certificateAuthority: string, token: string, clusterApi: string, clusterName: string, username: string) { return ` ## KUBECONFIG generated on ${new Date()}apiVersion: v1 clusters: - cluster: certificate-authority-data: ${certificateAuthority}server: ${clusterApi}name: ${clusterName}contexts: - context: cluster: ${clusterName}user: ${username}name: ${clusterName}current-context: ${clusterName}kind: Config preferences: {} users: - name: ${username}user: token: ${token}` } If edit or admin rights are to be assigned, the ClusterRoleBinding must be adapted in the roleRef section with the roles listed below.\nFurthermore, you can restrict this to a single namespace by not creating a ClusterRoleBinding but only a RoleBinding within the desired namespace.\n Default ClusterRole Default ClusterRoleBinding Description cluster-admin system:masters group Allows super-user access to perform any action on any resource. When used in a ClusterRoleBinding, it gives full control over every resource in the cluster and in all namespaces. When used in a RoleBinding, it gives full control over every resource in the rolebinding’s namespace, including the namespace itself. admin None Allows admin access, intended to be granted within a namespace using a RoleBinding. If used in a RoleBinding, allows read/write access to most resources in a namespace, including the ability to create roles and rolebindings within the namespace. It does not allow write access to resource quota or to the namespace itself. edit None Allows read/write access to most objects in a namespace. It does not allow viewing or modifying roles or rolebindings. view None Allows read-only access to see most objects in a namespace. It does not allow viewing roles or rolebindings. It does not allow viewing secrets, since those are escalating. ","categories":"","description":"","excerpt":"Overview The kubectl command-line tool uses kubeconfig files to find …","ref":"/docs/guides/client-tools/working-with-kubeconfig/","tags":"","title":"Organizing Access Using kubeconfig Files"},{"body":"Problem After updating your HTML and JavaScript sources in your web application, the Kubernetes cluster delivers outdated versions - why?\nOverview By default, Kubernetes service pods are not accessible from the external network, but only from other pods within the same Kubernetes cluster.\nThe Gardener cluster has a built-in configuration for HTTP load balancing called Ingress, defining rules for external connectivity to Kubernetes services. Users who want external access to their Kubernetes services create an ingress resource that defines rules, including the URI path, backing service name, and other information. The Ingress controller can then automatically program a frontend load balancer to enable Ingress configuration.\nExample Ingress Configuration apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: name: vuejs-ingress spec: rules: - host: test.ingress.\u003cGARDENER-CLUSTER\u003e.\u003cGARDENER-PROJECT\u003e.shoot.canary.k8s-hana.ondemand.com http: paths: - backend: serviceName: vuejs-svc servicePort: 8080 where:\n \u003cGARDENER-CLUSTER\u003e: The cluster name in the Gardener \u003cGARDENER-PROJECT\u003e: You project name in the Gardener Diagnosing the Problem The ingress controller we are using is NGINX. NGINX is a software load balancer, web server, and content cache built on top of open source NGINX.\nNGINX caches the content as specified in the HTTP header. If the HTTP header is missing, it is assumed that the cache is forever and NGINX never updates the content in the stupidest case.\nSolution In general, you can avoid this pitfall with one of the solutions below:\n Use a cache buster + HTTP-Cache-Control (prefered) Use HTTP-Cache-Control with a lower retention period Disable the caching in the ingress (just for dev purposes) Learning how to set the HTTP header or setup a cache buster is left to you, as an exercise for your web framework (e.g., Express/NodeJS, SpringBoot, …)\nHere is an example on how to disable the cache control for your ingress, done with an annotation in your ingress YAML (during development).\n--- apiVersion: networking.k8s.io/v1beta1 kind: Ingress metadata: annotations: ingress.kubernetes.io/cache-enable: \"false\" name: vuejs-ingress spec: rules: - host: test.ingress.\u003cGARDENER-CLUSTER\u003e.\u003cGARDENER-PROJECT\u003e.shoot.canary.k8s-hana.ondemand.com http: paths: - backend: serviceName: vuejs-svc servicePort: 8080 ","categories":"","description":"Why is my application always outdated?","excerpt":"Why is my application always outdated?","ref":"/docs/guides/applications/service-cache-control/","tags":"","title":"Out-Dated HTML and JS Files Delivered"},{"body":"Machine Controller Manager CORE – ./machine-controller-manager(provider independent) Out of tree : Machine controller (provider specific) MCM is a set controllers:\n Machine Deployment Controller\n Machine Set Controller\n Machine Controller\n Machine Safety Controller\n Questions and refactoring Suggestions Refactoring Statement FilePath Status ConcurrentNodeSyncs” bad name - nothing to do with node syncs actually. If its value is ’10’ then it will start 10 goroutines (workers) per resource type (machine, machinist, machinedeployment, provider-specific-class, node - study the different resource types. cmd/machine-controller-manager/app/options/options.go pending LeaderElectionConfiguration is very similar to the one present in “client-go/tools/leaderelection/leaderelection.go” - can we simply used the one in client-go instead of defining again? pkg/options/types.go - MachineControllerManagerConfiguration pending Have all userAgents as constant. Right now there is just one. cmd/app/controllermanager.go pending Shouldn’t run function be defined on MCMServer struct itself? cmd/app/controllermanager.go pending clientcmd.BuildConfigFromFlags fallsback to inClusterConfig which will surely not work as that is not the target. Should it not check and exit early? cmd/app/controllermanager.go - run Function pending A more direct way to create an in cluster config is using k8s.io/client-go/rest -\u003e rest.InClusterConfig instead of using clientcmd.BuildConfigFromFlags passing empty arguments and depending upon the implementation to fallback to creating a inClusterConfig. If they change the implementation that you get affected. cmd/app/controllermanager.go - run Function pending Introduce a method on MCMServer which gets a target KubeConfig and controlKubeConfig or alternatively which creates respective clients. cmd/app/controllermanager.go - run Function pending Why can’t we use Kubernetes.NewConfigOrDie also for kubeClientControl? cmd/app/controllermanager.go - run Function pending I do not see any benefit of client builders actually. All you need to do is pass in a config and then directly use client-go functions to create a client. cmd/app/controllermanager.go - run Function pending Function: getAvailableResources - rename this to getApiServerResources cmd/app/controllermanager.go pending Move the method which waits for API server to up and ready to a separate method which returns a discoveryClient when the API server is ready. cmd/app/controllermanager.go - getAvailableResources function pending Many methods in client-go used are now deprecated. Switch to the ones that are now recommended to be used instead. cmd/app/controllermanager.go - startControllers pending This method needs a general overhaul cmd/app/controllermanager.go - startControllers pending If the design is influenced/copied from KCM then its very different. There are different controller structs defined for deployment, replicaset etc which makes the code much more clearer. You can see “kubernetes/cmd/kube-controller-manager/apps.go” and then follow the trail from there. - agreed needs to be changed in future (if time permits) pkg/controller/controller.go pending I am not sure why “MachineSetControlInterface”, “RevisionControlInterface”, “MachineControlInterface”, “FakeMachineControl” are defined in this file? pkg/controller/controller_util.go pending IsMachineActive - combine the first 2 conditions into one with OR. pkg/controller/controller_util.go pending Minor change - correct the comment, first word should always be the method name. Currently none of the comments have correct names. pkg/controller/controller_util.go pending There are too many deep copies made. What is the need to make another deep copy in this method? You are not really changing anything here. pkg/controller/deployment.go - updateMachineDeploymentFinalizers pending Why can’t these validations be done as part of a validating webhook? pkg/controller/machineset.go - reconcileClusterMachineSet pending Small change to the following if condition. else if is not required a simple else is sufficient. Code1 pkg/controller/machineset.go - reconcileClusterMachineSet pending Why call these inactiveMachines, these are live and running and therefore active. pkg/controller/machineset.go - terminateMachines pending Clarification Statement FilePath Status Why are there 2 versions - internal and external versions? General pending Safety controller freezes MCM controllers in the following cases: * Num replicas go beyond a threshold (above the defined replicas) * Target API service is not reachable There seems to be an overlap between DWD and MCM Safety controller. In the meltdown scenario why is MCM being added to DWD, you could have used Safety controller for that. General pending All machine resources are v1alpha1 - should we not promote it to beta. V1alpha1 has a different semantic and does not give any confidence to the consumers. cmd/app/controllermanager.go pending Shouldn’t controller manager use context.Context instead of creating a stop channel? - Check if signals (os.Interrupt and SIGTERM are handled properly. Do not see code where this is handled currently.) cmd/app/controllermanager.go pending What is the rationale behind a timeout of 10s? If the API server is not up, should this not just block as it can anyways not do anything. Also, if there is an error returned then you exit the MCM which does not make much sense actually as it will be started again and you will again do the poll for the API server to come back up. Forcing an exit of MCM will not have any impact on the reachability of the API server in anyway so why exit? cmd/app/controllermanager.go - getAvailableResources pending There is a very weird check - availableResources[machineGVR] || availableResources[machineSetGVR] || availableResources[machineDeploymentGVR] Shouldn’t this be conjunction instead of disjunction? * What happens if you do not find one or all of these resources? Currently an error log is printed and nothing else is done. MCM can be used outside gardener context where consumers can directly create MachineClass and Machine and not create MachineSet / Maching Deployment. There is no distinction made between context (gardener or outside-gardener). cmd/app/controllermanager.go - StartControllers pending Instead of having an empty select {} to block forever, isn’t it better to wait on the stop channel? cmd/app/controllermanager.go - StartControllers pending Do we need provider specific queues and syncs and listers pkg/controller/controller.go pending Why are resource types prefixed with “Cluster”? - not sure , check PR pkg/controller/controller.go pending When will forgetAfterSuccess be false and why? - as per the current code this is never the case. - Himanshu will check cmd/app/controllermanager.go - createWorker pending What is the use of “ExpectationsInterface” and “UIDTrackingContExpectations”? * All expectations related code should be in its own file “expectations.go” and not in this file. pkg/controller/controller_util.go pending Why do we not use lister but directly use the controlMachingClient to get the deployment? Is it because you want to avoid any potential delays caused by update of the local cache held by the informer and accessed by the lister? What is the load on API server due to this? pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Why is this conversion needed? code2 pkg/controller/deployment.go - reconcileClusterMachineDeployment pending A deep copy of machineDeployment is already passed and within the function another deepCopy is made. Any reason for it? pkg/controller/deployment.go - addMachineDeploymentFinalizers pending What is an Status.ObservedGeneration? *Read more about generations and observedGeneration at: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata https://alenkacz.medium.com/kubernetes-operator-best-practices-implementing-observedgeneration-250728868792 Ideally the update to the ObservedGeneration should only be made after successful reconciliation and not before. I see that this is just copied from deployment_controller.go as is pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Why and when will a MachineDeployment be marked as frozen and when will it be un-frozen? pkg/controller/deployment.go - reconcileClusterMachineDeployment pending Shoudn’t the validation of the machine deployment be done during the creation via a validating webhook instead of allowing it to be stored in etcd and then failing the validation during sync? I saw the checks and these can be done via validation webhook. pkg/controller/deployment.go - reconcileClusterMachineDeployment pending RollbackTo has been marked as deprecated. What is the replacement? code3 pkg/controller/deployment.go - reconcileClusterMachineDeployment pending What is the max machineSet deletions that you could process in a single run? The reason for asking this question is that for every machineSetDeletion a new goroutine spawned. * Is the Delete call a synchrounous call? Which means it blocks till the machineset deletion is triggered which then also deletes the machines (due to cascade-delete and blockOwnerDeletion= true)? pkg/controller/deployment.go - terminateMachineSets pending If there are validation errors or error when creating label selector then a nil is returned. In the worker reconcile loop if the return value is nil then it will remove it from the queue (forget + done). What is the way to see any errors? Typically when we describe a resource the errors are displayed. Will these be displayed when we discribe a MachineDeployment? pkg/controller/deployment.go - reconcileClusterMachineSet pending If an error is returned by updateMachineSetStatus and it is IsNotFound error then returning an error will again queue the MachineSet. Is this desired as IsNotFound indicates the MachineSet has been deleted and is no longer there? pkg/controller/deployment.go - reconcileClusterMachineSet pending is machineControl.DeleteMachine a synchronous operation which will wait till the machine has been deleted? Also where is the DeletionTimestamp set on the Machine? Will it be automatically done by the API server? pkg/controller/deployment.go - prepareMachineForDeletion pending Bugs/Enhancements Statement + TODO FilePath Status This defines QPS and Burst for its requests to the KAPI. Check if it would make sense to explicitly define a FlowSchema and PriorityLevelConfiguration to ensure that the requests from this controller are given a well-defined preference. What is the rational behind deciding these values? pkg/options/types.go - MachineControllerManagerConfiguration pending In function “validateMachineSpec” fldPath func parameter is never used. pkg/apis/machine/validation/machine.go pending If there is an update failure then this method recursively calls itself without any sort of delays which could lead to a LOT of load on the API server. (opened: https://github.com/gardener/machine-controller-manager/issues/686) pkg/controller/deployment.go - updateMachineDeploymentFinalizers pending We are updating filteredMachines by invoking syncMachinesNodeTemplates, syncMachinesConfig and syncMachinesClassKind but we do not create any deepCopy here. Everywhere else the general principle is when you mutate always make a deepCopy and then mutate the copy instead of the original as a lister is used and that changes the cached copy. Fix: SatisfiedExpectations check has been commented and there is a TODO there to fix it. Is there a PR for this? pkg/controller/machineset.go - reconcileClusterMachineSet pending Code references\n1.1 code1 if machineSet.DeletionTimestamp == nil { // manageReplicas is the core machineSet method where scale up/down occurs // It is not called when deletion timestamp is set manageReplicasErr = c.manageReplicas(ctx, filteredMachines, machineSet) ​ } else if machineSet.DeletionTimestamp != nil { //FIX: change this to simple else without the if 1.2 code2 defer dc.enqueueMachineDeploymentAfter(deployment, 10*time.Minute) * `Clarification`: Why is this conversion needed? err = v1alpha1.Convert_v1alpha1_MachineDeployment_To_machine_MachineDeployment(deployment, internalMachineDeployment, nil) 1.3 code3 // rollback is not re-entrant in case the underlying machine sets are updated with a new \t// revision so we should ensure that we won't proceed to update machine sets until we \t// make sure that the deployment has cleaned up its rollback spec in subsequent enqueues. \tif d.Spec.RollbackTo != nil { \treturn dc.rollback(ctx, d, machineSets, machineMap) \t} ","categories":"","description":"","excerpt":"Machine Controller Manager CORE – …","ref":"/docs/other-components/machine-controller-manager/todo/outline/","tags":"","title":"Outline"},{"body":"Extensibility Overview Initially, everything was developed in-tree in the Gardener project. All cloud providers and the configuration for all the supported operating systems were released together with the Gardener core itself. But as the project grew, it got more and more difficult to add new providers and maintain the existing code base. As a consequence and in order to become agile and flexible again, we proposed GEP-1 (Gardener Enhancement Proposal). The document describes an out-of-tree extension architecture that keeps the Gardener core logic independent of provider-specific knowledge (similar to what Kubernetes has achieved with out-of-tree cloud providers or with CSI volume plugins).\nBasic Concepts Gardener keeps running in the “garden cluster” and implements the core logic of shoot cluster reconciliation / deletion. Extensions are Kubernetes controllers themselves (like Gardener) and run in the seed clusters. As usual, we try to use Kubernetes wherever applicable. We rely on Kubernetes extension concepts in order to enable extensibility for Gardener. The main ideas of GEP-1 are the following:\n During the shoot reconciliation process, Gardener will write CRDs into the seed cluster that are watched and managed by the extension controllers. They will reconcile (based on the .spec) and report whether everything went well or errors occurred in the CRD’s .status field.\n Gardener keeps deploying the provider-independent control plane components (etcd, kube-apiserver, etc.). However, some of these components might still need little customization by providers, e.g., additional configuration, flags, etc. In this case, the extension controllers register webhooks in order to manipulate the manifests.\n Example 1:\nGardener creates a new AWS shoot cluster and requires the preparation of infrastructure in order to proceed (networks, security groups, etc.). It writes the following CRD into the seed cluster:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: Infrastructure metadata: name: infrastructure namespace: shoot--core--aws-01 spec: type: aws providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 internal: - 10.250.112.0/22 public: - 10.250.96.0/22 workers: - 10.250.0.0/19 zones: - eu-west-1a dns: apiserver: api.aws-01.core.example.com region: eu-west-1 secretRef: name: my-aws-credentials sshPublicKey: | base64(key) Please note that the .spec.providerConfig is a raw blob and not evaluated or known in any way by Gardener. Instead, it was specified by the user (in the Shoot resource) and just “forwarded” to the extension controller. Only the AWS controller understands this configuration and will now start provisioning/reconciling the infrastructure. It reports in the .status field the result:\nstatus: observedGeneration: ... state: ... lastError: .. lastOperation: ... providerStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus vpc: id: vpc-1234 subnets: - id: subnet-acbd1234 name: workers zone: eu-west-1 securityGroups: - id: sg-xyz12345 name: workers iam: nodesRoleARN: \u003csome-arn\u003e instanceProfileName: foo ec2: keyName: bar Gardener waits until the .status.lastOperation / .status.lastError indicates that the operation reached a final state and either continuous with the next step, or stops and reports the potential error. The extension-specific output in .status.providerStatus is - similar to .spec.providerConfig - not evaluated, and simply forwarded to CRDs in subsequent steps.\nExample 2:\nGardener deploys the control plane components into the seed cluster, e.g. the kube-controller-manager deployment with the following flags:\napiVersion: apps/v1 kind: Deployment ... spec: template: spec: containers: - command: - /usr/local/bin/kube-controller-manager - --allocate-node-cidrs=true - --attach-detach-reconcile-sync-period=1m0s - --controllers=*,bootstrapsigner,tokencleaner - --cluster-cidr=100.96.0.0/11 - --cluster-name=shoot--core--aws-01 - --cluster-signing-cert-file=/srv/kubernetes/ca/ca.crt - --cluster-signing-key-file=/srv/kubernetes/ca/ca.key - --concurrent-deployment-syncs=10 - --concurrent-replicaset-syncs=10 ... The AWS controller requires some additional flags in order to make the cluster functional. It needs to provide a Kubernetes cloud-config and also some cloud-specific flags. Consequently, it registers a MutatingWebhookConfiguration on Deployments and adds these flags to the container:\n - --cloud-provider=external - --external-cloud-volume-plugin=aws - --cloud-config=/etc/kubernetes/cloudprovider/cloudprovider.conf Of course, it would have needed to create a ConfigMap containing the cloud config and to add the proper volume and volumeMounts to the manifest as well.\n(Please note for this special example: The Kubernetes community is also working on making the kube-controller-manager provider-independent. However, there will most probably be still components other than the kube-controller-manager which need to be adapted by extensions.)\nIf you are interested in writing an extension, or generally in digging deeper to find out the nitty-gritty details of the extension concepts, please read GEP-1. We are truly looking forward to your feedback!\nCurrent Status Meanwhile, the out-of-tree extension architecture of Gardener is in place and has been productively validated. We are tracking all internal and external extensions of Gardener in the Gardener Extensions Library repo.\n","categories":"","description":"","excerpt":"Extensibility Overview Initially, everything was developed in-tree in …","ref":"/docs/gardener/extensions/overview/","tags":"","title":"Overview"},{"body":"Setting up the usage environment Setting up the usage environment Important ⚠️ Set KUBECONFIG Replace provider credentials and desired VM configurations Deploy required CRDs and Objects Check current cluster state Important ⚠️ All paths are relative to the root location of this project repository.\n Run the Machine Controller Manager either as described in Setting up a local development environment or Deploying the Machine Controller Manager into a Kubernetes cluster.\n Make sure that the following steps are run before managing machines/ machine-sets/ machine-deploys.\n Set KUBECONFIG Using the existing Kubeconfig, open another Terminal panel/window with the KUBECONFIG environment variable pointing to this Kubeconfig file as shown below,\n$ export KUBECONFIG=\u003cPATH_TO_REPO\u003e/dev/kubeconfig.yaml Replace provider credentials and desired VM configurations Open kubernetes/machine_classes/aws-machine-class.yaml and replace required values there with the desired VM configurations.\nSimilarily open kubernetes/secrets/aws-secret.yaml and replace - userData, providerAccessKeyId, providerSecretAccessKey with base64 encoded values of cloudconfig file, AWS access key id, and AWS secret access key respectively. Use the following command to get the base64 encoded value of your details\n$ echo \"sample-cloud-config\" | base64 base64-encoded-cloud-config Do the same for your access key id and secret access key.\nDeploy required CRDs and Objects Create all the required CRDs in the cluster using kubernetes/crds.yaml\n$ kubectl apply -f kubernetes/crds.yaml Create the class template that will be used as an machine template to create VMs using kubernetes/machine_classes/aws-machine-class.yaml\n$ kubectl apply -f kubernetes/machine_classes/aws-machine-class.yaml Create the secret used for the cloud credentials and cloudconfig using kubernetes/secrets/aws-secret.yaml\n$ kubectl apply -f kubernetes/secrets/aws-secret.yaml Check current cluster state Get to know the current cluster state using the following commands,\n Checking aws-machine-class in the cluster $ kubectl get awsmachineclass NAME MACHINE TYPE AMI AGE test-aws t2.large ami-123456 5m Checking kubernetes secrets in the cluster $ kubectl get secret NAME TYPE DATA AGE test-secret Opaque 3 21h Checking kubernetes nodes in the cluster $ kubectl get nodes Lists the default set of nodes attached to your cluster\n Checking Machine Controller Manager machines in the cluster $ kubectl get machine No resources found. Checking Machine Controller Manager machine-sets in the cluster $ kubectl get machineset No resources found. Checking Machine Controller Manager machine-deploys in the cluster $ kubectl get machinedeployment No resources found. ","categories":"","description":"","excerpt":"Setting up the usage environment Setting up the usage environment …","ref":"/docs/other-components/machine-controller-manager/prerequisite/","tags":"","title":"Prerequisite"},{"body":"PriorityClasses in Gardener Clusters Gardener makes use of PriorityClasses to improve the overall robustness of the system. In order to benefit from the full potential of PriorityClasses, the gardenlet manages a set of well-known PriorityClasses with fine-granular priority values.\nAll components of the system should use these well-known PriorityClasses instead of creating and using separate ones with arbitrary values, which would compromise the overall goal of using PriorityClasses in the first place. The gardenlet manages the well-known PriorityClasses listed in this document, so that third parties (e.g., Gardener extensions) can rely on them to be present when deploying components to Seed and Shoot clusters.\nThe listed well-known PriorityClasses follow this rough concept:\n Values are close to the maximum that can be declared by the user. This is important to ensure that Shoot system components have higher priority than the workload deployed by end-users. Values have a bit of headroom in between to ensure flexibility when the need for intermediate priority values arises. Values of PriorityClasses created on Seed clusters are lower than the ones on Shoots to ensure that Shoot system components have higher priority than Seed components, if the Seed is backed by a Shoot (ManagedSeed), e.g. coredns should have higher priority than gardenlet. Names simply include the last digits of the value to minimize confusion caused by many (similar) names like critical, importance-high, etc. Garden Clusters When using the gardener-operator for managing the garden runtime and virtual cluster, the following PriorityClasses are available:\nPriorityClasses for Garden Control Plane Components Name Priority Associated Components (Examples) gardener-garden-system-critical 999999550 gardener-operator, gardener-resource-manager, istio gardener-garden-system-500 999999500 virtual-garden-etcd-events, virtual-garden-etcd-main, virtual-garden-kube-apiserver, gardener-apiserver gardener-garden-system-400 999999400 virtual-garden-gardener-resource-manager, gardener-admission-controller, Extension Admission Controllers gardener-garden-system-300 999999300 virtual-garden-kube-controller-manager, vpa-admission-controller, etcd-druid, nginx-ingress-controller gardener-garden-system-200 999999200 vpa-recommender, vpa-updater, hvpa-controller, gardener-scheduler, gardener-controller-manager, gardener-dashboard, terminal-controller-manager, gardener-discovery-server, Extension Controllers gardener-garden-system-100 999999100 fluent-operator, fluent-bit, gardener-metrics-exporter, kube-state-metrics, plutono, vali, prometheus-operator, alertmanager-garden, prometheus-garden, blackbox-exporter, prometheus-longterm Seed Clusters PriorityClasses for Seed System Components Name Priority Associated Components (Examples) gardener-system-critical 999998950 gardenlet, gardener-resource-manager, istio-ingressgateway, istiod gardener-system-900 999998900 Extensions, reversed-vpn-auth-server gardener-system-800 999998800 dependency-watchdog-endpoint, dependency-watchdog-probe, etcd-druid, vpa-admission-controller gardener-system-700 999998700 hvpa-controller, vpa-recommender, vpa-updater gardener-system-600 999998600 alertmanager-seed, fluent-operator, fluent-bit, plutono, kube-state-metrics, nginx-ingress-controller, nginx-k8s-backend, prometheus-operator, prometheus-aggregate, prometheus-cache, prometheus-seed, vali gardener-reserve-excess-capacity -5 reserve-excess-capacity (ref) PriorityClasses for Shoot Control Plane Components Name Priority Associated Components (Examples) gardener-system-500 999998500 etcd-events, etcd-main, kube-apiserver gardener-system-400 999998400 gardener-resource-manager gardener-system-300 999998300 cloud-controller-manager, cluster-autoscaler, csi-driver-controller, kube-controller-manager, kube-scheduler, machine-controller-manager, terraformer, vpn-seed-server gardener-system-200 999998200 csi-snapshot-controller, csi-snapshot-validation, cert-controller-manager, shoot-dns-service, vpa-admission-controller, vpa-recommender, vpa-updater gardener-system-100 999998100 alertmanager-shoot, plutono, kube-state-metrics, prometheus-shoot, blackbox-exporter, vali, event-logger Shoot Clusters PriorityClasses for Shoot System Components Name Priority Associated Components (Examples) system-node-critical (created by Kubernetes) 2000001000 calico-node, kube-proxy, apiserver-proxy, csi-driver, egress-filter-applier system-cluster-critical (created by Kubernetes) 2000000000 calico-typha, calico-kube-controllers, coredns, vpn-shoot, registry-cache gardener-shoot-system-900 999999900 node-problem-detector gardener-shoot-system-800 999999800 calico-typha-horizontal-autoscaler, calico-typha-vertical-autoscaler gardener-shoot-system-700 999999700 blackbox-exporter, node-exporter gardener-shoot-system-600 999999600 addons-nginx-ingress-controller, addons-nginx-ingress-k8s-backend, kubernetes-dashboard, kubernetes-metrics-scraper ","categories":"","description":"","excerpt":"PriorityClasses in Gardener Clusters Gardener makes use of …","ref":"/docs/gardener/priority-classes/","tags":"","title":"Priority Classes"},{"body":"Prober Overview Prober starts asynchronous and periodic probes for every shoot cluster. The first probe is the api-server probe which checks the reachability of the API Server from the control plane. The second probe is the lease probe which is done after the api server probe is successful and checks if the number of expired node leases is below a certain threshold. If the lease probe fails, it will scale down the dependent kubernetes resources. Once the connectivity to kube-apiserver is reestablished and the number of expired node leases are within the accepted threshold, the prober will then proactively scale up the dependent kubernetes resources it had scaled down earlier. The failure threshold fraction for lease probe and dependent kubernetes resources are defined in configuration that is passed to the prober.\nOrigin In a shoot cluster (a.k.a data plane) each node runs a kubelet which periodically renewes its lease. Leases serve as heartbeats informing Kube Controller Manager that the node is alive. The connectivity between the kubelet and the Kube ApiServer can break for different reasons and not recover in time.\nAs an example, consider a large shoot cluster with several hundred nodes. There is an issue with a NAT gateway on the shoot cluster which prevents the Kubelet from any node in the shoot cluster to reach its control plane Kube ApiServer. As a consequence, Kube Controller Manager transitioned the nodes of this shoot cluster to Unknown state.\nMachine Controller Manager which also runs in the shoot control plane reacts to any changes to the Node status and then takes action to recover backing VMs/machine(s). It waits for a grace period and then it will begin to replace the unhealthy machine(s) with new ones.\nThis replacement of healthy machines due to a broken connectivity between the worker nodes and the control plane Kube ApiServer results in undesired downtimes for customer workloads that were running on these otherwise healthy nodes. It is therefore required that there be an actor which detects the connectivity loss between the the kubelet and shoot cluster’s Kube ApiServer and proactively scales down components in the shoot control namespace which could exacerbate the availability of nodes in the shoot cluster.\nDependency Watchdog Prober in Gardener Prober is a central component which is deployed in the garden namespace in the seed cluster. Control plane components for a shoot are deployed in a dedicated shoot namespace for the shoot within the seed cluster.\n NOTE: If you are not familiar with what gardener components like seed, shoot then please see the appendix for links.\n Prober periodically probes Kube ApiServer via two separate probes:\n API Server Probe: Local cluster DNS name which resolves to the ClusterIP of the Kube Apiserver Lease Probe: Checks for number of expired leases to be within the specified threshold. The threshold defines the limit after which DWD can say that the kubelets are not able to reach the API server. Behind the scene For all active shoot clusters (which have not been hibernated or deleted or moved to another seed via control-plane-migration), prober will schedule a probe to run periodically. During each run of a probe it will do the following:\n Checks if the Kube ApiServer is reachable via local cluster DNS. This should always succeed and will fail only when the Kube ApiServer has gone down. If the Kube ApiServer is down then there can be no further damage to the existing shoot cluster (barring new requests to the Kube Api Server). Only if the probe is able to reach the Kube ApiServer via local cluster DNS, will it attempt to check the number of expired node leases in the shoot. The node lease renewal is done by the Kubelet, and so we can say that the lease probe is checking if the kubelet is able to reach the API server. If the number of expired node leases reaches the threshold, then the probe fails. If and when a lease probe fails, then it will initiate a scale-down operation for dependent resources as defined in the prober configuration. In subsequent runs it will keep performing the lease probe. If it is successful, then it will start the scale-up operation for dependent resources as defined in the configuration. Prober lifecycle A reconciler is registered to listen to all events for Cluster resource.\nWhen a Reconciler receives a request for a Cluster change, it will query the extension kube-api server to get the Cluster resource.\nIn the following cases it will either remove an existing probe for this cluster or skip creating a new probe:\n Cluster is marked for deletion. Hibernation has been enabled for the cluster. There is an ongoing seed migration for this cluster. If a new cluster is created with no workers. If an update is made to the cluster by removing all workers (in other words making it worker-less). If none of the above conditions are true and there is no existing probe for this cluster then a new probe will be created, registered and started.\nProbe failure identification DWD probe can either be a success or it could return an error. If the API server probe fails, the lease probe is not done and the probes will be retried. If the error is a TooManyRequests error due to requests to the Kube-API-Server being throttled, then the probes are retried after a backOff of backOffDurationForThrottledRequests.\nIf the lease probe fails, then the error could be due to failure in listing the leases. In this case, no scaling operations are performed. If the error in listing the leases is a TooManyRequests error due to requests to the Kube-API-Server being throttled, then the probes are retried after a backOff of backOffDurationForThrottledRequests.\nIf there is no error in listing the leases, then the Lease probe fails if the number of expired leases reaches the threshold fraction specified in the configuration. A lease is considered expired in the following scenario:-\n\ttime.Now() \u003e= lease.Spec.RenewTime + (p.config.KCMNodeMonitorGraceDuration.Duration * expiryBufferFraction) Here, lease.Spec.RenewTime is the time when current holder of a lease has last updated the lease. config is the probe config generated from the configuration and KCMNodeMonitorGraceDuration is amount of time which KCM allows a running Node to be unresponsive before marking it unhealthy (See ref) . expiryBufferFraction is a hard coded value of 0.75. Using this fraction allows the prober to intervene before KCM marks a node as unknown, but at the same time allowing kubelet sufficient retries to renew the node lease (Kubelet renews the lease every 10s See ref).\nAppendix Gardener Reverse Cluster VPN ","categories":"","description":"","excerpt":"Prober Overview Prober starts asynchronous and periodic probes for …","ref":"/docs/other-components/dependency-watchdog/concepts/prober/","tags":"","title":"Prober"},{"body":"Hotfixes This document describes how to contribute hotfixes\n Hotfixes Cherry Picks Prerequisites Initiate a Cherry Pick Cherry Picks This section explains how to initiate cherry picks on hotfix branches within the gardener/dashboard repository.\n Prerequisites Initiate a Cherry Pick Prerequisites Before you initiate a cherry pick, make sure that the following prerequisites are accomplished.\n A pull request merged against the master branch. The hotfix branch exists (check in the branches section). Have the gardener/dashboard repository cloned as follows: the origin remote should point to your fork (alternatively this can be overwritten by passing FORK_REMOTE=\u003cfork-remote\u003e). the upstream remote should point to the Gardener GitHub org (alternatively this can be overwritten by passing UPSTREAM_REMOTE=\u003cupstream-remote\u003e). Have hub installed, e.g. brew install hub assuming you have a standard golang development environment. A GitHub token which has permissions to create a PR in an upstream branch. Initiate a Cherry Pick Run the [cherry pick script][cherry-pick-script].\nThis example applies a master branch PR #1824 to the remote branch upstream/hotfix-1.74:\nGITHUB_USER=\u003cyour-user\u003e hack/cherry-pick-pull.sh upstream/hotfix-1.74 1824 Be aware the cherry pick script assumes you have a git remote called upstream that points at the Gardener GitHub org.\n You will need to run the cherry pick script separately for each patch release you want to cherry pick to. Cherry picks should be applied to all active hotfix branches where the fix is applicable.\n When asked for your GitHub password, provide the created GitHub token rather than your actual GitHub password. Refer https://github.com/github/hub/issues/2655#issuecomment-735836048\n cherry-pick-script\n ","categories":"","description":"","excerpt":"Hotfixes This document describes how to contribute hotfixes\n Hotfixes …","ref":"/docs/dashboard/process/","tags":"","title":"Process"},{"body":"Releases, Features, Hotfixes This document describes how to contribute features or hotfixes, and how new Gardener releases are usually scheduled, validated, etc.\n Releases, Features, Hotfixes Releases Release Responsible Plan Release Validation Contributing New Features or Fixes TODO Statements Deprecations and Backwards-Compatibility Cherry Picks Prerequisites Initiate a Cherry Pick Releases The @gardener-maintainers are trying to provide a new release roughly every other week (depending on their capacity and the stability/robustness of the master branch).\nHotfixes are usually maintained for the latest three minor releases, though, there are no fixed release dates.\nRelease Responsible Plan Version Week No Begin Validation Phase Due Date Release Responsible v1.101 Week 31-32 July 29, 2024 August 11, 2024 @rfranzke v1.102 Week 33-34 August 12, 2024 August 25, 2024 @plkokanov v1.103 Week 35-36 August 26, 2024 September 8, 2024 @oliver-goetz v1.104 Week 37-38 September 9, 2024 September 22, 2024 @ialidzhikov v1.105 Week 39-40 September 23, 2024 October 6, 2024 @acumino v1.106 Week 41-42 October 7, 2024 October 20, 2024 @timuthy v1.107 Week 43-44 October 21, 2024 November 3, 2024 @LucaBernstein v1.108 Week 45-46 November 4, 2024 November 17, 2024 @shafeeqes v1.109 Week 47-48 November 18, 2024 December 1, 2024 @ary1992 v1.110 Week 48-49 December 2, 2024 December 15, 2024 @ScheererJ v1.111 Week 50-51 December 30, 2024 January 26, 2025 @oliver-goetz v1.112 Week 01-04 January 27, 2025 February 9, 2025 @tobschli v1.113 Week 05-06 February 10, 2025 February 23, 2025 @plkokanov v1.114 Week 07-08 February 24, 2025 March 9, 2025 @rfranzke v1.115 Week 09-10 March 10, 2025 March 23, 2025 @ialidzhikov Apart from the release of the next version, the release responsible is also taking care of potential hotfix releases of the last three minor versions. The release responsible is the main contact person for coordinating new feature PRs for the next minor versions or cherry-pick PRs for the last three minor versions.\n Click to expand the archived release responsible associations! Version Week No Begin Validation Phase Due Date Release Responsible v1.17 Week 07-08 February 15, 2021 February 28, 2021 @rfranzke v1.18 Week 09-10 March 1, 2021 March 14, 2021 @danielfoehrKn v1.19 Week 11-12 March 15, 2021 March 28, 2021 @timebertt v1.20 Week 13-14 March 29, 2021 April 11, 2021 @vpnachev v1.21 Week 15-16 April 12, 2021 April 25, 2021 @timuthy v1.22 Week 17-18 April 26, 2021 May 9, 2021 @BeckerMax v1.23 Week 19-20 May 10, 2021 May 23, 2021 @ialidzhikov v1.24 Week 21-22 May 24, 2021 June 5, 2021 @stoyanr v1.25 Week 23-24 June 7, 2021 June 20, 2021 @rfranzke v1.26 Week 25-26 June 21, 2021 July 4, 2021 @danielfoehrKn v1.27 Week 27-28 July 5, 2021 July 18, 2021 @timebertt v1.28 Week 29-30 July 19, 2021 August 1, 2021 @ialidzhikov v1.29 Week 31-32 August 2, 2021 August 15, 2021 @timuthy v1.30 Week 33-34 August 16, 2021 August 29, 2021 @BeckerMax v1.31 Week 35-36 August 30, 2021 September 12, 2021 @stoyanr v1.32 Week 37-38 September 13, 2021 September 26, 2021 @vpnachev v1.33 Week 39-40 September 27, 2021 October 10, 2021 @voelzmo v1.34 Week 41-42 October 11, 2021 October 24, 2021 @plkokanov v1.35 Week 43-44 October 25, 2021 November 7, 2021 @kris94 v1.36 Week 45-46 November 8, 2021 November 21, 2021 @timebertt v1.37 Week 47-48 November 22, 2021 December 5, 2021 @danielfoehrKn v1.38 Week 49-50 December 6, 2021 December 19, 2021 @rfranzke v1.39 Week 01-04 January 3, 2022 January 30, 2022 @ialidzhikov, @timuthy v1.40 Week 05-06 January 31, 2022 February 13, 2022 @BeckerMax v1.41 Week 07-08 February 14, 2022 February 27, 2022 @plkokanov v1.42 Week 09-10 February 28, 2022 March 13, 2022 @kris94 v1.43 Week 11-12 March 14, 2022 March 27, 2022 @rfranzke v1.44 Week 13-14 March 28, 2022 April 10, 2022 @timebertt v1.45 Week 15-16 April 11, 2022 April 24, 2022 @acumino v1.46 Week 17-18 April 25, 2022 May 8, 2022 @ialidzhikov v1.47 Week 19-20 May 9, 2022 May 22, 2022 @shafeeqes v1.48 Week 21-22 May 23, 2022 June 5, 2022 @ary1992 v1.49 Week 23-24 June 6, 2022 June 19, 2022 @plkokanov v1.50 Week 25-26 June 20, 2022 July 3, 2022 @rfranzke v1.51 Week 27-28 July 4, 2022 July 17, 2022 @timebertt v1.52 Week 29-30 July 18, 2022 July 31, 2022 @acumino v1.53 Week 31-32 August 1, 2022 August 14, 2022 @kris94 v1.54 Week 33-34 August 15, 2022 August 28, 2022 @ialidzhikov v1.55 Week 35-36 August 29, 2022 September 11, 2022 @oliver-goetz v1.56 Week 37-38 September 12, 2022 September 25, 2022 @shafeeqes v1.57 Week 39-40 September 26, 2022 October 9, 2022 @ary1992 v1.58 Week 41-42 October 10, 2022 October 23, 2022 @plkokanov v1.59 Week 43-44 October 24, 2022 November 6, 2022 @rfranzke v1.60 Week 45-46 November 7, 2022 November 20, 2022 @acumino v1.61 Week 47-48 November 21, 2022 December 4, 2022 @ialidzhikov v1.62 Week 49-50 December 5, 2022 December 18, 2022 @oliver-goetz v1.63 Week 01-04 January 2, 2023 January 29, 2023 @shafeeqes v1.64 Week 05-06 January 30, 2023 February 12, 2023 @ary1992 v1.65 Week 07-08 February 13, 2023 February 26, 2023 @timuthy v1.66 Week 09-10 February 27, 2023 March 12, 2023 @plkokanov v1.67 Week 11-12 March 13, 2023 March 26, 2023 @rfranzke v1.68 Week 13-14 March 27, 2023 April 9, 2023 @acumino v1.69 Week 15-16 April 10, 2023 April 23, 2023 @oliver-goetz v1.70 Week 17-18 April 24, 2023 May 7, 2023 @ialidzhikov v1.71 Week 19-20 May 8, 2023 May 21, 2023 @shafeeqes v1.72 Week 21-22 May 22, 2023 June 4, 2023 @ary1992 v1.73 Week 23-24 June 5, 2023 June 18, 2023 @timuthy v1.74 Week 25-26 June 19, 2023 July 2, 2023 @oliver-goetz v1.75 Week 27-28 July 3, 2023 July 16, 2023 @rfranzke v1.76 Week 29-30 July 17, 2023 July 30, 2023 @plkokanov v1.77 Week 31-32 July 31, 2023 August 13, 2023 @ialidzhikov v1.78 Week 33-34 August 14, 2023 August 27, 2023 @acumino v1.79 Week 35-36 August 28, 2023 September 10, 2023 @shafeeqes v1.80 Week 37-38 September 11, 2023 September 24, 2023 @ScheererJ v1.81 Week 39-40 September 25, 2023 October 8, 2023 @ary1992 v1.82 Week 41-42 October 9, 2023 October 22, 2023 @timuthy v1.83 Week 43-44 October 23, 2023 November 5, 2023 @oliver-goetz v1.84 Week 45-46 November 6, 2023 November 19, 2023 @rfranzke v1.85 Week 47-48 November 20, 2023 December 3, 2023 @plkokanov v1.86 Week 49-50 December 4, 2023 December 17, 2023 @ialidzhikov v1.87 Week 01-04 January 1, 2024 January 28, 2024 @acumino v1.88 Week 05-06 January 29, 2024 February 11, 2024 @timuthy v1.89 Week 07-08 February 12, 2024 February 25, 2024 @ScheererJ v1.90 Week 09-10 February 26, 2024 March 10, 2024 @ary1992 v1.91 Week 11-12 March 11, 2024 March 24, 2024 @shafeeqes v1.92 Week 13-14 March 25, 2024 April 7, 2024 @oliver-goetz v1.93 Week 15-16 April 8, 2024 April 21, 2024 @rfranzke v1.94 Week 17-18 April 22, 2024 May 5, 2024 @plkokanov v1.95 Week 19-20 May 6, 2024 May 19, 2024 @ialidzhikov v1.96 Week 21-22 May 20, 2024 June 2, 2024 @acumino v1.97 Week 23-24 June 3, 2024 June 16, 2024 @timuthy v1.98 Week 25-26 June 17, 2024 June 30, 2024 @ScheererJ v1.99 Week 27-28 July 1, 2024 July 14, 2024 @ary1992 v1.100 Week 29-30 July 15, 2024 July 28, 2024 @shafeeqes Release Validation The release phase for a new minor version lasts two weeks. Typically, the first week is used for the validation of the release. This phase includes the following steps:\n master (or latest release-* branch) is deployed to a development landscape that already hosts some existing seed and shoot clusters. An extended test suite is triggered by the “release responsible” which: executes the Gardener integration tests for different Kubernetes versions, infrastructures, and Shoot settings. executes the Kubernetes conformance tests. executes further tests like Kubernetes/OS patch/minor version upgrades. Additionally, every four hours (or on demand) more tests (e.g., including the Kubernetes e2e test suite) are executed for different infrastructures. The “release responsible” is verifying new features or other notable changes (derived of the draft release notes) in this development system. Usually, the new release is triggered in the beginning of the second week if all tests are green, all checks were successful, and if all of the planned verifications were performed by the release responsible.\nContributing New Features or Fixes Please refer to the Gardener contributor guide. Besides a lot of general information, it also provides a checklist for newly created pull requests that may help you to prepare your changes for an efficient review process. If you are contributing a fix or major improvement, please take care to open cherry-pick PRs to all affected and still supported versions once the change is approved and merged in the master branch.\n⚠️ Please ensure that your modifications pass the verification checks (linting, formatting, static code checks, tests, etc.) by executing\nmake verify before filing your pull request.\nThe guide applies for both changes to the master and to any release-* branch. All changes must be submitted via a pull request and be reviewed and approved by at least one code owner.\nTODO Statements Sometimes, TODO statements are being introduced when one cannot follow up immediately with certain tasks or when temporary migration code is required. In order to properly follow-up with such TODOs and to prevent them from piling up without getting attention, the following rules should be followed:\n Each TODO statement should have an associated person and state when it can be removed. Example: // TODO(\u003cgithub-username\u003e): Remove this code after v1.75 has been released. When the task depends on a certain implementation, a GitHub issue should be opened and referenced in the statement. Example: // TODO(\u003cgithub-username\u003e): Remove this code after https://github.com/gardener/gardener/issues/\u003cissue-number\u003e has been implemented. The associated person should actively drive the implementation of the referenced issue (unless it cannot be done because of third-party dependencies or conditions) so that the TODO statement does not get stale. TODO statements without actionable tasks or those that are unlikely to ever be implemented (maybe because of very low priorities) should not be specified in the first place. If a TODO is specified, the associated person should make sure to actively follow-up. Deprecations and Backwards-Compatibility In case you have to remove functionality relevant to end-users (e.g., a field or default value in the Shoot API), please connect it with a Kubernetes minor version upgrade. This way, end-users are forced to actively adapt their manifests when they perform their Kubernetes upgrades. For example, the .spec.kubernetes.enableStaticTokenKubeconfig field in the Shoot API is no longer allowed to be set for Kubernetes versions \u003e= 1.27.\nIn case you have to remove or change functionality which cannot be directly connected with a Kubernetes version upgrade, please consider introducing a feature gate. This way, landscape operators can announce the planned changes to their users and communicate a timeline when they plan to activate the feature gate. End-users can then prepare for it accordingly. For example, the fact that changes to kubelet.kubeReserved in the Shoot API will lead to a rolling update of the worker nodes (previously, these changes were updated in-place) is controlled via the NewWorkerPoolHash feature gate.\nIn case you have to remove functionality relevant to Gardener extensions, please deprecate it first, and add a TODO statement to remove it only after at least 9 releases. Do not forget to write a proper release note as part of your pull request. This gives extension developers enough time (~18 weeks) to adapt to the changes (and to release a new version of their extension) before Gardener finally removes the functionality. Examples are removing a field in the extensions.gardener.cloud/v1alpha1 API group, or removing a controller in the extensions library.\nIn case you have to run migration code (which is mostly internal), please add a TODO statement to remove it only after 3 releases. This way, we can ensure that the Gardener version skew policy is not violated. For example, the migration code for moving the Prometheus instances under management of prometheus-operator was running for three releases.\n [!TIP] Please revisit the version skew policy.\n Cherry Picks This section explains how to initiate cherry picks on release branches within the gardener/gardener repository.\n Prerequisites Initiate a Cherry Pick Prerequisites Before you initiate a cherry pick, make sure that the following prerequisites are accomplished.\n A pull request merged against the master branch. The release branch exists (check in the branches section). Have the gardener/gardener repository cloned as follows: the origin remote should point to your fork (alternatively this can be overwritten by passing FORK_REMOTE=\u003cfork-remote\u003e). the upstream remote should point to the Gardener GitHub org (alternatively this can be overwritten by passing UPSTREAM_REMOTE=\u003cupstream-remote\u003e). Have hub installed, which is most easily installed via go get github.com/github/hub assuming you have a standard golang development environment. A GitHub token which has permissions to create a PR in an upstream branch. Initiate a Cherry Pick Run the [cherry pick script][cherry-pick-script].\nThis example applies a master branch PR #3632 to the remote branch upstream/release-v3.14:\nGITHUB_USER=\u003cyour-user\u003e hack/cherry-pick-pull.sh upstream/release-v3.14 3632 Be aware the cherry pick script assumes you have a git remote called upstream that points at the Gardener GitHub org.\n You will need to run the cherry pick script separately for each patch release you want to cherry pick to. Cherry picks should be applied to all active release branches where the fix is applicable.\n When asked for your GitHub password, provide the created GitHub token rather than your actual GitHub password. Refer https://github.com/github/hub/issues/2655#issuecomment-735836048\n cherry-pick-script\n ","categories":"","description":"","excerpt":"Releases, Features, Hotfixes This document describes how to contribute …","ref":"/docs/gardener/process/","tags":"","title":"Process"},{"body":"Profiling Gardener Components Similar to Kubernetes, Gardener components support profiling using standard Go tools for analyzing CPU and memory usage by different code sections and more. This document shows how to enable and use profiling handlers with Gardener components.\nEnabling profiling handlers and the ports on which they are exposed differs between components. However, once the handlers are enabled, they provide profiles via the same HTTP endpoint paths, from which you can retrieve them via curl/wget or directly using go tool pprof. (You might need to use kubectl port-forward in order to access HTTP endpoints of Gardener components running in clusters.)\nFor example (gardener-controller-manager):\n$ curl http://localhost:2718/debug/pprof/heap \u003e /tmp/heap-controller-manager $ go tool pprof /tmp/heap-controller-manager Type: inuse_space Time: Sep 3, 2021 at 10:05am (CEST) Entering interactive mode (type \"help\" for commands, \"o\" for options) (pprof) or\n$ go tool pprof http://localhost:2718/debug/pprof/heap Fetching profile over HTTP from http://localhost:2718/debug/pprof/heap Saved profile in /Users/timebertt/pprof/pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.008.pb.gz Type: inuse_space Time: Sep 3, 2021 at 10:05am (CEST) Entering interactive mode (type \"help\" for commands, \"o\" for options) (pprof) gardener-apiserver gardener-apiserver provides the same flags as kube-apiserver for enabling profiling handlers (enabled by default):\n--contention-profiling Enable lock contention profiling, if profiling is enabled --profiling Enable profiling via web interface host:port/debug/pprof/ (default true) The handlers are served on the same port as the API endpoints (configured via --secure-port). This means that you will also have to authenticate against the API server according to the configured authentication and authorization policy.\ngardener-{admission-controller,controller-manager,scheduler,resource-manager}, gardenlet gardener-controller-manager, gardener-admission-controller, gardener-scheduler, gardener-resource-manager and gardenlet also allow enabling profiling handlers via their respective component configs (currently disabled by default). Here is an example for the gardener-admission-controller’s configuration and how to enable it (it looks similar for the other components):\napiVersion: admissioncontroller.config.gardener.cloud/v1alpha1 kind: AdmissionControllerConfiguration # ... server: metrics: port: 2723 debugging: enableProfiling: true enableContentionProfiling: true However, the handlers are served on the same port as configured in server.metrics.port via HTTP.\nFor example (gardener-admission-controller):\n$ curl http://localhost:2723/debug/pprof/heap \u003e /tmp/heap $ go tool pprof /tmp/heap ","categories":"","description":"","excerpt":"Profiling Gardener Components Similar to Kubernetes, Gardener …","ref":"/docs/gardener/monitoring/profiling/","tags":"","title":"Profiling"},{"body":"Project Operations This section demonstrates how to use the standard Kubernetes tool for cluster operation kubectl for common cluster operations with emphasis on Gardener resources. For more information on kubectl, see kubectl on kubernetes.io.\n Project Operations Prerequisites Using kubeconfig for remote project operations Downloading your kubeconfig List Gardener API resources Check your permissions Working with projects Working with clusters List project clusters Create a new cluster Delete cluster Get kubeconfig for a Shoot Cluster Related Links Prerequisites You’re logged on to the Gardener Dashboard. You’ve created a cluster and its status is operational. It’s recommended that you get acquainted with the resources in the Gardener API.\nUsing kubeconfig for remote project operations The kubeconfig for project operations is different from the one for cluster operations. It has a larger scope and allows a different set of operations that are applicable for a project administrator role, such as lifecycle control on clusters and managing project members.\nDepending on your goal, you can create a service account suitable for automation and use it for your pipelines, or you can get a user-specific kubeconfig and use it to manage your project resources via kubectl.\nDownloading your kubeconfig Kubernetes doesn’t offer an own resource type for human users that access the API server. Instead, you either have to manage unique user strings, or use an OpenID-Connect (OIDC) compatible Identity Provider (IDP) to do the job.\nOnce the latter is set up, each Gardener user can use the kubelogin plugin for kubectl to authenticate against the API server:\n Set up kubelogin if you don’t have it yet. More information: kubelogin setup.\n Open the menu at the top right of the screen, then choose MY ACCOUNT.\n On the Access card, choose the arrow to see all options for the personalized command-line interface access.\n The personal bearer token that is also offered here only provides access for a limited amount of time for one time operations, for example, in curl commands. The kubeconfig provided for the personalized access is used by kubelogin to grant access to the Gardener API for the user permanently by using a refresh token.\n Check that the right Project is chosen and keep the settings otherwise. Download the kubeconfig file and add its path to the KUBECONFIG environment variable.\n You can now execute kubectl commands on the garden cluster using the identity of your user.\n Note: You can also manage your Gardener project resources automatically using a Gardener service account. For more information, see Automating Project Resource Management.\n List Gardener API resources Using a kubeconfig for project operations, you can list the Gardner API resources using the following command:\nkubectl api-resources | grep garden The response looks like this:\nbackupbuckets bbc core.gardener.cloud false BackupBucket backupentries bec core.gardener.cloud true BackupEntry cloudprofiles cprofile,cpfl core.gardener.cloud false CloudProfile controllerinstallations ctrlinst core.gardener.cloud false ControllerInstallation controllerregistrations ctrlreg core.gardener.cloud false ControllerRegistration plants pl core.gardener.cloud true Plant projects core.gardener.cloud false Project quotas squota core.gardener.cloud true Quota secretbindings sb core.gardener.cloud true SecretBinding seeds core.gardener.cloud false Seed shoots core.gardener.cloud true Shoot shootstates core.gardener.cloud true ShootState terminals dashboard.gardener.cloud true Terminal clusteropenidconnectpresets coidcps settings.gardener.cloud false ClusterOpenIDConnectPreset openidconnectpresets oidcps settings.gardener.cloud true OpenIDConnectPreset Enter the following command to view the Gardener API versions:\nkubectl api-versions | grep garden The response looks like this:\ncore.gardener.cloud/v1alpha1 core.gardener.cloud/v1beta1 dashboard.gardener.cloud/v1alpha1 settings.gardener.cloud/v1alpha1 Check your permissions The operations on project resources are limited by the role of the identity that tries to perform them. To get an overview over your permissions, use the following command:\nkubectl auth can-i --list | grep garden The response looks like this:\nplants.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] quotas.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] secretbindings.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] shoots.core.gardener.cloud [] [] [create delete deletecollection get list patch update watch] terminals.dashboard.gardener.cloud [] [] [create delete deletecollection get list patch update watch] openidconnectpresets.settings.gardener.cloud [] [] [create delete deletecollection get list patch update watch] cloudprofiles.core.gardener.cloud [] [] [get list watch] projects.core.gardener.cloud [] [flowering] [get patch update delete] namespaces [] [garden-flowering] [get] Try to execute an operation that you aren’t allowed, for example:\nkubectl get projects You receive an error message like this:\nError from server (Forbidden): projects.core.gardener.cloud is forbidden: User \"system:serviceaccount:garden-flowering:robot\" cannot list resource \"projects\" in API group \"core.gardener.cloud\" at the cluster scope Working with projects You can get the details for a project, where you (or the service account) is a member.\nkubectl get project flowering The response looks like this:\nNAME NAMESPACE STATUS OWNER CREATOR AGE flowering garden-flowering Ready [PROJECT-ADMIN]@domain [PROJECT-ADMIN]@domain system 45m For more information, see Project in the API reference.\n To query the names of the members of a project, use the following command:\nkubectl get project docu -o jsonpath='{.spec.members[*].name }' The response looks like this:\n[PROJECT-ADMIN]@domain system:serviceaccount:garden-flowering:robot For more information, see members in the API reference.\n Working with clusters The Gardener domain object for a managed cluster is called Shoot.\nList project clusters To query the clusters in a project:\nkubectl get shoots The output looks like this:\nNAME CLOUDPROFILE VERSION SEED DOMAIN HIBERNATION OPERATION PROGRESS APISERVER CONTROL NODES SYSTEM AGE geranium aws 1.18.3 aws-eu1 geranium.flowering.shoot.\u003ctruncated\u003e Awake Succeeded 100 True True True True 74m Create a new cluster To create a new cluster using the command line, you need a YAML definition of the Shoot resource.\n To get started, copy the following YAML definition to a new file, for example, daffodil.yaml (or copy file shoot.yaml to daffodil.yaml) and adapt it to your needs.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: daffodil namespace: garden-flowering spec: secretBindingName: trial-secretbinding-gcp cloudProfileName: gcp region: europe-west1 purpose: evaluation provider: type: gcp infrastructureConfig: kind: InfrastructureConfig apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 networks: workers: 10.250.0.0/16 controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 zone: europe-west1-c kind: ControlPlaneConfig workers: - name: cpu-worker maximum: 2 minimum: 1 maxSurge: 1 maxUnavailable: 0 machine: type: n1-standard-2 image: name: coreos version: 2303.3.0 volume: type: pd-standard size: 50Gi zones: - europe-west1-c networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: true machineImageVersion: true hibernation: enabled: true schedules: - start: '00 17 * * 1,2,3,4,5' location: Europe/Kiev kubernetes: allowPrivilegedContainers: true kubeControllerManager: nodeCIDRMaskSize: 24 kubeProxy: mode: IPTables version: 1.18.3 addons: nginxIngress: enabled: false kubernetesDashboard: enabled: false In your new YAML definition file, replace the value of field metadata.namespace with your namespace following the convention garden-[YOUR-PROJECTNAME].\n Create a cluster using this manifest (with flag --wait=false the command returns immediately, otherwise it doesn’t return until the process is finished):\nkubectl apply -f daffodil.yaml --wait=false The response looks like this:\nshoot.core.gardener.cloud/daffodil created It takes 5–10 minutes until the cluster is created. To watch the progress, get all shoots and use the -w flag.\nkubectl get shoots -w For a more extended example, see Gardener example shoot manifest.\nDelete cluster To delete a shoot cluster, you must first annotate the shoot resource to confirm the operation with confirmation.gardener.cloud/deletion: \"true\":\n Add the annotation to your manifest (daffodil.yaml in the previous example):\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: daffodil namespace: garden-flowering annotations: confirmation.gardener.cloud/deletion: \"true\" spec: addons: ... Apply your changes of daffodil.yaml.\nkubectl apply -f daffodil.yaml The response looks like this:\nshoot.core.gardener.cloud/daffodil configured Trigger the deletion.\nkubectl delete shoot daffodil --wait=false The response looks like this:\nshoot.core.gardener.cloud \"daffodil\" deleted It takes 5–10 minutes to delete the cluster. To watch the progress, get all shoots and use the -w flag.\nkubectl get shoots -w Get kubeconfig for a Shoot Cluster To get the kubeconfig for a shoot cluster in Gardener from the command line, use one of the following methods:\n Using shoots/admin/kubeconfig Subresource:\n You can obtain a temporary admin kubeconfig by using the shoots/admin/kubeconfig subresource. Detailed instructions can be found in the Gardener documentation here. Using gardenctl and gardenlogin: gardenctl simplifies targeting Shoot clusters. It automatically downloads a kubeconfig that uses the gardenlogin kubectl auth plugin. This plugin transparently manages Shoot cluster authentication and certificate renewal without embedding any credentials in the kubeconfig file.\n When installing gardenctl via Homebrew or Chocolatey, gardenlogin will be installed as a dependency. Refer to the installation instructions here. Both tools can share the same configuration. To set up the tools, refer to the documentation here. To get the kubeconfig, use either the target or kubeconfig command: Target Command: This command targets the specified Shoot cluster and automatically downloads the kubeconfig.\ngardenctl target --garden landscape-dev --project my-project --shoot my-shoot To set the KUBECONFIG environment variable to point to the downloaded kubeconfig file, use the following command (for bash):\neval $(gardenctl kubectl-env bash) Detailed instructions can be found here.\n Kubeconfig Command: This command directly downloads the kubeconfig for the specified Shoot cluster and outputs it in raw format.\ngardenctl kubeconfig --garden landscape-dev --project my-project --shoot my-shoot --raw Related Links Automating Project Resource Management Authenticating with an Identity Provider. ","categories":"","description":"","excerpt":"Project Operations This section demonstrates how to use the standard …","ref":"/docs/dashboard/project-operations/","tags":"","title":"Project Operations"},{"body":"Extending Project Roles The Project resource allows to specify a list of roles for every member (.spec.members[*].roles). There are a few standard roles defined by Gardener itself. Please consult Projects for further information.\nHowever, extension controllers running in the garden cluster may also create CustomResourceDefinitions that project members might be able to CRUD. For this purpose, Gardener also allows to specify extension roles.\nAn extension role is prefixed with extension:, e.g.\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: name: dev spec: members: - apiGroup: rbac.authorization.k8s.io kind: User name: alice.doe@example.com role: admin roles: - owner - extension:foo The project controller will, for every extension role, create a ClusterRole with name gardener.cloud:extension:project:\u003cprojectName\u003e:\u003croleName\u003e, i.e., for the above example: gardener.cloud:extension:project:dev:foo. This ClusterRole aggregates other ClusterRoles that are labeled with rbac.gardener.cloud/aggregate-to-extension-role=foo which might be created by extension controllers.\nAn extension that might want to contribute to the core admin or viewer roles can use the labels rbac.gardener.cloud/aggregate-to-project-member=true or rbac.gardener.cloud/aggregate-to-project-viewer=true, respectively.\nPlease note that the names of the extension roles are restricted to 20 characters!\nMoreover, the project controller will also create a corresponding RoleBinding with the same name in the project namespace. It will automatically assign all members that are assigned to this extension role.\n","categories":"","description":"","excerpt":"Extending Project Roles The Project resource allows to specify a list …","ref":"/docs/gardener/extensions/project-roles/","tags":"","title":"Project Roles"},{"body":"Projects The Gardener API server supports a cluster-scoped Project resource which is used for data isolation between individual Gardener consumers. For example, each development team has its own project to manage its own shoot clusters.\nEach Project is backed by a Kubernetes Namespace that contains the actual related Kubernetes resources, like Secrets or Shoots.\nExample resource:\napiVersion: core.gardener.cloud/v1beta1 kind: Project metadata: name: dev spec: namespace: garden-dev description: \"This is my first project\" purpose: \"Experimenting with Gardener\" owner: apiGroup: rbac.authorization.k8s.io kind: User name: john.doe@example.com members: - apiGroup: rbac.authorization.k8s.io kind: User name: alice.doe@example.com role: admin # roles: # - viewer # - uam # - serviceaccountmanager # - extension:foo - apiGroup: rbac.authorization.k8s.io kind: User name: bob.doe@example.com role: viewer # tolerations: # defaults: # - key: \u003csome-key\u003e # whitelist: # - key: \u003csome-key\u003e The .spec.namespace field is optional and is initialized if unset. The name of the resulting namespace will be determined based on the Project name and UID, e.g., garden-dev-5aef3. It’s also possible to adopt existing namespaces by labeling them gardener.cloud/role=project and project.gardener.cloud/name=dev beforehand (otherwise, they cannot be adopted).\nWhen deleting a Project resource, the corresponding namespace is also deleted. To keep a namespace after project deletion, an administrator/operator (not Project members!) can annotate the project-namespace with namespace.gardener.cloud/keep-after-project-deletion.\nThe spec.description and .spec.purpose fields can be used to describe to fellow team members and Gardener operators what this project is used for.\nEach project has one dedicated owner, configured in .spec.owner using the rbac.authorization.k8s.io/v1.Subject type. The owner is the main contact person for Gardener operators. Please note that the .spec.owner field is deprecated and will be removed in future API versions in favor of the owner role, see below.\nThe list of members (again a list in .spec.members[] using the rbac.authorization.k8s.io/v1.Subject type) contains all the people that are associated with the project in any way. Each project member must have at least one role (currently described in .spec.members[].role, additional roles can be added to .spec.members[].roles[]). The following roles exist:\n admin: This allows to fully manage resources inside the project (e.g., secrets, shoots, configmaps, and similar). Mind that the admin role has read only access to service accounts. serviceaccountmanager: This allows to fully manage service accounts inside the project namespace and request tokens for them. The permissions of the created service accounts are instead managed by the admin role. Please refer to Service Account Manager. uam: This allows to add/modify/remove human users or groups to/from the project member list. viewer: This allows to read all resources inside the project except secrets. owner: This combines the admin, uam, and serviceaccountmanager roles. Extension roles (prefixed with extension:): Please refer to Extending Project Roles. The project controller inside the Gardener Controller Manager is managing RBAC resources that grant the described privileges to the respective members.\nThere are three central ClusterRoles gardener.cloud:system:project-member, gardener.cloud:system:project-viewer, and gardener.cloud:system:project-serviceaccountmanager that grant the permissions for namespaced resources (e.g., Secrets, Shoots, ServiceAccounts). Via referring RoleBindings created in the respective namespace the project members get bound to these ClusterRoles and, thus, the needed permissions. There are also project-specific ClusterRoles granting the permissions for cluster-scoped resources, e.g., the Namespace or Project itself.\nFor each role, the following ClusterRoles, ClusterRoleBindings, and RoleBindings are created:\n Role ClusterRole ClusterRoleBinding RoleBinding admin gardener.cloud:system:project-member:\u003cprojectName\u003e gardener.cloud:system:project-member:\u003cprojectName\u003e gardener.cloud:system:project-member serviceaccountmanager gardener.cloud:system:project-serviceaccountmanager uam gardener.cloud:system:project-uam:\u003cprojectName\u003e gardener.cloud:system:project-uam:\u003cprojectName\u003e viewer gardener.cloud:system:project-viewer:\u003cprojectName\u003e gardener.cloud:system:project-viewer:\u003cprojectName\u003e gardener.cloud:system:project-viewer owner gardener.cloud:system:project:\u003cprojectName\u003e gardener.cloud:system:project:\u003cprojectName\u003e extension:* gardener.cloud:extension:project:\u003cprojectName\u003e:\u003cextensionRoleName\u003e gardener.cloud:extension:project:\u003cprojectName\u003e:\u003cextensionRoleName\u003e User Access Management For Projects created before Gardener v1.8, all admins were allowed to manage other members. Beginning with v1.8, the new uam role is being introduced. It is backed by the manage-members custom RBAC verb which allows to add/modify/remove human users or groups to/from the project member list. Human users are subjects with kind=User and name!=system:serviceaccount:*, and groups are subjects with kind=Group. The management of service account subjects (kind=ServiceAccount or name=system:serviceaccount:*) is not controlled via the uam custom verb but with the standard update/patch verbs for projects.\nAll newly created projects will only bind the owner to the uam role. The owner can still grant the uam role to other members if desired. For projects created before Gardener v1.8, the Gardener Controller Manager will migrate all projects to also assign the uam role to all admin members (to not break existing use-cases). The corresponding migration logic is present in Gardener Controller Manager from v1.8 to v1.13. The project owner can gradually remove these roles if desired.\nStale Projects When a project is not actively used for some period of time, it is marked as “stale”. This is done by a controller called “Stale Projects Reconciler”. Once the project is marked as stale, there is a time frame in which if not used it will be deleted by that controller.\nFour-Eyes-Principle For Resource Deletion In order to delete a Shoot, the deletion must be confirmed upfront with the confirmation.gardener.cloud/deletion=true annotation. Without this annotation being set, gardener-apiserver denies any DELETE request. Still, users sometimes accidentally shot themselves in the foot, meaning that they accidentally deleted a Shoot despite the confirmation requirement.\nTo prevent that (or make it harder, at least), the Project can be configured to apply the dual approval concept for Shoot deletion. This means that the subject confirming the deletion must not be the same as the subject sending the DELETE request.\nExample:\nspec: dualApprovalForDeletion: - resource: shoots selector: matchLabels: {} includeServiceAccounts: true [!NOTE] As of today, core.gardener.cloud/v1beta1.Shoot is the only resource for which this concept is implemented.\n As usual, .spec.dualApprovalForDeletion[].selector.matchLabels={} matches all resources, .spec.dualApprovalForDeletion[].selector.matchLabels=null matches none at all. It can also be decided to specify an individual label selector if this concept shall only apply to a subset of the Shoots in the project (e.g., CI/development clusters shall be excluded).\nThe includeServiceAccounts (default: true) controls whether the concept also applies when the Shoot deletion confirmation and actual deletion is triggered via ServiceAccounts. This is to prevent that CI jobs have to follow this concept as well, adding additional complexity/overhead. Alternatively, you could also use two ServiceAccounts, one for confirming the deletion, and another one for actually sending the DELETE request, if desired.\n [!IMPORTANT] Project members can still change the labels of Shoots (or the selector itself) to circumvent the dual approval concept. This concern is intentionally excluded/ignored for now since the principle is not a “security feature” but shall just help preventing accidental deletion.\n ","categories":"","description":"Project operations and roles. Four-Eyes-Principle for resource deletion","excerpt":"Project operations and roles. Four-Eyes-Principle for resource …","ref":"/docs/gardener/projects/","tags":"","title":"Projects"},{"body":"Gardener Extension for Alicloud provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Alicloud provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Alibaba cloud provider","excerpt":"Gardener extension controller for the Alibaba cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/","tags":"","title":"Provider Alicloud"},{"body":"Gardener Extension for AWS provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the AWS provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\nCompatibility The following lists known compatibility issues of this extension controller with other Gardener components.\n AWS Extension Gardener Action Notes \u003c= v1.15.0 \u003ev1.10.0 Please update the provider version to \u003e v1.15.0 or disable the feature gate MountHostCADirectories in the Gardenlet. Applies if feature flag MountHostCADirectories in the Gardenlet is enabled. Shoots with CSI enabled (Kubernetes version \u003e= 1.18) miss a mount to the directory /etc/ssl in the Shoot API Server. This can lead to not trusting external Root CAs when the API Server makes requests via webhooks or OIDC. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the AWS cloud provider","excerpt":"Gardener extension controller for the AWS cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/","tags":"","title":"Provider AWS"},{"body":"Gardener Extension for Azure provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Azure provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.31 1.31.0+ N/A Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Azure cloud provider","excerpt":"Gardener extension controller for the Azure cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/","tags":"","title":"Provider Azure"},{"body":"Gardener Extension for Equinix Metal provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the Equinix Metal provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 untested N/A Kubernetes 1.29 untested N/A Kubernetes 1.28 untested N/A Kubernetes 1.27 untested N/A Kubernetes 1.26 untested N/A Kubernetes 1.25 untested N/A Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the Equinix Metal cloud provider","excerpt":"Gardener extension controller for the Equinix Metal cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/","tags":"","title":"Provider Equinix Metal"},{"body":"Gardener Extension for GCP provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the GCP provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the GCP cloud provider","excerpt":"Gardener extension controller for the GCP cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/","tags":"","title":"Provider GCP"},{"body":"Packages:\n local.provider.extensions.gardener.cloud/v1alpha1 local.provider.extensions.gardener.cloud/v1alpha1 Package v1alpha1 contains the local provider API resources.\nResource Types: CloudProfileConfig WorkerStatus CloudProfileConfig CloudProfileConfig contains provider-specific configuration that is embedded into Gardener’s CloudProfile resource.\n Field Description apiVersion string local.provider.extensions.gardener.cloud/v1alpha1 kind string CloudProfileConfig machineImages []MachineImages MachineImages is the list of machine images that are understood by the controller. It maps logical names and versions to provider-specific identifiers.\n WorkerStatus WorkerStatus contains information about created worker resources.\n Field Description apiVersion string local.provider.extensions.gardener.cloud/v1alpha1 kind string WorkerStatus machineImages []MachineImage (Optional) MachineImages is a list of machine images that have been used in this worker. Usually, the extension controller gets the mapping from name/version to the provider-specific machine image data from the CloudProfile. However, if a version that is still in use gets removed from this componentconfig it cannot reconcile anymore existing Worker resources that are still using this version. Hence, it stores the used versions in the provider status to ensure reconciliation is possible.\n MachineImage (Appears on: WorkerStatus) MachineImage is a mapping from logical names and versions to provider-specific machine image data.\n Field Description name string Name is the logical name of the machine image.\n version string Version is the logical version of the machine image.\n image string Image is the image for the machine image.\n MachineImageVersion (Appears on: MachineImages) MachineImageVersion contains a version and a provider-specific identifier.\n Field Description version string Version is the version of the image.\n image string Image is the image for the machine image.\n MachineImages (Appears on: CloudProfileConfig) MachineImages is a mapping from logical names and versions to provider-specific identifiers.\n Field Description name string Name is the logical name of the machine image.\n versions []MachineImageVersion Versions contains versions and a provider-specific identifier.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n local.provider.extensions.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/provider-local/","tags":"","title":"Provider Local"},{"body":"Local Provider Extension The “local provider” extension is used to allow the usage of seed and shoot clusters which run entirely locally without any real infrastructure or cloud provider involved. It implements Gardener’s extension contract (GEP-1) and thus comprises several controllers and webhooks acting on resources in seed and shoot clusters.\nThe code is maintained in pkg/provider-local.\nMotivation The motivation for maintaining such extension is the following:\n 🛡 Output Qualification: Run fast and cost-efficient end-to-end tests, locally and in CI systems (increased confidence ⛑ before merging pull requests) ⚙️ Development Experience: Develop Gardener entirely on a local machine without any external resources involved (improved costs 💰 and productivity 🚀) 🤝 Open Source: Quick and easy setup for a first evaluation of Gardener and a good basis for first contributions Current Limitations The following enlists the current limitations of the implementation. Please note that all of them are not technical limitations/blockers, but simply advanced scenarios that we haven’t had invested yet into.\n No load balancers for Shoot clusters.\nWe have not yet developed a cloud-controller-manager which could reconcile load balancer Services in the shoot cluster.\n In case a seed cluster with multiple availability zones, i.e. multiple entries in .spec.provider.zones, is used in conjunction with a single-zone shoot control plane, i.e. a shoot cluster without .spec.controlPlane.highAvailability or with .spec.controlPlane.highAvailability.failureTolerance.type set to node, the local address of the API server endpoint needs to be determined manually or via the in-cluster coredns.\nAs the different istio ingress gateway loadbalancers have individual external IP addresses, single-zone shoot control planes can end up in a random availability zone. Having the local host use the coredns in the cluster as name resolver would form a name resolution cycle. The tests mitigate the issue by adapting the DNS configuration inside the affected test.\n ManagedSeeds It is possible to deploy ManagedSeeds with provider-local by first creating a Shoot in the garden namespace and then creating a referencing ManagedSeed object.\n Please note that this is only supported by the Skaffold-based setup.\n The corresponding e2e test can be run via:\n./hack/test-e2e-local.sh --label-filter \"ManagedSeed\" Implementation Details The images locally built by Skaffold for the Gardener components which are deployed to this shoot cluster are managed by a container registry in the registry namespace in the kind cluster. provider-local configures this registry as mirror for the shoot by mutating the OperatingSystemConfig and using the default contract for extending the containerd configuration.\nIn order to bootstrap a seed cluster, the gardenlet deploys PersistentVolumeClaims and Services of type LoadBalancer. While storage is supported in shoot clusters by using the local-path-provisioner, load balancers are not supported yet. However, provider-local runs a Service controller which specifically reconciles the seed-related Services of type LoadBalancer. This way, they get an IP and gardenlet can finish its bootstrapping process. Note that these IPs are not reachable, however for the sake of developing ManagedSeeds this is sufficient for now.\nAlso, please note that the provider-local extension only gets deployed because of the Always deployment policy in its corresponding ControllerRegistration and because the DNS provider type of the seed is set to local.\nImplementation Details This section contains information about how the respective controllers and webhooks in provider-local are implemented and what their purpose is.\nBootstrapping The Helm chart of the provider-local extension defined in its ControllerDeployment contains a special deployment for a CoreDNS instance in a gardener-extension-provider-local-coredns namespace in the seed cluster.\nThis CoreDNS instance is responsible for enabling the components running in the shoot clusters to be able to resolve the DNS names when they communicate with their kube-apiservers.\nIt contains a static configuration to resolve the DNS names based on local.gardener.cloud to istio-ingressgateway.istio-ingress.svc.\nControllers There are controllers for all resources in the extensions.gardener.cloud/v1alpha1 API group except for BackupBucket and BackupEntrys.\nControlPlane This controller is deploying the local-path-provisioner as well as a related StorageClass in order to support PersistentVolumeClaims in the local shoot cluster. Additionally, it creates a few (currently unused) dummy secrets (CA, server and client certificate, basic auth credentials) for the sake of testing the secrets manager integration in the extensions library.\nDNSRecord The controller adapts the cluster internal DNS configuration by extending the coredns configuration for every observed DNSRecord. It will add two corresponding entries in the custom DNS configuration per shoot cluster:\ndata: api.local.local.external.local.gardener.cloud.override: | rewrite stop name regex api.local.local.external.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer auto api.local.local.internal.local.gardener.cloud.override: | rewrite stop name regex api.local.local.internal.local.gardener.cloud istio-ingressgateway.istio-ingress.svc.cluster.local answer auto Infrastructure This controller generates a NetworkPolicy which allows the control plane pods (like kube-apiserver) to communicate with the worker machine pods (see Worker section).\nNetwork This controller is not implemented anymore. In the initial version of provider-local, there was a Network controller deploying kindnetd (see release v1.44.1). However, we decided to drop it because this setup prevented us from using NetworkPolicys (kindnetd does not ship a NetworkPolicy controller). In addition, we had issues with shoot clusters having more than one node (hence, we couldn’t support rolling updates, see PR #5666).\nOperatingSystemConfig This controller renders a simple cloud-init template which can later be executed by the shoot worker nodes.\nThe shoot worker nodes are Pods with a container based on the kindest/node image. This is maintained in the gardener/machine-controller-manager-provider-local repository and has a special run-userdata systemd service which executes the cloud-init generated earlier by the OperatingSystemConfig controller.\nWorker This controller leverages the standard generic Worker actuator in order to deploy the machine-controller-manager as well as the machine-controller-manager-provider-local.\nAdditionally, it generates the MachineClasses and the MachineDeployments based on the specification of the Worker resources.\nIngress The gardenlet creates a wildcard DNS record for the Seed’s ingress domain pointing to the nginx-ingress-controller’s LoadBalancer. This domain is commonly used by all Ingress objects created in the Seed for Seed and Shoot components. As provider-local implements the DNSRecord extension API (see the DNSRecordsection), this controller reconciles all Ingresss and creates DNSRecords of type local for each host included in spec.rules. This only happens for shoot namespaces (gardener.cloud/role=shoot label) to make Ingress domains resolvable on the machine pods.\nService This controller reconciles Services of type LoadBalancer in the local Seed cluster. Since the local Kubernetes clusters used as Seed clusters typically don’t support such services, this controller sets the .status.ingress.loadBalancer.ip[0] to the IP of the host. It makes important LoadBalancer Services (e.g. istio-ingress/istio-ingressgateway and garden/nginx-ingress-controller) available to the host by setting spec.ports[].nodePort to well-known ports that are mapped to hostPorts in the kind cluster configuration.\nistio-ingress/istio-ingressgateway is set to be exposed on nodePort 30433 by this controller.\nIn case the seed has multiple availability zones (.spec.provider.zones) and it uses SNI, the different zone-specific istio-ingressgateway loadbalancers are exposed via different IP addresses. Per default, IP addresses 172.18.255.10, 172.18.255.11, and 172.18.255.12 are used for the zones 0, 1, and 2 respectively.\nETCD Backups This controller reconciles the BackupBucket and BackupEntry of the shoot allowing the etcd-backup-restore to create and copy backups using the local provider functionality. The backups are stored on the host file system. This is achieved by mounting that directory to the etcd-backup-restore container.\nExtension Seed This controller reconciles Extensions of type local-ext-seed. It creates a single serviceaccount named local-ext-seed in the shoot’s namespace in the seed. The extension is reconciled before the kube-apiserver. More on extension lifecycle strategies can be read in Registering Extension Controllers.\nExtension Shoot This controller reconciles Extensions of type local-ext-shoot. It creates a single serviceaccount named local-ext-shoot in the kube-system namespace of the shoot. The extension is reconciled after the kube-apiserver. More on extension lifecycle strategies can be read Registering Extension Controllers.\nExtension Shoot After Worker This controller reconciles Extensions of type local-ext-shoot-after-worker. It creates a deployment named local-ext-shoot-after-worker in the kube-system namespace of the shoot. The extension is reconciled after the workers and waits until the deployment is ready. More on extension lifecycle strategies can be read Registering Extension Controllers.\nHealth Checks The health check controller leverages the health check library in order to:\n check the health of the ManagedResource/extension-controlplane-shoot-webhooks and populate the SystemComponentsHealthy condition in the ControlPlane resource. check the health of the ManagedResource/extension-networking-local and populate the SystemComponentsHealthy condition in the Network resource. check the health of the ManagedResource/extension-worker-mcm-shoot and populate the SystemComponentsHealthy condition in the Worker resource. check the health of the Deployment/machine-controller-manager and populate the ControlPlaneHealthy condition in the Worker resource. check the health of the Nodes and populate the EveryNodeReady condition in the Worker resource. Webhooks Control Plane This webhook reacts on the OperatingSystemConfig containing the configuration of the kubelet and sets the failSwapOn to false (independent of what is configured in the Shoot spec) (ref).\nDNS Config This webhook reacts on events for the dependency-watchdog-probe Deployment, the blackbox-exporter Deployment, as well as on events for Pods created when the machine-controller-manager reconciles Machines. All these pods need to be able to resolve the DNS names for shoot clusters. It sets the .spec.dnsPolicy=None and .spec.dnsConfig.nameServers to the cluster IP of the coredns Service created in the gardener-extension-provider-local-coredns namespaces so that these pods can resolve the DNS records for shoot clusters (see the Bootstrapping section for more details).\nMachine Controller Manager This webhook mutates the global ClusterRole related to machine-controller-manager and injects permissions for Service resources. The machine-controller-manager-provider-local deploys Pods for each Machine (while real infrastructure provider obviously deploy VMs, so no Kubernetes resources directly). It also deploys a Service for these machine pods, and in order to do so, the ClusterRole must allow the needed permissions for Service resources.\nNode This webhook reacts on updates to nodes/status in both seed and shoot clusters and sets the .status.{allocatable,capacity}.cpu=\"100\" and .status.{allocatable,capacity}.memory=\"100Gi\" fields.\nBackground: Typically, the .status.{capacity,allocatable} values are determined by the resources configured for the Docker daemon (see for example the docker Quick Start Guide for Mac). Since many of the Pods deployed by Gardener have quite high .spec.resources.requests, the Nodes easily get filled up and only a few Pods can be scheduled (even if they barely consume any of their reserved resources). In order to improve the user experience, on startup/leader election the provider-local extension submits an empty patch which triggers the “node webhook” (see the below section) for the seed cluster. The webhook will increase the capacity of the Nodes to allow all Pods to be scheduled. For the shoot clusters, this empty patch trigger is not needed since the MutatingWebhookConfiguration is reconciled by the ControlPlane controller and exists before the Node object gets registered.\nShoot This webhook reacts on the ConfigMap used by the kube-proxy and sets the maxPerCore field to 0 since other values don’t work well in conjunction with the kindest/node image which is used as base for the shoot worker machine pods (ref).\nDNS Configuration for Multi-Zonal Seeds In case a seed cluster has multiple availability zones as specified in .spec.provider.zones, multiple istio ingress gateways are deployed, one per availability zone in addition to the default deployment. The result is that single-zone shoot control planes, i.e. shoot clusters with .spec.controlPlane.highAvailability set or with .spec.controlPlane.highAvailability.failureTolerance.type set to node, may be exposed via any of the zone-specific istio ingress gateways. Previously, the endpoints were statically mapped via /etc/hosts. Unfortunately, this is no longer possible due to the aforementioned dynamic in the endpoint selection.\nFor multi-zonal seed clusters, there is an additional configuration following coredns’s view plugin mapping the external IP addresses of the zone-specific loadbalancers to the corresponding internal istio ingress gateway domain names. This configuration is only in place for requests from outside of the seed cluster. Those requests are currently being identified by the protocol. UDP requests are interpreted as originating from within the seed cluster while TCP requests are assumed to come from outside the cluster via the docker hostport mapping.\nThe corresponding test sets the DNS configuration accordingly so that the name resolution during the test use coredns in the cluster.\nFuture Work Future work could mostly focus on resolving the above listed limitations, i.e.:\n Implement a cloud-controller-manager and deploy it via the ControlPlane controller. Properly implement .spec.machineTypes in the CloudProfiles (i.e., configure .spec.resources properly for the created shoot worker machine pods). ","categories":"","description":"","excerpt":"Local Provider Extension The “local provider” extension is used to …","ref":"/docs/gardener/extensions/provider-local/","tags":"","title":"Provider Local"},{"body":"Gardener Extension for OpenStack provider \nProject Gardener implements the automated management and operation of Kubernetes clusters as a service. Its main principle is to leverage Kubernetes concepts for all of its tasks.\nRecently, most of the vendor specific logic has been developed in-tree. However, the project has grown to a size where it is very hard to extend, maintain, and test. With GEP-1 we have proposed how the architecture can be changed in a way to support external controllers that contain their very own vendor specifics. This way, we can keep Gardener core clean and independent.\nThis controller implements Gardener’s extension contract for the OpenStack provider.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\nSupported Kubernetes versions This extension controller supports the following Kubernetes versions:\n Version Support Conformance test results Kubernetes 1.31 1.31.0+ N/A Kubernetes 1.30 1.30.0+ Kubernetes 1.29 1.29.0+ Kubernetes 1.28 1.28.0+ Kubernetes 1.27 1.27.0+ Kubernetes 1.26 1.26.0+ Kubernetes 1.25 1.25.0+ Please take a look here to see which versions are supported by Gardener in general.\n Compatibility The following lists known compatibility issues of this extension controller with other Gardener components.\n OpenStack Extension Gardener Action Notes \u003c v1.12.0 \u003e v1.10.0 Please update the provider version to \u003e= v1.12.0 or disable the feature gate MountHostCADirectories in the Gardenlet. Applies if feature flag MountHostCADirectories in the Gardenlet is enabled. This is to prevent duplicate volume mounts to /usr/share/ca-certificates in the Shoot API Server. How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start.\nStatic code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io GEP-1 (Gardener Enhancement Proposal) on extensibility GEP-4 (New core.gardener.cloud/v1beta1 API) Extensibility API documentation Gardener Extensions Golang library Gardener API Reference ","categories":"","description":"Gardener extension controller for the OpenStack cloud provider","excerpt":"Gardener extension controller for the OpenStack cloud provider","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/","tags":"","title":"Provider Openstack"},{"body":"Overview When opening a pull request, it is best to give all the necessary details in order to help out the reviewers understand your changes and why you are proposing them. Here is the template that you will need to fill out:\n**What this PR does / why we need it**: \u003c!-- Describe the purpose of this PR and what changes have been proposed in it --\u003e **Which issue(s) this PR fixes**: Fixes # \u003c!-- If you are opening a PR in response to a specific issue, linking it will automatically close the issue once the PR has been merged --\u003e **Special notes for your reviewer**: \u003c!-- Any additional information your reviewer might need to know to better process your PR --\u003e **Release note**: \u003c!-- Write your release note: 1.Enter your release note in the below block. 2.If no release note is required, just write \"NONE\" within the block. Format of block header: \u003ccategory\u003e \u003ctarget_group\u003e Possible values: - category: improvement|noteworthy|action - target_group: user|operator|developer --\u003e ```other operator EXAMPLE \\``` Writing Release Notes Some guidelines and tips for writing release notes include:\n Be as descriptive as needed. Only use lists if you are describing multiple different additions. You can freely use markdown formatting, including links. You can find various examples in the Releases sections of the gardener/documentation and gardener/gardener repositories.\n","categories":"","description":"","excerpt":"Overview When opening a pull request, it is best to give all the …","ref":"/docs/contribute/documentation/pr-description/","tags":"","title":"Pull Request Description"},{"body":"Readiness of Shoot Worker Nodes Background When registering new Nodes, kubelet adds the node.kubernetes.io/not-ready taint to prevent scheduling workload Pods to the Node until the Ready condition gets True. However, the kubelet does not consider the readiness of node-critical Pods. Hence, the Ready condition might get True and the node.kubernetes.io/not-ready taint might get removed, for example, before the CNI daemon Pod (e.g., calico-node) has successfully placed the CNI binaries on the machine.\nThis problem has been discussed extensively in kubernetes, e.g., in kubernetes/kubernetes#75890. However, several proposals have been rejected because the problem can be solved by using the --register-with-taints kubelet flag and dedicated controllers (ref).\nImplementation in Gardener Gardener makes sure that workload Pods are only scheduled to Nodes where all node-critical components required for running workload Pods are ready. For this, Gardener follows the proposed solution by the Kubernetes community and registers new Node objects with the node.gardener.cloud/critical-components-not-ready taint (effect NoSchedule). gardener-resource-manager’s Node controller reacts on newly created Node objects that have this taint. The controller removes the taint once all node-critical Pods are ready (determined by checking the Pods’ Ready conditions).\nThe Node controller considers all DaemonSets and Pods with the label node.gardener.cloud/critical-component=true as node-critical. If there are DaemonSets that contain the node.gardener.cloud/critical-component=true label in their metadata and in their Pod template, the Node controller waits for corresponding daemon Pods to be scheduled and to get ready before removing the taint.\nAdditionally, the Node controller checks for the readiness of csi-driver-node components if a respective Pod indicates that it uses such a driver. This is achieved through a well-defined annotation prefix (node.gardener.cloud/wait-for-csi-node-). For example, the csi-driver-node Pod for Openstack Cinder is annotated with node.gardener.cloud/wait-for-csi-node-cinder=cinder.csi.openstack.org. A key prefix is used instead of a “regular” annotation to allow for multiple CSI drivers being registered by one csi-driver-node Pod. The annotation key’s suffix can be chosen arbitrarily (in this case cinder) and the annotation value needs to match the actual driver name as specified in the CSINode object. The Node controller will verify that the used driver is properly registered in this object before removing the node.gardener.cloud/critical-components-not-ready taint. Note that the csi-driver-node Pod still needs to be labelled and tolerate the taint as described above to be considered in this additional check.\nMarking Node-Critical Components To make use of this feature, node-critical DaemonSets and Pods need to:\n Tolerate the node.gardener.cloud/critical-components-not-ready NoSchedule taint. Be labelled with node.gardener.cloud/critical-component=true. csi-driver-node Pods additionally need to:\n Be annotated with node.gardener.cloud/wait-for-csi-node-\u003cname\u003e=\u003cfull-driver-name\u003e. It’s required that these Pods fulfill the above criteria (label and toleration) as well. Gardener already marks components like kube-proxy, apiserver-proxy and node-local-dns as node-critical. Provider extensions mark components like csi-driver-node as node-critical and add the wait-for-csi-node annotation. Network extensions mark components responsible for setting up CNI on worker Nodes (e.g., calico-node) as node-critical. If shoot owners manage any additional node-critical components, they can make use of this feature as well.\n","categories":"","description":"Implementation in Gardener for readiness of Shoot worker Nodes. How to mark components as node-critical","excerpt":"Implementation in Gardener for readiness of Shoot worker Nodes. How to …","ref":"/docs/gardener/node-readiness/","tags":"","title":"Readiness of Shoot Worker Nodes"},{"body":"Reconcile Trigger Gardener dictates the time of reconciliation for resources of the API group extensions.gardener.cloud. It does that by annotating the respected resource with gardener.cloud/operation=reconcile. Extension controllers shall react to this annotation and start reconciling the resource. They have to remove this annotation as soon as they begin with their reconcile operation and maintain the status of the extension resource accordingly.\nThe reason for this behaviour is that it is possible to configure Gardener to reconcile only in the shoots’ maintenance time windows. In order to avoid that, extension controllers reconcile outside of the shoot’s maintenance time window we have introduced this contract. This way extension controllers don’t need to care about when the shoot maintenance time window happens. Gardener keeps control and decides when the shoot shall be reconciled/updated.\nOur extension controller library provides all the required utilities to conveniently implement this behaviour.\n","categories":"","description":"","excerpt":"Reconcile Trigger Gardener dictates the time of reconciliation for …","ref":"/docs/gardener/extensions/reconcile-trigger/","tags":"","title":"Reconcile Trigger"},{"body":"What is impacted during a reconciliation? Infrastructure and DNSRecord reconciliation are only done during usual reconciliation if there were relevant changes. Otherwise, they are only done during maintenance.\nHow do you steer a reconciliation? Reconciliation is bound to the maintenance time window of a cluster. This means that your shoot will be reconciled regularly, without need for input.\nOutside of the maintenance time window your shoot will only reconcile if you change the specification or if you explicitly trigger it. To learn how, see Trigger shoot operations.\n","categories":"","description":"","excerpt":"What is impacted during a reconciliation? Infrastructure and DNSRecord …","ref":"/docs/faq/reconciliation-impact/","tags":"","title":"Reconciliation"},{"body":"Recovery from Permanent Quorum Loss in an Etcd Cluster Quorum loss in Etcd Cluster Quorum loss means when the majority of Etcd pods (greater than or equal to n/2 + 1) are down simultaneously for some reason.\nThere are two types of quorum loss that can happen to an Etcd multinode cluster:\n Transient quorum loss - A quorum loss is called transient when the majority of Etcd pods are down simultaneously for some time. The pods may be down due to network unavailability, high resource usages, etc. When the pods come back after some time, they can re-join the cluster and quorum is recovered automatically without any manual intervention. There should not be a permanent failure for the majority of etcd pods due to hardware failure or disk corruption.\n Permanent quorum loss - A quorum loss is called permanent when the majority of Etcd cluster members experience permanent failure, whether due to hardware failure or disk corruption, etc. In that case, the etcd cluster is not going to recover automatically from the quorum loss. A human operator will now need to intervene and execute the following steps to recover the multi-node Etcd cluster.\n If permanent quorum loss occurs to a multinode Etcd cluster, the operator needs to note down the PVCs, configmaps, statefulsets, CRs, etc. related to that Etcd cluster and work on those resources only. The following steps guide a human operator to recover from permanent quorum loss of an etcd cluster. We assume the name of the Etcd CR for the Etcd cluster is etcd-main.\nEtcd cluster in shoot control plane of gardener deployment: There are two Etcd clusters running in the shoot control plane. One is named etcd-events and another is named etcd-main. The operator needs to take care of permanent quorum loss to a specific cluster. If permanent quorum loss occurs to etcd-events cluster, the operator needs to note down the PVCs, configmaps, statefulsets, CRs, etc. related to the etcd-events cluster and work on those resources only.\n⚠️ Note: Please note that manually restoring etcd can result in data loss. This guide is the last resort to bring an Etcd cluster up and running again.\nIf etcd-druid and etcd-backup-restore is being used with gardener, then:\nTarget the control plane of affected shoot cluster via kubectl. Alternatively, you can use gardenctl to target the control plane of the affected shoot cluster. You can get the details to target the control plane from the Access tile in the shoot cluster details page on the Gardener dashboard. Ensure that you are targeting the correct namespace.\n Add the following annotations to the Etcd resource etcd-main:\n kubectl annotate etcd etcd-main druid.gardener.cloud/suspend-etcd-spec-reconcile=\n kubectl annotate etcd etcd-main druid.gardener.cloud/disable-resource-protection=\n Note down the configmap name that is attached to the etcd-main statefulset. If you describe the statefulset with kubectl describe sts etcd-main, look for the lines similar to following lines to identify attached configmap name. It will be needed at later stages:\nVolumes: etcd-config-file: Type: ConfigMap (a volume populated by a ConfigMap) Name: etcd-bootstrap-4785b0 Optional: false Alternatively, the related configmap name can be obtained by executing following command as well:\nkubectl get sts etcd-main -o jsonpath='{.spec.template.spec.volumes[?(@.name==\"etcd-config-file\")].configMap.name}'\n Scale down the etcd-main statefulset replicas to 0:\nkubectl scale sts etcd-main --replicas=0\n The PVCs will look like the following on listing them with the command kubectl get pvc:\nmain-etcd-etcd-main-0 Bound pv-shoot--garden--aws-ha-dcb51848-49fa-4501-b2f2-f8d8f1fad111 80Gi RWO gardener.cloud-fast 13d main-etcd-etcd-main-1 Bound pv-shoot--garden--aws-ha-b4751b28-c06e-41b7-b08c-6486e03090dd 80Gi RWO gardener.cloud-fast 13d main-etcd-etcd-main-2 Bound pv-shoot--garden--aws-ha-ff17323b-d62e-4d5e-a742-9de823621490 80Gi RWO gardener.cloud-fast 13d Delete all PVCs that are attached to etcd-main cluster.\nkubectl delete pvc -l instance=etcd-main\n Check the etcd’s member leases. There should be leases starting with etcd-main as many as etcd-main replicas. One of those leases will have holder identity as \u003cetcd-member-id\u003e:Leader and rest of etcd member leases have holder identities as \u003cetcd-member-id\u003e:Member. Please ignore the snapshot leases, i.e., those leases which have the suffix snap.\netcd-main member leases:\n NAME HOLDER AGE etcd-main-0 4c37667312a3912b:Member 1m etcd-main-1 75a9b74cfd3077cc:Member 1m etcd-main-2 c62ee6af755e890d:Leader 1m Delete all etcd-main member leases.\n Edit the etcd-main cluster’s configmap (ex: etcd-bootstrap-4785b0) as follows:\nFind the initial-cluster field in the configmap. It should look similar to the following:\n# Initial cluster initial-cluster: etcd-main-0=https://etcd-main-0.etcd-main-peer.default.svc:2380,etcd-main-1=https://etcd-main-1.etcd-main-peer.default.svc:2380,etcd-main-2=https://etcd-main-2.etcd-main-peer.default.svc:2380 Change the initial-cluster field to have only one member (etcd-main-0) in the string. It should now look like this:\n# Initial cluster initial-cluster: etcd-main-0=https://etcd-main-0.etcd-main-peer.default.svc:2380 Scale up the etcd-main statefulset replicas to 1:\nkubectl scale sts etcd-main --replicas=1\n Wait for the single-member etcd cluster to be completely ready.\nkubectl get pods etcd-main-0 will give the following output when ready:\nNAME READY STATUS RESTARTS AGE etcd-main-0 2/2 Running 0 1m Remove the following annotations from the Etcd resource etcd-main:\n kubectl annotate etcd etcd-main druid.gardener.cloud/suspend-etcd-spec-reconcile-\n kubectl annotate etcd etcd-main druid.gardener.cloud/disable-resource-protection-\n Finally, add the following annotation to the Etcd resource etcd-main:\nkubectl annotate etcd etcd-main gardener.cloud/operation='reconcile'\n Verify that the etcd cluster is formed correctly.\nAll the etcd-main pods will have outputs similar to following:\nNAME READY STATUS RESTARTS AGE etcd-main-0 2/2 Running 0 5m etcd-main-1 2/2 Running 0 1m etcd-main-2 2/2 Running 0 1m Additionally, check if the Etcd CR is ready with kubectl get etcd etcd-main:\nNAME READY AGE etcd-main true 13d Additionally, check the leases for 30 seconds at least. There should be leases starting with etcd-main as many as etcd-main replicas. One of those leases will have holder identity as \u003cetcd-member-id\u003e:Leader and rest of those leases have holder identities as \u003cetcd-member-id\u003e:Member. The AGE of those leases can also be inspected to identify if those leases were updated in conjunction with the restart of the Etcd cluster: Example:\nNAME HOLDER AGE etcd-main-0 4c37667312a3912b:Member 1m etcd-main-1 75a9b74cfd3077cc:Member 1m etcd-main-2 c62ee6af755e890d:Leader 1m ","categories":"","description":"","excerpt":"Recovery from Permanent Quorum Loss in an Etcd Cluster Quorum loss in …","ref":"/docs/other-components/etcd-druid/recovery-from-permanent-quorum-loss-in-etcd-cluster/","tags":"","title":"Recovery From Permanent Quorum Loss In Etcd Cluster"},{"body":"Topic Title (the topic title can also be placed in the frontmatter)\nContent This section gives the user all the information needed in order to understand the topic.\n Content Type Definition Example Name 1 Definition of Name 1 Relevant link Name 2 Definition of Name 2 Relevant link Related Links Link 1 Link 2 ","categories":"","description":"Describes the contents of a reference topic","excerpt":"Describes the contents of a reference topic","ref":"/docs/contribute/documentation/style-guide/reference_template/","tags":"","title":"Reference Topic Structure"},{"body":"Referenced Resources The Shoot resource can include a list of resources (usually secrets) that can be referenced by name in the extension providerConfig and other Shoot sections, for example:\nkind: Shoot apiVersion: core.gardener.cloud/v1beta1 metadata: name: crazy-botany namespace: garden-dev ... spec: ... extensions: - type: foobar providerConfig: apiVersion: foobar.extensions.gardener.cloud/v1alpha1 kind: FooBarConfig foo: bar secretRef: foobar-secret resources: - name: foobar-secret resourceRef: apiVersion: v1 kind: Secret name: my-foobar-secret Gardener expects to find these referenced resources in the project namespace (e.g. garden-dev) and will copy them to the Shoot namespace in the Seed cluster when reconciling a Shoot, adding a prefix to their names to avoid naming collisions with Gardener’s own resources.\nExtension controllers can resolve the references to these resources by accessing the Shoot via the Cluster resource. To properly read a referenced resources, extension controllers should use the utility function GetObjectByReference from the extensions/pkg/controller package, for example:\n ... ref = \u0026autoscalingv1.CrossVersionObjectReference{ APIVersion: \"v1\", Kind: \"Secret\", Name: \"foo\", } secret := \u0026corev1.Secret{} if err := controller.GetObjectByReference(ctx, client, ref, \"shoot--test--foo\", secret); err != nil { return err } // Use secret ... ","categories":"","description":"","excerpt":"Referenced Resources The Shoot resource can include a list of …","ref":"/docs/gardener/extensions/referenced-resources/","tags":"","title":"Referenced Resources"},{"body":"Gardener Extension for Registry Cache \nGardener extension controller which deploys pull-through caches for container registries.\nUsage Configuring the Registry Cache Extension - learn what is the use-case for a pull-through cache, how to enable it and configure it How to provide credentials for upstream repository? Configuring the Registry Mirror Extension - learn what is the use-case for a registry mirror, how to enable and configure it Local Setup and Development Deploying Registry Cache Extension Locally - learn how to set up a local development environment Deploying Registry Cache Extension in Gardener’s Local Setup with Provider Extensions - learn how to set up a development environment using own Seed clusters on an existing Kubernetes cluster Developer Docs for Gardener Extension Registry Cache - learn about the inner workings ","categories":"","description":"Gardener extension controller which deploys pull-through caches for container registries.","excerpt":"Gardener extension controller which deploys pull-through caches for …","ref":"/docs/extensions/others/gardener-extension-registry-cache/","tags":"","title":"Registry cache"},{"body":"Overview If you commit sensitive data, such as a kubeconfig.yaml or SSH key into a Git repository, you can remove it from the history. To entirely remove unwanted files from a repository’s history you can use the git filter-branch command.\nThe git filter-branch command rewrites your repository’s history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. Merging or closing all open pull requests before removing files from your repository is recommended.\nWarning If someone has already checked out the repository, then of course they have the secret on their computer. So ALWAYS revoke the OAuthToken/Password or whatever it was immediately. Purging a File from Your Repository’s History Warning If you run git filter-branch after stashing changes, you won’t be able to retrieve your changes with other stash commands. Before running git filter-branch, we recommend unstashing any changes you’ve made. To unstash the last set of changes you’ve stashed, run git stash show -p | git apply -R. For more information, see Git Tools - Stashing and Cleaning. To illustrate how git filter-branch works, we’ll show you how to remove your file with sensitive data from the history of your repository and add it to .gitignore to ensure that it is not accidentally re-committed.\n1. Navigate into the repository’s working directory:\ncd YOUR-REPOSITORY 2. Run the following command, replacing PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA with the path to the file you want to remove, not just its filename.\nThese arguments will:\n Force Git to process, but not check out, the entire history of every branch and tag Remove the specified file, as well as any empty commits generated as a result Overwrite your existing tags git filter-branch --force --index-filter \\ 'git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA' \\ --prune-empty --tag-name-filter cat -- --all 3. Add your file with sensitive data to .gitignore to ensure that you don’t accidentally commit it again:\n echo \"YOUR-FILE-WITH-SENSITIVE-DATA\" \u003e\u003e .gitignore Double-check that you’ve removed everything you wanted to from your repository’s history, and that all of your branches are checked out. Once you’re happy with the state of your repository, continue to the next step.\n4. Force-push your local changes to overwrite your GitHub repository, as well as all the branches you’ve pushed up:\ngit push origin --force --all 4. In order to remove the sensitive file from your tagged releases, you’ll also need to force-push against your Git tags:\ngit push origin --force --tags Warning Tell your collaborators to rebase, not merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging. Related Links Removing Sensitive Data from a Repository ","categories":"","description":"Never ever commit a kubeconfig.yaml into github","excerpt":"Never ever commit a kubeconfig.yaml into github","ref":"/docs/guides/applications/commit-secret-fail/","tags":"","title":"Remove Committed Secrets in Github 💀"},{"body":"Packages:\n resources.gardener.cloud/v1alpha1 resources.gardener.cloud/v1alpha1 Package v1alpha1 contains the configuration of the Gardener Resource Manager.\nResource Types: ManagedResource ManagedResource describes a list of managed resources.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedResourceSpec Spec contains the specification of this managed resource.\n class string (Optional) Class holds the resource class used to control the responsibility for multiple resource manager instances\n secretRefs []Kubernetes core/v1.LocalObjectReference SecretRefs is a list of secret references.\n injectLabels map[string]string (Optional) InjectLabels injects the provided labels into every resource that is part of the referenced secrets.\n forceOverwriteLabels bool (Optional) ForceOverwriteLabels specifies that all existing labels should be overwritten. Defaults to false.\n forceOverwriteAnnotations bool (Optional) ForceOverwriteAnnotations specifies that all existing annotations should be overwritten. Defaults to false.\n keepObjects bool (Optional) KeepObjects specifies whether the objects should be kept although the managed resource has already been deleted. Defaults to false.\n equivalences [][]k8s.io/apimachinery/pkg/apis/meta/v1.GroupKind (Optional) Equivalences specifies possible group/kind equivalences for objects.\n deletePersistentVolumeClaims bool (Optional) DeletePersistentVolumeClaims specifies if PersistentVolumeClaims created by StatefulSets, which are managed by this resource, should also be deleted when the corresponding StatefulSet is deleted (defaults to false).\n status ManagedResourceStatus Status contains the status of this managed resource.\n ManagedResourceSpec (Appears on: ManagedResource) ManagedResourceSpec contains the specification of this managed resource.\n Field Description class string (Optional) Class holds the resource class used to control the responsibility for multiple resource manager instances\n secretRefs []Kubernetes core/v1.LocalObjectReference SecretRefs is a list of secret references.\n injectLabels map[string]string (Optional) InjectLabels injects the provided labels into every resource that is part of the referenced secrets.\n forceOverwriteLabels bool (Optional) ForceOverwriteLabels specifies that all existing labels should be overwritten. Defaults to false.\n forceOverwriteAnnotations bool (Optional) ForceOverwriteAnnotations specifies that all existing annotations should be overwritten. Defaults to false.\n keepObjects bool (Optional) KeepObjects specifies whether the objects should be kept although the managed resource has already been deleted. Defaults to false.\n equivalences [][]k8s.io/apimachinery/pkg/apis/meta/v1.GroupKind (Optional) Equivalences specifies possible group/kind equivalences for objects.\n deletePersistentVolumeClaims bool (Optional) DeletePersistentVolumeClaims specifies if PersistentVolumeClaims created by StatefulSets, which are managed by this resource, should also be deleted when the corresponding StatefulSet is deleted (defaults to false).\n ManagedResourceStatus (Appears on: ManagedResource) ManagedResourceStatus is the status of a managed resource.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition observedGeneration int64 ObservedGeneration is the most recent generation observed for this resource.\n resources []ObjectReference (Optional) Resources is a list of objects that have been created.\n secretsDataChecksum string (Optional) SecretsDataChecksum is the checksum of referenced secrets data.\n ObjectReference (Appears on: ManagedResourceStatus) ObjectReference is a reference to another object.\n Field Description ObjectReference Kubernetes core/v1.ObjectReference (Members of ObjectReference are embedded into this type.) labels map[string]string Labels is a map of labels that were used during last update of the resource.\n annotations map[string]string Annotations is a map of annotations that were used during last update of the resource.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n resources.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/resources/","tags":"","title":"Resources"},{"body":"Restoration of a single member in multi-node etcd deployed by etcd-druid Note:\n For a cluster with n members, we are proposing the solution to only single member restoration within a etcd cluster not the quorum loss scenario (when majority of members within a cluster fail). In this proposal we are not targeting the recovery of single member which got separated from cluster due to network partition. Motivation If a single etcd member within a multi-node etcd cluster goes down due to DB corruption/PVC corruption/Invalid data-dir then it needs to be brought back. Unlike in the single-node case, a minority member of a multi-node cluster can’t be restored from the snapshots present in storage container as you can’t restore from the old snapshots as it contains the metadata information of cluster which leads to memberID mismatch that prevents the new member from coming up as new member is getting its metadata information from db which got restore from old snapshots.\nSolution If a corresponding backup-restore sidecar detects that its corresponding etcd is down due to data-dir corruption or Invalid data-dir Then backup-restore will first remove the failing etcd member from the cluster using the MemberRemove API call and clean the data-dir of failed etcd member. It won’t affect the etcd cluster as quorum is still maintained. After successfully removing failed etcd member from the cluster, backup-restore sidecar will try to add a new etcd member to a cluster to get the same cluster size as before. Backup-restore firstly adds new member as a Learner using the MemberAddAsLearner API call, once learner is added to the cluster and it’s get in sync with leader and becomes up-to-date then promote the learner(non-voting member) to a voting member using MemberPromote API call. So, the failed member first needs to be removed from the cluster and then added as a new member. Example If a 3 member etcd cluster has 1 downed member(due to invalid data-dir), the cluster can still make forward progress because the quorum is 2. Etcd downed member get restarted and it’s corresponding backup-restore sidecar receives an initialization request. Then, backup-restore sidecar checks for data corruption/invalid data-dir. Backup-restore sidecar detects that data-dir is invalid and its a multi-node etcd cluster. Then, backup-restore sidecar removed the downed etcd member from cluster. The number of members in a cluster becomes 2 and the quorum remains at 2, so it won’t affect the etcd cluster. Clean the data-dir and add a member as a learner(non-voting member). As soon as learner gets in sync with leader, promote the learner to a voting member, hence increasing number of members in a cluster back to 3. ","categories":"","description":"","excerpt":"Restoration of a single member in multi-node etcd deployed by …","ref":"/docs/other-components/etcd-druid/restoring-single-member-in-multi-node-etcd-cluster/","tags":"","title":"Restoring Single Member In Multi Node Etcd Cluster"},{"body":"Reversed VPN Tunnel Setup and Configuration The Reversed VPN Tunnel is enabled by default. A highly available VPN connection is automatically deployed in all shoots that configure an HA control-plane.\nReversed VPN Tunnel In the first VPN solution, connection establishment was initiated by a VPN client in the seed cluster. Due to several issues with this solution, the tunnel establishment direction has been reverted. The client is deployed in the shoot and initiates the connection from there. This way, there is no need to deploy a special purpose loadbalancer for the sake of addressing the data-plane, in addition to saving costs, this is considered the more secure alternative. For more information on how this is achieved, please have a look at the following GEP.\nConnection establishment with a reversed tunnel:\nAPIServer --\u003e Envoy-Proxy | VPN-Seed-Server \u003c-- Istio/Envoy-Proxy \u003c-- SNI API Server Endpoint \u003c-- LB (one for all clusters of a seed) \u003c--- internet \u003c--- VPN-Shoot-Client --\u003e Pods | Nodes | Services\nHigh Availability for Reversed VPN Tunnel Shoots which define spec.controlPlane.highAvailability.failureTolerance: {node, zone} get an HA control-plane, including a highly available VPN connection by deploying redundant VPN servers and clients.\nPlease note that it is not possible to move an open connection to another VPN tunnel. Especially long-running commands like kubectl exec -it ... or kubectl logs -f ... will still break if the routing path must be switched because either VPN server or client are not reachable anymore. A new request should be possible within seconds.\nHA Architecture for VPN Establishing a connection from the VPN client on the shoot to the server in the control plane works nearly the same way as in the non-HA case. The only difference is that the VPN client targets one of two VPN servers, represented by two services vpn-seed-server-0 and vpn-seed-server-1 with endpoints in pods with the same name. The VPN tunnel is used by a kube-apiserver to reach nodes, services, or pods in the shoot cluster. In the non-HA case, a kube-apiserver uses an HTTP proxy running as a side-car in the VPN server to address the shoot networks via the VPN tunnel and the vpn-shoot acts as a router. In the HA case, the setup is more complicated. Instead of an HTTP proxy in the VPN server, the kube-apiserver has additional side-cars, one side-car for each VPN client to connect to the corresponding VPN server. On the shoot side, there are now two vpn-shoot pods, each with two VPN clients for each VPN server. With this setup, there would be four possible routes, but only one can be used. Switching the route kills all open connections. Therefore, another layer is introduced: link aggregation, also named bonding. In Linux, you can create a network link by using several other links as slaves. Bonding here is used with active-backup mode. This means the traffic only goes through the active sublink and is only changed if the active one becomes unavailable. Switching happens in the bonding network driver without changing any routes. So with this layer, vpn-seed-server pods can be rolled without disrupting open connections.\nWith bonding, there are 2 possible routing paths, ensuring that there is at least one routing path intact even if one vpn-seed-server pod and one vpn-shoot pod are unavailable at the same time.\nAs multi-path routing is not available on the worker nodes, one routing path must be configured explicitly. For this purpose, the path-controller app is running in another side-car of the kube-apiserver pod. It pings all shoot-side VPN clients regularly every few seconds. If the active routing path is not responsive anymore, the routing is switched to the other responsive routing path.\nUsing an IPv6 transport network for communication between the bonding devices of the VPN clients, additional tunnel devices are needed on both ends to allow transport of both IPv4 and IPv6 packets. For this purpose, ip6tnl type tunnel devices are in place (an IPv4/IPv6 over IPv6 tunnel interface).\nThe connection establishment with a reversed tunnel in HA case is:\nAPIServer[k] --\u003e ip6tnl-device[j] --\u003e bond-device --\u003e tap-device[i] | VPN-Seed-Server[i] \u003c-- Istio/Envoy-Proxy \u003c-- SNI API Server Endpoint \u003c-- LB (one for all clusters of a seed) \u003c--- internet \u003c--- VPN-Shoot-Client[j] --\u003e tap-device[i] --\u003e bond-device --\u003e ip6tnl-device[k] --\u003e Pods | Nodes | Services\nHere, [k] is the index of the kube-apiserver instance, [j] of the VPN shoot instance, and [i] of VPN seed server.\nFor each kube-apiserver instance, an own ip6tnl tunnel device is needed on the shoot side. Additionally, the back routes from the VPN shoot to any new kube-apiserver instance must be set dynamically. Both tasks are managed by the tunnel-controller running in each VPN shoot client. It listens for UDP6 packets sent periodically from the path-controller running in the kube-apiserver pods. These UDP6 packets contain the IPv6 address of the bond device. If the tunnel controller detects a new kube-apiserver this way, it creates a new tunnel device and route to it.\nFor general information about HA control-plane, see GEP-20.\n","categories":"","description":"","excerpt":"Reversed VPN Tunnel Setup and Configuration The Reversed VPN Tunnel is …","ref":"/docs/gardener/reversed-vpn-tunnel/","tags":"","title":"Reversed VPN Tunnel"},{"body":"Scoped API Access for gardenlets and Extensions By default, gardenlets have administrative access in the garden cluster. They are able to execute any API request on any object independent of whether the object is related to the seed cluster the gardenlet is responsible for. As RBAC is not powerful enough for fine-grained checks and for the sake of security, Gardener provides two optional but recommended configurations for your environments that scope the API access for gardenlets.\nSimilar to the Node authorization mode in Kubernetes, Gardener features a SeedAuthorizer plugin. It is a special-purpose authorization plugin that specifically authorizes API requests made by the gardenlets.\nLikewise, similar to the NodeRestriction admission plugin in Kubernetes, Gardener features a SeedRestriction plugin. It is a special-purpose admission plugin that specifically limits the Kubernetes objects gardenlets can modify.\n📚 You might be interested to look into the design proposal for scoped Kubelet API access from the Kubernetes community. It can be translated to Gardener and Gardenlets with their Seed and Shoot resources.\nHistorically, gardenlet has been the only component running in the seed cluster that has access to both the seed cluster and the garden cluster. Starting from Gardener v1.74.0, extensions running on seed clusters can also get access to the garden cluster using a token for a dedicated ServiceAccount. Extensions using this mechanism only get permission to read global resources like CloudProfiles (this is granted to all authenticated users) unless the plugins described in this document are enabled.\nGenerally, the plugins handle extension clients exactly like gardenlet clients with some minor exceptions. Extension clients in the sense of the plugins are clients authenticated as a ServiceAccount with the extension- name prefix in a seed- namespace of the garden cluster. Other ServiceAccounts are not considered as seed clients, not handled by the plugins, and only get the described read access to global resources.\nFlow Diagram The following diagram shows how the two plugins are included in the request flow of a gardenlet. When they are not enabled, then the kube-apiserver is internally authorizing the request via RBAC before forwarding the request directly to the gardener-apiserver, i.e., the gardener-admission-controller would not be consulted (this is not entirely correct because it also serves other admission webhook handlers, but for simplicity reasons this document focuses on the API access scope only).\nWhen enabling the plugins, there is one additional step for each before the gardener-apiserver responds to the request.\nPlease note that the example shows a request to an object (Shoot) residing in one of the API groups served by gardener-apiserver. However, the gardenlet is also interacting with objects in API groups served by the kube-apiserver (e.g., Secret,ConfigMap). In this case, the consultation of the SeedRestriction admission plugin is performed by the kube-apiserver itself before it forwards the request to the gardener-apiserver.\nImplemented Rules Today, the following rules are implemented:\n Resource Verbs Path(s) Description BackupBucket get, list, watch, create, update, patch, delete BackupBucket -\u003e Seed Allow get, list, watch requests for all BackupBuckets. Allow only create, update, patch, delete requests for BackupBuckets assigned to the gardenlet’s Seed. BackupEntry get, list, watch, create, update, patch BackupEntry -\u003e Seed Allow get, list, watch requests for all BackupEntrys. Allow only create, update, patch requests for BackupEntrys assigned to the gardenlet’s Seed and referencing BackupBuckets assigned to the gardenlet’s Seed. Bastion get, list, watch, create, update, patch Bastion -\u003e Seed Allow get, list, watch requests for all Bastions. Allow only create, update, patch requests for Bastions assigned to the gardenlet’s Seed. CertificateSigningRequest get, create CertificateSigningRequest -\u003e Seed Allow only get, create requests for CertificateSigningRequests related to the gardenlet’s Seed. CloudProfile get CloudProfile -\u003e Shoot -\u003e Seed Allow only get requests for CloudProfiles referenced by Shoots that are assigned to the gardenlet’s Seed. ClusterRoleBinding create, get, update, patch, delete ClusterRoleBinding -\u003e ManagedSeed -\u003e Shoot -\u003e Seed Allow create, get, update, patch requests for ManagedSeeds in the bootstrapping phase assigned to the gardenlet’s Seeds. Allow delete requests from gardenlets bootstrapped via ManagedSeeds. ConfigMap get ConfigMap -\u003e Shoot -\u003e Seed Allow only get requests for ConfigMaps referenced by Shoots that are assigned to the gardenlet’s Seed. Allows reading the kube-system/cluster-identity ConfigMap. ControllerRegistration get, list, watch ControllerRegistration -\u003e ControllerInstallation -\u003e Seed Allow get, list, watch requests for all ControllerRegistrations. ControllerDeployment get ControllerDeployment -\u003e ControllerInstallation -\u003e Seed Allow get requests for ControllerDeploymentss referenced by ControllerInstallations assigned to the gardenlet’s Seed. ControllerInstallation get, list, watch, update, patch ControllerInstallation -\u003e Seed Allow get, list, watch requests for all ControllerInstallations. Allow only update, patch requests for ControllerInstallations assigned to the gardenlet’s Seed. CredentialsBinding get CredentialsBinding -\u003e Shoot -\u003e Seed Allow only get requests for CredentialsBindings referenced by Shoots that are assigned to the gardenlet’s Seed. Event create, patch none Allow to create or patch all kinds of Events. ExposureClass get ExposureClass -\u003e Shoot -\u003e Seed Allow get requests for ExposureClasses referenced by Shoots that are assigned to the gardenlet’s Seed. Deny get requests to other ExposureClasses. Gardenlet get, list, watch, update, patch, create Gardenlet -\u003e Seed Allow get, list, watch requests for all Gardenlets. Allow only create, update, and patch requests for Gardenlets belonging to the gardenlet’s Seed. Lease create, get, watch, update Lease -\u003e Seed Allow create, get, update, and delete requests for Leases of the gardenlet’s Seed. ManagedSeed get, list, watch, update, patch ManagedSeed -\u003e Shoot -\u003e Seed Allow get, list, watch requests for all ManagedSeeds. Allow only update, patch requests for ManagedSeeds referencing a Shoot assigned to the gardenlet’s Seed. Namespace get Namespace -\u003e Shoot -\u003e Seed Allow get requests for Namespaces of Shoots that are assigned to the gardenlet’s Seed. Always allow get requests for the garden Namespace. NamespacedCloudProfile get NamespacedCloudProfile -\u003e Shoot -\u003e Seed Allow only get requests for NamespacedCloudProfiles referenced by Shoots that are assigned to the gardenlet’s Seed. Project get Project -\u003e Namespace -\u003e Shoot -\u003e Seed Allow get requests for Projects referenced by the Namespace of Shoots that are assigned to the gardenlet’s Seed. SecretBinding get SecretBinding -\u003e Shoot -\u003e Seed Allow only get requests for SecretBindings referenced by Shoots that are assigned to the gardenlet’s Seed. Secret create, get, update, patch, delete(, list, watch) Secret -\u003e Seed, Secret -\u003e Shoot -\u003e Seed, Secret -\u003e SecretBinding -\u003e Shoot -\u003e Seed, Secret -\u003e CredentialsBinding -\u003e Shoot -\u003e Seed, BackupBucket -\u003e Seed Allow get, list, watch requests for all Secrets in the seed-\u003cname\u003e namespace. Allow only create, get, update, patch, delete requests for the Secrets related to resources assigned to the gardenlet’s Seeds. Seed get, list, watch, create, update, patch, delete Seed Allow get, list, watch requests for all Seeds. Allow only create, update, patch, delete requests for the gardenlet’s Seeds. [1] ServiceAccount create, get, update, patch, delete ServiceAccount -\u003e ManagedSeed -\u003e Shoot -\u003e Seed, ServiceAccount -\u003e Namespace -\u003e Seed Allow create, get, update, patch requests for ManagedSeeds in the bootstrapping phase assigned to the gardenlet’s Seeds. Allow delete requests from gardenlets bootstrapped via ManagedSeeds. Allow all verbs on ServiceAccounts in seed-specific namespace. Shoot get, list, watch, update, patch Shoot -\u003e Seed Allow get, list, watch requests for all Shoots. Allow only update, patch requests for Shoots assigned to the gardenlet’s Seed. ShootState get, create, update, patch ShootState -\u003e Shoot -\u003e Seed Allow only get, create, update, patch requests for ShootStates belonging by Shoots that are assigned to the gardenlet’s Seed. WorkloadIdentity get WorkloadIdentity -\u003e CredentialsBinding -\u003e Shoot -\u003e Seed Allow only get requests for WorkloadIdentities referenced by CredentialsBindings referenced by Shoots that are assigned to the gardenlet’s Seed. [1] If you use ManagedSeed resources then the gardenlet reconciling them (“parent gardenlet”) may be allowed to submit certain requests for the Seed resources resulting out of such ManagedSeed reconciliations (even if the “parent gardenlet” is not responsible for them):\n ℹ️ It is allowed to delete the Seed resources if the corresponding ManagedSeed objects already have a deletionTimestamp (this is secure as gardenlets themselves don’t have permissions for deleting ManagedSeeds).\nRule Exceptions for Extension Clients Extension clients are allowed to perform the same operations as gardenlet clients with the following exceptions:\n Extension clients are granted the read-only subset of verbs for CertificateSigningRequests, ClusterRoleBindings, and ServiceAccounts (to prevent privilege escalation). Extension clients are granted full access to Lease objects but only in the seed-specific namespace. When the need arises, more exceptions might be added to the access rules for resources that are already handled by the plugins. E.g., if an extension needs to populate additional shoot-specific InternalSecrets, according handling can be introduced. Permissions for resources that are not handled by the plugins can be granted using additional RBAC rules (independent of the plugins).\nSeedAuthorizer Authorization Webhook Enablement The SeedAuthorizer is implemented as a Kubernetes authorization webhook and part of the gardener-admission-controller component running in the garden cluster.\n🎛 In order to activate it, you have to follow these steps:\n Set the following flags for the kube-apiserver of the garden cluster (i.e., the kube-apiserver whose API is extended by Gardener):\n --authorization-mode=RBAC,Node,Webhook (please note that Webhook should appear after RBAC in the list [1]; Node might not be needed if you use a virtual garden cluster) --authorization-webhook-config-file=\u003cpath-to-the-webhook-config-file\u003e --authorization-webhook-cache-authorized-ttl=0 --authorization-webhook-cache-unauthorized-ttl=0 The webhook config file (stored at \u003cpath-to-the-webhook-config-file\u003e) should look as follows:\napiVersion: v1 kind: Config clusters: - name: garden cluster: certificate-authority-data: base64(CA-CERT-OF-GARDENER-ADMISSION-CONTROLLER) server: https://gardener-admission-controller.garden/webhooks/auth/seed users: - name: kube-apiserver user: {} contexts: - name: auth-webhook context: cluster: garden user: kube-apiserver current-context: auth-webhook When deploying the Gardener controlplane Helm chart, set .global.rbac.seedAuthorizer.enabled=true. This will ensure that the RBAC resources granting global access for all gardenlets will be deployed.\n Delete the existing RBAC resources granting global access for all gardenlets by running:\nkubectl delete \\ clusterrole.rbac.authorization.k8s.io/gardener.cloud:system:seeds \\ clusterrolebinding.rbac.authorization.k8s.io/gardener.cloud:system:seeds \\ --ignore-not-found Please note that you should activate the SeedRestriction admission handler as well.\n [1] The reason for the fact that Webhook authorization plugin should appear after RBAC is that the kube-apiserver will be depending on the gardener-admission-controller (serving the webhook). However, the gardener-admission-controller can only start when gardener-apiserver runs, but gardener-apiserver itself can only start when kube-apiserver runs. If Webhook is before RBAC, then gardener-apiserver might not be able to start, leading to a deadlock.\n Authorizer Decisions As mentioned earlier, it’s the authorizer’s job to evaluate API requests and return one of the following decisions:\n DecisionAllow: The request is allowed, further configured authorizers won’t be consulted. DecisionDeny: The request is denied, further configured authorizers won’t be consulted. DecisionNoOpinion: A decision cannot be made, further configured authorizers will be consulted. For backwards compatibility, no requests are denied at the moment, so that they are still deferred to a subsequent authorizer like RBAC. Though, this might change in the future.\nFirst, the SeedAuthorizer extracts the Seed name from the API request. This step considers the following two cases:\n If the authenticated user belongs to the gardener.cloud:system:seeds group, it is considered a gardenlet client. This requires a proper TLS certificate that the gardenlet uses to contact the API server and is automatically given if TLS bootstrapping is used. The authorizer extracts the seed name from the username by stripping the gardener.cloud:system:seed: prefix. In cases where this information is missing e.g., when a custom Kubeconfig is used, the authorizer cannot make any decision. Thus, RBAC is still a considerable option to restrict the gardenlet’s access permission if the above explained preconditions are not given. If the authenticated user belongs to the system:serviceaccounts group, it is considered an extension client under the following conditions: The ServiceAccount must be located in a seed- namespace. I.e., the user has to belong to a group with the system:serviceaccounts:seed- prefix. The seed name is extracted from this group by stripping the prefix. The ServiceAccount must have the extension- prefix. I.e., the username must have the system:serviceaccount:seed-\u003cseed-name\u003e:extension- prefix. With the Seed name at hand, the authorizer checks for an existing path from the resource that a request is being made for to the Seed belonging to the gardenlet/extension. Take a look at the Implementation Details section for more information.\nImplementation Details Internally, the SeedAuthorizer uses a directed, acyclic graph data structure in order to efficiently respond to authorization requests for gardenlets/extensions:\n A vertex in this graph represents a Kubernetes resource with its kind, namespace, and name (e.g., Shoot:garden-my-project/my-shoot). An edge from vertex u to vertex v in this graph exists when (1) v is referred by u and v is a Seed, or when (2) u is referred by v, or when (3) u is strictly associated with v. For example, a Shoot refers to a Seed, a CloudProfile, a SecretBinding, etc., so it has an outgoing edge to the Seed (1) and incoming edges from the CloudProfile and SecretBinding vertices (2). However, there might also be a ShootState or a BackupEntry resource strictly associated with this Shoot, hence, it has incoming edges from these vertices (3).\nIn the above picture, the resources that are actively watched are shaded. Gardener resources are green, while Kubernetes resources are blue. It shows the dependencies between the resources and how the graph is built based on the above rules.\nℹ️ The above picture shows all resources that may be accessed by gardenlets/extensions, except for the Quota resource which is only included for completeness.\nNow, when a gardenlet/extension wants to access certain resources, then the SeedAuthorizer uses a Depth-First traversal starting from the vertex representing the resource in question, e.g., from a Project vertex. If there is a path from the Project vertex to the vertex representing the Seed the gardenlet/extension is responsible for. then it allows the request.\nMetrics The SeedAuthorizer registers the following metrics related to the mentioned graph implementation:\n Metric Description gardener_admission_controller_seed_authorizer_graph_update_duration_seconds Histogram of duration of resource dependency graph updates in seed authorizer, i.e., how long does it take to update the graph’s vertices/edges when a resource is created, changed, or deleted. gardener_admission_controller_seed_authorizer_graph_path_check_duration_seconds Histogram of duration of checks whether a path exists in the resource dependency graph in seed authorizer. Debug Handler When the .server.enableDebugHandlers field in the gardener-admission-controller’s component configuration is set to true, then it serves a handler that can be used for debugging the resource dependency graph under /debug/resource-dependency-graph.\n🚨 Only use this setting for development purposes, as it enables unauthenticated users to view all data if they have access to the gardener-admission-controller component.\nThe handler renders an HTML page displaying the current graph with a list of vertices and its associated incoming and outgoing edges to other vertices. Depending on the size of the Gardener landscape (and consequently, the size of the graph), it might not be possible to render it in its entirety. If there are more than 2000 vertices, then the default filtering will selected for kind=Seed to prevent overloading the output.\nExample output:\n------------------------------------------------------------------------------- | | # Seed:my-seed | \u003c- (11) | BackupBucket:73972fe2-3d7e-4f61-a406-b8f9e670e6b7 | BackupEntry:garden-my-project/shoot--dev--my-shoot--4656a460-1a69-4f00-9372-7452cbd38ee3 | ControllerInstallation:dns-external-mxt8m | ControllerInstallation:extension-shoot-cert-service-4qw5j | ControllerInstallation:networking-calico-bgrb2 | ControllerInstallation:os-gardenlinux-qvb5z | ControllerInstallation:provider-gcp-w4mvf | Secret:garden/backup | Shoot:garden-my-project/my-shoot | ------------------------------------------------------------------------------- | | # Shoot:garden-my-project/my-shoot | \u003c- (5) | CloudProfile:gcp | Namespace:garden-my-project | Secret:garden-my-project/my-dns-secret | SecretBinding:garden-my-project/my-credentials | ShootState:garden-my-project/my-shoot | -\u003e (1) | Seed:my-seed | ------------------------------------------------------------------------------- | | # ShootState:garden-my-project/my-shoot | -\u003e (1) | Shoot:garden-my-project/my-shoot | ------------------------------------------------------------------------------- ... (etc., similarly for the other resources) There are anchor links to easily jump from one resource to another, and the page provides means for filtering the results based on the kind, namespace, and/or name.\nPitfalls When there is a relevant update to an existing resource, i.e., when a reference to another resource is changed, then the corresponding vertex (along with all associated edges) is first deleted from the graph before it gets added again with the up-to-date edges. However, this does only work for vertices belonging to resources that are only created in exactly one “watch handler”. For example, the vertex for a SecretBinding can either be created in the SecretBinding handler itself or in the Shoot handler. In such cases, deleting the vertex before (re-)computing the edges might lead to race conditions and potentially renders the graph invalid. Consequently, instead of deleting the vertex, only the edges the respective handler is responsible for are deleted. If the vertex ends up with no remaining edges, then it also gets deleted automatically. Afterwards, the vertex can either be added again or the updated edges can be created.\nSeedRestriction Admission Webhook Enablement The SeedRestriction is implemented as Kubernetes admission webhook and part of the gardener-admission-controller component running in the garden cluster.\n🎛 In order to activate it, you have to set .global.admission.seedRestriction.enabled=true when using the Gardener controlplane Helm chart. This will add an additional webhook in the existing ValidatingWebhookConfiguration of the gardener-admission-controller which contains the configuration for the SeedRestriction handler. Please note that it should only be activated when the SeedAuthorizer is active as well.\nAdmission Decisions The admission’s purpose is to perform extended validation on requests which require the body of the object in question. Additionally, it handles CREATE requests of gardenlets/extensions (the above discussed resource dependency graph cannot be used in such cases because there won’t be any vertex/edge for non-existing resources).\nGardenlets/extensions are restricted to only create new resources which are somehow related to the seed clusters they are responsible for.\n","categories":"","description":"","excerpt":"Scoped API Access for gardenlets and Extensions By default, gardenlets …","ref":"/docs/gardener/deployment/gardenlet_api_access/","tags":"","title":"Scoped API Access for gardenlets and Extensions"},{"body":"SecretBinding Provider Controller This page describes the process on how to enable the SecretBinding provider controller.\nOverview With Gardener v1.38.0, the SecretBinding resource now contains a new optional field .provider.type (details about the motivation can be found in https://github.com/gardener/gardener/issues/4888). To make the process of setting the new field automated and afterwards to enforce validation on the new field in backwards compatible manner, Gardener features the SecretBinding provider controller and a feature gate - SecretBindingProviderValidation.\nProcess A Gardener landscape operator can follow the following steps:\n Enable the SecretBinding provider controller of Gardener Controller Manager.\nThe SecretBinding provider controller is responsible for populating the .provider.type field of a SecretBinding based on its current usage by Shoot resources. For example, if a Shoot crazy-botany with .provider.type=aws is using a SecretBinding my-secret-binding, then the SecretBinding provider controller will take care to set the .provider.type field of the SecretBinding to the same provider type (aws). To enable the SecretBinding provider controller, set the controller.secretBindingProvider.concurrentSyncs field in the ControllerManagerConfiguration (e.g set it to 5). Although that it is not recommended, the API allows Shoots from different provider types to reference the same SecretBinding (assuming that the backing Secret contains data for both of the provider types). To preserve the backwards compatibility for such SecretBindings, the provider controller will maintain the multiple provider types in the field (it will join them with the separator , - for example aws,gcp).\n Disable the SecretBinding provider controller and enable the SecretBindingProviderValidation feature gate of Gardener API server.\nThe SecretBindingProviderValidation feature gate of Gardener API server enables a set of validations for the SecretBinding provider field. It forbids creating a Shoot that has a different provider type from the referenced SecretBinding’s one. It also enforces immutability on the field. After making sure that SecretBinding provider controller is enabled and it populated the .provider.type field of a majority of the SecretBindings on a Gardener landscape (the SecretBindings that are unused will have their provider type unset), a Gardener landscape operator has to disable the SecretBinding provider controller and to enable the SecretBindingProviderValidation feature gate of Gardener API server. To disable the SecretBinding provider controller, set the controller.secretBindingProvider.concurrentSyncs field in the ControllerManagerConfiguration to 0.\n Implementation History Gardener v1.38: The SecretBinding resource has a new optional field .provider.type. The SecretBinding provider controller is disabled by default. The SecretBindingProviderValidation feature gate of Gardener API server is disabled by default. Gardener v1.42: The SecretBinding provider controller is enabled by default. Gardener v1.51: The SecretBindingProviderValidation feature gate of Gardener API server is enabled by default and the SecretBinding provider controller is disabled by default. Gardener v1.53: The SecretBindingProviderValidation feature gate of Gardener API server is unconditionally enabled (can no longer be disabled). Gardener v1.55: The SecretBindingProviderValidation feature gate of Gardener API server and the SecretBinding provider controller are removed. ","categories":"","description":"","excerpt":"SecretBinding Provider Controller This page describes the process on …","ref":"/docs/gardener/deployment/secret_binding_provider_controller/","tags":"","title":"Secret Binding Provider Controller"},{"body":"Secrets Management for Seed and Shoot Cluster The gardenlet needs to create quite some amount of credentials (certificates, private keys, passwords) for seed and shoot clusters in order to ensure secure deployments. Such credentials typically should be renewed automatically when their validity expires, rotated regularly, and they potentially need to be persisted such that they don’t get lost in case of a control plane migration or a lost seed cluster.\nSecretsManager Introduction These requirements can be covered by using the SecretsManager package maintained in pkg/utils/secrets/manager. It is built on top of the ConfigInterface and DataInterface interfaces part of pkg/utils/secrets and provides the following functions:\n Generate(context.Context, secrets.ConfigInterface, ...GenerateOption) (*corev1.Secret, error)\nThis method either retrieves the current secret for the given configuration or it (re)generates it in case the configuration changed, the signing CA changed (for certificate secrets), or when proactive rotation was triggered. If the configuration describes a certificate authority secret then this method automatically generates a bundle secret containing the current and potentially the old certificate. Available GenerateOptions:\n SignedByCA(string, ...SignedByCAOption): This is only valid for certificate secrets and automatically retrieves the correct certificate authority in order to sign the provided server or client certificate. There are two SignedByCAOptions: UseCurrentCA. This option will sign server certificates with the new/current CA in case of a CA rotation. For more information, please refer to the “Certificate Signing” section below. UseOldCA. This option will sign client certificates with the old CA in case of a CA rotation. For more information, please refer to the “Certificate Signing” section below. Persist(): This marks the secret such that it gets persisted in the ShootState resource in the garden cluster. Consequently, it should only be used for secrets related to a shoot cluster. Rotate(rotationStrategy): This specifies the strategy in case this secret is to be rotated or regenerated (either InPlace which immediately forgets about the old secret, or KeepOld which keeps the old secret in the system). IgnoreOldSecrets(): This specifies that old secrets should not be considered and loaded (contrary to the default behavior). It should be used when old secrets are no longer important and can be “forgotten” (e.g. in “phase 2” (t2) of the CA certificate rotation). Such old secrets will be deleted on Cleanup(). IgnoreOldSecretsAfter(time.Duration): This specifies that old secrets should not be considered and loaded once a given duration after rotation has passed. It can be used to clean up old secrets after automatic rotation (e.g. the Seed cluster CA is automatically rotated when its validity will soon end and the old CA will be cleaned up 24 hours after triggering the rotation). Validity(time.Duration): This specifies how long the secret should be valid. For certificate secret configurations, the manager will automatically deduce this information from the generated certificate. RenewAfterValidityPercentage(int): This specifies the percentage of validity for renewal. The secret will be renewed based on whichever comes first: The specified percentage of validity or 10 days before end of validity. If not specified, the default percentage is 80. Get(string, ...GetOption) (*corev1.Secret, bool)\nThis method retrieves the current secret for the given name. In case the secret in question is a certificate authority secret then it retrieves the bundle secret by default. It is important that this method only knows about secrets for which there were prior Generate calls. Available GetOptions:\n Bundle (default): This retrieves the bundle secret. Current: This retrieves the current secret. Old: This retrieves the old secret. Cleanup(context.Context) error\nThis method deletes secrets which are no longer required. No longer required secrets are those still existing in the system which weren’t detected by prior Generate calls. Consequently, only call Cleanup after you have executed Generate calls for all desired secrets.\n Some exemplary usages would look as follows:\nsecret, err := k.secretsManager.Generate( ctx, \u0026secrets.CertificateSecretConfig{ Name: \"my-server-secret\", CommonName: \"server-abc\", DNSNames: []string{\"first-name\", \"second-name\"}, CertType: secrets.ServerCert, SkipPublishingCACertificate: true, }, secretsmanager.SignedByCA(\"my-ca\"), secretsmanager.Persist(), secretsmanager.Rotate(secretsmanager.InPlace), ) if err != nil { return err } As explained above, the caller does not need to care about the renewal, rotation or the persistence of this secret - all of these concerns are handled by the secrets manager. Automatic renewal of secrets happens when their validity approaches 80% or less than 10d are left until expiration.\nIn case a CA certificate is needed by some component, then it can be retrieved as follows:\ncaSecret, found := k.secretsManager.Get(\"my-ca\") if !found { return fmt.Errorf(\"secret my-ca not found\") } As explained above, this returns the bundle secret for the CA my-ca which might potentially contain both the current and the old CA (in case of rotation/regeneration).\nCertificate Signing Default Behaviour By default, client certificates are signed by the current CA while server certificate are signed by the old CA (if it exists). This is to ensure a smooth exchange of certificate during a CA rotation (typically has two phases, ref GEP-18):\n Client certificates: In phase 1, clients get new certificates as soon as possible to ensure that all clients have been adapted before phase 2. In phase 2, the respective server drops accepting certificates signed by the old CA. Server certificates: In phase 1, servers still use their old/existing certificates to allow clients to update their CA bundle used for verification of the servers’ certificates. In phase 2, the old CA is dropped, hence servers need to get a certificate signed by the new/current CA. At this point in time, clients have already adapted their CA bundles. Alternative: Sign Server Certificates with Current CA In case you control all clients and update them at the same time as the server, it is possible to make the secrets manager generate even server certificates with the new/current CA. This can help to prevent certificate mismatches when the CA bundle is already exchanged while the server still serves with a certificate signed by a CA no longer part of the bundle.\nLet’s consider the two following examples:\n gardenlet deploys a webhook server (gardener-resource-manager) and a corresponding MutatingWebhookConfiguration at the same time. In this case, the server certificate should be generated with the new/current CA to avoid above mentioned certificate mismatches during a CA rotation. gardenlet deploys a server (etcd) in one step, and a client (kube-apiserver) in a subsequent step. In this case, the default behaviour should apply (server certificate should be signed by old/existing CA). Alternative: Sign Client Certificate with Old CA In the unusual case where the client is deployed before the server, it might be useful to always use the old CA for signing the client’s certificate. This can help to prevent certificate mismatches when the client already gets a new certificate while the server still only accepts certificates signed by the old CA.\nLet’s consider the two following examples:\n gardenlet deploys the kube-apiserver before the kubelet. However, the kube-apiserver has a client certificate signed by the ca-kubelet in order to communicate with it (e.g., when retrieving logs or forwarding ports). In this case, the client certificate should be generated with the old CA to avoid above mentioned certificate mismatches during a CA rotation. gardenlet deploys a server (etcd) in one step, and a client (kube-apiserver) in a subsequent step. In this case, the default behaviour should apply (client certificate should be signed by new/current CA). Reusing the SecretsManager in Other Components While the SecretsManager is primarily used by gardenlet, it can be reused by other components (e.g. extensions) as well for managing secrets that are specific to the component or extension. For example, provider extensions might use their own SecretsManager instance for managing the serving certificate of cloud-controller-manager.\nExternal components that want to reuse the SecretsManager should consider the following aspects:\n On initialization of a SecretsManager, pass an identity specific to the component, controller and purpose. For example, gardenlet’s shoot controller uses gardenlet as the SecretsManager’s identity, the Worker controller in provider-foo should use provider-foo-worker, and the ControlPlane controller should use provider-foo-controlplane-exposure for ControlPlane objects of purpose exposure. The given identity is added as a value for the manager-identity label on managed Secrets. This label is used by the Cleanup function to select only those Secrets that are actually managed by the particular SecretManager instance. This is done to prevent removing still needed Secrets that are managed by other instances. Generate dedicated CAs for signing certificates instead of depending on CAs managed by gardenlet. Names of Secrets managed by external SecretsManager instances must not conflict with Secret names from other instances (e.g. gardenlet). For CAs that should be rotated in lock-step with the Shoot CAs managed by gardenlet, components need to pass information about the last rotation initiation time and the current rotation phase to the SecretsManager upon initialization. The relevant information can be retrieved from the Cluster resource under .spec.shoot.status.credentials.rotation.certificateAuthorities. Independent of the specific identity, secrets marked with the Persist option are automatically saved in the ShootState resource by the gardenlet and are also restored by the gardenlet on Control Plane Migration to the new Seed. Migrating Existing Secrets To SecretsManager If you already have existing secrets which were not created with SecretsManager, then you can (optionally) migrate them by labeling them with secrets-manager-use-data-for-name=\u003cconfig-name\u003e. For example, if your SecretsManager generates a CertificateConfigSecret with name foo like this\nsecret, err := k.secretsManager.Generate( ctx, \u0026secrets.CertificateSecretConfig{ Name: \"foo\", // ... }, ) and you already have an existing secret in your system whose data should be kept instead of regenerated, then labeling it with secrets-manager-use-data-for-name=foo will instruct SecretsManager accordingly.\n⚠️ Caveat: You have to make sure that the existing data keys match with what SecretsManager uses:\n Secret Type Data Keys Basic Auth username, password, auth CA Certificate ca.crt, ca.key Non-CA Certificate tls.crt, tls.key Control Plane Secret ca.crt, username, password, token, kubeconfig ETCD Encryption Key key, secret Kubeconfig kubeconfig RSA Private Key id_rsa, id_rsa.pub Static Token static_tokens.csv VPN TLS Auth vpn.tlsauth Implementation Details The source of truth for the secrets manager is the list of Secrets in the Kubernetes cluster it acts upon (typically, the seed cluster). The persisted secrets in the ShootState are only used if and only if the shoot is in the Restore phase - in this case all secrets are just synced to the seed cluster so that they can be picked up by the secrets manager.\nIn order to prevent kubelets from unneeded watches (thus, causing some significant traffic against the kube-apiserver), the Secrets are marked as immutable. Consequently, they have a unique, deterministic name which is computed as follows:\n For CA secrets, the name is just exactly the name specified in the configuration (e.g., ca). This is for backwards-compatibility and will be dropped in a future release once all components depending on the static name have been adapted. For all other secrets, the name specified in the configuration is used as prefix followed by an 8-digit hash. This hash is computed out of the checksum of the secret configuration and the checksum of the certificate of the signing CA (only for certificate configurations). In all cases, the name of the secrets is suffixed with a 5-digit hash computed out of the time when the rotation for this secret was last started.\n","categories":"","description":"","excerpt":"Secrets Management for Seed and Shoot Cluster The gardenlet needs to …","ref":"/docs/gardener/secrets_management/","tags":"","title":"Secrets Management"},{"body":"Packages:\n security.gardener.cloud/v1alpha1 security.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: CredentialsBinding WorkloadIdentity CredentialsBinding CredentialsBinding represents a binding to credentials in the same or another namespace.\n Field Description apiVersion string security.gardener.cloud/v1alpha1 kind string CredentialsBinding metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. provider CredentialsBindingProvider Provider defines the provider type of the CredentialsBinding. This field is immutable.\n credentialsRef Kubernetes core/v1.ObjectReference CredentialsRef is a reference to a resource holding the credentials. Accepted resources are core/v1.Secret and security.gardener.cloud/v1alpha1.WorkloadIdentity This field is immutable.\n quotas []Kubernetes core/v1.ObjectReference (Optional) Quotas is a list of references to Quota objects in the same or another namespace. This field is immutable.\n WorkloadIdentity WorkloadIdentity is resource that allows workloads to be presented before external systems by giving them identities managed by the Gardener API server. The identity of such workload is represented by JSON Web Token issued by the Gardener API server. Workload identities are designed to be used by components running in the Gardener environment, seed or runtime cluster, that make use of identity federation inspired by the OIDC protocol.\n Field Description apiVersion string security.gardener.cloud/v1alpha1 kind string WorkloadIdentity metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec WorkloadIdentitySpec Spec configures the JSON Web Token issued by the Gardener API server.\n audiences []string Audiences specify the list of recipients that the JWT is intended for. The values of this field will be set in the ‘aud’ claim.\n targetSystem TargetSystem TargetSystem represents specific configurations for the system that will accept the JWTs.\n status WorkloadIdentityStatus Status contain the latest observed status of the WorkloadIdentity.\n ContextObject (Appears on: TokenRequestSpec) ContextObject identifies the object the token is requested for.\n Field Description kind string Kind of the object the token is requested for. Valid kinds are ‘Shoot’, ‘Seed’, etc.\n apiVersion string API version of the object the token is requested for.\n name string Name of the object the token is requested for.\n namespace string (Optional) Namespace of the object the token is requested for.\n uid k8s.io/apimachinery/pkg/types.UID UID of the object the token is requested for.\n CredentialsBindingProvider (Appears on: CredentialsBinding) CredentialsBindingProvider defines the provider type of the CredentialsBinding.\n Field Description type string Type is the type of the provider.\n TargetSystem (Appears on: WorkloadIdentitySpec) TargetSystem represents specific configurations for the system that will accept the JWTs.\n Field Description type string Type is the type of the target system.\n providerConfig k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) ProviderConfig is the configuration passed to extension resource.\n TokenRequest TokenRequest is a resource that is used to request WorkloadIdentity tokens.\n Field Description metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec TokenRequestSpec Spec holds configuration settings for the requested token.\n contextObject ContextObject (Optional) ContextObject identifies the object the token is requested for.\n expirationSeconds int64 (Optional) ExpirationSeconds specifies for how long the requested token should be valid.\n status TokenRequestStatus Status bears the issued token with additional information back to the client.\n TokenRequestSpec (Appears on: TokenRequest) TokenRequestSpec holds configuration settings for the requested token.\n Field Description contextObject ContextObject (Optional) ContextObject identifies the object the token is requested for.\n expirationSeconds int64 (Optional) ExpirationSeconds specifies for how long the requested token should be valid.\n TokenRequestStatus (Appears on: TokenRequest) TokenRequestStatus bears the issued token with additional information back to the client.\n Field Description token string Token is the issued token.\n expirationTimestamp Kubernetes meta/v1.Time ExpirationTimestamp is the time of expiration of the returned token.\n WorkloadIdentitySpec (Appears on: WorkloadIdentity) WorkloadIdentitySpec configures the JSON Web Token issued by the Gardener API server.\n Field Description audiences []string Audiences specify the list of recipients that the JWT is intended for. The values of this field will be set in the ‘aud’ claim.\n targetSystem TargetSystem TargetSystem represents specific configurations for the system that will accept the JWTs.\n WorkloadIdentityStatus (Appears on: WorkloadIdentity) WorkloadIdentityStatus contain the latest observed status of the WorkloadIdentity.\n Field Description sub string Sub contains the computed value of the subject that is going to be set in JWTs ‘sub’ claim.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n security.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/security/","tags":"","title":"Security"},{"body":"Gardener Security Release Process Gardener is a growing community of volunteers and users. The Gardener community has adopted this security disclosure and response policy to ensure we responsibly handle critical issues.\nGardener Security Team Security vulnerabilities should be handled quickly and sometimes privately. The primary goal of this process is to reduce the total time users are vulnerable to publicly known exploits. The Gardener Security Team is responsible for organizing the entire response, including internal communication and external disclosure, but will need help from relevant developers and release managers to successfully run this process. The Gardener Security Team consists of the following volunteers:\n Vasu Chandrasekhara, (@vasu1124) Christian Cwienk, (@ccwienk) Donka Dimitrova, (@donistz) Claudia Hölters, (@hoeltcl) Vedran Lerenc, (@vlerenc) Dirk Marwinski, (@marwinski) Jordan Jordanov, (@jordanjordanov) Frederik Thormaehlen, (@ThormaehlenFred) Disclosures Private Disclosure Process The Gardener community asks that all suspected vulnerabilities be privately and responsibly disclosed. If you’ve found a vulnerability or a potential vulnerability in Gardener, please let us know by writing an e-mail to secure@sap.com. We’ll send a confirmation e-mail to acknowledge your report, and we’ll send an additional e-mail when we’ve identified the issue positively or negatively.\nPublic Disclosure Process If you know of a publicly disclosed vulnerability please IMMEDIATELY write an e-mail to secure@sap.com to inform the Gardener Security Team about the vulnerability so they may start the patch, release, and communication process.\nIf possible, the Gardener Security Team will ask the person making the public report if the issue can be handled via a private disclosure process (for example, if the full exploit details have not yet been published). If the reporter denies the request for private disclosure, the Gardener Security Team will move swiftly with the fix and release process. In extreme cases GitHub can be asked to delete the issue but this generally isn’t necessary and is unlikely to make a public disclosure less damaging.\nPatch, Release, and Public Communication For each vulnerability, a member of the Gardener Security Team will volunteer to lead coordination with the “Fix Team” and is responsible for sending disclosure e-mails to the rest of the community. This lead will be referred to as the “Fix Lead.” The role of the Fix Lead should rotate round-robin across the Gardener Security Team. Note that given the current size of the Gardener community it is likely that the Gardener Security Team is the same as the “Fix Team” (i.e., all maintainers).\nThe Gardener Security Team may decide to bring in additional contributors for added expertise depending on the area of the code that contains the vulnerability. All of the timelines below are suggestions and assume a private disclosure. The Fix Lead drives the schedule using his best judgment based on severity and development time.\nIf the Fix Lead is dealing with a public disclosure, all timelines become ASAP (assuming the vulnerability has a CVSS score \u003e= 7; see below). If the fix relies on another upstream project’s disclosure timeline, that will adjust the process as well. We will work with the upstream project to fit their timeline and best protect our users.\nFix Team Organization The Fix Lead will work quickly to identify relevant engineers from the affected projects and packages and CC those engineers into the disclosure thread. These selected developers are the Fix Team. The Fix Lead will give the Fix Team access to a private security repository to develop the fix.\nFix Development Process The Fix Lead and the Fix Team will create a CVSS using the CVSS Calculator. The Fix Lead makes the final call on the calculated CVSS; it is better to move quickly than make the CVSS perfect.\nThe Fix Team will notify the Fix Lead that work on the fix branch is complete once there are LGTMs on all commits in the private repository from one or more maintainers.\nIf the CVSS score is under 7.0 (a medium severity score) the Fix Team can decide to slow the release process down in the face of holidays, developer bandwidth, etc. These decisions must be discussed on the private Gardener Security mailing list.\nFix Disclosure Process With the fix development underway, the Fix Lead needs to come up with an overall communication plan for the wider community. This Disclosure process should begin after the Fix Team has developed a Fix or mitigation so that a realistic timeline can be communicated to users. The Fix Lead will inform the Gardener mailing list that a security vulnerability has been disclosed and that a fix will be made available in the future on a certain release date. The Fix Lead will include any mitigating steps users can take until a fix is available. The communication to Gardener users should be actionable. They should know when to block time to apply patches, understand exact mitigation steps, etc.\nFix Release Day The Release Managers will ensure all the binaries are built, publicly available, and functional before the Release Date. The Release Managers will create a new patch release branch from the latest patch release tag + the fix from the security branch. As a practical example, if v0.12.0 is the latest patch release in gardener.git, a new branch will be created called v0.12.1 which includes only patches required to fix the issue. The Fix Lead will cherry-pick the patches onto the master branch and all relevant release branches. The Fix Team will LGTM and merge. The Release Managers will merge these PRs as quickly as possible.\nChanges shouldn’t be made to the commits, even for a typo in the CHANGELOG, as this will change the git sha of the already built commits, leading to confusion and potentially conflicts as the fix is cherry-picked around branches. The Fix Lead will request a CVE from the SAP Product Security Response Team via email to cna@sap.com with all the relevant information (description, potential impact, affected version, fixed version, CVSS v3 base score, and supporting documentation for the CVSS score) for every vulnerability. The Fix Lead will inform the Gardener mailing list and announce the new releases, the CVE number (if available), the location of the binaries, and the relevant merged PRs to get wide distribution and user action.\nAs much as possible, this e-mail should be actionable and include links how to apply the fix to users environments; this can include links to external distributor documentation. The recommended target time is 4pm UTC on a non-Friday weekday. This means the announcement will be seen morning Pacific, early evening Europe, and late evening Asia. The Fix Lead will remove the Fix Team from the private security repository.\nRetrospective These steps should be completed after the Release Date. The retrospective process should be blameless.\nThe Fix Lead will send a retrospective of the process to the Gardener mailing list including details on everyone involved, the timeline of the process, links to relevant PRs that introduced the issue, if relevant, and any critiques of the response and release process. The Release Managers and Fix Team are also encouraged to send their own feedback on the process to the Gardener mailing list. Honest critique is the only way we are going to get good at this as a community.\nCommunication Channel The private or public disclosure process should be triggered exclusively by writing an e-mail to secure@sap.com.\nGardener security announcements will be communicated by the Fix Lead sending an e-mail to the Gardener mailing list (reachable via gardener@googlegroups.com), as well as posting a link in the Gardener Slack channel.\nPublic discussions about Gardener security announcements and retrospectives will primarily happen in the Gardener mailing list. Thus Gardener community members who are interested in participating in discussions related to the Gardener Security Release Process are encouraged to join the Gardener mailing list (how to find and join a group).\nThe members of the Gardener Security Team are subscribed to the private Gardener Security mailing list (reachable via gardener-security@googlegroups.com).\n","categories":"","description":"","excerpt":"Gardener Security Release Process Gardener is a growing community of …","ref":"/docs/contribute/code/security-guide/","tags":"","title":"Security Release Process"},{"body":"Seed Bootstrapping Whenever the gardenlet is responsible for a new Seed resource its “seed controller” is being activated. One part of this controller’s reconciliation logic is deploying certain components into the garden namespace of the seed cluster itself. These components are required to spawn and manage control planes for shoot clusters later on. This document is providing an overview which actions are performed during this bootstrapping phase, and it explains the rationale behind them.\nDependency Watchdog The dependency watchdog (abbreviation: DWD) is a component developed separately in the gardener/dependency-watchdog GitHub repository. Gardener is using it for two purposes:\n Prevention of melt-down situations when the load balancer used to expose the kube-apiserver of shoot clusters goes down while the kube-apiserver itself is still up and running. Fast recovery times for crash-looping pods when depending pods are again available. For the sake of separating these concerns, two instances of the DWD are deployed by the seed controller.\nProber The dependency-watchdog-prober deployment is responsible for above-mentioned first point.\nThe kube-apiserver of shoot clusters is exposed via a load balancer, usually with an attached public IP, which serves as the main entry point when it comes to interaction with the shoot cluster (e.g., via kubectl). While end-users are talking to their clusters via this load balancer, other control plane components like the kube-controller-manager or kube-scheduler run in the same namespace/same cluster, so they can communicate via the in-cluster Service directly instead of using the detour with the load balancer. However, the worker nodes of shoot clusters run in isolated, distinct networks. This means that the kubelets and kube-proxys also have to talk to the control plane via the load balancer.\nThe kube-controller-manager has a special control loop called nodelifecycle which will set the status of Nodes to NotReady in case the kubelet stops to regularly renew its lease/to send its heartbeat. This will trigger other self-healing capabilities of Kubernetes, for example, the eviction of pods from such “unready” nodes to healthy nodes. Similarly, the cloud-controller-manager has a control loop that will disconnect load balancers from “unready” nodes, i.e., such workload would no longer be accessible until moved to a healthy node. Furthermore, the machine-controller-manager removes “unready” nodes after health-timeout (default 10min).\nWhile these are awesome Kubernetes features on their own, they have a dangerous drawback when applied in the context of Gardener’s architecture: When the kube-apiserver load balancer fails for whatever reason, then the kubelets can’t talk to the kube-apiserver to renew their lease anymore. After a minute or so the kube-controller-manager will get the impression that all nodes have died and will mark them as NotReady. This will trigger above mentioned eviction as well as detachment of load balancers. As a result, the customer’s workload will go down and become unreachable.\nThis is exactly the situation that the DWD prevents: It regularly tries to talk to the kube-apiservers of the shoot clusters, once by using their load balancer, and once by talking via the in-cluster Service. If it detects that the kube-apiserver is reachable internally but not externally, it scales down machine-controller-manager, cluster-autoscaler (if enabled) and kube-controller-manager to 0. This will prevent it from marking the shoot worker nodes as “unready”. This will also prevent the machine-controller-manager from deleting potentially healthy nodes. As soon as the kube-apiserver is reachable externally again, kube-controller-manager, machine-controller-manager and cluster-autoscaler are restored to the state prior to scale-down.\nWeeder The dependency-watchdog-weeder deployment is responsible for above mentioned second point.\nKubernetes is restarting failing pods with an exponentially increasing backoff time. While this is a great strategy to prevent system overloads, it has the disadvantage that the delay between restarts is increasing up to multiple minutes very fast.\nIn the Gardener context, we are deploying many components that are depending on other components. For example, the kube-apiserver is depending on a running etcd, or the kube-controller-manager and kube-scheduler are depending on a running kube-apiserver. In case such a “higher-level” component fails for whatever reason, the dependent pods will fail and end-up in crash-loops. As Kubernetes does not know anything about these hierarchies, it won’t recognize that such pods can be restarted faster as soon as their dependents are up and running again.\nThis is exactly the situation in which the DWD will become active: If it detects that a certain Service is available again (e.g., after the etcd was temporarily down while being moved to another seed node), then DWD will restart all crash-looping dependant pods. These dependant pods are detected via a pre-configured label selector.\nAs of today, the DWD is configured to restart a crash-looping kube-apiserver after etcd became available again, or any pod depending on the kube-apiserver that has a gardener.cloud/role=controlplane label (e.g., kube-controller-manager, kube-scheduler).\n","categories":"","description":"","excerpt":"Seed Bootstrapping Whenever the gardenlet is responsible for a new …","ref":"/docs/gardener/seed_bootstrapping/","tags":"","title":"Seed Bootstrapping"},{"body":"Settings for Seeds The Seed resource offers a few settings that are used to control the behaviour of certain Gardener components. This document provides an overview over the available settings:\nDependency Watchdog Gardenlet can deploy two instances of the dependency-watchdog into the garden namespace of the seed cluster. One instance only activates the weeder while the second instance only activates the prober.\nWeeder The weeder helps to alleviate the delay where control plane components remain unavailable by finding the respective pods in CrashLoopBackoff status and restarting them once their dependents become ready and available again. For example, if etcd goes down then also kube-apiserver goes down (and into a CrashLoopBackoff state). If etcd comes up again then (without the endpoint controller) it might take some time until kube-apiserver gets restarted as well.\n⚠️ .spec.settings.dependencyWatchdog.endpoint.enabled is deprecated and will be removed in a future version of Gardener. Use .spec.settings.dependencyWatchdog.weeder.enabled instead.\nIt can be enabled/disabled via the .spec.settings.dependencyWatchdog.endpoint.enabled field. It defaults to true.\nProber The probe controller scales down the kube-controller-manager of shoot clusters in case their respective kube-apiserver is not reachable via its external ingress. This is in order to avoid melt-down situations, since the kube-controller-manager uses in-cluster communication when talking to the kube-apiserver, i.e., it wouldn’t be affected if the external access to the kube-apiserver is interrupted for whatever reason. The kubelets on the shoot worker nodes, however, would indeed be affected since they typically run in different networks and use the external ingress when talking to the kube-apiserver. Hence, without scaling down kube-controller-manager, the nodes might be marked as NotReady and eventually replaced (since the kubelets cannot report their status anymore). To prevent such unnecessary turbulence, kube-controller-manager is being scaled down until the external ingress becomes available again. In addition, as a precautionary measure, machine-controller-manager is also scaled down, along with cluster-autoscaler which depends on machine-controller-manager.\n⚠️ .spec.settings.dependencyWatchdog.probe.enabled is deprecated and will be removed in a future version of Gardener. Use .spec.settings.dependencyWatchdog.prober.enabled instead.\nIt can be enabled/disabled via the .spec.settings.dependencyWatchdog.probe.enabled field. It defaults to true.\nReserve Excess Capacity If the excess capacity reservation is enabled, then the gardenlet will deploy a special Deployment into the garden namespace of the seed cluster. This Deployment’s pod template has only one container, the pause container, which simply runs in an infinite loop. The priority of the deployment is very low, so any other pod will preempt these pause pods. This is especially useful if new shoot control planes are created in the seed. In case the seed cluster runs at its capacity, then there is no waiting time required during the scale-up. Instead, the low-priority pause pods will be preempted and allow newly created shoot control plane pods to be scheduled fast. In the meantime, the cluster-autoscaler will trigger the scale-up because the preempted pause pods want to run again. However, this delay doesn’t affect the important shoot control plane pods, which will improve the user experience.\nUse .spec.settings.excessCapacityReservation.configs to create excess capacity reservation deployments which allow to specify custom values for resources, nodeSelector and tolerations. Each config creates a deployment with a minium number of 2 replicas and a maximum equal to the number of zones configured for this seed. It defaults to a config reserving 2 CPUs and 6Gi of memory for each pod with no nodeSelector and no tolerations.\nExcess capacity reservation is enabled when .spec.settings.excessCapacityReservation.enabled is true or not specified while configs are present. It can be disabled by setting the field to false.\nScheduling By default, the Gardener Scheduler will consider all seed clusters when a new shoot cluster shall be created. However, administrators/operators might want to exclude some of them from being considered by the scheduler. Therefore, seed clusters can be marked as “invisible”. In this case, the scheduler simply ignores them as if they wouldn’t exist. Shoots can still use the invisible seed but only by explicitly specifying the name in their .spec.seedName field.\nSeed clusters can be marked visible/invisible via the .spec.settings.scheduling.visible field. It defaults to true.\nℹ️ In previous Gardener versions (\u003c 1.5) these settings were controlled via taint keys (seed.gardener.cloud/{disable-capacity-reservation,invisible}). The taint keys are no longer supported and removed in version 1.12. The rationale behind it is the implementation of tolerations similar to Kubernetes tolerations. More information about it can be found in #2193.\nLoad Balancer Services Gardener creates certain Kubernetes Service objects of type LoadBalancer in the seed cluster. Most prominently, they are used for exposing the shoot control planes, namely the kube-apiserver of the shoot clusters. In most cases, the cloud-controller-manager (responsible for managing these load balancers on the respective underlying infrastructure) supports certain customization and settings via annotations. This document provides a good overview and many examples.\nBy setting the .spec.settings.loadBalancerServices.annotations field the Gardener administrator can specify a list of annotations, which will be injected into the Services of type LoadBalancer.\nExternal Traffic Policy Setting the external traffic policy to Local can be beneficial as it preserves the source IP address of client requests. In addition to that, it removes one hop in the data path and hence reduces request latency. On some cloud infrastructures, it can furthermore be used in conjunction with Service annotations as described above to prevent cross-zonal traffic from the load balancer to the backend pod.\nThe default external traffic policy is Cluster, meaning that all traffic from the load balancer will be sent to any cluster node, which then itself will redirect the traffic to the actual receiving pod. This approach adds a node to the data path, may cross the zone boundaries twice, and replaces the source IP with one of the cluster nodes.\nUsing external traffic policy Local drops the additional node, i.e., only cluster nodes with corresponding backend pods will be in the list of backends of the load balancer. However, this has multiple implications. The health check port in this scenario is exposed by kube-proxy , i.e., if kube-proxy is not working on a node a corresponding pod on the node will not receive traffic from the load balancer as the load balancer will see a failing health check. (This is quite different from ordinary service routing where kube-proxy is only responsible for setup, but does not need to run for its operation.) Furthermore, load balancing may become imbalanced if multiple pods run on the same node because load balancers will split the load equally among the nodes and not among the pods. This is mitigated by corresponding node anti affinities.\nOperators need to take these implications into account when considering switching external traffic policy to Local.\nProxy Protocol Traditionally, the client IP address can be used for security filtering measures, e.g. IP allow listing. However, for this to have any usefulness, the client IP address needs to be correctly transferred to the filtering entity.\nLoad balancers can either act transparently and simply pass the client IP on, or they terminate one connection and forward data on a new connection. The latter (intransparant) approach requires a separate way to propagate the client IP address. Common approaches are an HTTP header for TLS terminating load balancers or (HA) proxy protocol.\nFor level 3 load balancers, (HA) proxy protocol is the default way to preserve client IP addresses. As it prepends a small proxy protocol header before the actual workload data, the receiving server needs to be aware of it and handle it properly. This means that activating proxy protocol needs to happen on both load balancer and receiving server at/around the same time, as otherwise the receiving server will incorrectly interpret data as workload/proxy protocol header.\nFor disruption-free migration to proxy protocol, set .spec.settings.loadBalancerServices.proxyProtocol.allow to true. The migration path should be to enable the option and shortly thereafter also enable proxy protocol on the load balancer with infrastructure-specific means, e.g. a corresponding load balancer annotation.\nWhen switching back from use of proxy protocol to no use of it, use the inverse order, i.e. disable proxy protocol first on the load balancer before disabling .spec.settings.loadBalancerServices.proxyProtocol.allow.\nZone-Specific Settings In case a seed cluster is configured to use multiple zones via .spec.provider.zones, it may be necessary to configure the load balancers in individual zones in different way, e.g., by utilizing different annotations. One reason may be to reduce cross-zonal traffic and have zone-specific load balancers in place. Zone-specific load balancers may then be bound to zone-specific subnets or availability zones in the cloud infrastructure.\nBesides the load balancer annotations, it is also possible to set proxy protocol termination and the external traffic policy for each zone-specific load balancer individually.\nVertical Pod Autoscaler Gardener heavily relies on the Kubernetes vertical-pod-autoscaler component. By default, the seed controller deploys the VPA components into the garden namespace of the respective seed clusters. In case you want to manage the VPA deployment on your own or have a custom one, then you might want to disable the automatic deployment of Gardener. Otherwise, you might end up with two VPAs, which will cause erratic behaviour. By setting the .spec.settings.verticalPodAutoscaler.enabled=false, you can disable the automatic deployment.\n⚠️ In any case, there must be a VPA available for your seed cluster. Using a seed without VPA is not supported.\nVPA Pitfall: Excessive Resource Requests Making Pod Unschedulable VPA is unaware of node capacity, and can increase the resource requests of a pod beyond the capacity of any single node. Such pod is likely to become permanently unschedulable. That problem can be partly mitigated by using the VerticalPodAutoscaler.Spec.ResourcePolicy.ContainerPolicies[].MaxAllowed field to constrain pod resource requests to the level of nodes’ allocatable resources. The downside is that a pod constrained in such fashion would be using more resources than it has requested, and can starve for resources and/or negatively impact neighbour pods with which it is sharing a node.\nAs an alternative, in scenarios where MaxAllowed is not set, it is important to maintain a worker pool which can accommodate the highest level of resources that VPA would actually request for the pods it controls.\nFinally, the optimal strategy typically is to both ensure large enough worker pools, and, as an insurance, use MaxAllowed aligned with the allocatable resources of the largest worker.\nTopology-Aware Traffic Routing Refer to the Topology-Aware Traffic Routing documentation as this document contains the documentation for the topology-aware routing Seed setting.\n","categories":"","description":"","excerpt":"Settings for Seeds The Seed resource offers a few settings that are …","ref":"/docs/gardener/seed_settings/","tags":"","title":"Seed Settings"},{"body":"Packages:\n seedmanagement.gardener.cloud/v1alpha1 seedmanagement.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: Gardenlet ManagedSeed ManagedSeedSet Gardenlet Gardenlet represents a Gardenlet configuration for an unmanaged seed.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string Gardenlet metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec GardenletSpec (Optional) Specification of the Gardenlet.\n deployment GardenletSelfDeployment Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n kubeconfigSecretRef Kubernetes core/v1.LocalObjectReference (Optional) KubeconfigSecretRef is a reference to a secret containing a kubeconfig for the cluster to which gardenlet should be deployed. This is only used by gardener-operator for a very first gardenlet deployment. After that, gardenlet will continuously upgrade itself. If this field is empty, gardener-operator deploys it into its own runtime cluster.\n status GardenletStatus (Optional) Most recently observed status of the Gardenlet.\n ManagedSeed ManagedSeed represents a Shoot that is registered as Seed.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string ManagedSeed metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSpec (Optional) Specification of the ManagedSeed.\n shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n status ManagedSeedStatus (Optional) Most recently observed status of the ManagedSeed.\n ManagedSeedSet ManagedSeedSet represents a set of identical ManagedSeeds.\n Field Description apiVersion string seedmanagement.gardener.cloud/v1alpha1 kind string ManagedSeedSet metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSetSpec (Optional) Spec defines the desired identities of ManagedSeeds and Shoots in this set.\n replicas int32 (Optional) Replicas is the desired number of replicas of the given Template. Defaults to 1.\n selector Kubernetes meta/v1.LabelSelector Selector is a label query over ManagedSeeds and Shoots that should match the replica count. It must match the ManagedSeeds and Shoots template’s labels. This field is immutable.\n template ManagedSeedTemplate Template describes the ManagedSeed that will be created if insufficient replicas are detected. Each ManagedSeed created / updated by the ManagedSeedSet will fulfill this template.\n shootTemplate github.com/gardener/gardener/pkg/apis/core/v1beta1.ShootTemplate ShootTemplate describes the Shoot that will be created if insufficient replicas are detected for hosting the corresponding ManagedSeed. Each Shoot created / updated by the ManagedSeedSet will fulfill this template.\n updateStrategy UpdateStrategy (Optional) UpdateStrategy specifies the UpdateStrategy that will be employed to update ManagedSeeds / Shoots in the ManagedSeedSet when a revision is made to Template / ShootTemplate.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the maximum number of revisions that will be maintained in the ManagedSeedSet’s revision history. Defaults to 10. This field is immutable.\n status ManagedSeedSetStatus (Optional) Status is the current status of ManagedSeeds and Shoots in this ManagedSeedSet.\n Bootstrap (string alias)\n (Appears on: GardenletConfig) Bootstrap describes a mechanism for bootstrapping gardenlet connection to the Garden cluster.\nGardenletConfig (Appears on: ManagedSeedSpec) GardenletConfig specifies gardenlet deployment parameters and the GardenletConfiguration used to configure gardenlet.\n Field Description deployment GardenletDeployment (Optional) Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n bootstrap Bootstrap (Optional) Bootstrap is the mechanism that should be used for bootstrapping gardenlet connection to the Garden cluster. One of ServiceAccount, BootstrapToken, None. If set to ServiceAccount or BootstrapToken, a service account or a bootstrap token will be created in the garden cluster and used to compute the bootstrap kubeconfig. If set to None, the gardenClientConnection.kubeconfig field will be used to connect to the Garden cluster. Defaults to BootstrapToken. This field is immutable.\n mergeWithParent bool (Optional) MergeWithParent specifies whether the GardenletConfiguration of the parent gardenlet should be merged with the specified GardenletConfiguration. Defaults to true. This field is immutable.\n GardenletDeployment (Appears on: GardenletConfig, GardenletSelfDeployment) GardenletDeployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n Field Description replicaCount int32 (Optional) ReplicaCount is the number of gardenlet replicas. Defaults to 2.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the number of old gardenlet ReplicaSets to retain to allow rollback. Defaults to 2.\n serviceAccountName string (Optional) ServiceAccountName is the name of the ServiceAccount to use to run gardenlet pods.\n image Image (Optional) Image is the gardenlet container image.\n resources Kubernetes core/v1.ResourceRequirements (Optional) Resources are the compute resources required by the gardenlet container.\n podLabels map[string]string (Optional) PodLabels are the labels on gardenlet pods.\n podAnnotations map[string]string (Optional) PodAnnotations are the annotations on gardenlet pods.\n additionalVolumes []Kubernetes core/v1.Volume (Optional) AdditionalVolumes is the list of additional volumes that should be mounted by gardenlet containers.\n additionalVolumeMounts []Kubernetes core/v1.VolumeMount (Optional) AdditionalVolumeMounts is the list of additional pod volumes to mount into the gardenlet container’s filesystem.\n env []Kubernetes core/v1.EnvVar (Optional) Env is the list of environment variables to set in the gardenlet container.\n vpa bool (Optional) VPA specifies whether to enable VPA for gardenlet. Defaults to true.\nDeprecated: This field is deprecated and has no effect anymore. It will be removed in the future. TODO(rfranzke): Remove this field after v1.110 has been released.\n GardenletHelm (Appears on: GardenletSelfDeployment) GardenletHelm is the Helm deployment configuration for gardenlet.\n Field Description ociRepository github.com/gardener/gardener/pkg/apis/core/v1.OCIRepository OCIRepository defines where to pull the chart.\n GardenletSelfDeployment (Appears on: GardenletSpec) GardenletSelfDeployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n Field Description GardenletDeployment GardenletDeployment (Members of GardenletDeployment are embedded into this type.) (Optional) GardenletDeployment specifies common gardenlet deployment parameters.\n helm GardenletHelm Helm is the Helm deployment configuration.\n imageVectorOverwrite string (Optional) ImageVectorOverwrite is the image vector overwrite for the components deployed by this gardenlet.\n componentImageVectorOverwrite string (Optional) ComponentImageVectorOverwrite is the component image vector overwrite for the components deployed by this gardenlet.\n GardenletSpec (Appears on: Gardenlet) GardenletSpec specifies gardenlet deployment parameters and the configuration used to configure gardenlet.\n Field Description deployment GardenletSelfDeployment Deployment specifies certain gardenlet deployment parameters, such as the number of replicas, the image, etc.\n config k8s.io/apimachinery/pkg/runtime.RawExtension (Optional) Config is the GardenletConfiguration used to configure gardenlet.\n kubeconfigSecretRef Kubernetes core/v1.LocalObjectReference (Optional) KubeconfigSecretRef is a reference to a secret containing a kubeconfig for the cluster to which gardenlet should be deployed. This is only used by gardener-operator for a very first gardenlet deployment. After that, gardenlet will continuously upgrade itself. If this field is empty, gardener-operator deploys it into its own runtime cluster.\n GardenletStatus (Appears on: Gardenlet) GardenletStatus is the status of a Gardenlet.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a Gardenlet’s current state.\n observedGeneration int64 (Optional) ObservedGeneration is the most recent generation observed for this Gardenlet. It corresponds to the Gardenlet’s generation, which is updated on mutation by the API Server.\n Image (Appears on: GardenletDeployment) Image specifies container image parameters.\n Field Description repository string (Optional) Repository is the image repository.\n tag string (Optional) Tag is the image tag.\n pullPolicy Kubernetes core/v1.PullPolicy (Optional) PullPolicy is the image pull policy. One of Always, Never, IfNotPresent. Defaults to Always if latest tag is specified, or IfNotPresent otherwise.\n ManagedSeedSetSpec (Appears on: ManagedSeedSet) ManagedSeedSetSpec is the specification of a ManagedSeedSet.\n Field Description replicas int32 (Optional) Replicas is the desired number of replicas of the given Template. Defaults to 1.\n selector Kubernetes meta/v1.LabelSelector Selector is a label query over ManagedSeeds and Shoots that should match the replica count. It must match the ManagedSeeds and Shoots template’s labels. This field is immutable.\n template ManagedSeedTemplate Template describes the ManagedSeed that will be created if insufficient replicas are detected. Each ManagedSeed created / updated by the ManagedSeedSet will fulfill this template.\n shootTemplate github.com/gardener/gardener/pkg/apis/core/v1beta1.ShootTemplate ShootTemplate describes the Shoot that will be created if insufficient replicas are detected for hosting the corresponding ManagedSeed. Each Shoot created / updated by the ManagedSeedSet will fulfill this template.\n updateStrategy UpdateStrategy (Optional) UpdateStrategy specifies the UpdateStrategy that will be employed to update ManagedSeeds / Shoots in the ManagedSeedSet when a revision is made to Template / ShootTemplate.\n revisionHistoryLimit int32 (Optional) RevisionHistoryLimit is the maximum number of revisions that will be maintained in the ManagedSeedSet’s revision history. Defaults to 10. This field is immutable.\n ManagedSeedSetStatus (Appears on: ManagedSeedSet) ManagedSeedSetStatus represents the current state of a ManagedSeedSet.\n Field Description observedGeneration int64 ObservedGeneration is the most recent generation observed for this ManagedSeedSet. It corresponds to the ManagedSeedSet’s generation, which is updated on mutation by the API Server.\n replicas int32 Replicas is the number of replicas (ManagedSeeds and their corresponding Shoots) created by the ManagedSeedSet controller.\n readyReplicas int32 ReadyReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller that have a Ready Condition.\n nextReplicaNumber int32 NextReplicaNumber is the ordinal number that will be assigned to the next replica of the ManagedSeedSet.\n currentReplicas int32 CurrentReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller from the ManagedSeedSet version indicated by CurrentRevision.\n updatedReplicas int32 UpdatedReplicas is the number of ManagedSeeds created by the ManagedSeedSet controller from the ManagedSeedSet version indicated by UpdateRevision.\n currentRevision string CurrentRevision, if not empty, indicates the version of the ManagedSeedSet used to generate ManagedSeeds with smaller ordinal numbers during updates.\n updateRevision string UpdateRevision, if not empty, indicates the version of the ManagedSeedSet used to generate ManagedSeeds with larger ordinal numbers during updates\n collisionCount int32 (Optional) CollisionCount is the count of hash collisions for the ManagedSeedSet. The ManagedSeedSet controller uses this field as a collision avoidance mechanism when it needs to create the name for the newest ControllerRevision.\n conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a ManagedSeedSet’s current state.\n pendingReplica PendingReplica (Optional) PendingReplica, if not empty, indicates the replica that is currently pending creation, update, or deletion. This replica is in a state that requires the controller to wait for it to change before advancing to the next replica.\n ManagedSeedSpec (Appears on: ManagedSeed, ManagedSeedTemplate) ManagedSeedSpec is the specification of a ManagedSeed.\n Field Description shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n ManagedSeedStatus (Appears on: ManagedSeed) ManagedSeedStatus is the status of a ManagedSeed.\n Field Description conditions []github.com/gardener/gardener/pkg/apis/core/v1beta1.Condition (Optional) Conditions represents the latest available observations of a ManagedSeed’s current state.\n observedGeneration int64 ObservedGeneration is the most recent generation observed for this ManagedSeed. It corresponds to the ManagedSeed’s generation, which is updated on mutation by the API Server.\n ManagedSeedTemplate (Appears on: ManagedSeedSetSpec) ManagedSeedTemplate is a template for creating a ManagedSeed object.\n Field Description metadata Kubernetes meta/v1.ObjectMeta (Optional) Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ManagedSeedSpec (Optional) Specification of the desired behavior of the ManagedSeed.\n shoot Shoot (Optional) Shoot references a Shoot that should be registered as Seed. This field is immutable.\n gardenlet GardenletConfig (Optional) Gardenlet specifies that the ManagedSeed controller should deploy a gardenlet into the cluster with the given deployment parameters and GardenletConfiguration.\n PendingReplica (Appears on: ManagedSeedSetStatus) PendingReplica contains information about a replica that is currently pending creation, update, or deletion.\n Field Description name string Name is the replica name.\n reason PendingReplicaReason Reason is the reason for the replica to be pending.\n since Kubernetes meta/v1.Time Since is the moment in time since the replica is pending with the specified reason.\n retries int32 (Optional) Retries is the number of times the shoot operation (reconcile or delete) has been retried after having failed. Only applicable if Reason is ShootReconciling or ShootDeleting.\n PendingReplicaReason (string alias)\n (Appears on: PendingReplica) PendingReplicaReason is a string enumeration type that enumerates all possible reasons for a replica to be pending.\nRollingUpdateStrategy (Appears on: UpdateStrategy) RollingUpdateStrategy is used to communicate parameters for RollingUpdateStrategyType.\n Field Description partition int32 (Optional) Partition indicates the ordinal at which the ManagedSeedSet should be partitioned. Defaults to 0.\n Shoot (Appears on: ManagedSeedSpec) Shoot identifies the Shoot that should be registered as Seed.\n Field Description name string Name is the name of the Shoot that will be registered as Seed.\n UpdateStrategy (Appears on: ManagedSeedSetSpec) UpdateStrategy specifies the strategy that the ManagedSeedSet controller will use to perform updates. It includes any additional parameters necessary to perform the update for the indicated strategy.\n Field Description type UpdateStrategyType (Optional) Type indicates the type of the UpdateStrategy. Defaults to RollingUpdate.\n rollingUpdate RollingUpdateStrategy (Optional) RollingUpdate is used to communicate parameters when Type is RollingUpdateStrategyType.\n UpdateStrategyType (string alias)\n (Appears on: UpdateStrategy) UpdateStrategyType is a string enumeration type that enumerates all possible update strategies for the ManagedSeedSet controller.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n seedmanagement.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/seedmanagement/","tags":"","title":"Seedmanagement"},{"body":"Service Account Manager Overview With Gardener v1.47, a new role called serviceaccountmanager was introduced. This role allows to fully manage ServiceAccount’s in the project namespace and request tokens for them. This is the preferred way of managing the access to a project namespace, as it aims to replace the usage of the default ServiceAccount secrets that will no longer be generated automatically.\nActions Once assigned the serviceaccountmanager role, a user can create/update/delete ServiceAccounts in the project namespace.\nCreate a Service Account In order to create a ServiceAccount named “robot-user”, run the following kubectl command:\nkubectl -n project-abc create sa robot-user Request a Token for a Service Account A token for the “robot-user” ServiceAccount can be requested via the TokenRequest API in several ways:\nkubectl -n project-abc create token robot-user --duration=3600s directly calling the Kubernetes HTTP API curl -X POST https://api.gardener/api/v1/namespaces/project-abc/serviceaccounts/robot-user/token \\ -H \"Authorization: Bearer \u003cauth-token\u003e\" \\ -H \"Content-Type: application/json\" \\ -d '{ \"apiVersion\": \"authentication.k8s.io/v1\", \"kind\": \"TokenRequest\", \"spec\": { \"expirationSeconds\": 3600 } }' Mind that the returned token is not stored within the Kubernetes cluster, will be valid for 3600 seconds, and will be invalidated if the “robot-user” ServiceAccount is deleted. Although expirationSeconds can be modified depending on the needs, the returned token’s validity will not exceed the configured service-account-max-token-expiration duration for the garden cluster. It is advised that the actual expirationTimestamp is verified so that expectations are met. This can be done by asserting the expirationTimestamp in the TokenRequestStatus or the exp claim in the token itself.\nDelete a Service Account In order to delete the ServiceAccount named “robot-user”, run the following kubectl command:\nkubectl -n project-abc delete sa robot-user This will invalidate all existing tokens for the “robot-user” ServiceAccount.\n","categories":"","description":"The role that allows a user to manage ServiceAccounts in the project namespace","excerpt":"The role that allows a user to manage ServiceAccounts in the project …","ref":"/docs/gardener/service-account-manager/","tags":"","title":"Service Account Manager"},{"body":"Packages:\n settings.gardener.cloud/v1alpha1 settings.gardener.cloud/v1alpha1 Package v1alpha1 is a version of the API.\nResource Types: ClusterOpenIDConnectPreset OpenIDConnectPreset ClusterOpenIDConnectPreset ClusterOpenIDConnectPreset is a OpenID Connect configuration that is applied to a Shoot objects cluster-wide.\n Field Description apiVersion string settings.gardener.cloud/v1alpha1 kind string ClusterOpenIDConnectPreset metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec ClusterOpenIDConnectPresetSpec Spec is the specification of this OpenIDConnect preset.\n OpenIDConnectPresetSpec OpenIDConnectPresetSpec (Members of OpenIDConnectPresetSpec are embedded into this type.) projectSelector Kubernetes meta/v1.LabelSelector (Optional) Project decides whether to apply the configuration if the Shoot is in a specific Project matching the label selector. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Defaults to the empty LabelSelector, which matches everything.\n OpenIDConnectPreset OpenIDConnectPreset is a OpenID Connect configuration that is applied to a Shoot in a namespace.\n Field Description apiVersion string settings.gardener.cloud/v1alpha1 kind string OpenIDConnectPreset metadata Kubernetes meta/v1.ObjectMeta Standard object metadata.\nRefer to the Kubernetes API documentation for the fields of the metadata field. spec OpenIDConnectPresetSpec Spec is the specification of this OpenIDConnect preset.\n server KubeAPIServerOpenIDConnect Server contains the kube-apiserver’s OpenID Connect configuration. This configuration is not overwriting any existing OpenID Connect configuration already set on the Shoot object.\n client OpenIDConnectClientAuthentication (Optional) Client contains the configuration used for client OIDC authentication of Shoot clusters. This configuration is not overwriting any existing OpenID Connect client authentication already set on the Shoot object.\nDeprecated: The OpenID Connect configuration this field specifies is not used and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n shootSelector Kubernetes meta/v1.LabelSelector (Optional) ShootSelector decides whether to apply the configuration if the Shoot has matching labels. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Default to the empty LabelSelector, which matches everything.\n weight int32 Weight associated with matching the corresponding preset, in the range 1-100. Required.\n ClusterOpenIDConnectPresetSpec (Appears on: ClusterOpenIDConnectPreset) ClusterOpenIDConnectPresetSpec contains the OpenIDConnect specification and project selector matching Shoots in Projects.\n Field Description OpenIDConnectPresetSpec OpenIDConnectPresetSpec (Members of OpenIDConnectPresetSpec are embedded into this type.) projectSelector Kubernetes meta/v1.LabelSelector (Optional) Project decides whether to apply the configuration if the Shoot is in a specific Project matching the label selector. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Defaults to the empty LabelSelector, which matches everything.\n KubeAPIServerOpenIDConnect (Appears on: OpenIDConnectPresetSpec) KubeAPIServerOpenIDConnect contains configuration settings for the OIDC provider. Note: Descriptions were taken from the Kubernetes documentation.\n Field Description caBundle string (Optional) If set, the OpenID server’s certificate will be verified by one of the authorities in the oidc-ca-file, otherwise the host’s root CA set will be used.\n clientID string The client ID for the OpenID Connect client. Required.\n groupsClaim string (Optional) If provided, the name of a custom OpenID Connect claim for specifying user groups. The claim value is expected to be a string or array of strings. This field is experimental, please see the authentication documentation for further details.\n groupsPrefix string (Optional) If provided, all groups will be prefixed with this value to prevent conflicts with other authentication strategies.\n issuerURL string The URL of the OpenID issuer, only HTTPS scheme will be accepted. If set, it will be used to verify the OIDC JSON Web Token (JWT). Required.\n requiredClaims map[string]string (Optional) key=value pairs that describes a required claim in the ID Token. If set, the claim is verified to be present in the ID Token with a matching value.\n signingAlgs []string (Optional) List of allowed JOSE asymmetric signing algorithms. JWTs with a ‘alg’ header value not in this list will be rejected. Values are defined by RFC 7518 https://tools.ietf.org/html/rfc7518#section-3.1 Defaults to [RS256]\n usernameClaim string (Optional) The OpenID claim to use as the user name. Note that claims other than the default (‘sub’) is not guaranteed to be unique and immutable. This field is experimental, please see the authentication documentation for further details. Defaults to “sub”.\n usernamePrefix string (Optional) If provided, all usernames will be prefixed with this value. If not provided, username claims other than ‘email’ are prefixed by the issuer URL to avoid clashes. To skip any prefixing, provide the value ‘-’.\n OpenIDConnectClientAuthentication (Appears on: OpenIDConnectPresetSpec) OpenIDConnectClientAuthentication contains configuration for OIDC clients.\n Field Description secret string (Optional) The client Secret for the OpenID Connect client.\n extraConfig map[string]string (Optional) Extra configuration added to kubeconfig’s auth-provider. Must not be any of idp-issuer-url, client-id, client-secret, idp-certificate-authority, idp-certificate-authority-data, id-token or refresh-token\n OpenIDConnectPresetSpec (Appears on: OpenIDConnectPreset, ClusterOpenIDConnectPresetSpec) OpenIDConnectPresetSpec contains the Shoot selector for which a specific OpenID Connect configuration is applied.\n Field Description server KubeAPIServerOpenIDConnect Server contains the kube-apiserver’s OpenID Connect configuration. This configuration is not overwriting any existing OpenID Connect configuration already set on the Shoot object.\n client OpenIDConnectClientAuthentication (Optional) Client contains the configuration used for client OIDC authentication of Shoot clusters. This configuration is not overwriting any existing OpenID Connect client authentication already set on the Shoot object.\nDeprecated: The OpenID Connect configuration this field specifies is not used and will be forbidden starting from Kubernetes 1.31. It’s use was planned for genereting OIDC kubeconfig https://github.com/gardener/gardener/issues/1433 TODO(AleksandarSavchev): Drop this field after support for Kubernetes 1.30 is dropped.\n shootSelector Kubernetes meta/v1.LabelSelector (Optional) ShootSelector decides whether to apply the configuration if the Shoot has matching labels. Use the selector only if the OIDC Preset is opt-in, because end users may skip the admission by setting the labels. Default to the empty LabelSelector, which matches everything.\n weight int32 Weight associated with matching the corresponding preset, in the range 1-100. Required.\n Generated with gen-crd-api-reference-docs \n","categories":"","description":"","excerpt":"Packages:\n settings.gardener.cloud/v1alpha1 …","ref":"/docs/gardener/api-reference/settings/","tags":"","title":"Settings"},{"body":"Deploying Gardener into a Kubernetes Cluster Similar to Kubernetes, Gardener consists out of control plane components (Gardener API server, Gardener controller manager, Gardener scheduler), and an agent component (gardenlet). The control plane is deployed in the so-called garden cluster, while the agent is installed into every seed cluster. Please note that it is possible to use the garden cluster as seed cluster by simply deploying the gardenlet into it.\nWe are providing Helm charts in order to manage the various resources of the components. Please always make sure that you use the Helm chart version that matches the Gardener version you want to deploy.\nDeploying the Gardener Control Plane (API Server, Admission Controller, Controller Manager, Scheduler) The configuration values depict the various options to configure the different components. Please consult Gardener Configuration and Usage for component specific configurations and Authentication of Gardener Control Plane Components Against the Garden Cluster for authentication related specifics.\nAlso, note that all resources and deployments need to be created in the garden namespace (not overrideable). If you enable the Gardener admission controller as part of you setup, please make sure the garden namespace is labelled with app: gardener. Otherwise, the backing service account for the admission controller Pod might not be created successfully. No action is necessary if you deploy the garden namespace with the Gardener control plane Helm chart.\nAfter preparing your values in a separate controlplane-values.yaml file (values.yaml can be used as starting point), you can run the following command against your garden cluster:\nhelm install charts/gardener/controlplane \\ --namespace garden \\ --name gardener-controlplane \\ -f controlplane-values.yaml \\ --wait Deploying Gardener Extensions Gardener is an extensible system that does not contain the logic for provider-specific things like DNS management, cloud infrastructures, network plugins, operating system configs, and many more.\nYou have to install extension controllers for these parts. Please consult the documentation regarding extensions to get more information.\nDeploying the Gardener Agent (gardenlet) Please refer to Deploying Gardenlets on how to deploy a gardenlet.\n","categories":"","description":"","excerpt":"Deploying Gardener into a Kubernetes Cluster Similar to Kubernetes, …","ref":"/docs/gardener/deployment/setup_gardener/","tags":"","title":"Setup Gardener"},{"body":"Auto-Scaling in Shoot Clusters There are three auto-scaling scenarios of relevance in Kubernetes clusters in general and Gardener shoot clusters in particular:\n Horizontal node auto-scaling, i.e., dynamically adding and removing worker nodes. Horizontal pod auto-scaling, i.e., dynamically adding and removing pod replicas. Vertical pod auto-scaling, i.e., dynamically raising or shrinking the resource requests/limits of pods. This document provides an overview of these scenarios and how the respective auto-scaling components can be enabled and configured. For more details, please see our pod auto-scaling best practices.\nHorizontal Node Auto-Scaling Every shoot cluster that has at least one worker pool with minimum \u003c maximum nodes configuration will get a cluster-autoscaler deployment. Gardener is leveraging the upstream community Kubernetes cluster-autoscaler component. We have forked it to gardener/autoscaler so that it supports the way how Gardener manages the worker nodes (leveraging gardener/machine-controller-manager). However, we have not touched the logic how it performs auto-scaling decisions. Consequently, please refer to the official documentation for this component.\nThe Shoot API allows to configure a few flags of the cluster-autoscaler:\nThere are general options for cluster-autoscaler, and these values will be used for all worker groups except for those overwriting them. Additionally, there are some cluster-autoscaler flags to be set per worker pool. They override any general value such as those specified in the general flags above.\n Only some cluster-autoscaler flags can be configured per worker pool, and is limited by NodeGroupAutoscalingOptions of the upstream community Kubernetes repository. This list can be found here.\n Horizontal Pod Auto-Scaling This functionality (HPA) is a standard functionality of any Kubernetes cluster (implemented as part of the kube-controller-manager that all Kubernetes clusters have). It is always enabled.\nThe Shoot API allows to configure most of the flags of the horizontal-pod-autoscaler.\nVertical Pod Auto-Scaling This form of auto-scaling (VPA) is enabled by default, but it can be switched off in the Shoot by setting .spec.kubernetes.verticalPodAutoscaler.enabled=false in case you deploy your own VPA into your cluster (having more than one VPA on the same set of pods will lead to issues, eventually).\nGardener is leveraging the upstream community Kubernetes vertical-pod-autoscaler. If enabled, Gardener will deploy it as part of the control plane into the seed cluster. It will also be used for the vertical autoscaling of Gardener’s system components deployed into the kube-system namespace of shoot clusters, for example, kube-proxy or metrics-server.\nYou might want to refer to the official documentation for this component to get more information how to use it.\nThe Shoot API allows to configure a few flags of the vertical-pod-autoscaler.\n⚠️ Please note that if you disable VPA, the related CustomResourceDefinitions (ours and yours) will remain in your shoot cluster (whether someone acts on them or not). You can delete these CustomResourceDefinitions yourself using kubectl delete crd if you want to get rid of them (in case you statically size all resources, which we do not recommend).\nPod Auto-Scaling Best Practices Please continue reading our pod auto-scaling best practices for more details and recommendations.\n","categories":"","description":"The basics of horizontal Node and vertical Pod auto-scaling","excerpt":"The basics of horizontal Node and vertical Pod auto-scaling","ref":"/docs/gardener/shoot_autoscaling/","tags":"","title":"Shoot Autoscaling"},{"body":"Overview Day two operations for shoot clusters are related to:\n The Kubernetes version of the control plane and the worker nodes The operating system version of the worker nodes Note When referring to an update of the “operating system version” in this document, the update of the machine image of the shoot cluster’s worker nodes is meant. For example, Amazon Machine Images (AMI) for AWS. The following table summarizes what options Gardener offers to maintain these versions:\n Auto-Update Forceful Updates Manual Updates Kubernetes version Patches only Patches and consecutive minor updates only yes Operating system version yes yes yes Allowed Target Versions in the CloudProfile Administrators maintain the allowed target versions that you can update to in the CloudProfile for each IaaS-Provider. Users with access to a Gardener project can check supported target versions with:\nkubectl get cloudprofile [IAAS-SPECIFIC-PROFILE] -o yaml Path Description More Information spec.kubernetes.versions The supported Kubernetes version major.minor.patch. Patch releases spec.machineImages The supported operating system versions for worker nodes Both the Kubernetes version and the operating system version follow semantic versioning that allows Gardener to handle updates automatically.\nFor more information, see Semantic Versioning.\nImpact of Version Classifications on Updates Gardener allows to classify versions in the CloudProfile as preview, supported, deprecated, or expired. During maintenance operations, preview versions are excluded from updates, because they’re often recently released versions that haven’t yet undergone thorough testing and may contain bugs or security issues.\nFor more information, see Version Classifications.\nLet Gardener Manage Your Updates The Maintenance Window Gardener can manage updates for you automatically. It offers users to specify a maintenance window during which updates are scheduled:\n The time interval of the maintenance window can’t be less than 30 minutes or more than 6 hours. If there’s no maintenance window specified during the creation of a shoot cluster, Gardener chooses a maintenance window randomly to spread the load. You can either specify the maintenance window in the shoot cluster specification (.spec.maintenance.timeWindow) or the start time of the maintenance window using the Gardener dashboard (CLUSTERS \u003e [YOUR-CLUSTER] \u003e OVERVIEW \u003e Lifecycle \u003e Maintenance).\nAuto-Update and Forceful Updates To trigger updates during the maintenance window automatically, Gardener offers the following methods:\n Auto-update: Gardener starts an update during the next maintenance window whenever there’s a version available in the CloudProfile that is higher than the one of your shoot cluster specification, and that isn’t classified as preview version. For Kubernetes versions, auto-update only updates to higher patch levels.\nYou can either activate auto-update on the Gardener dashboard (CLUSTERS \u003e [YOUR-CLUSTER] \u003e OVERVIEW \u003e Lifecycle \u003e Maintenance) or in the shoot cluster specification:\n .spec.maintenance.autoUpdate.kubernetesVersion: true .spec.maintenance.autoUpdate.machineImageVersion: true Forceful updates: In the maintenance window, Gardener compares the current version given in the shoot cluster specification with the version list in the CloudProfile. If the version has an expiration date and if the date is before the start of the maintenance window, Gardener starts an update to the highest version available in the CloudProfile that isn’t classified as preview version. The highest version in CloudProfile can’t have an expiration date. For Kubernetes versions, Gardener only updates to higher patch levels or consecutive minor versions.\n If you don’t want to wait for the next maintenance window, you can annotate the shoot cluster specification with shoot.gardener.cloud/operation: maintain. Gardener then checks immediately if there’s an auto-update or a forceful update needed.\nNote Forceful version updates are executed even if the auto-update for the Kubernetes version(or the auto-update for the machine image version) is deactivated (set to false). With expiration dates, administrators can give shoot cluster owners more time for testing before the actual version update happens, which allows for smoother transitions to new versions.\nKubernetes Update Paths The bigger the delta of the Kubernetes source version and the Kubernetes target version, the better it must be planned and executed by operators. Gardener only provides automatic support for updates that can be applied safely to the cluster workload:\n Update Type Example Update Method Patches 1.10.12 to 1.10.13 auto-update or Forceful update Update to consecutive minor version 1.10.12 to 1.11.10 Forceful update Other 1.10.12 to 1.12.0 Manual update Gardener doesn’t support automatic updates of nonconsecutive minor versions, because Kubernetes doesn’t guarantee updateability in this case. However, multiple minor version updates are possible if not only the minor source version is expired, but also the minor target version is expired. Gardener then updates the Kubernetes version first to the expired target version, and waits for the next maintenance window to update this version to the next minor target version.\nWarning The administrator who maintains the CloudProfile has to ensure that the list of Kubernetes versions consists of consecutive minor versions, for example, from 1.10.x to 1.11.y. If the minor version increases in bigger steps, for example, from 1.10.x to 1.12.y, then the shoot cluster updates will fail during the maintenance window. Manual Updates To update the Kubernetes version or the node operating system manually, change the .spec.kubernetes.version field or the .spec.provider.workers.machine.image.version field correspondingly.\nManual updates are required if you would like to do a minor update of the Kubernetes version. Gardener doesn’t do such updates automatically, as they can have breaking changes that could impact the cluster workload.\nManual updates are either executed immediately (default) or can be confined to the maintenance time window.\nChoosing the latter option causes changes to the cluster (for example, node pool rolling-updates) and the subsequent reconciliation to only predictably happen during a defined time window (available since Gardener version 1.4).\nFor more information, see Confine Specification Changes/Update Roll Out.\nWarning Before applying such an update on minor or major releases, operators should check for all the breaking changes introduced in the target Kubernetes release changelog. Examples In the examples for the CloudProfile and the shoot cluster specification, only the fields relevant for the example are shown.\nAuto-Update of Kubernetes Version Let’s assume that the Kubernetes versions 1.10.5 and 1.11.0 were added in the following CloudProfile:\nspec: kubernetes: versions: - version: 1.11.0 - version: 1.10.5 - version: 1.10.0 Before this change, the shoot cluster specification looked like this:\nspec: kubernetes: version: 1.10.0 maintenance: timeWindow: begin: 220000+0000 end: 230000+0000 autoUpdate: kubernetesVersion: true As a consequence, the shoot cluster is updated to Kubernetes version 1.10.5 between 22:00-23:00 UTC. Your shoot cluster isn’t updated automatically to 1.11.0, even though it’s the highest Kubernetes version in the CloudProfile, because Gardener only does automatic updates of the Kubernetes patch level.\nForceful Update Due to Expired Kubernetes Version Let’s assume the following CloudProfile exists on the cluster:\nspec: kubernetes: versions: - version: 1.12.8 - version: 1.11.10 - version: 1.10.13 - version: 1.10.12 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: kubernetes: version: 1.10.12 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: false The shoot cluster specification refers to a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-12, the Kubernetes version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14, the Kubernetes version of the shoot cluster is updated to 1.10.13 (independently of the value of .spec.maintenance.autoUpdate.kubernetesVersion).\nForceful Update to New Minor Kubernetes Version Let’s assume the following CloudProfile exists on the cluster:\nspec: kubernetes: versions: - version: 1.12.8 - version: 1.11.10 - version: 1.11.09 - version: 1.10.12 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: kubernetes: version: 1.10.12 maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: kubernetesVersion: false The shoot cluster specification refers a Kubernetes version that has an expirationDate. In the maintenance window on 2019-04-14, the Kubernetes version of the shoot cluster is updated to 1.11.10, which is the highest patch version of minor target version 1.11 that follows the source version 1.10.\nAutomatic Update from Expired Machine Image Version Let’s assume the following CloudProfile exists on the cluster:\nspec: machineImages: - name: coreos versions: - version: 2191.5.0 - version: 2191.4.1 - version: 2135.6.0 expirationDate: \"2019-04-13T08:00:00Z\" Let’s assume the shoot cluster has the following specification:\nspec: provider: type: aws workers: - name: name maximum: 1 minimum: 1 maxSurge: 1 maxUnavailable: 0 image: name: coreos version: 2135.6.0 type: m5.large volume: type: gp2 size: 20Gi maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 autoUpdate: machineImageVersion: false The shoot cluster specification refers a machine image version that has an expirationDate. In the maintenance window on 2019-04-12, the machine image version stays the same as it’s still not expired. But in the maintenance window on 2019-04-14, the machine image version of the shoot cluster is updated to 2191.5.0 (independently of the value of .spec.maintenance.autoUpdate.machineImageVersion) as version 2135.6.0 is expired.\n","categories":"","description":"Understanding and configuring Gardener's Day-2 operations for Shoot clusters.","excerpt":"Understanding and configuring Gardener's Day-2 operations for Shoot …","ref":"/docs/guides/administer-shoots/maintain-shoot/","tags":"","title":"Shoot Cluster Maintenance"},{"body":"Shoot Cluster Purpose The Shoot resource contains a .spec.purpose field indicating how the shoot is used, whose allowed values are as follows:\n evaluation (default): Indicates that the shoot cluster is for evaluation scenarios. development: Indicates that the shoot cluster is for development scenarios. testing: Indicates that the shoot cluster is for testing scenarios. production: Indicates that the shoot cluster is for production scenarios. infrastructure: Indicates that the shoot cluster is for infrastructure scenarios (only allowed for shoots in the garden namespace). Behavioral Differences The following enlists the differences in the way the shoot clusters are set up based on the selected purpose:\n testing shoot clusters do not get a monitoring or a logging stack as part of their control planes. for production and infrastructure shoot clusters auto-scaling scale down of the main ETCD is disabled. There are also differences with respect to how testing shoots are scheduled after creation, please consult the Scheduler documentation.\nFuture Steps We might introduce more behavioral difference depending on the shoot purpose in the future. As of today, there are no plans yet.\n","categories":"","description":"Available Shoot cluster purposes and the behavioral differences between them","excerpt":"Available Shoot cluster purposes and the behavioral differences …","ref":"/docs/gardener/shoot_purposes/","tags":"","title":"Shoot Cluster Purposes"},{"body":"Credentials Rotation for Shoot Clusters There are a lot of different credentials for Shoots to make sure that the various components can communicate with each other and to make sure it is usable and operable.\nThis page explains how the varieties of credentials can be rotated so that the cluster can be considered secure.\nUser-Provided Credentials Cloud Provider Keys End-users must provide credentials such that Gardener and Kubernetes controllers can communicate with the respective cloud provider APIs in order to perform infrastructure operations. For example, Gardener uses them to setup and maintain the networks, security groups, subnets, etc., while the cloud-controller-manager uses them to reconcile load balancers and routes, and the CSI controller uses them to reconcile volumes and disks.\nDepending on the cloud provider, the required data keys of the Secret differ. Please consult the documentation of the respective provider extension documentation to get to know the concrete data keys (e.g., this document for AWS).\nIt is the responsibility of the end-user to regularly rotate those credentials. The following steps are required to perform the rotation:\n Update the data in the Secret with new credentials. ⚠️ Wait until all Shoots using the Secret are reconciled before you disable the old credentials in your cloud provider account! Otherwise, the Shoots will no longer work as expected. Check out this document to learn how to trigger a reconciliation of your Shoots. After all Shoots using the Secret were reconciled, you can go ahead and deactivate the old credentials in your provider account. Gardener-Provided Credentials The below credentials are generated by Gardener when shoot clusters are being created. Those include:\n kubeconfig (if enabled) certificate authorities (and related server and client certificates) observability passwords for Plutono SSH key pair for worker nodes ETCD encryption key ServiceAccount token signing key … 🚨 There is no auto-rotation of those credentials, and it is the responsibility of the end-user to regularly rotate them.\nWhile it is possible to rotate them one by one, there is also a convenient method to combine the rotation of all of those credentials. The rotation happens in two phases since it might be required to update some API clients (e.g., when CAs are rotated). In order to start the rotation (first phase), you have to annotate the shoot with the rotate-credentials-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-start Note: You can check the .status.credentials.rotation field in the Shoot to see when the rotation was last initiated and last completed.\n Kindly consider the detailed descriptions below to learn how the rotation is performed and what your responsibilities are. Please note that all respective individual actions apply for this combined rotation as well (e.g., worker nodes are rolled out in the first phase).\nYou can complete the rotation (second phase) by annotating the shoot with the rotate-credentials-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-credentials-complete Kubeconfig If the .spec.kubernetes.enableStaticTokenKubeconfig field is set to true (default), then Gardener generates a kubeconfig with cluster-admin privileges for the Shoots containing credentials for communication with the kube-apiserver (see this document for more information).\nThis Secret is stored with the name \u003cshoot-name\u003e.kubeconfig in the project namespace in the garden cluster and has multiple data keys:\n kubeconfig: the completed kubeconfig ca.crt: the CA bundle for establishing trust to the API server (same as in the Cluster CA bundle secret) Shoots created with Gardener \u003c= 0.28 used to have a kubeconfig based on a client certificate instead of a static token. With the first kubeconfig rotation, such clusters will get a static token as well.\n⚠️ This does not invalidate the old client certificate. In order to do this, you should perform a rotation of the CAs (see section below).\n It is the responsibility of the end-user to regularly rotate those credentials (or disable this kubeconfig entirely). In order to rotate the token in this kubeconfig, annotate the Shoot with gardener.cloud/operation=rotate-kubeconfig-credentials. This operation is not allowed for Shoots that are already marked for deletion. Please note that only the token (and basic auth password, if enabled) are exchanged. The CA certificate remains the same (see section below for information about the rotation).\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-kubeconfig-credentials You can check the .status.credentials.rotation.kubeconfig field in the Shoot to see when the rotation was last initiated and last completed.\n Certificate Authorities Gardener generates several certificate authorities (CAs) to ensure secured communication between the various components and actors. Most of those CAs are used for internal communication (e.g., kube-apiserver talks to etcd, vpn-shoot talks to the vpn-seed-server, kubelet talks to kube-apiserver). However, there is also the “cluster CA” which is part of all kubeconfigs and used to sign the server certificate exposed by the kube-apiserver.\nGardener populates a ConfigMap with the name \u003cshoot-name\u003e.ca-cluster in the project namespace in the garden cluster which contains the following data keys:\n ca.crt: the CA bundle of the cluster This bundle contains one or multiple CAs which are used for signing serving certificates of the Shoot’s API server. Hence, the certificates contained in this ConfigMap can be used to verify the API server’s identity when communicating with its public endpoint (e.g., as certificate-authority-data in a kubeconfig). This is the same certificate that is also contained in the kubeconfig’s certificate-authority-data field.\n Shoots created with Gardener \u003e= v1.45 have a dedicated client CA which verifies the legitimacy of client certificates. For older Shoots, the client CA is equal to the cluster CA. With the first CA rotation, such clusters will get a dedicated client CA as well.\n All of the certificates are valid for 10 years. Since it requires adaptation for the consumers of the Shoot, there is no automatic rotation and it is the responsibility of the end-user to regularly rotate the CA certificates.\nThe rotation happens in three stages (see also GEP-18 for the full details):\n In stage one, new CAs are created and added to the bundle (together with the old CAs). Client certificates are re-issued immediately. In stage two, end-users update all cluster API clients that communicate with the control plane. In stage three, the old CAs are dropped from the bundle and server certificate are re-issued. Technically, the Preparing phase indicates stage one. Once it is completed, the Prepared phase indicates readiness for stage two. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.certificateAuthorities field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-ca-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ca-start This will trigger a Shoot reconciliation and performs stage one. After it is completed, the .status.credentials.rotation.certificateAuthorities.phase is set to Prepared.\nNow you must update all API clients outside the cluster (such as the kubeconfigs on developer machines) to use the newly issued CA bundle in the \u003cshoot-name\u003e.ca-cluster ConfigMap. Please also note that client certificates must be re-issued now.\nAfter updating all API clients, you can complete the rotation by annotating the shoot with the rotate-ca-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ca-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.certificateAuthorities.phase is set to Completed. You could update your API clients again and drop the old CA from their bundle.\n Note that the CA rotation also rotates all internal CAs and signed certificates. Hence, most of the components need to be restarted (including etcd and kube-apiserver).\n⚠️ In stage one, all worker nodes of the Shoot will be rolled out to ensure that the Pods as well as the kubelets get the updated credentials as well.\n Observability Password(s) For Plutono and Prometheus For Shoots with .spec.purpose!=testing, Gardener deploys an observability stack with Prometheus for monitoring, Alertmanager for alerting (optional), Vali for logging, and Plutono for visualization. The Plutono instance is exposed via Ingress and accessible for end-users via basic authentication credentials generated and managed by Gardener.\nThose credentials are stored in a Secret with the name \u003cshoot-name\u003e.monitoring in the project namespace in the garden cluster and has multiple data keys:\n username: the user name password: the password auth: the user name with SHA-1 representation of the password It is the responsibility of the end-user to regularly rotate those credentials. In order to rotate the password, annotate the Shoot with gardener.cloud/operation=rotate-observability-credentials. This operation is not allowed for Shoots that are already marked for deletion.\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-observability-credentials You can check the .status.credentials.rotation.observability field in the Shoot to see when the rotation was last initiated and last completed.\n SSH Key Pair for Worker Nodes Gardener generates an SSH key pair whose public key is propagated to all worker nodes of the Shoot. The private key can be used to establish an SSH connection to the workers for troubleshooting purposes. It is recommended to use gardenctl-v2 and its gardenctl ssh command since it is required to first open up the security groups and create a bastion VM (no direct SSH access to the worker nodes is possible).\nThe private key is stored in a Secret with the name \u003cshoot-name\u003e.ssh-keypair in the project namespace in the garden cluster and has multiple data keys:\n id_rsa: the private key id_rsa.pub: the public key for SSH In order to rotate the keys, annotate the Shoot with gardener.cloud/operation=rotate-ssh-keypair. This will propagate a new key to all worker nodes while keeping the old key active and valid as well (it will only be invalidated/removed with the next rotation).\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-ssh-keypair You can check the .status.credentials.rotation.sshKeypair field in the Shoot to see when the rotation was last initiated or last completed.\n The old key is stored in a Secret with the name \u003cshoot-name\u003e.ssh-keypair.old in the project namespace in the garden cluster and has the same data keys as the regular Secret.\nETCD Encryption Key This key is used to encrypt the data of Secret resources inside etcd (see upstream Kubernetes documentation).\nThe encryption key has no expiration date. There is no automatic rotation and it is the responsibility of the end-user to regularly rotate the encryption key.\nThe rotation happens in three stages:\n In stage one, a new encryption key is created and added to the bundle (together with the old encryption key). In stage two, all Secrets in the cluster and resources configured in the spec.kubernetes.kubeAPIServer.encryptionConfig of the Shoot (see ETCD Encryption Config) are rewritten by the kube-apiserver so that they become encrypted with the new encryption key. In stage three, the old encryption is dropped from the bundle. Technically, the Preparing phase indicates the stages one and two. Once it is completed, the Prepared phase indicates readiness for stage three. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.etcdEncryptionKey field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-etcd-encryption-key-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-etcd-encryption-key-start This will trigger a Shoot reconciliation and performs the stages one and two. After it is completed, the .status.credentials.rotation.etcdEncryptionKey.phase is set to Prepared. Now you can complete the rotation by annotating the shoot with the rotate-etcd-encryption-key-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-etcd-encryption-key-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.etcdEncryptionKey.phase is set to Completed.\nServiceAccount Token Signing Key Gardener generates a key which is used to sign the tokens for ServiceAccounts. Those tokens are typically used by workload Pods running inside the cluster in order to authenticate themselves with the kube-apiserver. This also includes system components running in the kube-system namespace.\nThe token signing key has no expiration date. Since it might require adaptation for the consumers of the Shoot, there is no automatic rotation and it is the responsibility of the end-user to regularly rotate the signing key.\nThe rotation happens in three stages, similar to how the CA certificates are rotated:\n In stage one, a new signing key is created and added to the bundle (together with the old signing key). In stage two, end-users update all out-of-cluster API clients that communicate with the control plane via ServiceAccount tokens. In stage three, the old signing key is dropped from the bundle. Technically, the Preparing phase indicates stage one. Once it is completed, the Prepared phase indicates readiness for stage two. The Completing phase indicates stage three, and the Completed phase states that the rotation process has finished.\n You can check the .status.credentials.rotation.serviceAccountKey field in the Shoot to see when the rotation was last initiated, last completed, and in which phase it currently is.\n In order to start the rotation (stage one), you have to annotate the shoot with the rotate-serviceaccount-key-start operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-serviceaccount-key-start This will trigger a Shoot reconciliation and performs stage one. After it is completed, the .status.credentials.rotation.serviceAccountKey.phase is set to Prepared.\nNow you must update all API clients outside the cluster using a ServiceAccount token (such as the kubeconfigs on developer machines) to use a token issued by the new signing key. Gardener already generates new secrets for those ServiceAccounts in the cluster, whose static token was automatically created by Kubernetes (typically before v1.22 - ref) However, if you need to create it manually, you can check out this document for instructions.\nAfter updating all API clients, you can complete the rotation by annotating the shoot with the rotate-serviceaccount-key-complete operation:\nkubectl -n \u003cshoot-namespace\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=rotate-serviceaccount-key-complete This will trigger another Shoot reconciliation and performs stage three. After it is completed, the .status.credentials.rotation.serviceAccountKey.phase is set to Completed.\n ⚠️ In stage one, all worker nodes of the Shoot will be rolled out to ensure that the Pods use a new token.\n OpenVPN TLS Auth Keys This key is used to ensure encrypted communication for the VPN connection between the control plane in the seed cluster and the shoot cluster. It is currently not rotated automatically and there is no way to trigger it manually.\n","categories":"","description":"","excerpt":"Credentials Rotation for Shoot Clusters There are a lot of different …","ref":"/docs/gardener/shoot_credentials_rotation/","tags":"","title":"Shoot Credentials Rotation"},{"body":"Introduction This extension implements cosign image verification. It is strictly limited only to the kubernetes system components deployed by Gardener and other Gardener Extensions in the kube-system namespace of a shoot cluster.\nShoot Feature Gate In most of the Gardener setups the shoot-lakom-service extension is enabled globally and thus can be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to disable the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-lakom-service disabled: true providerConfig: apiVersion: lakom.extensions.gardener.cloud/v1alpha1 kind: LakomConfig scope: KubeSystem ... The scope field instruct lakom which pods to validate. The possible values are:\n KubeSystem Lakom will validate all pods in the kube-system namespace. KubeSystemManagedByGardener Lakom will validate all pods in the kube-system namespace that are annotated with “managed-by/gardener” Cluster Lakom will validate all pods in all namespaces. ","categories":"","description":"","excerpt":"Introduction This extension implements cosign image verification. It …","ref":"/docs/extensions/others/gardener-extension-shoot-lakom-service/shoot-extension/","tags":"","title":"Shoot Extension"},{"body":"Contributing to Shoot Health Status Conditions Gardener checks regularly (every minute by default) the health status of all shoot clusters. It categorizes its checks into five different types:\n APIServerAvailable: This type indicates whether the shoot’s kube-apiserver is available or not. ControlPlaneHealthy: This type indicates whether the core components of the Shoot controlplane (ETCD, KAPI, KCM..) are healthy. EveryNodeReady: This type indicates whether all Nodes and all Machine objects report healthiness. ObservabilityComponentsHealthy: This type indicates whether the observability components of the Shoot control plane (Prometheus, Vali, Plutono..) are healthy. SystemComponentsHealthy: This type indicates whether all system components deployed to the kube-system namespace in the shoot do exist and are running fine. In case of workerless Shoot, EveryNodeReady condition is not present in the Shoot’s conditions since there are no nodes in the cluster.\nEvery Shoot resource has a status.conditions[] list that contains the mentioned types, together with a status (True/False) and a descriptive message/explanation of the status.\nMost extension controllers are deploying components and resources as part of their reconciliation flows into the seed or shoot cluster. A prominent example for this is the ControlPlane controller that usually deploys a cloud-controller-manager or CSI controllers as part of the shoot control plane. Now that the extensions deploy resources into the cluster, especially resources that are essential for the functionality of the cluster, they might want to contribute to Gardener’s checks mentioned above.\nWhat can extensions do to contribute to Gardener’s health checks? Every extension resource in Gardener’s extensions.gardener.cloud/v1alpha1 API group also has a status.conditions[] list (like the Shoot). Extension controllers can write conditions to the resource they are acting on and use a type that also exists in the shoot’s conditions. One exception is that APIServerAvailable can’t be used, as Gardener clearly can identify the status of this condition and it doesn’t make sense for extensions to try to contribute/modify it.\nAs an example for the ControlPlane controller, let’s take a look at the following resource:\napiVersion: extensions.gardener.cloud/v1alpha1 kind: ControlPlane metadata: name: control-plane namespace: shoot--foo--bar spec: ... status: conditions: - type: ControlPlaneHealthy status: \"False\" reason: DeploymentUnhealthy message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:44:27Z\" - type: ConfigComputedSuccessfully status: \"True\" reason: ConfigCreated message: The cloud-provider-config has been successfully computed. lastUpdateTime: \"2014-05-25T12:43:27Z\" The extension controller has declared in its extension resource that one of the deployments it is responsible for is unhealthy. Also, it has written a second condition using a type that is unknown by Gardener.\nGardener will pick the list of conditions and recognize that there is one with a type ControlPlaneHealthy. It will merge it with its own ControlPlaneHealthy condition and report it back to the Shoot’s status:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: labels: shoot.gardener.cloud/status: unhealthy name: some-shoot namespace: garden-core spec: status: conditions: - type: APIServerAvailable status: \"True\" reason: HealthzRequestSucceeded message: API server /healthz endpoint responded with success status code. [response_time:31ms] lastUpdateTime: \"2014-05-23T08:26:52Z\" lastTransitionTime: \"2014-05-25T12:45:13Z\" - type: ControlPlaneHealthy status: \"False\" reason: ControlPlaneUnhealthyReport message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:45:13Z\" lastTransitionTime: \"2014-05-25T12:45:13Z\" ... Hence, the only duty extensions have is to maintain the health status of their components in the extension resource they are managing. This can be accomplished using the health check library for extensions.\nError Codes The Gardener API includes some well-defined error codes, e.g., ERR_INFRA_UNAUTHORIZED, ERR_INFRA_DEPENDENCIES, etc. Extension may set these error codes in the .status.conditions[].codes[] list in case it makes sense. Gardener will pick them up and will similarly merge them into the .status.conditions[].codes[] list in the Shoot:\nstatus: conditions: - type: ControlPlaneHealthy status: \"False\" reason: DeploymentUnhealthy message: 'Deployment cloud-controller-manager is unhealthy: condition \"Available\" has invalid status False (expected True) due to MinimumReplicasUnavailable: Deployment does not have minimum availability.' lastUpdateTime: \"2014-05-25T12:44:27Z\" codes: - ERR_INFRA_UNAUTHORIZED ","categories":"","description":"","excerpt":"Contributing to Shoot Health Status Conditions Gardener checks …","ref":"/docs/gardener/extensions/shoot-health-status-conditions/","tags":"","title":"Shoot Health Status Conditions"},{"body":"Shoot Hibernation Clusters are only needed 24 hours a day if they run productive workload. So whenever you do development in a cluster, or just use it for tests or demo purposes, you can save a lot of money if you scale-down your Kubernetes resources whenever you don’t need them. However, scaling them down manually can become time-consuming the more resources you have.\nGardener offers a clever way to automatically scale-down all resources to zero: cluster hibernation. You can either hibernate a cluster by pushing a button, or by defining a hibernation schedule.\n To save costs, it’s recommended to define a hibernation schedule before the creation of a cluster. You can hibernate your cluster or wake up your cluster manually even if there’s a schedule for its hibernation.\n Hibernate a Cluster What Is Hibernation? What Isn’t Affected by the Hibernation? Hibernate Your Cluster Manually Wake Up Your Cluster Manually Create a Schedule to Hibernate Your Cluster What Is Hibernation? When a cluster is hibernated, Gardener scales down the worker nodes and the cluster’s control plane to free resources at the IaaS provider. This affects:\n Your workload, for example, pods, deployments, custom resources. The virtual machines running your workload. The resources of the control plane of your cluster. What Isn’t Affected by the Hibernation? To scale up everything where it was before hibernation, Gardener doesn’t delete state-related information, that is, information stored in persistent volumes. The cluster state as persistent in etcd is also preserved.\nHibernate Your Cluster Manually The .spec.hibernation.enabled field specifies whether the cluster needs to be hibernated or not. If the field is set to true, the cluster’s desired state is to be hibernated. If it is set to false or not specified at all, the cluster’s desired state is to be awakened.\nTo hibernate your cluster, you can run the following kubectl command:\n$ kubectl patch shoot -n $NAMESPACE $SHOOT_NAME -p '{\"spec\":{\"hibernation\":{\"enabled\": true}}}' Wake Up Your Cluster Manually To wake up your cluster, you can run the following kubectl command:\n$ kubectl patch shoot -n $NAMESPACE $SHOOT_NAME -p '{\"spec\":{\"hibernation\":{\"enabled\": false}}}' Create a Schedule to Hibernate Your Cluster You can specify a hibernation schedule to automatically hibernate/wake up a cluster.\nLet’s have a look into the following example:\n hibernation: enabled: false schedules: - start: \"0 20 * * *\" # Start hibernation every day at 8PM end: \"0 6 * * *\" # Stop hibernation every day at 6AM location: \"America/Los_Angeles\" # Specify a location for the cron to run in The above section configures a hibernation schedule that hibernates the cluster every day at 08:00 PM and wakes it up at 06:00 AM. The start or end fields can be omitted, though at least one of them has to be specified. Hence, it is possible to configure a hibernation schedule that only hibernates or wakes up a cluster. The location field is the time location used to evaluate the cron expressions.\n","categories":"","description":"What is hibernation? Manual hibernation/wake up and specifying a hibernation schedule","excerpt":"What is hibernation? Manual hibernation/wake up and specifying a …","ref":"/docs/gardener/shoot_hibernate/","tags":"","title":"Shoot Hibernation"},{"body":"Highly Available Shoot Control Plane Shoot resource offers a way to request for a highly available control plane.\nFailure Tolerance Types A highly available shoot control plane can be setup with either a failure tolerance of zone or node.\nNode Failure Tolerance The failure tolerance of a node will have the following characteristics:\n Control plane components will be spread across different nodes within a single availability zone. There will not be more than one replica per node for each control plane component which has more than one replica. Worker pool should have a minimum of 3 nodes. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different node within a single availability zone. Zone Failure Tolerance The failure tolerance of a zone will have the following characteristics:\n Control plane components will be spread across different availability zones. There will be at least one replica per zone for each control plane component which has more than one replica. Gardener scheduler will automatically select a seed which has a minimum of 3 zones to host the shoot control plane. A multi-node etcd (quorum size of 3) will be provisioned, offering zero-downtime capabilities with each member in a different zone. Shoot Spec To request for a highly available shoot control plane Gardener provides the following configuration in the shoot spec:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: controlPlane: highAvailability: failureTolerance: type: \u003cnode | zone\u003e Allowed Transitions\nIf you already have a shoot cluster with non-HA control plane, then the following upgrades are possible:\n Upgrade of non-HA shoot control plane to HA shoot control plane with node failure tolerance. Upgrade of non-HA shoot control plane to HA shoot control plane with zone failure tolerance. However, it is essential that the seed which is currently hosting the shoot control plane should be multi-zonal. If it is not, then the request to upgrade will be rejected. Note: There will be a small downtime during the upgrade, especially for etcd, which will transition from a single node etcd cluster to a multi-node etcd cluster.\n Disallowed Transitions\nIf you already have a shoot cluster with HA control plane, then the following transitions are not possible:\n Upgrade of HA shoot control plane from node failure tolerance to zone failure tolerance is currently not supported, mainly because already existing volumes are bound to the zone they were created in originally. Downgrade of HA shoot control plane with zone failure tolerance to node failure tolerance is currently not supported, mainly because of the same reason as above, that already existing volumes are bound to the respective zones they were created in originally. Downgrade of HA shoot control plane with either node or zone failure tolerance, to a non-HA shoot control plane is currently not supported, mainly because etcd-druid does not currently support scaling down of a multi-node etcd cluster to a single-node etcd cluster. Zone Outage Situation Implementing highly available software that can tolerate even a zone outage unscathed is no trivial task. You may find our HA Best Practices helpful to get closer to that goal. In this document, we collected many options and settings for you that also Gardener internally uses to provide a highly available service.\nDuring a zone outage, you may be forced to change your cluster setup on short notice in order to compensate for failures and shortages resulting from the outage. For instance, if the shoot cluster has worker nodes across three zones where one zone goes down, the computing power from these nodes is also gone during that time. Changing the worker pool (shoot.spec.provider.workers[]) and infrastructure (shoot.spec.provider.infrastructureConfig) configuration can eliminate this disbalance, having enough machines in healthy availability zones that can cope with the requests of your applications.\nGardener relies on a sophisticated reconciliation flow with several dependencies for which various flow steps wait for the readiness of prior ones. During a zone outage, this can block the entire flow, e.g., because all three etcd replicas can never be ready when a zone is down, and required changes mentioned above can never be accomplished. For this, a special one-off annotation shoot.gardener.cloud/skip-readiness helps to skip any readiness checks in the flow.\n The shoot.gardener.cloud/skip-readiness annotation serves as a last resort if reconciliation is stuck because of important changes during an AZ outage. Use it with caution, only in exceptional cases and after a case-by-case evaluation with your Gardener landscape administrator. If used together with other operations like Kubernetes version upgrades or credential rotation, the annotation may lead to a severe outage of your shoot control plane.\n ","categories":"","description":"Failure tolerance types `node` and `zone`. Possible mitigations for zone or node outages","excerpt":"Failure tolerance types `node` and `zone`. Possible mitigations for …","ref":"/docs/gardener/shoot_high_availability/","tags":"","title":"Shoot High Availability"},{"body":"Shoot Info ConfigMap Overview The gardenlet maintains a ConfigMap inside the Shoot cluster that contains information about the cluster itself. The ConfigMap is named shoot-info and located in the kube-system namespace.\nFields The following fields are provided:\napiVersion: v1 kind: ConfigMap metadata: name: shoot-info namespace: kube-system data: domain: crazy-botany.core.my-custom-domain.com # .spec.dns.domain field from the Shoot resource extensions: foobar,foobaz # List of extensions that are enabled kubernetesVersion: 1.25.4 # .spec.kubernetes.version field from the Shoot resource maintenanceBegin: 220000+0100 # .spec.maintenance.timeWindow.begin field from the Shoot resource maintenanceEnd: 230000+0100 # .spec.maintenance.timeWindow.end field from the Shoot resource nodeNetwork: 10.250.0.0/16 # .spec.networking.nodes field from the Shoot resource podNetwork: 100.96.0.0/11 # .spec.networking.pods field from the Shoot resource projectName: dev # .metadata.name of the Project provider: \u003csome-provider-name\u003e # .spec.provider.type field from the Shoot resource region: europe-central-1 # .spec.region field from the Shoot resource serviceNetwork: 100.64.0.0/13 # .spec.networking.services field from the Shoot resource shootName: crazy-botany # .metadata.name from the Shoot resource ","categories":"","description":"","excerpt":"Shoot Info ConfigMap Overview The gardenlet maintains a ConfigMap …","ref":"/docs/gardener/shoot_info_configmap/","tags":"","title":"Shoot Info Configmap"},{"body":"Shoot Kubernetes and Operating System Versioning in Gardener Motivation On the one hand-side, Gardener is responsible for managing the Kubernetes and the Operating System (OS) versions of its Shoot clusters. On the other hand-side, Gardener needs to be configured and updated based on the availability and support of the Kubernetes and Operating System version it provides. For instance, the Kubernetes community releases minor versions roughly every three months and usually maintains three minor versions (the current and the last two) with bug fixes and security updates. Patch releases are done more frequently.\nWhen using the term Machine image in the following, we refer to the OS version that comes with the machine image of the node/worker pool of a Gardener Shoot cluster. As such, we are not referring to the CloudProvider specific machine image like the AMI for AWS. For more information on how Gardener maps machine image versions to CloudProvider specific machine images, take a look at the individual gardener extension providers, such as the provider for AWS.\nGardener should be configured accordingly to reflect the “logical state” of a version. It should be possible to define the Kubernetes or Machine image versions that still receive bug fixes and security patches, and also vice-versa to define the version that are out-of-maintenance and are potentially vulnerable. Moreover, this allows Gardener to “understand” the current state of a version and act upon it (more information in the following sections).\nOverview As a Gardener operator:\n I can classify a version based on it’s logical state (preview, supported, deprecated, and expired; see Version Classification). I can define which Machine image and Kubernetes versions are eligible for the auto update of clusters during the maintenance time. I can define a moment in time when Shoot clusters are forcefully migrated off a certain version (through an expirationDate). I can define an update path for machine images for auto and force updates; see Update path for machine image versions). I can disallow the creation of clusters having a certain version (think of severe security issues). As an end-user/Shoot owner of Gardener:\n I can get information about which Kubernetes and Machine image versions exist and their classification. I can determine the time when my Shoot clusters Machine image and Kubernetes version will be forcefully updated to the next patch or minor version (in case the cluster is running a deprecated version with an expiration date). I can get this information via API from the CloudProfile. Version Classifications Administrators can classify versions into four distinct “logical states”: preview, supported, deprecated, and expired. The version classification serves as a “point-of-reference” for end-users and also has implications during shoot creation and the maintenance time.\nIf a version is unclassified, Gardener cannot make those decision based on the “logical state”. Nevertheless, Gardener can operate without version classifications and can be added at any time to the Kubernetes and machine image versions in the CloudProfile.\nAs a best practice, versions usually start with the classification preview, then are promoted to supported, eventually deprecated and finally expired. This information is programmatically available in the CloudProfiles of the Garden cluster.\n preview: A preview version is a new version that has not yet undergone thorough testing, possibly a new release, and needs time to be validated. Due to its short early age, there is a higher probability of undiscovered issues and is therefore not yet recommended for production usage. A Shoot does not update (neither auto-update or force-update) to a preview version during the maintenance time. Also, preview versions are not considered for the defaulting to the highest available version when deliberately omitting the patch version during Shoot creation. Typically, after a fresh release of a new Kubernetes (e.g., v1.25.0) or Machine image version (e.g., suse-chost 15.4.20220818), the operator tags it as preview until they have gained sufficient experience and regards this version to be reliable. After the operator has gained sufficient trust, the version can be manually promoted to supported.\n supported: A supported version is the recommended version for new and existing Shoot clusters. This is the version that new Shoot clusters should use and existing clusters should update to. Typically for Kubernetes versions, the latest Kubernetes patch versions of the actual (if not still in preview) and the last 3 minor Kubernetes versions are maintained by the community. An operator could define these versions as being supported (e.g., v1.27.6, v1.26.10, and v1.25.12).\n deprecated: A deprecated version is a version that approaches the end of its lifecycle and can contain issues which are probably resolved in a supported version. New Shoots should not use this version anymore. Existing Shoots will be updated to a newer version if auto-update is enabled (.spec.maintenance.autoUpdate.kubernetesVersion for Kubernetes version auto-update, or .spec.maintenance.autoUpdate.machineImageVersion for machine image version auto-update). Using automatic upgrades, however, does not guarantee that a Shoot runs a non-deprecated version, as the latest version (overall or of the minor version) can be deprecated as well. Deprecated versions should have an expiration date set for eventual expiration.\n expired: An expired versions has an expiration date (based on the Golang time package) in the past. New clusters with that version cannot be created and existing clusters are forcefully migrated to a higher version during the maintenance time.\n Below is an example how the relevant section of the CloudProfile might look like:\napiVersion: core.gardener.cloud/v1beta1 kind: CloudProfile metadata: name: alicloud spec: kubernetes: versions: - classification: preview version: 1.27.0 - classification: preview version: 1.26.3 - classification: supported version: 1.26.2 - classification: preview version: 1.25.5 - classification: supported version: 1.25.4 - classification: supported version: 1.24.6 - classification: deprecated expirationDate: \"2022-11-30T23:59:59Z\" version: 1.24.5 Automatic Version Upgrades There are two ways, the Kubernetes version of the control plane as well as the Kubernetes and machine image version of a worker pool can be upgraded: auto update and forceful update. See Automatic Version Updates for how to enable auto updates for Kubernetes or machine image versions on the Shoot cluster.\nIf a Shoot is running a version after its expiration date has passed, it will be forcefully updated during its maintenance time. This happens even if the owner has opted out of automatic cluster updates!\nWhen an auto update is triggered?:\n The Shoot has auto-update enabled and the version is not the latest eligible version for the auto-update. Please note that this latest version that qualifies for an auto-update is not necessarily the overall latest version in the CloudProfile: For Kubernetes version, the latest eligible version for auto-updates is the latest patch version of the current minor. For machine image version, the latest eligible version for auto-updates is controlled by the updateStrategy field of the machine image in the CloudProfile. The Shoot has auto-update disabled and the version is either expired or does not exist. The auto update can fail if the version is already on the latest eligible version for the auto-update. A failed auto update triggers a force update. The force and auto update path for Kubernetes and machine image versions differ slightly and are described in more detail below.\nUpdate rules for both Kubernetes and machine image versions\n Both auto and force update first try to update to the latest patch version of the same minor. An auto update prefers supported versions over deprecated versions. If there is a lower supported version and a higher deprecated version, auto update will pick the supported version. If all qualifying versions are deprecated, update to the latest deprecated version. An auto update never updates to an expired version. A force update prefers to update to not-expired versions. If all qualifying versions are expired, update to the latest expired version. Please note that therefore multiple consecutive version upgrades are possible. In this case, the version is again upgraded in the next maintenance time. Update path for machine image versions Administrators can define three different update strategies (field updateStrategy) for machine images in the CloudProfile: patch, minor, major (default). This is to accommodate the different version schemes of Operating Systems (e.g. Gardenlinux only updates major and minor versions with occasional patches).\n patch: update to the latest patch version of the current minor version. When using an expired version: force update to the latest patch of the current minor. If already on the latest patch version, then force update to the next higher (not necessarily +1) minor version. minor: update to the latest minor and patch version. When using an expired version: force update to the latest minor and patch of the current major. If already on the latest minor and patch of the current major, then update to the next higher (not necessarily +1) major version. major: always update to the overall latest version. This is the legacy behavior for automatic machine image version upgrades. Force updates are not possible and will fail if the latest version in the CloudProfile for that image is expired (EOL scenario). Example configuration in the CloudProfile:\nmachineImages: - name: gardenlinux updateStrategy: minor versions: - version: 1096.1.0 - version: 934.8.0 - version: 934.7.0 - name: suse-chost updateStrategy: patch versions: - version: 15.3.20220818 - version: 15.3.20221118 Please note that force updates for machine images can skip minor versions (strategy: patch) or major versions (strategy: minor) if the next minor/major version has no qualifying versions (only preview versions).\nUpdate path for Kubernetes versions For Kubernetes versions, the auto update picks the latest non-preview patch version of the current minor version.\nIf the cluster is already on the latest patch version and the latest patch version is also expired, it will continue with the latest patch version of the next consecutive minor (minor +1) Kubernetes version, so it will result in an update of a minor Kubernetes version!\nKubernetes “minor version jumps” are not allowed - meaning to skip the update to the consecutive minor version and directly update to any version after that. For instance, the version 1.24.x can only update to a version 1.25.x, not to 1.26.x or any other version. This is because Kubernetes does not guarantee upgradability in this case, leading to possibly broken Shoot clusters. The administrator has to set up the CloudProfile in such a way that consecutive Kubernetes minor versions are available. Otherwise, Shoot clusters will fail to upgrade during the maintenance time.\nConsider the CloudProfile below with a Shoot using the Kubernetes version 1.24.12. Even though the version is expired, due to missing 1.25.x versions, the Gardener Controller Manager cannot upgrade the Shoot’s Kubernetes version.\nspec: kubernetes: versions: - version: 1.26.10 - version: 1.26.9 - version: 1.24.12 expirationDate: \"\u003cexpiration date in the past\u003e\" The CloudProfile must specify versions 1.25.x of the consecutive minor version. Configuring the CloudProfile in such a way, the Shoot’s Kubernetes version will be upgraded to version 1.25.10 in the next maintenance time.\nspec: kubernetes: versions: - version: 1.26.9 - version: 1.25.10 - version: 1.25.9 - version: 1.24.12 expirationDate: \"\u003cexpiration date in the past\u003e\" Version Requirements (Kubernetes and Machine Image) The Gardener API server enforces the following requirements for versions:\n A version that is in use by a Shoot cannot be deleted from the CloudProfile. Creating a new version with expiration date in the past is not allowed. There can be only one supported version per minor version. The latest Kubernetes version cannot have an expiration date. NOTE: The latest version for a machine image can have an expiration date. [*] [*] Useful for cases in which support for a given machine image needs to be deprecated and removed (for example, the machine image reaches end of life).\nRelated Documentation You might want to read about the Shoot Updates and Upgrades procedures to get to know the effects of such operations.\n","categories":"","description":"","excerpt":"Shoot Kubernetes and Operating System Versioning in Gardener …","ref":"/docs/gardener/shoot_versions/","tags":"","title":"Shoot Kubernetes and Operating System Versioning in Gardener"},{"body":"Shoot Maintenance There is a general document about shoot maintenance that you might want to read. Here, we describe how you can influence certain operations that happen during a shoot maintenance.\nRestart Control Plane Controllers As outlined in the above linked document, Gardener offers to restart certain control plane controllers running in the seed during a shoot maintenance.\nExtension controllers can extend the amount of pods being affected by these restarts. If your Gardener extension manages pods of a shoot’s control plane (shoot namespace in seed) and it could potentially profit from a regular restart, please consider labeling it with maintenance.gardener.cloud/restart=true.\n","categories":"","description":"","excerpt":"Shoot Maintenance There is a general document about shoot maintenance …","ref":"/docs/gardener/extensions/shoot-maintenance/","tags":"","title":"Shoot Maintenance"},{"body":"Shoot Maintenance Shoots configure a maintenance time window in which Gardener performs certain operations that may restart the control plane, roll out the nodes, result in higher network traffic, etc. A summary of what was changed in the last maintenance time window in shoot specification is kept in the shoot status .status.lastMaintenance field.\nThis document outlines what happens during a shoot maintenance.\nTime Window Via the .spec.maintenance.timeWindow field in the shoot specification, end-users can configure the time window in which maintenance operations are executed. Gardener runs one maintenance operation per day in this time window:\nspec: maintenance: timeWindow: begin: 220000+0100 end: 230000+0100 The offset (+0100) is considered with respect to UTC time. The minimum time window is 30m and the maximum is 6h.\n⚠️ Please note that there is no guarantee that a maintenance operation that, e.g., starts a node roll-out will finish within the time window. Especially for large clusters, it may take several hours until a graceful rolling update of the worker nodes succeeds (also depending on the workload and the configured pod disruption budgets/termination grace periods).\nInternally, Gardener is subtracting 15m from the end of the time window to (best-effort) try to finish the maintenance until the end is reached, however, this might not work in all cases.\nIf you don’t specify a time window, then Gardener will randomly compute it. You can change it later, of course.\nAutomatic Version Updates The .spec.maintenance.autoUpdate field in the shoot specification allows you to control how/whether automatic updates of Kubernetes patch and machine image versions are performed. Machine image versions are updated per worker pool.\nspec: maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true During the daily maintenance, the Gardener Controller Manager updates the Shoot’s Kubernetes and machine image version if any of the following criteria applies:\n There is a higher version available and the Shoot opted-in for automatic version updates. The currently used version is expired. The target version for machine image upgrades is controlled by the updateStrategy field for the machine image in the CloudProfile. Allowed update strategies are patch, minor and major.\nGardener (gardener-controller-manager) populates the lastMaintenance field in the Shoot status with the maintenance results.\nLast Maintenance: Description: \"All maintenance operations successful. Control Plane: Updated Kubernetes version from 1.26.4 to 1.27.1. Reason: Kubernetes version expired - force update required\" State: Succeeded Triggered Time: 2023-07-28T09:07:27Z Additionally, Gardener creates events with the type MachineImageVersionMaintenance or KubernetesVersionMaintenance on the Shoot describing the action performed during maintenance, including the reason why an update has been triggered.\nLAST SEEN TYPE REASON OBJECT MESSAGE 30m Normal MachineImageVersionMaintenance shoot/local Worker pool \"local\": Updated image from 'gardenlinux' version 'xy' to version 'abc'. Reason: Automatic update of the machine image version is configured (image update strategy: major). 30m Normal KubernetesVersionMaintenance shoot/local Control Plane: Updated Kubernetes version from \"1.26.4\" to \"1.27.1\". Reason: Kubernetes version expired - force update required. 15m Normal KubernetesVersionMaintenance shoot/local Worker pool \"local\": Updated Kubernetes version '1.26.3' to version '1.27.1'. Reason: Kubernetes version expired - force update required. If at least one maintenance operation fails, the lastMaintenance field in the Shoot status is set to Failed:\nLast Maintenance: Description: \"(1/2) maintenance operations successful: Control Plane: Updated Kubernetes version from 1.26.4 to 1.27.1. Reason: Kubernetes version expired - force update required, Worker pool x: 'gardenlinux' machine image version maintenance failed. Reason for update: machine image version expired\" FailureReason: \"Worker pool x: either the machine image 'gardenlinux' is reaching end of life and migration to another machine image is required or there is a misconfiguration in the CloudProfile.\" State: Failed Triggered Time: 2023-07-28T09:07:27Z Please refer to the Shoot Kubernetes and Operating System Versioning in Gardener topic for more information about Kubernetes and machine image versions in Gardener.\nCluster Reconciliation Gardener administrators/operators can configure the gardenlet in a way that it only reconciles shoot clusters during their maintenance time windows. This behaviour is not controllable by end-users but might make sense for large Gardener installations. Concretely, your shoot will be reconciled regularly during its maintenance time window. Outside of the maintenance time window it will only reconcile if you change the specification or if you explicitly trigger it, see also Trigger Shoot Operations.\nConfine Specification Changes/Updates Roll Out Via the .spec.maintenance.confineSpecUpdateRollout field you can control whether you want to make Gardener roll out changes/updates to your shoot specification only during the maintenance time window. It is false by default, i.e., any change to your shoot specification triggers a reconciliation (even outside of the maintenance time window). This is helpful if you want to update your shoot but don’t want the changes to be applied immediately. One example use-case would be a Kubernetes version upgrade that you want to roll out during the maintenance time window. Any update to the specification will not increase the .metadata.generation of the Shoot, which is something you should be aware of. Also, even if Gardener administrators/operators have not enabled the “reconciliation in maintenance time window only” configuration (as mentioned above), then your shoot will only reconcile in the maintenance time window. The reason is that Gardener cannot differentiate between create/update/reconcile operations.\n⚠️ If confineSpecUpdateRollout=true, please note that if you change the maintenance time window itself, then it will only be effective after the upcoming maintenance.\n⚠️ As exceptions to the above rules, manually triggered reconciliations and changes to the .spec.hibernation.enabled field trigger immediate rollouts. I.e., if you hibernate or wake-up your shoot, or you explicitly tell Gardener to reconcile your shoot, then Gardener gets active right away.\nShoot Operations In case you would like to perform a shoot credential rotation or a reconcile operation during your maintenance time window, you can annotate the Shoot with\nmaintenance.gardener.cloud/operation=\u003coperation\u003e This will execute the specified \u003coperation\u003e during the next maintenance reconciliation. Note that Gardener will remove this annotation after it has been performed in the maintenance reconciliation.\n ⚠️ This is skipped when the Shoot’s .status.lastOperation.state=Failed. Make sure to retry your shoot reconciliation beforehand.\n Special Operations During Maintenance The shoot maintenance controller triggers special operations that are performed as part of the shoot reconciliation.\nInfrastructure and DNSRecord Reconciliation The reconciliation of the Infrastructure and DNSRecord extension resources is only demanded during the shoot’s maintenance time window. The rationale behind it is to prevent sending too many requests against the cloud provider APIs, especially on large landscapes or if a user has many shoot clusters in the same cloud provider account.\nRestart Control Plane Controllers Gardener operators can make Gardener restart/delete certain control plane pods during a shoot maintenance. This feature helps to automatically solve service denials of controllers due to stale caches, dead-locks or starving routines.\nPlease note that these are exceptional cases but they are observed from time to time. Gardener, for example, takes this precautionary measure for kube-controller-manager pods.\nSee Shoot Maintenance to see how extension developers can extend this behaviour.\nRestart Some Core Addons Gardener operators can make Gardener restart some core addons (at the moment only CoreDNS) during a shoot maintenance.\nCoreDNS benefits from this feature as it automatically solve problems with clients stuck to single replica of the deployment and thus overloading it. Please note that these are exceptional cases but they are observed from time to time.\n","categories":"","description":"Defining the maintenance time window, configuring automatic version updates, confining reconciliations to only happen during maintenance, adding an additional maintenance operation, etc.","excerpt":"Defining the maintenance time window, configuring automatic version …","ref":"/docs/gardener/shoot_maintenance/","tags":"","title":"Shoot Maintenance"},{"body":"Shoot Networking Configurations This document contains network related information for Shoot clusters.\nPod Network A Pod network is imperative for any kind of cluster communication with Pods not started within the Node’s host network. More information about the Kubernetes network model can be found in the Cluster Networking topic.\nGardener allows users to configure the Pod network’s CIDR during Shoot creation:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: networking: type: \u003csome-network-extension-name\u003e # {calico,cilium} pods: 100.96.0.0/16 nodes: ... services: ... ⚠️ The networking.pods IP configuration is immutable and cannot be changed afterwards. Please consider the following paragraph to choose a configuration which will meet your demands.\n One of the network plugin’s (CNI) tasks is to assign IP addresses to Pods started in the Pod network. Different network plugins come with different IP address management (IPAM) features, so we can’t give any definite advice how IP ranges should be configured. Nevertheless, we want to outline the standard configuration.\nInformation in .spec.networking.pods matches the –cluster-cidr flag of the Kube-Controller-Manager of your Shoot cluster. This IP range is divided into smaller subnets, also called podCIDRs (default mask /24) and assigned to Node objects .spec.podCIDR. Pods get their IP address from this smaller node subnet in a default IPAM setup. Thus, it must be guaranteed that enough of these subnets can be created for the maximum amount of nodes you expect in the cluster.\nExample 1\nPod network: 100.96.0.0/16 nodeCIDRMaskSize: /24 ------------------------- Number of podCIDRs: 256 --\u003e max. Node count Number of IPs per podCIDRs: 256 With the configuration above a Shoot cluster can at most have 256 nodes which are ready to run workload in the Pod network.\nExample 2\nPod network: 100.96.0.0/20 nodeCIDRMaskSize: /24 ------------------------- Number of podCIDRs: 16 --\u003e max. Node count Number of IPs per podCIDRs: 256 With the configuration above a Shoot cluster can at most have 16 nodes which are ready to run workload in the Pod network.\nBeside the configuration in .spec.networking.pods, users can tune the nodeCIDRMaskSize used by Kube-Controller-Manager on shoot creation. A smaller IP range per node means more podCIDRs and thus the ability to provision more nodes in the cluster, but less available IPs for Pods running on each of the nodes.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubeControllerManager: nodeCIDRMaskSize: 24 (default) ⚠️ The nodeCIDRMaskSize configuration is immutable and cannot be changed afterwards.\n Example 3\nPod network: 100.96.0.0/20 nodeCIDRMaskSize: /25 ------------------------- Number of podCIDRs: 32 --\u003e max. Node count Number of IPs per podCIDRs: 128 With the configuration above, a Shoot cluster can at most have 32 nodes which are ready to run workload in the Pod network.\n","categories":"","description":"Configuring Pod network. Maximum number of Nodes and Pods per Node","excerpt":"Configuring Pod network. Maximum number of Nodes and Pods per Node","ref":"/docs/gardener/shoot_networking/","tags":"","title":"Shoot Networking Configurations"},{"body":"Register Shoot Networking Filter Extension in Shoot Clusters Introduction Within a shoot cluster, it is possible to enable the networking filter. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-networking-filter extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-networking-filter extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-networking-filter ... Opt-out If the shoot networking filter is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter disabled: true ... Ingress Filtering By default, the networking filter only filters egress traffic. However, if you enable blackholing, incoming traffic will also be blocked. You can enable blackholing on a per-shoot basis.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter providerConfig: egressFilter: blackholingEnabled: true ... Ingress traffic can only be blocked by blackhole routing, if the source IP address is preserved. On Azure, GCP and AliCloud this works by default. The default on AWS is a classic load balancer that replaces the source IP by it’s own IP address. Here, a network load balancer has to be configured adding the annotation service.beta.kubernetes.io/aws-load-balancer-type: \"nlb\" to the service. On OpenStack, load balancers don’t preserve the source address.\nPlease note that if you disable blackholing in an existing shoot, the associated blackhole routes will not be removed automatically. To remove these routes, you can either replace the affected nodes or delete the routes manually.\nCustom IP It is possible to add custom IP addresses to the network filter. This can be useful for testing purposes.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-filter providerConfig: egressFilter: staticFilterList: - network: 1.2.3.4/31 policy: BLOCK_ACCESS - network: 5.6.7.8/32 policy: BLOCK_ACCESS - network: ::2/128 policy: BLOCK_ACCESS ... ","categories":"","description":"","excerpt":"Register Shoot Networking Filter Extension in Shoot Clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-filter/shoot-networking-filter/","tags":"","title":"Shoot Networking Filter"},{"body":"Register Shoot Networking Filter Extension in Shoot Clusters Introduction Within a shoot cluster, it is possible to enable the network problem detector. It is necessary that the Gardener installation your shoot cluster runs in is equipped with a shoot-networking-problemdetector extension. Please ask your Gardener operator if the extension is available in your environment.\nShoot Feature Gate In most of the Gardener setups the shoot-networking-problemdetector extension is not enabled globally and thus must be configured per shoot cluster. Please adapt the shoot specification by the configuration shown below to activate the extension individually.\nkind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector ... Opt-out If the shoot network problem detector is globally enabled by default, it can be disabled per shoot. To disable the service for a shoot, the shoot manifest must explicitly state it.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot ... spec: extensions: - type: shoot-networking-problemdetector disabled: true ... ","categories":"","description":"","excerpt":"Register Shoot Networking Filter Extension in Shoot Clusters …","ref":"/docs/extensions/others/gardener-extension-shoot-networking-problemdetector/shoot-networking-problemdetector/","tags":"","title":"Shoot Networking Problemdetector"},{"body":"Enable / disable overlay network for shoots with Calico Gardener can be used with or without the overlay network.\nStarting versions:\n provider-gcp@v1.25.0 provider-alicloud@v1.43.0 provider-aws@v1.38.2 provider-openstack@v1.30.0 The default configuration of shoot clusters is without overlay network.\nUnderstanding overlay network The Overlay networking permits the routing of packets between multiples pods located on multiple nodes, even if the pod and the node network are not the same.\nThis is done through the encapsulation of pod packets in the node network so that the routing can be done as usual. We use ipip encapsulation with calico in case the overlay network is enabled. This (simply put) sends an IP packet as workload in another IP packet.\nIn order to simplify the troubleshooting of problems and reduce the latency of packets traveling between nodes, the overlay network is disabled by default as stated above for all new clusters.\nThis means that the routing is done directly through the VPC routing table. Basically, when a new node is created, it is assigned a slice (usually a /24) of the pod network. All future pods in that node are going to be in this slice. Then, the cloud-controller-manager updates the cloud provider router to add the new route (all packets within the network slice as destination should go to that node).\nThis has the advantage of:\n Doing less work for the node as encapsulation takes some CPU cycles. The maximum transmission unit (MTU) is slightly bigger resulting in slightly better performance, i.e. potentially more workload bytes per packet. More direct and simpler setup, which makes the problems much easier to troubleshoot. In the case where multiple shoots are in the same VPC and the overlay network is disabled, if the pod’s network is not configured properly, there is a very strong chance that some pod IP address might overlap, which is going to cause all sorts of funny problems. So, if someone asks you how to avoid that, they need to make sure that the podCIDRs for each shoot do not overlap with each other.\nEnabling the overlay network In certain cases, the overlay network might be preferable if, for example, the customer wants to create multiple clusters in the same VPC without ensuring there’s no overlap between the pod networks.\nTo enable the overlay network, add the following to the shoot’s YAML:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: ... spec: ... networking: type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: true ... Disabling the overlay network Inversely, here is how to disable the overlay network:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: ... spec: ... networking: type: calico providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig overlay: enabled: false ... How to know if a cluster is using overlay or not? You can look at any of the old nodes. If there are tunl0 devices at least at some point in time the overlay network was used. Another way is to look into the Network object in the shoot’s control plane namespace on the seed (see example above).\nDo we have some documentation somewhere on how to do the migration? No, not yet. The migration from no overlay to overlay is fairly simply by just setting the configuration as specified above. The other way is more complicated as the Network configuration needs to be changed AND the local routes need to be cleaned. Unfortunately, the change will be rolled out slowly (one calico-node at a time). Hence, it implies some network outages during the migration.\nAWS implementation On AWS, it is not possible to use the cloud-controller-manager for managing the routes as it does not support multiple route tables, which Gardener creates. Therefore, a custom controller is created to manage the routes.\n","categories":"","description":"","excerpt":"Enable / disable overlay network for shoots with Calico Gardener can …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/shoot_overlay_network/","tags":"","title":"Shoot Overlay Network"},{"body":"Introduction There are two types of pod autoscaling in Kubernetes: Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA). HPA (implemented as part of the kube-controller-manager) scales the number of pod replicas, while VPA (implemented as independent community project) adjusts the CPU and memory requests for the pods. Both types of autoscaling aim to optimize resource usage/costs and maintain the performance and (high) availability of applications running on Kubernetes.\nHorizontal Pod Autoscaling (HPA) Horizontal Pod Autoscaling involves increasing or decreasing the number of pod replicas in a deployment, replica set, stateful set, or anything really with a scale subresource that manages pods. HPA adjusts the number of replicas based on specified metrics, such as CPU or memory average utilization (usage divided by requests; most common) or average value (usage; less common). When the demand on your application increases, HPA automatically scales out the number of pods to meet the demand. Conversely, when the demand decreases, it scales in the number of pods to reduce resource usage.\nHPA targets (mostly stateless) applications where adding more instances of the application can linearly increase the ability to handle additional load. It is very useful for applications that experience variable traffic patterns, as it allows for real-time scaling without the need for manual intervention.\n [!NOTE] HPA continuously monitors the metrics of the targeted pods and adjusts the number of replicas based on the observed metrics. It operates solely on the current metrics when it calculates the averages across all pods, meaning it reacts to the immediate resource usage without considering past trends or patterns. Also, all pods are treated equally based on the average metrics. This could potentially lead to situations where some pods are under high load while others are underutilized. Therefore, particular care must be applied to (fair) load-balancing (connection vs. request vs. actual resource load balancing are crucial).\n A Few Words on the Cluster-Proportional (Horizontal) Autoscaler (CPA) and the Cluster-Proportional Vertical Autoscaler (CPVA) Besides HPA and VPA, CPA and CPVA are further options for scaling horizontally or vertically (neither is deployed by Gardener and must be deployed by the user). Unlike HPA and VPA, CPA and CPVA do not monitor the actual pod metrics, but scale solely on the number of nodes or CPU cores in the cluster. While this approach may be helpful and sufficient in a few rare cases, it is often a risky and crude scaling scheme that we do not recommend. More often than not, cluster-proportional scaling results in either under- or over-reserving your resources.\nVertical Pod Autoscaling (VPA) Vertical Pod Autoscaling, on the other hand, focuses on adjusting the CPU and memory resources allocated to the pods themselves. Instead of changing the number of replicas, VPA tweaks the resource requests (and limits, but only proportionally, if configured) for the pods in a deployment, replica set, stateful set, daemon set, or anything really with a scale subresource that manages pods. This means that each pod can be given more, or fewer resources as needed.\nVPA is very useful for optimizing the resource requests of pods that have dynamic resource needs over time. It does so by mutating pod requests (unfortunately, not in-place). Therefore, in order to apply new recommendations, pods that are “out of bounds” (i.e. below a configured/computed lower or above a configured/computed upper recommendation percentile) will be evicted proactively, but also pods that are “within bounds” may be evicted after a grace period. The corresponding higher-level replication controller will then recreate a new pod that VPA will then mutate to set the currently recommended requests (and proportional limits, if configured).\n [!NOTE] VPA continuously monitors all targeted pods and calculates recommendations based on their usage (one recommendation for the entire target). This calculation is influenced by configurable percentiles, with a greater emphasis on recent usage data and a gradual decrease (=decay) in the relevance of older data. However, this means, that VPA doesn’t take into account individual needs of single pods - eventually, all pods will receive the same recommendation, which may lead to considerable resource waste. Ideally, VPA would update pods in-place depending on their individual needs, but that’s (individual recommendations) not in its design, even if in-place updates get implemented, which may be years away for VPA based on current activity on the component.\n Selecting the Appropriate Autoscaler Before deciding on an autoscaling strategy, it’s important to understand the characteristics of your application:\n Interruptibility: Most importantly, if the clients of your workload are too sensitive to disruptions/cannot cope well with terminating pods, then maybe neither HPA nor VPA is an option (both, HPA and VPA cause pods and connections to be terminated, though VPA even more frequently). Clients must retry on disruptions, which is a reasonable ask in a highly dynamic (and self-healing) environment such as Kubernetes, but this is often not respected (or expected) by your clients (they may not know or care you run the workload in a Kubernetes cluster and have different expectations to the stability of the workload unless you communicated those through SLIs/SLOs/SLAs). Statelessness: Is your application stateless or stateful? Stateless applications are typically better candidates for HPA as they can be easily scaled out by adding more replicas without worrying about maintaining state. Traffic Patterns: Does your application experience variable traffic? If so, HPA can help manage these fluctuations by adjusting the number of replicas to handle the load. Resource Usage: Does your application’s resource usage change over time? VPA can adjust the CPU and memory reservations dynamically, which is beneficial for applications with non-uniform resource requirements. Scalability: Can your application handle increased load by scaling vertically (more resources per pod) or does it require horizontal scaling (more pod instances)? HPA is the right choice if:\n Your application is stateless and can handle increased load by adding more instances. You experience short-term fluctuations in traffic that require quick scaling responses. You want to maintain a specific performance metric, such as requests per second per pod. VPA is the right choice if:\n Your application’s resource requirements change over time, and you want to optimize resource usage without manual intervention. You want to avoid the complexity of managing resource requests for each pod, especially when they run code where it’s impossible for you to suggest static requests. In essence:\n For applications that can handle increased load by simply adding more replicas, HPA should be used to handle short-term fluctuations in load by scaling the number of replicas. For applications that require more resources per pod to handle additional work, VPA should be used to adjust the resource allocation for longer-term trends in resource usage. Consequently, if both cases apply (VPA often applies), HPA and VPA can also be combined. However, combining both, especially on the same metrics (CPU and memory), requires understanding and care to avoid conflicts and ensure that the autoscaling actions do not interfere with and rather complement each other. For more details, see Combining HPA and VPA.\nHorizontal Pod Autoscaler (HPA) HPA operates by monitoring resource metrics for all pods in a target. It computes the desired number of replicas from the current average metrics and the desired user-defined metrics as follows:\ndesiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]\nHPA checks the metrics at regular intervals, which can be configured by the user. Several types of metrics are supported (classical resource metrics like CPU and memory, but also custom and external metrics like requests per second or queue length can be configured, if available). If a scaling event is necessary, HPA adjusts the replica count for the targeted resource.\nDefining an HPA Resource To configure HPA, you need to create an HPA resource in your cluster. This resource specifies the target to scale, the metrics to be used for scaling decisions, and the desired thresholds. Here’s an example of an HPA configuration:\napiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: foo-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: foo-deployment minReplicas: 1 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: AverageValue averageValue: 2 - type: Resource resource: name: memory target: type: AverageValue averageValue: 8G behavior: scaleUp: stabilizationWindowSeconds: 30 policies: - type: Percent value: 100 periodSeconds: 60 scaleDown: stabilizationWindowSeconds: 1800 policies: - type: Pods value: 1 periodSeconds: 300 In this example, HPA is configured to scale foo-deployment based on pod average CPU and memory usage. It will maintain an average CPU and memory usage (not utilization, which is usage divided by requests!) across all replicas of 2 CPUs and 8G or lower with as few replicas as possible. The number of replicas will be scaled between a minimum of 1 and a maximum of 10 based on this target.\nSince a while, you can also configure the autoscaling based on the resource usage of individual containers, not only on the resource usage of the entire pod. All you need to do is to switch the type from Resource to ContainerResource and specify the container name.\nIn the official documentation ([1] and [2]) you will find examples with average utilization (averageUtilization), not average usage (averageValue), but this is not particularly helpful, especially if you plan to combine HPA together with VPA on the same metrics (generally discouraged in the documentation). If you want to safely combine both on the same metrics, you should scale on average usage (averageValue) as shown above. For more details, see Combining HPA and VPA.\nFinally, the behavior section influences how fast you scale up and down. Most of the time (depends on your workload), you like to scale out faster than you scale in. In this example, the configuration will trigger a scale-out only after observing the need to scale out for 30s (stabilizationWindowSeconds) and will then only scale out at most 100% (value + type) of the current number of replicas every 60s (periodSeconds). The configuration will trigger a scale-in only after observing the need to scale in for 1800s (stabilizationWindowSeconds) and will then only scale in at most 1 pod (value + type) every 300s (periodSeconds). As you can see, scale-out happens quicker than scale-in in this example.\nHPA (actually KCM) Options HPA is a function of the kube-controller-manager (KCM).\nYou can read up the full KCM options online and set most of them conveniently in your Gardener shoot cluster spec:\n downscaleStabilization (default 5m): HPA will scale out whenever the formula (in accordance with the behavior section, if present in the HPA resource) yields a higher replica count, but it won’t scale in just as eagerly. This option lets you define a trailing time window that HPA must check and only if the recommended replica count is consistently lower throughout the entire time window, HPA will scale in (in accordance with the behavior section, if present in the HPA resource). If at any point in time in that trailing time window the recommended replica count isn’t lower, scale-in won’t happen. This setting is just a default, if nothing is defined in the behavior section of an HPA resource. The default for the upscale stabilization is 0s and it cannot be set via a KCM option (downscale stabilization was historically more important than upscale stabilization and when later the behavior sections were added to the HPA resources, upscale stabilization remained missing from the KCM options). tolerance (default +/-10%): HPA will not scale out or in if the desired replica count is (mathematically as a float) near the actual replica count (see source code for details), which is a form of hysteresis to avoid replica flapping around a threshold. There are a few more configurable options of lesser interest:\n syncPeriod (default 15s): How often HPA retrieves the pods and metrics respectively how often it recomputes and sets the desired replica count.\n cpuInitializationPeriod (default 30s) and initialReadinessDelay (default 5m): Both settings only affect whether or not CPU metrics are considered for scaling decisions. They can be easily misinterpreted as the official docs are somewhat hard to read (see source code for details, which is more readable, if you ignore the comments). Normally, you have little reason to modify them, but here is what they do:\n cpuInitializationPeriod: Defines a grace period after a pod starts during which HPA won’t consider CPU metrics of the pod for scaling if the pod is either not ready or it is ready, but a given CPU metric is older than the last state transition (to ready). This is to ignore CPU metrics that predate the current readiness while still in initialization to not make scaling decisions based on potentially misleading data. If the pod is ready and a CPU metric was collected after it became ready, it is considered also within this grace period. initialReadinessDelay: Defines another grace period after a pod starts during which HPA won’t consider CPU metrics of the pod for scaling if the pod is not ready and it became not ready within this grace period (the docs/comments want to check whether the pod was ever ready, but the code only checks whether the pod condition last transition time to not ready happened within that grace period which it could have from being ready or simply unknown before). This is to ignore not (ever have been) ready pods while still in initialization to not make scaling decisions based on potentially misleading data. If the pod is ready, it is considered also within this grace period. So, regardless of the values of these settings, if a pod is reporting ready and it has a CPU metric from the time after it became ready, that pod and its metric will be considered. This holds true even if the pod becomes ready very early into its initialization. These settings cannot be used to “black-out” pods for a certain duration before being considered for scaling decisions. Instead, if it is your goal to ignore a potentially resource-intensive initialization phase that could wrongly lead to further scale-out, you would need to configure your pods to not report as ready until that resource-intensive initialization phase is over.\n Considerations When Using HPA Selection of metrics: Besides CPU and memory, HPA can also target custom or external metrics. Pick those (in addition or exclusively), if you guarantee certain SLOs in your SLAs. Targeting usage or utilization: HPA supports usage (absolute) and utilization (relative). Utilization is often preferred in simple examples, but usage is more precise and versatile. Compatibility with VPA: Care must be taken when using HPA in conjunction with VPA, as they can potentially interfere with each other’s scaling decisions. Vertical Pod Autoscaler (VPA) VPA operates by monitoring resource metrics for all pods in a target. It computes a resource requests recommendation from the historic and current resource metrics. VPA checks the metrics at regular intervals, which can be configured by the user. Only CPU and memory are supported. If VPA detects that a pod’s resource allocation is too high or too low, it may evict pods (if within the permitted disruption budget), which will trigger the creation of a new pod by the corresponding higher-level replication controller, which will then be mutated by VPA to match resource requests recommendation. This happens in three different components that work together:\n VPA Recommender: The Recommender observes the historic and current resource metrics of pods and generates recommendations based on this data. VPA Updater: The Updater component checks the recommendations from the Recommender and decides whether any pod’s resource requests need to be updated. If an update is needed, the Updater will evict the pod. VPA Admission Controller: When a pod is (re-)created, the Admission Controller modifies the pod’s resource requests based on the recommendations from the Recommender. This ensures that the pod starts with the optimal amount of resources. Since VPA doesn’t support in-place updates, pods will be evicted. You will want to control voluntary evictions by means of Pod Disruption Budgets (PDBs). Please make yourself familiar with those and use them.\n [!NOTE] PDBs will not always work as expected and can also get in your way, e.g. if the PDB is violated or would be violated, it may possibly block evictions that would actually help your workload, e.g. to get a pod out of an OOMKilled CrashLoopBackoff (if the PDB is or would be violated, not even unhealthy pods would be evicted as they could theoretically become healthy again, which VPA doesn’t know). In order to overcome this issue, it is now possible (alpha since Kubernetes v1.26 in combination with the feature gate PDBUnhealthyPodEvictionPolicy on the API server, beta and enabled by default since Kubernetes v1.27) to configure the so-called unhealthy pod eviction policy. The default is still IfHealthyBudget as a change in default would have changed the behavior (as described above), but you can now also set AlwaysAllow at the PDB (spec.unhealthyPodEvictionPolicy). For more information, please check out this discussion, the PR and this document and balance the pros and cons for yourself. In short, the new AlwaysAllow option is probably the better choice in most of the cases while IfHealthyBudget is useful only if you have frequent temporary transitions or for special cases where you have already implemented controllers that depend on the old behavior.\n Defining a VPA Resource To configure VPA, you need to create a VPA resource in your cluster. This resource specifies the target to scale, the metrics to be used for scaling decisions, and the policies for resource updates. Here’s an example of an VPA configuration:\napiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: foo-vpa spec: targetRef: apiVersion: \"apps/v1\" kind: Deployment name: foo-deployment updatePolicy: updateMode: \"Auto\" resourcePolicy: containerPolicies: - containerName: foo-container controlledValues: RequestsOnly minAllowed: cpu: 50m memory: 200M maxAllowed: cpu: 4 memory: 16G In this example, VPA is configured to scale foo-deployment requests (RequestsOnly) from 50m cores (minAllowed) up to 4 cores (maxAllowed) and 200M memory (minAllowed) up to 16G memory (maxAllowed) automatically (updateMode). VPA doesn’t support in-place updates, so in updateMode Auto it will evict pods under certain conditions and then mutate the requests (and possibly limits if you omit controlledValues or set it to RequestsAndLimits, which is the default) of upcoming new pods.\nMultiple update modes exist. They influence eviction and mutation. The most important ones are:\n Off: In this mode, recommendations are computed, but never applied. This mode is useful, if you want to learn more about your workload or if you have a custom controller that depends on VPA’s recommendations but shall act instead of VPA. Initial: In this mode, recommendations are computed and applied, but pods are never proactively evicted to enforce new recommendations over time. This mode is useful, if you want to control pod evictions yourself (similar to the StatefulSet updateStrategy OnDelete) or your workload is sensitive to evictions, e.g. some brownfield singleton application or a daemon set pod that is critical for the node. Auto (default): In this mode, recommendations are computed, applied, and pods are even proactively evicted to enforce new recommendations over time. This applies recommendations continuously without you having to worry too much. As mentioned, controlledValues influences whether only requests or requests and limits are scaled:\n RequestsOnly: Updates only requests and doesn’t change limits. Useful if you have defined absolute limits (unrelated to the requests). RequestsAndLimits (default): Updates requests and proportionally scales limits along with the requests. Useful if you have defined relative limits (related to the requests). In this case, the gap between requests and limits should be either zero for QoS Guaranteed or small for QoS Burstable to avoid useless (way beyond the threshold of unhealthy behavior) or absurd (larger than node capacity) values. VPA doesn’t offer many more settings that can be tuned per VPA resource than you see above (different than HPA’s behavior section). However, there is one more that isn’t shown above, which allows to scale only up or only down (evictionRequirements[].changeRequirement), in case you need that, e.g. to provide resources when needed, but avoid disruptions otherwise.\nVPA Options VPA is an independent community project that consists of a recommender (computing target recommendations and bounds), an updater (evicting pods that are out of recommendation bounds), and an admission controller (mutating webhook applying the target recommendation to newly created pods). As such, they have independent options.\nVPA Recommender Options You can read up the full VPA recommender options online and set some of them conveniently in your Gardener shoot cluster spec:\n recommendationMarginFraction (default 15%): Safety margin that will be added to the recommended requests. targetCPUPercentile (default 90%): CPU usage percentile that will be targeted with the CPU recommendation (i.e. recommendation will “fit” e.g. 90% of the observed CPU usages). This setting is relevant for balancing your requests reservations vs. your costs. If you want to reduce costs, you can reduce this value (higher risk because of potential under-reservation, but lower costs), because CPU is compressible, but then VPA may lack the necessary signals for scale-up as throttling on an otherwise fully utilized node will go unnoticed by VPA. If you want to err on the safe side, you can increase this value, but you will then target more and more a worst case scenario, quickly (maybe even exponentially) increasing the costs. targetMemoryPercentile (default 90%): Memory usage percentile that will be targeted with the memory recommendation (i.e. recommendation will “fit” e.g. 90% of the observed memory usages). This setting is relevant for balancing your requests reservations vs. your costs. If you want to reduce costs, you can reduce this value (higher risk because of potential under-reservation, but lower costs), because OOMs will trigger bump-ups, but those will disrupt the workload. If you want to err on the safe side, you can increase this value, but you will then target more and more a worst case scenario, quickly (maybe even exponentially) increasing the costs. There are a few more configurable options of lesser interest:\n recommenderInterval (default 1m): How often VPA retrieves the pods and metrics respectively how often it recomputes the recommendations and bounds. There are many more options that you can only configure if you deploy your own VPA and which we will not discuss here, but you can check them out here.\n [!NOTE] Due to an implementation detail (smallest bucket size), VPA cannot create recommendations below 10m cores and 10M memory even if minAllowed is lower.\n VPA Updater Options You can read up the full VPA updater options online and set some of them conveniently in your Gardener shoot cluster spec:\n evictAfterOOMThreshold (default 10m): Pods where at least one container OOMs within this time period since its start will be actively evicted, which will implicitly apply the new target recommendation that will have been bumped up after OOMKill. Please note, the kubelet may evict pods even before an OOM, but only if kube-reserved is underrun, i.e. node-level resources are running low. In these cases, eviction will happen first by pod priority and second by how much the usage overruns the requests. evictionTolerance (default 50%): Defines a threshold below which no further eligible pod will be evited anymore, i.e. limits how many eligible pods may be in eviction in parallel (but at least 1). The threshold is computed as follows: running - evicted \u003e replicas - tolerance. Example: 10 replicas, 9 running, 8 eligible for eviction, 20% tolerance with 10 replicas which amounts to 2 pods, and no pod evicted in this round yet, then 9 - 0 \u003e 10 - 2 is true and a pod would be evicted, but the next one would be in violation as 9 - 1 = 10 - 2 and no further pod would be evicted anymore in this round. evictionRateBurst (default 1): Defines how many eligible pods may be evicted in one go. evictionRateLimit (default disabled): Defines how many eligible pods may be evicted per second (a value of 0 or -1 disables the rate limiting). In general, avoid modifying these eviction settings unless you have good reasons and try to rely on Pod Disruption Budgets (PDBs) instead. However, PDBs are not available for daemon sets.\nThere are a few more configurable options of lesser interest:\n updaterInterval (default 1m): How often VPA evicts the pods. There are many more options that you can only configure if you deploy your own VPA and which we will not discuss here, but you can check them out here.\nConsiderations When Using VPA Initial Resource Estimates: VPA requires historical resource usage data to base its recommendations on. Until they kick in, your initial resource requests apply and should be sensible. Pod Disruption: When VPA adjusts the resources for a pod, it may need to “recreate” the pod, which can cause temporary disruptions. This should be taken into account. Compatibility with HPA: Care must be taken when using VPA in conjunction with HPA, as they can potentially interfere with each other’s scaling decisions. Combining HPA and VPA HPA and VPA serve different purposes and operate on different axes of scaling. HPA increases or decreases the number of pod replicas based on metrics like CPU or memory usage, effectively scaling the application out or in. VPA, on the other hand, adjusts the CPU and memory reservations of individual pods, scaling the application up or down.\nWhen used together, these autoscalers can provide both horizontal and vertical scaling. However, they can also conflict with each other if used on the same metrics (e.g. both on CPU or both on memory). In particular, if VPA adjusts the requests, the utilization, i.e. the ratio between usage and requests, will approach 100% (for various reasons not exactly right, but for this consideration, close enough), which may trigger HPA to scale out, if it’s configured to scale on utilization below 100% (often seen in simple examples), which will spread the load across more pods, which may trigger VPA again to adjust the requests to match the new pod usages.\nThis is a feedback loop and it stems from HPA’s method of calculating the desired number of replicas, which is:\ndesiredReplicas = ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]\nIf desiredMetricValue is utilization and VPA adjusts the requests, which changes the utilization, this may inadvertently trigger HPA and create said feedback loop. On the other hand, if desiredMetricValue is usage and VPA adjusts the requests now, this will have no impact on HPA anymore (HPA will always influence VPA, but we can control whether VPA influences HPA).\nTherefore, to safely combine HPA and VPA, consider the following strategies:\n Configure HPA and VPA on different metrics: One way to avoid conflicts is to use HPA and VPA based on different metrics. For instance, you could configure HPA to scale based on requests per seconds (or another representative custom/external metric) and VPA to adjust CPU and memory requests. This way, each autoscaler operates independently based on its specific metric(s). Configure HPA to scale on usage, not utilization, when used with VPA: Another way to avoid conflicts is to use HPA not on average utilization (averageUtilization), but instead on average usage (averageValue) as replicas driver, which is an absolute metric (requests don’t affect usage). This way, you can combine both autoscalers even on the same metrics. Pod Autoscaling and Cluster Autoscaler Autoscaling within Kubernetes can be implemented at different levels: pod autoscaling (HPA and VPA) and cluster autoscaling (CA). While pod autoscaling adjusts the number of pod replicas or their resource reservations, cluster autoscaling focuses on the number of nodes in the cluster, so that your pods can be hosted. If your workload isn’t static and especially if you make use of pod autoscaling, it only works if you have sufficient node capacity available. The most effective way to do that, without running a worst-case number of nodes, is to configure burstable worker pools in your shoot spec, i.e. define a true minimum node count and a worst-case maximum node count and leave the node autoscaling to Gardener that internally uses the Cluster Autoscaler to provision and deprovision nodes as needed.\nCluster Autoscaler automatically adjusts the number of nodes by adding or removing nodes based on the demands of the workloads and the available resources. It interacts with the cloud provider’s APIs to provision or deprovision nodes as needed. Cluster Autoscaler monitors the utilization of nodes and the scheduling of pods. If it detects that pods cannot be scheduled due to a lack of resources, it will trigger the addition of new nodes to the cluster. Conversely, if nodes are underutilized for some time and their pods can be placed on other nodes, it will remove those nodes to reduce costs and improve resource efficiency.\nBest Practices:\n Resource Buffering: Maintain a buffer of resources to accommodate temporary spikes in demand without waiting for node provisioning. This can be done by deploying pods with low priority that can be preempted when real workloads require resources. This helps in faster pod scheduling and avoids delays in scaling out or up. Pod Disruption Budgets (PDBs): Use PDBs to ensure that during scale-down events, the availability of applications is maintained as the Cluster Autoscaler will not voluntarily evict a pod if a PDB would be violated. Interesting CA Options CA can be configured in your Gardener shoot cluster spec globally and also in parts per worker pool:\n Can only be configured globally: expander (default least-waste): Defines the “expander” algorithm to use during scale-up, see FAQ. scaleDownDelayAfterAdd (default 1h): Defines how long after scaling up a node, a node may be scaled down. scaleDownDelayAfterFailure (default 3m): Defines how long after scaling down a node failed, scaling down will be resumed. scaleDownDelayAfterDelete (default 0s): Defines how long after scaling down a node, another node may be scaled down. Can be configured globally and also overwritten individually per worker pool: scaleDownUtilizationThreshold (default 50%): Defines the threshold below which a node becomes eligible for scaling down. scaleDownUnneededTime (default 30m): Defines the trailing time window the node must be consistently below a certain utilization threshold before it can finally be scaled down. There are many more options that you can only configure if you deploy your own CA and which we will not discuss here, but you can check them out here.\nImportance of Monitoring Monitoring is a critical component of autoscaling for several reasons:\n Performance Insights: It provides insights into how well your autoscaling strategy is meeting the performance requirements of your applications. Resource Utilization: It helps you understand resource utilization patterns, enabling you to optimize resource allocation and reduce waste. Cost Management: It allows you to track the cost implications of scaling actions, helping you to maintain control over your cloud spending. Troubleshooting: It enables you to quickly identify and address issues with autoscaling, such as unexpected scaling behavior or resource bottlenecks. To effectively monitor autoscaling, you should leverage the following tools and metrics:\n Kubernetes Metrics Server: Collects resource metrics from kubelets and provides them to HPA and VPA for autoscaling decisions (automatically provided by Gardener). Prometheus: An open-source monitoring system that can collect and store custom metrics, providing a rich dataset for autoscaling decisions. Grafana/Plutono: A visualization tool that integrates with Prometheus to create dashboards for monitoring autoscaling metrics and events. Cloud Provider Tools: Most cloud providers offer native monitoring solutions that can be used to track the performance and costs associated with autoscaling. Key metrics to monitor include:\n CPU and Memory Utilization: Track the resource utilization of your pods and nodes to understand how they correlate with scaling events. Pod Count: Monitor the number of pod replicas over time to see how HPA is responding to changes in load. Scaling Events: Keep an eye on scaling events triggered by HPA and VPA to ensure they align with expected behavior. Application Performance Metrics: Track application-specific metrics such as response times, error rates, and throughput. Based on the insights gained from monitoring, you may need to adjust your autoscaling configurations:\n Refine Thresholds: If you notice frequent scaling actions or periods of underutilization or overutilization, adjust the thresholds used by HPA and VPA to better match the workload patterns. Update Policies: Modify VPA update policies if you observe that the current settings are causing too much or too little pod disruption. Custom Metrics: If using custom metrics, ensure they accurately reflect the load on your application and adjust them if they do not. Scaling Limits: Review and adjust the minimum and maximum scaling limits to prevent over-scaling or under-scaling based on the capacity of your cluster and the criticality of your applications. Quality of Service (QoS) A few words on the quality of service for pods. Basically, there are 3 classes of QoS and they influence the eviction of pods when kube-reserved is underrun, i.e. node-level resources are running low:\n BestEffort, i.e. pods where no container has CPU or memory requests or limits: Avoid them unless you have really good reasons. The kube-scheduler will place them just anywhere according to its policy, e.g. balanced or bin-packing, but whatever resources these pods consume, may bring other pods into trouble or even the kubelet and the container runtime itself, if it happens very suddenly. Burstable, i.e. pods where at least one container has CPU or memory requests and at least one has no limits or limits that don’t match the requests: Prefer them unless you have really good reasons for the other QoS classes. Always specify proper requests or use VPA to recommend those. This helps the kube-scheduler to make the right scheduling decisions. Not having limits will additionally provide upward resource flexibility, if the node is not under pressure. Guaranteed, i.e. pods where all containers have CPU and memory requests and equal limits: Avoid them unless you really know the limits or throttling/killing is intended. While “Guaranteed” sounds like something “positive” in the English language, this class comes with the downside, that pods will be actively CPU-throttled and will actively go OOM, even if the node is not under pressure and has excess capacity left. Worse, if containers in the pod are under VPA, their CPU requests/limits will often not be scaled up as CPU throttling will go unnoticed by VPA. Summary As a rule of thumb, always set CPU and memory requests (or let VPA do that) and always avoid CPU and memory limits. CPU limits aren’t helpful on an under-utilized node (=may result in needless outages) and even suppress the signals for VPA to act. On a nearly or fully utilized node, CPU limits are practically irrelevant as only the requests matter, which are translated into CPU shares that provide a fair use of the CPU anyway (see CFS). Therefore, if you do not know the healthy range, do not set CPU limits. If you as author of the source code know its healthy range, set them to the upper threshold of that healthy range (everything above, from your knowledge of that code, is definitely an unbound busy loop or similar, which is the main reason for CPU limits, besides batch jobs where throttling is acceptable or even desired). Memory limits may be more useful, but suffer a similar, though not as negative downside. As with CPU limits, memory limits aren’t helpful on an under-utilized node (=may result in needless outages), but different than CPU limits, they result in an OOM, which triggers VPA to provide more memory suddenly (modifies the currently computed recommendations by a configurable factor, defaulting to +20%, see docs). Therefore, if you do not know the healthy range, do not set memory limits. If you as author of the source code know its healthy range, set them to the upper threshold of that healthy range (everything above, from your knowledge of that code, is definitely an unbound memory leak or similar, which is the main reason for memory limits) Horizontal Pod Autoscaling (HPA): Use for pods that support horizontal scaling. Prefer scaling on usage, not utilization, as this is more predictable (not dependent on a second variable, namely the current requests) and conflict-free with vertical pod autoscaling (VPA). As a rule of thumb, set the initial replicas to the 5th percentile of the actually observed replica count in production. Since HPA reacts fast, this is not as critical, but may help reduce initial load on the control plane early after deployment. However, be cautious when you update the higher-level resource not to inadvertently reset the current HPA-controlled replica count (very easy to make mistake that can lead to catastrophic loss of pods). HPA modifies the replica count directly in the spec and you do not want to overwrite that. Even if it reacts fast, it is not instant (not via a mutating webhook as VPA operates) and the damage may already be done. As for minimum and maximum, let your high availability requirements determine the minimum and your theoretical maximum load determine the maximum, flanked with alerts to detect erroneous run-away out-scaling or the actual nearing of your practical maximum load, so that you can intervene. Vertical Pod Autoscaling (VPA): Use for containers that have a significant usage (e.g. any container above 50m CPU or 100M memory) and a significant usage spread over time (by more than 2x), i.e. ignore small (e.g. side-cars) or static (e.g. Java statically allocated heap) containers, but otherwise use it to provide the resources needed on the one hand and keep the costs in check on the other hand. As a rule of thumb, set the initial requests to the 5th percentile of the actually observed CPU resp. memory usage in production. Since VPA may need some time at first to respond and evict pods, this is especially critical early after deployment. The lower bound, below which pods will be immediately evicted, converges much faster than the upper bound, above which pods will be immediately evicted, but it isn’t instant, e.g. after 5 minutes the lower bound is just at 60% of the computed lower bound; after 12 hours the upper bound is still at 300% of the computed upper bound (see code). Unlike with HPA, you don’t need to be as cautious when updating the higher-level resource in the case of VPA. As long as VPA’s mutating webhook (VPA Admission Controller) is operational (which also the VPA Updater checks before evicting pods), it’s generally safe to update the higher-level resource. However, if it’s not up and running, any new pods that are spawned (e.g. as a consequence of a rolling update of the higher-level resource or for any other reason) will not be mutated. Instead, they will receive whatever requests are currently configured at the higher-level resource, which can lead to catastrophic resource under-reservation. Gardener deploys the VPA Admission Controller in HA - if unhealthy, it is reported under the ControlPlaneHealthy shoot status condition. If you have defined absolute limits (unrelated to the requests), configure VPA to only scale the requests or else it will proportionally scale the limits as well, which can easily become useless (way beyond the threshold of unhealthy behavior) or absurd (larger than node capacity): spec: resourcePolicy: containerPolicies: - controlledValues: RequestsOnly ... If you have defined relative limits (related to the requests), the default policy to scale the limits proportionally with the requests is fine, but the gap between requests and limits must be zero for QoS Guaranteed and should best be small for QoS Burstable to avoid useless or absurd limits either, e.g. prefer limits being 5 to at most 20% larger than requests as opposed to being 100% larger or more. As a rule of thumb, set minAllowed to the highest observed VPA recommendation (usually during the initialization phase or during any periodical activity) for an otherwise practically idle container, so that you avoid needless trashing (e.g. resource usage calms down over time and recommendations drop consecutively until eviction, which will then lead again to initialization or later periodical activity and higher recommendations and new evictions).⚠️ You may want to provide higher minAllowed values, if you observe that up-scaling takes too long for CPU or memory for a too large percentile of your workload. This will get you out of the danger zone of too few resources for too many pods at the expense of providing too many resources for a few pods. Memory may react faster than CPU, because CPU throttling is not visible and memory gets aided by OOM bump-up incidents, but still, if you observe that up-scaling takes too long, you may want to increase minAllowed accordingly. As a rule of thumb, set maxAllowed to your theoretical maximum load, flanked with alerts to detect erroneous run-away usage or the actual nearing of your practical maximum load, so that you can intervene. However, VPA can easily recommend requests larger than what is allocatable on a node, so you must either ensure large enough nodes (Gardener can scale up from zero, in case you like to define a low-priority worker pool with more resources for very large pods) and/or cap VPA’s target recommendations using maxAllowed at the node allocatable remainder (after daemon set pods) of the largest eligible machine type (may result in under-provisioning resources for a pod). Use your monitoring and check maximum pod usage to decide about the maximum machine type. Recommendations in a Box Container When to use Value Requests - Set them (recommended) unless:- Do not set requests for QoS BestEffort; useful only if pod can be evicted as often as needed and pod can pick up where it left off without any penalty Set requests to 95th percentile (w/o VPA) of the actually observed CPU resp. memory usage in production resp. 5th percentile (w/ VPA) (see below) Limits - Avoid them (recommended) unless:- Set limits for QoS Guaranteed; useful only if pod has strictly static resource requirements- Set CPU limits if you want to throttle CPU usage for containers that can be throttled w/o any other disadvantage than processing time (never do that when time-critical operations like leases are involved)- Set limits if you know the healthy range and want to shield against unbound busy loops, unbound memory leaks, or similar If you really can (otherwise not), set limits to healthy theoretical max load Scaler When to use Initial Minimum Maximum HPA Use for pods that support horizontal scaling Set initial replicas to 5th percentile of the actually observed replica count in production (prefer scaling on usage, not utilization) and make sure to never overwrite it later when controlled by HPA Set minReplicas to 0 (requires feature gate and custom/external metrics), to 1 (regular HPA minimum), or whatever the high availability requirements of the workload demand Set maxReplicas to healthy theoretical max load VPA Use for containers that have a significant usage (\u003e50m/100M) and a significant usage spread over time (\u003e2x) Set initial requests to 5th percentile of the actually observed CPU resp. memory usage in production Set minAllowed to highest observed VPA recommendation (includes start-up phase) for an otherwise practically idle container (avoids pod trashing when pod gets evicted after idling) Set maxAllowed to fresh node allocatable remainder after daemonset pods (avoids pending pods when requests exeed fresh node allocatable remainder) or, if you really can (otherwise not), to healthy theoretical max load (less disruptive than limits as no throttling or OOM happens on under-utilized nodes) CA Use for dynamic workloads, definitely if you use HPA and/or VPA N/A Set minimum to 0 or number of nodes required right after cluster creation or wake-up Set maximum to healthy theoretical max load [!NOTE] Theoretical max load may be very difficult to ascertain, especially with modern software that consists of building blocks you do not own or know in detail. If you have comprehensive monitoring in place, you may be tempted to pick the observed maximum and add a safety margin or even factor on top (2x, 4x, or any other number), but this is not to be confused with “theoretical max load” (solely depending on the code, not observations from the outside). At any point in time, your numbers may change, e.g. because you updated a software component or your usage increased. If you decide to use numbers that are set based only on observations, make sure to flank those numbers with monitoring alerts, so that you have sufficient time to investigate, revise, and readjust if necessary.\n Conclusion Pod autoscaling is a dynamic and complex aspect of Kubernetes, but it is also one of the most powerful tools at your disposal for maintaining efficient, reliable, and cost-effective applications. By carefully selecting the appropriate autoscaler, setting well-considered thresholds, and continuously monitoring and adjusting your strategies, you can ensure that your Kubernetes deployments are well-equipped to handle your resource demands while not over-paying for the provided resources at the same time.\nAs Kubernetes continues to evolve (e.g. in-place updates) and as new patterns and practices emerge, the approaches to autoscaling may also change. However, the principles discussed above will remain foundational to creating scalable and resilient Kubernetes workloads. Whether you’re a developer or operations engineer, a solid understanding of pod autoscaling will be instrumental in the successful deployment and management of containerized applications.\n","categories":"","description":"","excerpt":"Introduction There are two types of pod autoscaling in Kubernetes: …","ref":"/docs/guides/applications/shoot-pod-autoscaling-best-practices/","tags":"","title":"Shoot Pod Autoscaling Best Practices"},{"body":"Developer Docs for Gardener Shoot Rsyslog Relp Extension This document outlines how Shoot reconciliation and deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.\nShoot Reconciliation This section outlines how the reconciliation works for a Shoot with the shoot-rsyslog-relp extension enabled.\nExtension Enablement / Reconciliation This section outlines how the extension enablement/reconciliation works, e.g., the extension has been added to the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet deploys the Extension resource. The shoot-rsyslog-relp extension reconciles the Extension resource. pkg/controller/lifecycle/actuator.go contains the implementation of the extension.Actuator interface. The reconciliation of an Extension of type shoot-rsyslog-relp only deploys the necessary monitoring configuration - the shoot-rsyslog-relp-dashboards ConfigMap which contains the definitions for: Plutono dashboard for the Rsyslog component, and the shoot-shoot-rsyslog-relp ServiceMonitor and PrometheusRule resources which contains the definitions for: scraping metrics by prometheus, alerting rules. As part of the Shoot reconciliation flow, the gardenlet deploys the OperatingSystemConfig resource. The shoot-rsyslog-relp extension serves a webhook that mutates the OperatingSystemConfig resource for Shoots having the shoot-rsyslog-relp extension enabled (the corresponding namespace gets labeled by the gardenlet with extensions.gardener.cloud/shoot-rsyslog-relp=true). pkg/webhook/operatingsystemconfig/ensurer.go contains implementation of the genericmutator.Ensurer interface. The webhook renders the 60-audit.conf.tpl template script and appends it to the OperatingSystemConfig files. When rendering the template, the configuration of the shoot-rsyslog-relp extension is used to fill in the required template values. The file is installed as /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf on the host OS. The webhook appends the audit rules to the OperatingSystemConfig. The files are installed under /var/lib/rsyslog-relp-configurator/rules.d on the host OS. If the user has specified alternative audit rules in a config map reference, the webhook fetches the referenced ConfigMap from the Shoot’s control plane namespace and decodes the value of its auditd data key into an object of type Auditd. It then takes the auditRules defined in the object and places those under the /var/lib/rsyslog-relp-configurator/rules.d directory in a single file. The webhook renders the configure-rsyslog.tpl.sh script and appends it to the OperatingSystemConfig files. This script is installed as /var/lib/rsyslog-relp-configurator/configure-rsyslog.sh on the host OS. It keeps the configuration of the rsyslog systemd service up-to-date by copying /var/lib/rsyslog-relp-configurator/rsyslog.d/60-audit.conf to /etc/rsyslog.d/60-audit.conf, if /etc/rsyslog.d/60-audit.conf does not exist or the files differ. The script also takes care of syncing the audit rules in /etc/audit/rules.d with the ones installed in /var/lib/rsyslog-relp-configurator/rules.d and restarts the auditd systemd service if necessary. The webhook renders the process-rsyslog-pstats.tpl.sh and appends it to the OperatingSystemConfig files. This script receives metrics from the rsyslog process, transforms them, and writes them to /var/lib/node-exporter/textfile-collector/rsyslog_pstats.prom so that they can be collected by the node-exporter. As part of the Shoot reconciliation, before the shoot-rsyslog-relp extension is deployed, the gardenlet copies all Secret and ConfigMap resources referenced in .spec.resources[] to the Shoot’s control plane namespace on the Seed. When the .tls.enabled field is true in the shoot-rsyslog-relp extension configuration, a value for .tls.secretReferenceName must also be specified so that it references a named resource reference in the Shoot’s .spec.resources[] array. The webhook appends the data of the referenced Secret in the Shoot’s control plane namespace to the OperatingSystemConfig files. The webhook appends the rsyslog-configurator.service unit to the OperatingSystemConfig units. The unit invokes the configure-rsyslog.sh script every 15 seconds. Extension Disablement This section outlines how the extension disablement works, i.e., the extension has to be removed from the Shoot spec.\n As part of the Shoot reconciliation flow, the gardenlet destroys the Extension resource because it is no longer needed. As part of the deletion flow, the shoot-rsyslog-relp extension deploys the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules. Shoot Deletion This section outlines how the deletion works for a Shoot with the shoot-rsyslog-relp extension enabled.\n As part of the Shoot deletion flow, the gardenlet destroys the Extension resource. In the Shoot deletion flow, the Extension resource is deleted after the Worker resource. Hence, there is no need to deploy the rsyslog-relp-configuration-cleaner DaemonSet to the Shoot cluster to clean up the existing rsyslog configuration and revert the audit rules. ","categories":"","description":"","excerpt":"Developer Docs for Gardener Shoot Rsyslog Relp Extension This document …","ref":"/docs/extensions/others/gardener-extension-shoot-rsyslog-relp/shoot-rsyslog-relp/","tags":"","title":"Shoot Rsyslog Relp"},{"body":"Shoot Scheduling Profiles This guide describes the available scheduling profiles and how they can be configured in the Shoot cluster. It also clarifies how a custom scheduling profile can be configured.\nScheduling Profiles The scheduling process in the kube-scheduler happens in a series of stages. A scheduling profile allows configuring the different stages of the scheduling.\nAs of today, Gardener supports two predefined scheduling profiles:\n balanced (default)\nOverview\nThe balanced profile attempts to spread Pods evenly across Nodes to obtain a more balanced resource usage. This profile provides the default kube-scheduler behavior.\nHow it works?\nThe kube-scheduler is started without any profiles. In such case, by default, one profile with the scheduler name default-scheduler is created. This profile includes the default plugins. If a Pod doesn’t specify the .spec.schedulerName field, kube-apiserver sets it to default-scheduler. Then, the Pod gets scheduled by the default-scheduler accordingly.\n bin-packing\nOverview\nThe bin-packing profile scores Nodes based on the allocation of resources. It prioritizes Nodes with the most allocated resources. By favoring the Nodes with the most allocation, some of the other Nodes become under-utilized over time (because new Pods keep being scheduled to the most allocated Nodes). Then, the cluster-autoscaler identifies such under-utilized Nodes and removes them from the cluster. In this way, this profile provides a greater overall resource utilization (compared to the balanced profile).\n Note: The decision of when to remove a Node is a trade-off between optimizing for utilization or the availability of resources. Removing under-utilized Nodes improves cluster utilization, but new workloads might have to wait for resources to be provisioned again before they can run.\n How it works?\nThe kube-scheduler is configured with the following bin packing profile:\napiVersion: kubescheduler.config.k8s.io/v1beta3 kind: KubeSchedulerConfiguration profiles: - schedulerName: bin-packing-scheduler pluginConfig: - name: NodeResourcesFit args: scoringStrategy: type: MostAllocated plugins: score: disabled: - name: NodeResourcesBalancedAllocation To impose the new profile, a MutatingWebhookConfiguration is deployed in the Shoot cluster. The MutatingWebhookConfiguration intercepts CREATE operations for Pods and sets the .spec.schedulerName field to bin-packing-scheduler. Then, the Pod gets scheduled by the bin-packing-scheduler accordingly. Pods that specify a custom scheduler (i.e., having .spec.schedulerName different from default-scheduler and bin-packing-scheduler) are not affected.\n Configuring the Scheduling Profile The scheduling profile can be configured via the .spec.kubernetes.kubeScheduler.profile field in the Shoot:\nspec: # ... kubernetes: kubeScheduler: profile: \"balanced\" # or \"bin-packing\" Custom Scheduling Profiles The kube-scheduler’s component configs allows configuring custom scheduling profiles to match the cluster needs. As of today, Gardener supports only two predefined scheduling profiles. The profile configuration in the component config is quite expressive and it is not possible to easily define profiles that would match the needs of every cluster. Because of these reasons, there are no plans to add support for new predefined scheduling profiles. If a cluster owner wants to use a custom scheduling profile, then they have to deploy (and maintain) a dedicated kube-scheduler deployment in the cluster itself.\n","categories":"","description":"Introducing `balanced` and `bin-packing` scheduling profiles","excerpt":"Introducing `balanced` and `bin-packing` scheduling profiles","ref":"/docs/gardener/shoot_scheduling_profiles/","tags":"","title":"Shoot Scheduling Profiles"},{"body":"ServiceAccount Configurations for Shoot Clusters The Shoot specification allows to configure some of the settings for the handling of ServiceAccounts:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot spec: kubernetes: kubeAPIServer: serviceAccountConfig: issuer: foo acceptedIssuers: - foo1 - foo2 extendTokenExpiration: true maxTokenExpiration: 45d ... Issuer and Accepted Issuers The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.{issuer,acceptedIssuers} fields are translated to the --service-account-issuer flag for the kube-apiserver. The issuer will assert its identifier in the iss claim of the issued tokens. According to the upstream specification, values need to meet the following requirements:\n This value is a string or URI. If this option is not a valid URI per the OpenID Discovery 1.0 spec, the ServiceAccountIssuerDiscovery feature will remain disabled, even if the feature gate is set to true. It is highly recommended that this value comply with the OpenID spec: https://openid.net/specs/openid-connect-discovery-1_0.html. In practice, this means that service-account-issuer must be an https URL. It is also highly recommended that this URL be capable of serving OpenID discovery documents at {service-account-issuer}/.well-known/openid-configuration.\n By default, Gardener uses the internal cluster domain as issuer (e.g., https://api.foo.bar.example.com). If you specify the issuer, then this default issuer will always be part of the list of accepted issuers (you don’t need to specify it yourself).\n [!CAUTION] If you change from the default issuer to a custom issuer, all previously issued tokens will still be valid/accepted. However, if you change from a custom issuer A to another issuer B (custom or default), then you have to add A to the acceptedIssuers so that previously issued tokens are not invalidated. Otherwise, the control plane components as well as system components and your workload pods might fail. You can remove A from the acceptedIssuers when all currently active tokens have been issued solely by B. This can be ensured by using projected token volumes with a short validity, or by rolling out all pods. Additionally, all ServiceAccount token secrets should be recreated. Apart from this, you should wait for at least 12h to make sure the control plane and system components have received a new token from Gardener.\n Token Expirations The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.extendTokenExpiration configures the --service-account-extend-token-expiration flag of the kube-apiserver. It is enabled by default and has the following specification:\n Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.\n The .spec.kubernetes.kubeAPIServer.serviceAccountConfig.maxTokenExpiration configures the --service-account-max-token-expiration flag of the kube-apiserver. It has the following specification:\n The maximum validity duration of a token created by the service account token issuer. If an otherwise valid TokenRequest with a validity duration larger than this value is requested, a token will be issued with a validity duration of this value.\n [!NOTE] The value for this field must be in the [30d,90d] range. The background for this limitation is that all Gardener components rely on the TokenRequest API and the Kubernetes service account token projection feature with short-lived, auto-rotating tokens. Any values lower than 30d risk impacting the SLO for shoot clusters, and any values above 90d violate security best practices with respect to maximum validity of credentials before they must be rotated. Given that the field just specifies the upper bound, end-users can still use lower values for their individual workload by specifying the .spec.volumes[].projected.sources[].serviceAccountToken.expirationSeconds in the PodSpecs.\n Managed Service Account Issuer Gardener also provides a way to manage the service account issuer of a shoot cluster as well as serving its OIDC discovery documents from a centrally managed server called Gardener Discovery Server. This ability removes the need for changing the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.issuer and exposing it separately.\nPrerequisites [!NOTE] The following prerequisites are responsibility of the Gardener Administrators and are not something that end users can configure by themselves. If uncertain that these requirements are met, please contact your Gardener Administrator.\n Prerequisites:\n The Garden Cluster should have the Gardener Discovery Server deployed and configured. The easiest way to handle this is by using the gardener-operator. The ShootManagedIssuer feature gate should be enabled. Enablement If the prerequisites are met then the feature can be enabled for a shoot cluster by annotating it with authentication.gardener.cloud/issuer=managed. Mind that once enabled, this feature cannot be disabled. After the shoot is reconciled, you can retrieve the new shoot service account issuer value from the shoot’s status. A sample query that will retrieve the managed issuer looks like this:\nkubectl -n my-project get shoot my-shoot -o jsonpath='{.status.advertisedAddresses[?(@.name==\"service-account-issuer\")].url}' Once retrieved, the shoot’s OIDC discovery documents can be explored by querying the /.well-known/openid-configuration endpoint of the issuer.\nMind that this annotation is incompatible with the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.issuer field, so if you want to enable it then the issuer field should not be set in the shoot specification.\n [!CAUTION] If you change from the default issuer to a managed issuer, all previously issued tokens will still be valid/accepted. However, if you change from a custom issuer A to a managed issuer, then you have to add A to the .spec.kubernetes.kubeAPIServer.serviceAccountConfig.acceptedIssuers so that previously issued tokens are not invalidated. Otherwise, the control plane components as well as system components and your workload pods might fail. You can remove A from the acceptedIssuers when all currently active tokens have been issued solely by the managed issuer. This can be ensured by using projected token volumes with a short validity, or by rolling out all pods. Additionally, all ServiceAccount token secrets should be recreated. Apart from this, you should wait for at least 12h to make sure the control plane and system components have received a new token from Gardener.\n ","categories":"","description":"","excerpt":"ServiceAccount Configurations for Shoot Clusters The Shoot …","ref":"/docs/gardener/shoot_serviceaccounts/","tags":"","title":"Shoot Serviceaccounts"},{"body":"Shoot Status This document provides an overview of the ShootStatus.\nConditions The Shoot status consists of a set of conditions. A Condition has the following fields:\n Field name Description type Name of the condition. status Indicates whether the condition is applicable, with possible values True, False, Unknown or Progressing. lastTransitionTime Timestamp for when the condition last transitioned from one status to another. lastUpdateTime Timestamp for when the condition was updated. Usually changes when reason or message in condition is updated. reason Machine-readable, UpperCamelCase text indicating the reason for the condition’s last transition. message Human-readable message indicating details about the last status transition. codes Well-defined error codes in case the condition reports a problem. Currently, the available Shoot condition types are:\n APIServerAvailable ControlPlaneHealthy EveryNodeReady ObservabilityComponentsHealthy SystemComponentsHealthy The Shoot conditions are maintained by the shoot care reconciler of the gardenlet. Find more information in the gardelent documentation.\nSync Period The condition checks are executed periodically at an interval which is configurable in the GardenletConfiguration (.controllers.shootCare.syncPeriod, defaults to 1m).\nCondition Thresholds The GardenletConfiguration also allows configuring condition thresholds (controllers.shootCare.conditionThresholds). A condition threshold is the amount of time to consider a condition as Processing on condition status changes.\nLet’s check the following example to get a better understanding. Let’s say that the APIServerAvailable condition of our Shoot is with status True. If the next condition check fails (for example kube-apiserver becomes unreachable), then the condition first goes to Processing state. Only if this state remains for condition threshold amount of time, then the condition is finally updated to False.\nConstraints Constraints represent conditions of a Shoot’s current state that constraint some operations on it. The current constraints are:\nHibernationPossible:\nThis constraint indicates whether a Shoot is allowed to be hibernated. The rationale behind this constraint is that a Shoot can have ValidatingWebhookConfigurations or MutatingWebhookConfigurations acting on resources that are critical for waking up a cluster. For example, if a webhook has rules for CREATE/UPDATE Pods or Nodes and failurePolicy=Fail, the webhook will block joining Nodes and creating critical system component Pods and thus block the entire wakeup operation, because the server backing the webhook is not running.\nEven if the failurePolicy is set to Ignore, high timeouts (\u003e15s) can lead to blocking requests of control plane components. That’s because most control-plane API calls are made with a client-side timeout of 30s, so if a webhook has timeoutSeconds=30 the overall request might still fail as there is overhead in communication with the API server and potential other webhooks.\nGenerally, it’s best practice to specify low timeouts in WebhookConfigs.\nAs an effort to correct this common problem, the webhook remediator has been created. This is enabled by setting .controllers.shootCare.webhookRemediatorEnabled=true in the gardenlet’s configuration. This feature simply checks whether webhook configurations in shoot clusters match a set of rules described here. If at least one of the rules matches, it will change set status=False for the .status.constraints of type HibernationPossible and MaintenancePreconditionsSatisfied in the Shoot resource. In addition, the failurePolicy in the affected webhook configurations will be set from Fail to Ignore. Gardenlet will also add an annotation to make it visible to end-users that their webhook configurations were mutated and should be fixed/adapted according to the rules and best practices.\nIn most cases, you can avoid this by simply excluding the kube-system namespace from your webhook via the namespaceSelector:\napiVersion: admissionregistration.k8s.io/v1 kind: MutatingWebhookConfiguration webhooks: - name: my-webhook.example.com namespaceSelector: matchExpressions: - key: gardener.cloud/purpose operator: NotIn values: - kube-system rules: - operations: [\"*\"] apiGroups: [\"\"] apiVersions: [\"v1\"] resources: [\"pods\"] scope: \"Namespaced\" However, some other resources (some of them cluster-scoped) might still trigger the remediator, namely:\n endpoints nodes clusterroles clusterrolebindings customresourcedefinitions apiservices certificatesigningrequests priorityclasses If one of the above resources triggers the remediator, the preferred solution is to remove that particular resource from your webhook’s rules. You can also use the objectSelector to reduce the scope of webhook’s rules. However, in special cases where a webhook is absolutely needed for the workload, it is possible to add the remediation.webhook.shoot.gardener.cloud/exclude=true label to your webhook so that the remediator ignores it. This label should not be used to silence an alert, but rather to confirm that a webhook won’t cause problems. Note that all of this is no perfect solution and just done on a best effort basis, and only the owner of the webhook can know whether it indeed is problematic and configured correctly.\nIn a special case, if a webhook has a rule for CREATE/UPDATE lease resources in kube-system namespace, its timeoutSeconds is updated to 3 seconds. This is required to ensure the proper functioning of the leader election of essential control plane controllers.\nYou can also find more help from the Kubernetes documentation\nMaintenancePreconditionsSatisfied:\nThis constraint indicates whether all preconditions for a safe maintenance operation are satisfied (see Shoot Maintenance for more information about what happens during a shoot maintenance). As of today, the same checks as in the HibernationPossible constraint are being performed (user-deployed webhooks that might interfere with potential rolling updates of shoot worker nodes). There is no further action being performed on this constraint’s status (maintenance is still being performed). It is meant to make the user aware of potential problems that might occur due to his configurations.\nCACertificateValiditiesAcceptable:\nThis constraint indicates that there is at least one CA certificate which expires in less than 1y. It will not be added to the .status.constraints if there is no such CA certificate. However, if it’s visible, then a credentials rotation operation should be considered.\nCRDsWithProblematicConversionWebhooks:\nThis constraint indicates that there is at least one CustomResourceDefinition in the cluster which has multiple stored versions and a conversion webhook configured. This could break the reconciliation flow of a Shoot cluster in some cases. See https://github.com/gardener/gardener/issues/7471 for more details. It will not be added to the .status.constraints if there is no such CRD. However, if it’s visible, then you should consider upgrading the existing objects to the current stored version. See Upgrade existing objects to a new stored version for detailed steps.\nLast Operation The Shoot status holds information about the last operation that is performed on the Shoot. The last operation field reflects overall progress and the tasks that are currently being executed. Allowed operation types are Create, Reconcile, Delete, Migrate, and Restore. Allowed operation states are Processing, Succeeded, Error, Failed, Pending, and Aborted. An operation in Error state is an operation that will be retried for a configurable amount of time (controllers.shoot.retryDuration field in GardenletConfiguration, defaults to 12h). If the operation cannot complete successfully for the configured retry duration, it will be marked as Failed. An operation in Failed state is an operation that won’t be retried automatically (to retry such an operation, see Retry failed operation).\nLast Errors The Shoot status also contains information about the last occurred error(s) (if any) during an operation. A LastError consists of identifier of the task returned error, human-readable message of the error and error codes (if any) associated with the error.\nError Codes Known error codes and their classification are:\n Error code User error Description ERR_INFRA_UNAUTHENTICATED true Indicates that the last error occurred due to the client request not being completed because it lacks valid authentication credentials for the requested resource. It is classified as a non-retryable error code. ERR_INFRA_UNAUTHORIZED true Indicates that the last error occurred due to the server understanding the request but refusing to authorize it. It is classified as a non-retryable error code. ERR_INFRA_QUOTA_EXCEEDED true Indicates that the last error occurred due to infrastructure quota limits. It is classified as a non-retryable error code. ERR_INFRA_RATE_LIMITS_EXCEEDED false Indicates that the last error occurred due to exceeded infrastructure request rate limits. ERR_INFRA_DEPENDENCIES true Indicates that the last error occurred due to dependent objects on the infrastructure level. It is classified as a non-retryable error code. ERR_RETRYABLE_INFRA_DEPENDENCIES false Indicates that the last error occurred due to dependent objects on the infrastructure level, but the operation should be retried. ERR_INFRA_RESOURCES_DEPLETED true Indicates that the last error occurred due to depleted resource in the infrastructure. ERR_CLEANUP_CLUSTER_RESOURCES true Indicates that the last error occurred due to resources in the cluster that are stuck in deletion. ERR_CONFIGURATION_PROBLEM true Indicates that the last error occurred due to a configuration problem. It is classified as a non-retryable error code. ERR_RETRYABLE_CONFIGURATION_PROBLEM true Indicates that the last error occurred due to a retryable configuration problem. “Retryable” means that the occurred error is likely to be resolved in a ungraceful manner after given period of time. ERR_PROBLEMATIC_WEBHOOK true Indicates that the last error occurred due to a webhook not following the Kubernetes best practices. Please note: Errors classified as User error: true do not require a Gardener operator to resolve but can be remediated by the user (e.g. by refreshing expired infrastructure credentials). Even though ERR_INFRA_RATE_LIMITS_EXCEEDED and ERR_RETRYABLE_INFRA_DEPENDENCIES is mentioned as User error: false` operator can’t provide any resolution because it is related to cloud provider issue.\nStatus Label Shoots will be automatically labeled with the shoot.gardener.cloud/status label. Its value might either be healthy, progressing, unhealthy or unknown depending on the .status.conditions, .status.lastOperation, and status.lastErrors of the Shoot. This can be used as an easy filter method to find shoots based on their “health” status.\n","categories":"","description":"Shoot conditions, constraints, and error codes","excerpt":"Shoot conditions, constraints, and error codes","ref":"/docs/gardener/shoot_status/","tags":"","title":"Shoot Status"},{"body":"Supported CPU Architectures for Shoot Worker Nodes Users can create shoot clusters with worker groups having virtual machines of different architectures. CPU architecture of each worker pool can be specified in the Shoot specification as follows:\nExample Usage in a Shoot spec: provider: workers: - name: cpu-worker machine: architecture: \u003csome-cpu-architecture\u003e # optional If no value is specified for the architecture field, it defaults to amd64. For a valid shoot object, a machine type should be present in the respective CloudProfile with the same CPU architecture as specified in the Shoot yaml. Also, a valid machine image should be present in the CloudProfile that supports the required architecture specified in the Shoot worker pool.\nExample Usage in a CloudProfile spec: machineImages: - name: test-image versions: - architectures: # optional - \u003carchitecture-1\u003e - \u003carchitecture-2\u003e version: 1.2.3 machineTypes: - architecture: \u003csome-cpu-architecture\u003e cpu: \"2\" gpu: \"0\" memory: 8Gi name: test-machine Currently, Gardener supports two of the most widely used CPU architectures:\n amd64 arm64 ","categories":"","description":"","excerpt":"Supported CPU Architectures for Shoot Worker Nodes Users can create …","ref":"/docs/gardener/shoot_supported_architectures/","tags":"","title":"Shoot Supported Architectures"},{"body":"Shoot Updates and Upgrades This document describes what happens during shoot updates (changes incorporated in a newly deployed Gardener version) and during shoot upgrades (changes for version controllable by end-users).\nUpdates Updates to all aspects of the shoot cluster happen when the gardenlet reconciles the Shoot resource.\nWhen are Reconciliations Triggered Generally, when you change the specification of your Shoot the reconciliation will start immediately, potentially updating your cluster. Please note that you can also confine the reconciliation triggered due to your specification updates to the cluster’s maintenance time window. Please find more information in Confine Specification Changes/Updates Roll Out.\nYou can also annotate your shoot with special operation annotations (for more information, see Trigger Shoot Operations), which will cause the reconciliation to start due to your actions.\nThere is also an automatic reconciliation by Gardener. The period, i.e., how often it is performed, depends on the configuration of the Gardener administrators/operators. In some Gardener installations the operators might enable “reconciliation in maintenance time window only” (for more information, see Cluster Reconciliation), which will result in at least one reconciliation during the time configured in the Shoot’s .spec.maintenance.timeWindow field.\nWhich Updates are Applied As end-users can only control the Shoot resource’s specification but not the used Gardener version, they don’t have any influence on which of the updates are rolled out (other than those settings configurable in the Shoot). A Gardener operator can deploy a new Gardener version at any point in time. Any subsequent reconciliation of Shoots will update them by rolling out the changes incorporated in this new Gardener version.\nSome examples for such shoot updates are:\n Add a new/remove an old component to/from the shoot’s control plane running in the seed, or to/from the shoot’s system components running on the worker nodes. Change the configuration of an existing control plane/system component. Restart of existing control plane/system components (this might result in a short unavailability of the Kubernetes API server, e.g., when etcd or a kube-apiserver itself is being restarted) Behavioural Changes Generally, some of such updates (e.g., configuration changes) could theoretically result in different behaviour of controllers. If such changes would be backwards-incompatible, then we usually follow one of those approaches (depends on the concrete change):\n Only apply the change for new clusters. Expose a new field in the Shoot resource that lets users control this changed behaviour to enable it at a convenient point in time. Put the change behind an alpha feature gate (disabled by default) in the gardenlet (only controllable by Gardener operators), which will be promoted to beta (enabled by default) in subsequent releases (in this case, end-users have no influence on when the behaviour changes - Gardener operators should inform their end-users and provide clear timelines when they will enable the feature gate). Upgrades We consider shoot upgrades to change either the:\n Kubernetes version (.spec.kubernetes.version) Kubernetes version of the worker pool if specified (.spec.provider.workers[].kubernetes.version) Machine image version of at least one worker pool (.spec.provider.workers[].machine.image.version) Generally, an upgrade is also performed through a reconciliation of the Shoot resource, i.e., the same concepts as for shoot updates apply. If an end-user triggers an upgrade (e.g., by changing the Kubernetes version) after a new Gardener version was deployed but before the shoot was reconciled again, then this upgrade might incorporate the changes delivered with this new Gardener version.\nIn-Place vs. Rolling Updates If the Kubernetes patch version is changed, then the upgrade happens in-place. This means that the shoot worker nodes remain untouched and only the kubelet process restarts with the new Kubernetes version binary. The same applies for configuration changes of the kubelet.\nIf the Kubernetes minor version is changed, then the upgrade is done in a “rolling update” fashion, similar to how pods in Kubernetes are updated (when backed by a Deployment). The worker nodes will be terminated one after another and replaced by new machines. The existing workload is gracefully drained and evicted from the old worker nodes to new worker nodes, respecting the configured PodDisruptionBudgets (see Specifying a Disruption Budget for your Application).\nCustomize Rolling Update Behaviour of Shoot Worker Nodes The .spec.provider.workers[] list exposes two fields that you might configure based on your workload’s needs: maxSurge and maxUnavailable. The same concepts like in Kubernetes apply. Additionally, you might customize how the machine-controller-manager (abbrev.: MCM; the component instrumenting this rolling update) is behaving. You can configure the following fields in .spec.provider.worker[].machineControllerManager:\n machineDrainTimeout: Timeout (in duration) used while draining of machine before deletion, beyond which MCM forcefully deletes the machine (default: 2h). machineHealthTimeout: Timeout (in duration) used while re-joining (in case of temporary health issues) of a machine before it is declared as failed (default: 10m). machineCreationTimeout: Timeout (in duration) used while joining (during creation) of a machine before it is declared as failed (default: 10m). maxEvictRetries: Maximum number of times evicts would be attempted on a pod before it is forcibly deleted during the draining of a machine (default: 10). nodeConditions: List of case-sensitive node-conditions which will change a machine to a Failed state after the machineHealthTimeout duration. It may further be replaced with a new machine if the machine is backed by a machine-set object (defaults: KernelDeadlock, ReadonlyFilesystem , DiskPressure). Rolling Update Triggers Apart from the above mentioned triggers, a rolling update of the shoot worker nodes is also triggered for some changes to your worker pool specification (.spec.provider.workers[], even if you don’t change the Kubernetes or machine image version). The complete list of fields that trigger a rolling update:\n .spec.kubernetes.version (except for patch version changes) .spec.provider.workers[].machine.image.name .spec.provider.workers[].machine.image.version .spec.provider.workers[].machine.type .spec.provider.workers[].volume.type .spec.provider.workers[].volume.size .spec.provider.workers[].providerConfig (except if feature gate NewWorkerPoolHash) .spec.provider.workers[].cri.name .spec.provider.workers[].kubernetes.version (except for patch version changes) .spec.systemComponents.nodeLocalDNS.enabled .status.credentials.rotation.certificateAuthorities.lastInitiationTime (changed by Gardener when a shoot CA rotation is initiated) .status.credentials.rotation.serviceAccountKey.lastInitiationTime (changed by Gardener when a shoot service account signing key rotation is initiated) If feature gate NewWorkerPoolHash is enabled:\n .spec.kubernetes.kubelet.kubeReserved (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.systemReserved (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.evictionHard (unless a worker pool-specific value is set) .spec.kubernetes.kubelet.cpuManagerPolicy (unless a worker pool-specific value is set) .spec.provider.workers[].kubernetes.kubelet.kubeReserved .spec.provider.workers[].kubernetes.kubelet.systemReserved .spec.provider.workers[].kubernetes.kubelet.evictionHard .spec.provider.workers[].kubernetes.kubelet.cpuManagerPolicy Changes to kubeReserved or systemReserved do not trigger a node roll if their sum does not change.\nGenerally, the provider extension controllers might have additional constraints for changes leading to rolling updates, so please consult the respective documentation as well. In particular, if the feature gate NewWorkerPoolHash is enabled and a worker pool uses the new hash, then the providerConfig as a whole is not included. Instead only fields selected by the provider extension are considered.\nRelated Documentation Shoot Operations Shoot Maintenance Confine Specification Changes/Updates Roll Out To Maintenance Time Window. ","categories":"","description":"","excerpt":"Shoot Updates and Upgrades This document describes what happens during …","ref":"/docs/gardener/shoot_updates/","tags":"","title":"Shoot Updates and Upgrades"},{"body":"Shoot Resource Customization Webhooks Gardener deploys several components/resources into the shoot cluster. Some of these resources are essential (like the kube-proxy), others are optional addons (like the kubernetes-dashboard or the nginx-ingress-controller). In either case, some provider extensions might need to mutate these resources and inject provider-specific bits into it.\nWhat’s the approach to implement such mutations? Similar to how control plane components in the seed are modified, we are using MutatingWebhookConfigurations to achieve the same for resources in the shoot. Both the provider extension and the kube-apiserver of the shoot cluster are running in the same seed. Consequently, the kube-apiserver can talk cluster-internally to the provider extension webhook, which makes such operations even faster.\nHow is the MutatingWebhookConfiguration object created in the shoot? The preferred approach is to use a ManagedResource (see also Deploy Resources to the Shoot Cluster) in the seed cluster. This way the gardener-resource-manager ensures that end-users cannot delete/modify the webhook configuration. The provider extension doesn’t need to care about the same.\nWhat else is needed? The shoot’s kube-apiserver must be allowed to talk to the provider extension. To achieve this, you need to make sure that the relevant NetworkPolicy get created for allowing the network traffic. Please refer to this guide for more information.\n","categories":"","description":"","excerpt":"Shoot Resource Customization Webhooks Gardener deploys several …","ref":"/docs/gardener/extensions/shoot-webhooks/","tags":"","title":"Shoot Webhooks"},{"body":"Shoot Worker Nodes Settings Users can configure settings affecting all worker nodes via .spec.provider.workersSettings in the Shoot resource.\nSSH Access SSHAccess indicates whether the sshd.service should be running on the worker nodes. This is ensured by a systemd service called sshd-ensurer.service which runs every 15 seconds on each worker node. When set to true, the systemd service ensures that the sshd.service is unmasked, enabled and running. If it is set to false, the systemd service ensures that sshd.service is disabled, masked and stopped. This also terminates all established SSH connections on the host. In addition, when this value is set to false, existing Bastion resources are deleted during Shoot reconciliation and new ones are prevented from being created, SSH keypairs are not created/rotated, SSH keypair secrets are deleted from the Garden cluster, and the gardener-user.service is not deployed to the worker nodes.\nsshAccess.enabled is set to true by default.\nExample Usage in a Shoot spec: provider: workersSettings: sshAccess: enabled: false ","categories":"","description":"Configuring SSH Access through '.spec.provider.workersSettings`","excerpt":"Configuring SSH Access through '.spec.provider.workersSettings`","ref":"/docs/gardener/shoot_workers_settings/","tags":"","title":"Shoot Worker Nodes Settings"},{"body":"Shortcodes are the Hugo way to extend the limitations of Markdown before resorting to HTML. There are a number of built-in shortcodes available from Hugo. This list is extended with Gardener website shortcodes designed specifically for its content. Find a complete reference to the Hugo built-in shortcodes on its website.\nBelow is a reference to the shortcodes developed for the Gardener website.\nalert {{% alert color=\"info\" title=\"Notice\" %}} text {{% /alert %}} produces Notice A notice disclaimer All the color options are info|warning|primary\nYou can also omit the title section from an alert, useful when creating notes.\nIt is important to note that the text that the “alerts” shortcode wraps will not be processed during site building. Do not use shortcodes in it.\nYou should also avoid mixing HTML and markdown formatting in shortcodes, since it won’t render correctly when the site is built.\nAlert Examples Info color Warning color Primary color mermaid The GitHub mermaid fenced code block syntax is used. You can find additional documentation at mermaid’s official website.\n```mermaid graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ``` produces:\ngraph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] Default settings can be overridden using the %%init%% header at the start of the diagram definition. See the mermaid theming documentation.\n```mermaid %%{init: {'theme': 'neutral', 'themeVariables': { 'mainBkg': '#eee'}}}%% graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ``` produces:\n%%{init: {'theme': 'neutral', 'themeVariables': { 'mainBkg': '#eee'}}}%% graph LR; A[Hard edge] --\u003e|Link text| B(Round edge) B --\u003e C{Decision} C --\u003e|One| D[Result one] C --\u003e|Two| E[Result two] ","categories":"","description":"","excerpt":"Shortcodes are the Hugo way to extend the limitations of Markdown …","ref":"/docs/contribute/documentation/shortcodes/","tags":"","title":"Shortcodes"},{"body":"This page gives writing style guidelines for the Gardener documentation. For formatting guidelines, see the Formatting Guide.\nThese are guidelines, not rules. Use your best judgment, and feel free to propose changes to this document in a Pull Request.\n Structure Language and Grammar Related Links Structure Documentation Types Overview The following table summarizes the types of documentation and their mapping to the SAP UA taxonomy. Every topic you create will fall into one of these categories.\n Gardener Content Type Definition Example Content Comparable UA Content Type Concept Introduce a functionality or concept; covers background information. Services Overview, Relevant headings Concept Reference Provide a reference, for example, list all command line options of gardenctl and what they are used for. Overview of kubectl Relevant headings Reference Task A step-by-step description that allows users to complete a specific task. Upgrading kubeadm clusters Overview, Prerequisites, Steps, Result Complex Task Trail Collection of all other content types to cover a big topic. Custom Networking None Maps Tutorial A combination of many tasks that allows users to complete an example task with the goal to learn the details of a given feature. Deploying Cassandra with a StatefulSet Overview, Prerequisites, Tasks, Result Tutorial See the Contributors Guide for more details on how to produce and contribute documentation.\nTopic Structure When creating a topic, you will need to follow a certain structure. A topic generally comprises of, in order:\n Metadata (Specific for .md files in Gardener) - Additional information about the topic\n Title - A short, descriptive name for the topic\n Content - The main part of the topic. It contains all the information relevant to the user\n Concept content: Overview, Relevant headings Task content: Overview, Prerequisites, Steps, Result Reference content: Relevant headings Related Links (Optional) - A part after the main content that contains links that are not a part of the topic, but are still connected to it\n You can use the provided content description files as a template for your own topics.\nFront Matter Front matter is metadata applied at the head of each content Markdown file. It is used to instruct the static site generator build process. The format is YAML and it must be enclosed in leading and trailing comment dashes (---).\nSample codeblock:\n--- title: Getting Started description: Guides to get you accustomed with Gardener weight: 10 --- There are a number of predefined front matter properites, but not all of them are considered by the layouts developed for the website. The most essential ones to consider are:\n title the content title that will be used as page title and in navigation structures. description describes the content. For some content types such as documentation guides, it may be rendered in the UI. weight a positive integer number that controls the ordering of the content in navigation structures. url if specified, it will override the default url constructed from the file path to the content. Make sure the url you specify is consistent and meaningful. Prefer short paths. Do not provide redundant URLs! persona specifies the type of user the topic is aimed towards. Use only a single persona per topic. persona: Users / Operators / Developers While this section will be automatically generated if your topic has a title header, adding more detailed information helps other users, developers, and technical writers better sort, classify and understand the topic.\nBy using a metadata section you can also skip adding a title header or overwrite it in the navigation section.\nAlerts If you want to add a note, tip or a warning to your topic, use the templates provides in the Shortcodes documentation.\nImages If you want to add an image to your topic, it is recommended to follow the guidelines outlined in the Images documentation.\nGeneral Tips Try to create a succint title and an informative description for your topics If a topic feels too long, it might be better to split it into a few different ones Avoid having have more than ten steps in one a task topic When writing a tutorial, link the tasks used in it instead of copying their content Language and Grammar Language Gardener documentation uses US English Keep it simple and use words that non-native English speakers are also familiar with Use the Merriam-Webster Dictionary when checking the spelling of words Writing Style Write in a conversational manner and use simple present tense Be friendly and refer to the person reading your content as “you”, instead of standard terms such as “user” Use an active voice - make it clear who is performing the action Creating Titles and Headers Use title case when creating titles or headers Avoid adding additional formatting to the title or header Concept and reference topic titles should be simple and succint Task and tutorial topic titles begin with a verb Related Links Formatting Guide Contributors Guide Shortcodes Images SAPterm ","categories":"","description":"","excerpt":"This page gives writing style guidelines for the Gardener …","ref":"/docs/contribute/documentation/style-guide/","tags":"","title":"Style Guide"},{"body":"Supported Kubernetes Versions We strongly recommend using etcd-druid with the supported kubernetes versions, published in this document. The following is a list of kubernetes versions supported by the respective etcd-druid versions.\n Etcd-druid version Kubernetes version \u003e=0.20 \u003e=1.21 \u003e=0.14 \u0026\u0026 \u003c0.20 All versions supported \u003c0.14 \u003c 1.25 ","categories":"","description":"","excerpt":"Supported Kubernetes Versions We strongly recommend using etcd-druid …","ref":"/docs/other-components/etcd-druid/supported_k8s_versions/","tags":"","title":"Supported K8s Versions"},{"body":"Supported Kubernetes Versions Currently, Gardener supports the following Kubernetes versions:\nGarden Clusters The minimum version of a garden cluster that can be used to run Gardener is 1.25.x.\nSeed Clusters The minimum version of a seed cluster that can be connected to Gardener is 1.25.x.\nShoot Clusters Gardener itself is capable of spinning up clusters with Kubernetes versions 1.25 up to 1.30. However, the concrete versions that can be used for shoot clusters depend on the installed provider extension. Consequently, please consult the documentation of your provider extension to see which Kubernetes versions are supported for shoot clusters.\n 👨🏼‍💻 Developers note: The Adding Support For a New Kubernetes Version topic explains what needs to be done in order to add support for a new Kubernetes version.\n ","categories":"","description":"","excerpt":"Supported Kubernetes Versions Currently, Gardener supports the …","ref":"/docs/gardener/supported_k8s_versions/","tags":"","title":"Supported Kubernetes Versions"},{"body":"Gardener Extension for SUSE CHost \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting SUSE Container Host configuration, i.e. suse-chost type:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: suse-chost units: ... files: ... Please find a concrete example in the example folder.\nIt is also capable of supporting the vSMP MemoryOne operating system with the memoryone-chost type. Please find more information here.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the SUSE Container Host operating system (CHost)","excerpt":"Gardener extension controller for the SUSE Container Host operating …","ref":"/docs/extensions/os-extensions/gardener-extension-os-suse-chost/","tags":"","title":"SUSE CHost OS"},{"body":"Problem One thing that always bothered me was that I couldn’t get logs of several pods at once with kubectl. A simple tail -f \u003cpath-to-logfile\u003e isn’t possible at all. Certainly, you can use kubectl logs -f \u003cpod-id\u003e, but it doesn’t help if you want to monitor more than one pod at a time.\nThis is something you really need a lot, at least if you run several instances of a pod behind a deployment. This is even more so if you don’t have a Kibana or a similar setup.\nSolution Luckily, there are smart developers out there who always come up with solutions. The finding of the week is a small bash script that allows you to aggregate log files of several pods at the same time in a simple way. The script is called kubetail and is available at GitHub.\n","categories":"","description":"Aggregate log files from different pods","excerpt":"Aggregate log files from different pods","ref":"/docs/guides/monitoring-and-troubleshooting/tail-logfile/","tags":"","title":"tail -f /var/log/my-application.log"},{"body":"Access the Kubernetes apiserver from your tailnet Overview If you would like to strengthen the security of your Kubernetes cluster even further, this guide post explains how this can be achieved.\nThe most common way to secure a Kubernetes cluster which was created with Gardener is to apply the ACLs described in the Gardener ACL Extension repository or to use ExposureClass, which exposes the Kubernetes apiserver in a corporate network not exposed to the public internet.\nHowever, those solutions are not without their drawbacks. Managing the ACL extension becomes fairly difficult with the growing number of participants, especially in a dynamic environment and work from home scenarios, and using ExposureClass requires you to first have a corporate network suitable for this purpose.\nBut there is a solution which bridges the gap between these two approaches by the use of a mesh VPN based on WireGuard\nTailscale Tailscale is a mesh VPN network which uses Wireguard under the hood, but automates the key exchange procedure. Please consult the official tailscale documentation for a detailed explanation.\nTarget Architecture Installation In order to be able to access the Kubernetes apiserver only from a tailscale VPN, you need this steps:\n Create a tailscale account and ensure MagicDNS is enabled. Create an OAuth ClientID and Secret OAuth ClientID and Secret. Don’t forget to create the required tags. Install the tailscale operator tailscale operator. If all went well after the operator installation, you should be able to see the tailscale operator by running tailscale status:\n# tailscale status ... 100.83.240.121 tailscale-operator tagged-devices linux - ... Expose the Kubernetes apiserver Now you are ready to expose the Kubernetes apiserver in the tailnet by annotating the service which was created by Gardener:\nkubectl annotate -n default kubernetes tailscale.com/expose=true tailscale.com/hostname=kubernetes It is required to kubernetes as the hostname, because this is part of the certificate common name of the Kubernetes apiserver.\nAfter annotating the service, it will be exposed in the tailnet and can be shown by running tailscale status:\n# tailscale status ... 100.83.240.121 tailscale-operator tagged-devices linux - 100.96.191.87 kubernetes tagged-devices linux idle, tx 19548 rx 71656 ... Modify the kubeconfig In order to access the cluster via the VPN, you must modify the kubeconfig to point to the Kubernetes service exposed in the tailnet, by changing the server entry to https://kubernetes.\n--- apiVersion: v1 clusters: - cluster: certificate-authority-data: \u003cbase64 encoded secret\u003e server: https://kubernetes name: my-cluster ... Enable ACLs to Block All IPs Now you are ready to use your cluster from every device which is part of your tailnet. Therefore you can now block all access to the Kubernetes apiserver with the ACL extension.\nCaveats Multiple Kubernetes Clusters You can actually not join multiple Kubernetes Clusters to join your tailnet because the kubernetes service in every cluster would overlap.\nHeadscale It is possible to host a tailscale coordination by your own if you do not want to rely on the service tailscale.com offers. The headscale project is a open source implementation of this.\nThis works for basic tailscale VPN setups, but not for the tailscale operator described here, because headscale does not implement all required API endpoints for the tailscale operator. The details can be found in this Github Issue.\n","categories":"","description":"","excerpt":"Access the Kubernetes apiserver from your tailnet Overview If you …","ref":"/docs/guides/administer-shoots/tailscale/","tags":"","title":"Tailscale"},{"body":"Taints and Tolerations for Seeds and Shoots Similar to taints and tolerations for Nodes and Pods in Kubernetes, the Seed resource supports specifying taints (.spec.taints, see this example) while the Shoot resource supports specifying tolerations (.spec.tolerations, see this example). The feature is used to control scheduling to seeds as well as decisions whether a shoot can use a certain seed.\nCompared to Kubernetes, Gardener’s taints and tolerations are very much down-stripped right now and have some behavioral differences. Please read the following explanations carefully if you plan to use them.\nScheduling When scheduling a new shoot, the gardener-scheduler will filter all seed candidates whose taints are not tolerated by the shoot. As Gardener’s taints/tolerations don’t support effects yet, you can compare this behaviour with using a NoSchedule effect taint in Kubernetes.\nBe reminded that taints/tolerations are no means to define any affinity or selection for seeds - please use .spec.seedSelector in the Shoot to state such desires.\n⚠️ Please note that - unlike how it’s implemented in Kubernetes - a certain seed cluster may only be used when the shoot tolerates all the seed’s taints. This means that specifying .spec.seedName for a seed whose taints are not tolerated will make the gardener-apiserver reject the request.\nConsequently, the taints/tolerations feature can be used as means to restrict usage of certain seeds.\nToleration Defaults and Whitelist The Project resource features a .spec.tolerations object that may carry defaults and a whitelist (see this example). The corresponding ShootTolerationRestriction admission plugin (cf. Kubernetes’ PodTolerationRestriction admission plugin) is responsible for evaluating these settings during creation/update of Shoots.\nWhitelist If a shoot gets created or updated with tolerations, then it is validated that only those tolerations may be used that were added to either a) the Project’s .spec.tolerations.whitelist, or b) to the global whitelist in the ShootTolerationRestriction’s admission config (see this example).\n⚠️ Please note that the tolerations whitelist of Projects can only be changed if the user trying to change it is bound to the modify-spec-tolerations-whitelist custom RBAC role, e.g., via the following ClusterRole:\napiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: full-project-modification-access rules: - apiGroups: - core.gardener.cloud resources: - projects verbs: - create - patch - update - modify-spec-tolerations-whitelist - delete Defaults If a shoot gets created, then the default tolerations specified in both the Project’s .spec.tolerations.defaults and the global default list in the ShootTolerationRestriction admission plugin’s configuration will be added to the .spec.tolerations of the Shoot (unless it already specifies a certain key).\n","categories":"","description":"","excerpt":"Taints and Tolerations for Seeds and Shoots Similar to taints and …","ref":"/docs/gardener/tolerations/","tags":"","title":"Taints and Tolerations for Seeds and Shoots"},{"body":"Task Title (the topic title can also be placed in the frontmatter)\nOverview This section provides an overview of the topic and the information provided in it.\nPrerequisites Prerequisite 1 Prerequisite 2 Steps Avoid nesting headings directly on top of each other with no text inbetween.\n Describe step 1 here Describe step 2 here You can use smaller sections within sections for related tasks Avoid nesting headings directly on top of each other with no text inbetween.\n Describe step 1 here Describe step 2 here Result Screenshot of the final status once all the steps have been completed.\nRelated Links Provide links to other relevant topics, if applicable. Once someone has completed these steps, what might they want to do next?\n Link 1 Link 2 ","categories":"","description":"Describes the contents of a task topic","excerpt":"Describes the contents of a task topic","ref":"/docs/contribute/documentation/style-guide/task_template/","tags":"","title":"Task Topic Structure"},{"body":"Terminal Shortcuts As user and/or gardener administrator you can configure terminal shortcuts, which are preconfigured terminals for frequently used views.\nYou can launch the terminal shortcuts directly on the shoot details screen. You can view the definition of a terminal terminal shortcut by clicking on they eye icon What also has improved is, that when creating a new terminal you can directly alter the configuration. With expanded configuration On the Create Terminal Session dialog you can choose one or multiple terminal shortcuts. Project specific terminal shortcuts created (by a member of the project) have a project icon badge and are listed as Unverified. A warning message is displayed before a project specific terminal shortcut is ran informing the user about the risks. How to create a project specific terminal shortcut\nDisclaimer: “Project specific terminal shortcuts” is experimental feature and may change in future releases (we plan to introduce a dedicated custom resource).\nYou need to create a secret with the name terminal.shortcuts within your project namespace, containing your terminal shortcut configurations. Under data.shortcuts you add a list of terminal shortcuts (base64 encoded). Example terminal.shortcuts secret:\nkind: Secret type: Opaque metadata: name: terminal.shortcuts namespace: garden-myproject apiVersion: v1 data: shortcuts: LS0tCi0gdGl0bGU6IE5ldHdvcmtEZWxheVRlc3RzCiAgZGVzY3JpcHRpb246IFNob3cgbmV0d29ya21hY2hpbmVyeS5pbydzIE5ldHdvcmtEZWxheVRlc3RzCiAgdGFyZ2V0OiBzaG9vdAogIGNvbnRhaW5lcjoKICAgIGltYWdlOiBxdWF5LmlvL2RlcmFpbGVkL2s5czpsYXRlc3QKICAgIGFyZ3M6CiAgICAtIC0taGVhZGxlc3MKICAgIC0gLS1jb21tYW5kPW5ldHdvcmtkZWxheXRlc3QKICBzaG9vdFNlbGVjdG9yOgogICAgbWF0Y2hMYWJlbHM6CiAgICAgIGZvbzogYmFyCi0gdGl0bGU6IFNjYW4gQ2x1c3RlcgogIGRlc2NyaXB0aW9uOiBTY2FucyBsaXZlIEt1YmVybmV0ZXMgY2x1c3RlciBhbmQgcmVwb3J0cyBwb3RlbnRpYWwgaXNzdWVzIHdpdGggZGVwbG95ZWQgcmVzb3VyY2VzIGFuZCBjb25maWd1cmF0aW9ucwogIHRhcmdldDogc2hvb3QKICBjb250YWluZXI6CiAgICBpbWFnZTogcXVheS5pby9kZXJhaWxlZC9rOXM6bGF0ZXN0CiAgICBhcmdzOgogICAgLSAtLWhlYWRsZXNzCiAgICAtIC0tY29tbWFuZD1wb3BleWU= How to configure the dashboard with terminal shortcuts Example values.yaml:\nfrontend: features: terminalEnabled: true projectTerminalShortcutsEnabled: true # members can create a `terminal.shortcuts` secret containing the project specific terminal shortcuts terminal: shortcuts: - title: \"Control Plane Pods\" description: Using K9s to view the pods of the control plane for this cluster target: cp container: image: quay.io/derailed/k9s:latest - \"--headless\" - \"--command=pods\" - title: \"Cluster Overview\" description: This gives a quick overview about the status of your cluster using K9s pulse feature target: shoot container: image: quay.io/derailed/k9s:latest args: - \"--headless\" - \"--command=pulses\" - title: \"Nodes\" description: View the nodes for this cluster target: shoot container: image: quay.io/derailed/k9s:latest command: - bin/sh args: - -c - sleep 1 \u0026\u0026 while true; do k9s --headless --command=nodes; done # shootSelector: # matchLabels: # foo: bar [...] terminal: # is generally required for the terminal feature container: image: europe-docker.pkg.dev/gardener-project/releases/gardener/ops-toolbelt:0.26.0 containerImageDescriptions: - image: /.*/ops-toolbelt:.*/ description: Run `ghelp` to get information about installed tools and packages gardenTerminalHost: seedRef: my-soil garden: operatorCredentials: serviceAccountRef: name: dashboard-terminal-admin namespace: garden ","categories":"","description":"","excerpt":"Terminal Shortcuts As user and/or gardener administrator you can …","ref":"/docs/dashboard/terminal-shortcuts/","tags":"","title":"Terminal Shortcuts"},{"body":"Testing Jest We use Jest JavaScript Testing Framework\n Jest can collect code coverage information​ Jest support snapshot testing out of the box​ All in One solution. Replaces Mocha, Chai, Sinon and Istanbul​ It works with Vue.js and Node.js projects​ To execute all tests, simply run\nyarn workspaces foreach --all run test or to include test coverage generation\nyarn workspaces foreach --all run test-coverage You can also run tests for frontend, backend and charts directly inside the respective folder via\nyarn test Lint We use ESLint for static code analyzing.\nTo execute, run\nyarn workspaces foreach --all run lint ","categories":"","description":"","excerpt":"Testing Jest We use Jest JavaScript Testing Framework\n Jest can …","ref":"/docs/dashboard/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline This document walks you through:\n What kind of tests we have in Gardener How to run each of them What purpose each kind of test serves How to best write tests that are correct, stable, fast and maintainable How to debug tests that are not working as expected The document is aimed towards developers that want to contribute code and need to write tests, as well as maintainers and reviewers that review test code. It serves as a common guide that we commit to follow in our project to ensure consistency in our tests, good coverage for high confidence, and good maintainability.\nThe guidelines are not meant to be absolute rules. Always apply common sense and adapt the guideline if it doesn’t make much sense for some cases. If in doubt, don’t hesitate to ask questions during a PR review (as an author, but also as a reviewer). Add new learnings as soon as we make them!\nGenerally speaking, tests are a strict requirement for contributing new code. If you touch code that is currently untested, you need to add tests for the new cases that you introduce as a minimum. Ideally though, you would add the missing test cases for the current code as well (boy scout rule – “always leave the campground cleaner than you found it”).\nWriting Tests (Relevant for All Kinds) We follow BDD (behavior-driven development) testing principles and use Ginkgo, along with Gomega. Make sure to check out their extensive guides for more information and how to best leverage all of their features Use By to structure test cases with multiple steps, so that steps are easy to follow in the logs: example test Call defer GinkgoRecover() if making assertions in goroutines: doc, example test Use DeferCleanup instead of cleaning up manually (or use custom coding from the test framework): example test, example test DeferCleanup makes sure to run the cleanup code in the right point in time, e.g., a DeferCleanup added in BeforeEach is executed with AfterEach. Test results should point to locations that cause the failures, so that the CI output isn’t too difficult to debug/fix. Consider using ExpectWithOffset if the test uses assertions made in a helper function, among other assertions defined directly in the test (e.g. expectSomethingWasCreated): example test Make sure to add additional descriptions to Gomega matchers if necessary (e.g. in a loop): example test Introduce helper functions for assertions to make test more readable where applicable: example test Introduce custom matchers to make tests more readable where applicable: example matcher Don’t rely on accurate timing of time.Sleep and friends. If doing so, CPU throttling in CI will make tests flaky, example flake Use fake clocks instead, example PR Use the same client schemes that are also used by production code to avoid subtle bugs/regressions: example PR, production schemes, usage in test Make sure that your test is actually asserting the right thing and it doesn’t pass if the exact bug is introduced that you want to prevent. Use specific error matchers instead of asserting any error has happened, make sure that the corresponding branch in the code is tested, e.g., prefer Expect(err).To(MatchError(\"foo\")) over Expect(err).To(HaveOccurred()) If you’re unsure about your test’s behavior, attaching the debugger can sometimes be helpful to make sure your test is correct. About overwriting global variables: This is a common pattern (or hack?) in go for faking calls to external functions. However, this can lead to races, when the global variable is used from a goroutine (e.g., the function is called). Alternatively, set fields on structs (passed via parameter or set directly): this is not racy, as struct values are typically (and should be) only used for a single test case. An alternative to dealing with function variables and fields: Add an interface which your code depends on Write a fake and a real implementation (similar to clock.Clock.Sleep) The real implementation calls the actual function (clock.RealClock.Sleep calls time.Sleep) The fake implementation does whatever you want it to do for your test (clock.FakeClock.Sleep waits until the test code advanced the time) Use constants in test code with care. Typically, you should not use constants from the same package as the tested code, instead use literals. If the constant value is changed, tests using the constant will still pass, although the “specification” is not fulfilled anymore. There are cases where it’s fine to use constants, but keep this caveat in mind when doing so. Creating sample data for tests can be a high effort. If valuable, add a package for generating common sample data, e.g. Shoot/Cluster objects. Make use of the testdata directory for storing arbitrary sample data needed by tests (helm charts, YAML manifests, etc.), example PR From https://pkg.go.dev/cmd/go/internal/test: The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Unit Tests Running Unit Tests Run all unit tests:\nmake test Run all unit tests with test coverage:\nmake test-cov open test.coverage.html make test-cov-clean Run unit tests of specific packages:\n# run with same settings like in CI (race detector, timeout, ...) ./hack/test.sh ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... # freestyle go test ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... ginkgo run ./pkg/resourcemanager/controller/... ./pkg/utils/secrets/... Debugging Unit Tests Use ginkgo to focus on (a set of) test specs via code or via CLI flags. Remember to unfocus specs before contributing code, otherwise your PR tests will fail.\n$ ginkgo run --focus \"should delete the unused resources\" ./pkg/resourcemanager/controller/garbagecollector ... Will run 1 of 3 specs SS• Ran 1 of 3 Specs in 0.003 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 2 Skipped PASS Use ginkgo to run tests until they fail:\n$ ginkgo run --until-it-fails ./pkg/resourcemanager/controller/garbagecollector ... Ran 3 of 3 Specs in 0.004 seconds SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 0 Skipped PASS All tests passed... Will keep running them until they fail. This was attempt #58 No, seriously... you can probably stop now. Use the stress tool for deflaking tests that fail sporadically in CI, e.g., due resource contention (CPU throttling):\n# get the stress tool go install golang.org/x/tools/cmd/stress@latest # build a test binary ginkgo build ./pkg/resourcemanager/controller/garbagecollector # alternatively go test -c ./pkg/resourcemanager/controller/garbagecollector # run the test in parallel and report any failures stress -p 16 ./pkg/resourcemanager/controller/garbagecollector/garbagecollector.test -ginkgo.focus \"should delete the unused resources\" 5s: 1077 runs so far, 0 failures 10s: 2160 runs so far, 0 failures stress will output a path to a file containing the full failure message when a test run fails.\nPurpose of Unit Tests Unit tests prove the correctness of a single unit according to the specification of its interface. Think: Is the unit that I introduced doing what it is supposed to do for all cases? Unit tests protect against regressions caused by adding new functionality to or refactoring of a single unit. Think: Is the unit that was introduced earlier (by someone else) and that I changed still doing what it was supposed to do for all cases? Example units: functions (conversion, defaulting, validation, helpers), structs (helpers, basic building blocks like the Secrets Manager), predicates, event handlers. For these purposes, unit tests need to cover all important cases of input for a single unit and cover edge cases / negative paths as well (e.g., errors). Because of the possible high dimensionality of test input, unit tests need to be fast to execute: individual test cases should not take more than a few seconds, test suites not more than 2 minutes. Fuzzing can be used as a technique in addition to usual test cases for covering edge cases. Test coverage can be used as a tool during test development for covering all cases of a unit. However, test coverage data can be a false safety net. Full line coverage doesn’t mean you have covered all cases of valid input. We don’t have strict requirements for test coverage, as it doesn’t necessarily yield the desired outcome. Unit tests should not test too large components, e.g. entire controller Reconcile functions. If a function/component does many steps, it’s probably better to split it up into multiple functions/components that can be unit tested individually There might be special cases for very small Reconcile functions. If there are a lot of edge cases, extract dedicated functions that cover them and use unit tests to test them. Usual-sized controllers should rather be tested in integration tests. Individual parts (e.g. helper functions) should still be tested in unit test for covering all cases, though. Unit tests are especially easy to run with a debugger and can help in understanding concrete behavior of components. Writing Unit Tests For the sake of execution speed, fake expensive calls/operations, e.g. secret generation: example test Generally, prefer fakes over mocks, e.g., use controller-runtime fake client over mock clients. Mocks decrease maintainability because they expect the tested component to follow a certain way to reach the desired goal (e.g., call specific functions with particular arguments), example consequence Generally, fakes should be used in “result-oriented” test code (e.g., that a certain object was labelled, but the test doesn’t care if it was via patch or update as both a valid ways to reach the desired goal). Although rare, there are valid use cases for mocks, e.g. if the following aspects are important for correctness: Asserting that an exact function is called Asserting that functions are called in a specific order Asserting that exact parameters/values/… are passed Asserting that a certain function was not called Many of these can also be verified with fakes, although mocks might be simpler Only use mocks if the tested code directly calls the mock; never if the tested code only calls the mock indirectly (e.g., through a helper package/function). Keep in mind the maintenance implications of using mocks: Can you make a valid non-behavioral change in the code without breaking the test or dependent tests? It’s valid to mix fakes and mocks in the same test or between test cases. Generally, use the go test package, i.e., declare package \u003cproduction_package\u003e_test: Helps in avoiding cyclic dependencies between production, test and helper packages Also forces you to distinguish between the public (exported) API surface of your code and internal state that might not be of interest to tests It might be valid to use the same package as the tested code if you want to test unexported functions. Alternatively, an internal package can be used to host “internal” helpers: example package Helpers can also be exported if no one is supposed to import the containing package (e.g. controller package). Integration Tests (envtests) Integration tests in Gardener use the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) and runs the test against it. The test suites start their individual envtest environment before running the tested controller/webhook and executing test cases. Before exiting, the test suites tear down the temporary test environment.\nPackage github.com/gardener/gardener/test/envtest augments the controller-runtime’s envtest package by starting and registering gardener-apiserver. This is used to test controllers that act on resources in the Gardener APIs (aggregated APIs).\nHistorically, test machinery tests have also been called “integration tests”. However, test machinery does not perform integration testing but rather executes a form of end-to-end tests against a real landscape. Hence, we tried to sharpen the terminology that we use to distinguish between “real” integration tests and test machinery tests but you might still find “integration tests” referring to test machinery tests in old issues or outdated documents.\nRunning Integration Tests The test-integration make rule prepares the environment automatically by downloading the respective binaries (if not yet present) and setting the necessary environment variables.\nmake test-integration If you want to run a specific set of integration tests, you can also execute them using ./hack/test-integration.sh directly instead of using the test-integration rule. Prior to execution, the PATH environment variable needs to be set to also included the tools binary directory. For example:\nexport PATH=\"$PWD/hack/tools/bin/$(go env GOOS)-$(go env GOARCH):$PATH\" source ./hack/test-integration.env ./hack/test-integration.sh ./test/integration/resourcemanager/tokenrequestor The script takes care of preparing the environment for you. If you want to execute the test suites directly via go test or ginkgo, you have to point the KUBEBUILDER_ASSETS environment variable to the path that contains the etcd and kube-apiserver binaries. Alternatively, you can install the binaries to /usr/local/kubebuilder/bin. Additionally, the environment variables from hack/test-integration.env should be sourced.\nDebugging Integration Tests You can configure envtest to use an existing cluster or control plane instead of starting a temporary control plane that is torn down immediately after executing the test. This can be helpful for debugging integration tests because you can easily inspect what is going on in your test environment with kubectl.\nWhile you can use an existing cluster (e.g., kind), some test suites expect that no controllers and no nodes are running in the test environment (as it is the case in envtest test environments). Hence, using a full-blown cluster with controllers and nodes might sometimes be impractical, as you would need to stop cluster components for the tests to work.\nYou can use make start-envtest to start an envtest test environment that is managed separately from individual test suites. This allows you to keep the test environment running for as long as you want, and to debug integration tests by executing multiple test runs in parallel or inspecting test runs using kubectl. When you are finished, just hit CTRL-C for tearing down the test environment. The kubeconfig for the test environment is placed in dev/envtest-kubeconfig.yaml.\nmake start-envtest brings up an envtest environment using the default configuration. If your test suite requires a different control plane configuration (e.g., disabled admission plugins or enabled feature gates), feel free to locally modify the configuration in test/start-envtest while debugging.\nRun an envtest suite (not using gardener-apiserver) against an existing test environment:\nmake start-envtest # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_CLUSTER=true # run test with verbose output ./hack/test-integration.sh -v ./test/integration/resourcemanager/health -ginkgo.v # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml # watch test objects k get managedresource -A -w Run a gardenerenvtest suite (using gardener-apiserver) against an existing test environment:\n# modify GardenerTestEnvironment{} in test/start-envtest to disable admission plugins and enable feature gates like in test suite... make start-envtest ENVTEST_TYPE=gardener # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_GARDENER=true # run test with verbose output ./hack/test-integration.sh -v ./test/integration/controllermanager/bastion -ginkgo.v # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml # watch test objects k get bastion -A -w Similar to debugging unit tests, the stress tool can help hunting flakes in integration tests. Though, you might need to run less tests in parallel though (specified via -p) and have a bit more patience. Generally, reproducing flakes in integration tests is easier when stress-testing against an existing test environment instead of starting temporary individual control planes per test run.\nStress-test an envtest suite (not using gardener-apiserver):\n# build a test binary ginkgo build ./test/integration/resourcemanager/health # prepare a test environment to run the test against make start-envtest # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_CLUSTER=true # use same timeout settings like in CI source ./hack/test-integration.env # switch to test package directory like `go test` cd ./test/integration/resourcemanager/health # run the test in parallel and report any failures stress -ignore \"unable to grab random port\" -p 16 ./health.test ... Stress-test a gardenerenvtest suite (using gardener-apiserver):\n# modify test/start-envtest to disable admission plugins and enable feature gates like in test suite... # build a test binary ginkgo build ./test/integration/controllermanager/bastion # prepare a test environment including gardener-apiserver to run the test against make start-envtest ENVTEST_TYPE=gardener # in another terminal session: export KUBECONFIG=$PWD/dev/envtest-kubeconfig.yaml export USE_EXISTING_GARDENER=true # use same timeout settings like in CI source ./hack/test-integration.env # switch to test package directory like `go test` cd ./test/integration/controllermanager/bastion # run the test in parallel and report any failures stress -ignore \"unable to grab random port\" -p 16 ./bastion.test ... Purpose of Integration Tests Integration tests prove that multiple units are correctly integrated into a fully-functional component of the system. Example components with multiple units: A controller with its reconciler, watches, predicates, event handlers, queues, etc. A webhook with its server, handler, decoder, and webhook configuration. Integration tests set up a full component (including used libraries) and run it against a test environment close to the actual setup. e.g., start controllers against a real Kubernetes control plane to catch bugs that can only happen when talking to a real API server. Integration tests are generally more expensive to run (e.g., in terms of execution time). Integration tests should not cover each and every detailed case. Rather than that, cover a good portion of the “usual” cases that components will face during normal operation (positive and negative test cases). Also, there is no need to cover all failure cases or all cases of predicates -\u003e they should be covered in unit tests already. Generally, not supposed to “generate test coverage” but to provide confidence that components work well. As integration tests typically test only one component (or a cohesive set of components) isolated from others, they cannot catch bugs that occur when multiple controllers interact (could be discovered by e2e tests, though). Rule of thumb: a new integration tests should be added for each new controller (an integration test doesn’t replace unit tests though). Writing Integration Tests Make sure to have a clean test environment on both test suite and test case level: Set up dedicated test environments (envtest instances) per test suite. Use dedicated namespaces per test suite: Use GenerateName with a test-specific prefix: example test Restrict the controller-runtime manager to the test namespace by setting manager.Options.Namespace: example test Alternatively, use a test-specific prefix with a random suffix determined upfront: example test This can be used to restrict webhooks to a dedicated test namespace: example test This allows running a test in parallel against the same existing cluster for deflaking and stress testing: example PR If the controller works on cluster-scoped resources: Label the resources with a label specific to the test run, e.g. the test namespace’s name: example test Restrict the manager’s cache for these objects with a corresponding label selector: example test Alternatively, use a checksum of a random UUID using uuid.NewUUID() function: example test This allows running a test in parallel against the same existing cluster for deflaking and stress testing, even if it works with cluster-scoped resources that are visible to all parallel test runs: example PR Use dedicated test resources for each test case: Use GenerateName: example test Alternatively, use a checksum of a random UUID using uuid.NewUUID() function: example test Logging the created object names is generally a good idea to support debugging failing or flaky tests: example test Always delete all resources after the test case (e.g., via DeferCleanup) that were created for the test case This avoids conflicts between test cases and cascading failures which distract from the actual root failures Don’t tolerate already existing resources (~dirty test environment), code smell: ignoring already exist errors Don’t use a cached client in test code (e.g., the one from a controller-runtime manager), always construct a dedicated test client (uncached): example test Use asynchronous assertions: Eventually and Consistently. Never Expect anything to happen synchronously (immediately). Don’t use retry or wait until functions -\u003e use Eventually, Consistently instead: example test This allows to override the interval/timeout values from outside instead of hard-coding this in the test (see hack/test-integration.sh): example PR Beware of the default Eventually / Consistently timeouts / poll intervals: docs Don’t set custom (high) timeouts and intervals in test code: example PR iInstead, shorten sync period of controllers, overwrite intervals of the tested code, or use fake clocks: example test Pass g Gomega to Eventually/Consistently and use g.Expect in it: docs, example test, example PR Don’t forget to call {Eventually,Consistently}.Should(), otherwise the assertions always silently succeeds without errors: onsi/gomega#561 When using Gardener’s envtest (envtest.GardenerTestEnvironment): Disable gardener-apiserver’s admission plugins that are not relevant to the integration test itself by passing --disable-admission-plugins: example test This makes setup / teardown code simpler and ensures to only test code relevant to the tested component itself (but not the entire set of admission plugins) e.g., you can disable the ShootValidator plugin to create Shoots that reference non-existing SecretBindings or disable the DeletionConfirmation plugin to delete Gardener resources without adding a deletion confirmation first. Use a custom rate limiter for controllers in integration tests: example test This can be used for limiting exponential backoff to shorten wait times. Otherwise, if using the default rate limiter, exponential backoff might exceed the timeout of Eventually calls and cause flakes. End-to-End (e2e) Tests (Using provider-local) We run a suite of e2e tests on every pull request and periodically on the master branch. It uses a KinD cluster and skaffold to bootstrap a full installation of Gardener based on the current revision, including provider-local. This allows us to run e2e tests in an isolated test environment and fully locally without any infrastructure interaction. The tests perform a set of operations on Shoot clusters, e.g. creating, deleting, hibernating and waking up.\nThese tests are executed in our prow instance at prow.gardener.cloud, see job definition and job history.\nRunning e2e Tests You can also run these tests on your development machine, using the following commands:\nmake kind-up export KUBECONFIG=$PWD/example/gardener-local/kind/local/kubeconfig make gardener-up make test-e2e-local # alternatively: make test-e2e-local-simple If you want to run a specific set of e2e test cases, you can also execute them using ./hack/test-e2e-local.sh directly in combination with ginkgo label filters. For example:\n./hack/test-e2e-local.sh --label-filter \"Shoot \u0026\u0026 credentials-rotation\" ./test/e2e/gardener/... If you want to use an existing shoot instead of creating a new one for the test case and deleting it afterwards, you can specify the existing shoot via the following flags. This can be useful to speed up the development of e2e tests.\n./hack/test-e2e-local.sh --label-filter \"Shoot \u0026\u0026 credentials-rotation\" ./test/e2e/gardener/... -- --project-namespace=garden-local --existing-shoot-name=local For more information, see Developing Gardener Locally and Deploying Gardener Locally.\nDebugging e2e Tests When debugging e2e test failures in CI, logs of the cluster components can be very helpful. Our e2e test jobs export logs of all containers running in the kind cluster to prow’s artifacts storage. You can find them by clicking the Artifacts link in the top bar in prow’s job view and navigating to artifacts. This directory will contain all cluster component logs grouped by node.\nPull all artifacts using gsutil for searching and filtering the logs locally (use the path displayed in the artifacts view):\ngsutil cp -r gs://gardener-prow/pr-logs/pull/gardener_gardener/6136/pull-gardener-e2e-kind/1542030416616099840/artifacts/gardener-local-control-plane /tmp Purpose of e2e Tests e2e tests provide a high level of confidence that our code runs as expected by users when deployed to production. They are supposed to catch bugs resulting from interaction between multiple components. Test cases should be as close as possible to real usage by end users: You should test “from the perspective of the user” (or operator). Example: I create a Shoot and expect to be able to connect to it via the provided kubeconfig. Accordingly, don’t assert details of the system. e.g., the user also wouldn’t expect that there is a kube-apiserver deployment in the seed, they rather expect that they can talk to it no matter how it is deployed Only assert details of the system if the tested feature is not fully visible to the end-user and there is no other way of ensuring that the feature works reliably e.g., the Shoot CA rotation is not fully visible to the user but is assertable by looking at the secrets in the Seed. Pro: can be executed by developers and users without any real infrastructure (provider-local). Con: they currently cannot be executed with real infrastructure (e.g., provider-aws), we will work on this as part of #6016. Keep in mind that the tested scenario is still artificial in a sense of using default configuration, only a few objects, only a few config/settings combinations are covered. We will never be able to cover the full “test matrix” and this should not be our goal. Bugs will still be released and will still happen in production; we can’t avoid it. Instead, we should add test cases for preventing bugs in features or settings that were frequently regressed: example PR Usually e2e tests cover the “straight-forward cases”. However, negative test cases can also be included, especially if they are important from the user’s perspective. Writing e2e Tests Always wrap API calls and similar things in Eventually blocks: example test At this point, we are pretty much working with a distributed system and failures can happen anytime. Wrapping calls in Eventually makes tests more stable and more realistic (usually, you wouldn’t call the system broken if a single API call fails because of a short connectivity issue). Most of the points from writing integration tests are relevant for e2e tests as well (especially the points about asynchronous assertions). In contrast to integration tests, in e2e tests, it might make sense to specify higher timeouts for Eventually calls, e.g., when waiting for a Shoot to be reconciled. Generally, try to use the default settings for Eventually specified via the environment variables. Only set higher timeouts if waiting for long-running reconciliations to be finished. Gardener Upgrade Tests (Using provider-local) Gardener upgrade tests setup a kind cluster and deploy Gardener version vX.X.X before upgrading it to a given version vY.Y.Y.\nThis allows verifying whether the current (unreleased) revision/branch (or a specific release) is compatible with the latest (or a specific other) release. The GARDENER_PREVIOUS_RELEASE and GARDENER_NEXT_RELEASE environment variables are used to specify the respective versions.\nThis helps understanding what happens or how the system reacts when Gardener upgrades from versions vX.X.X to vY.Y.Y for existing shoots in different states (creation/hibernation/wakeup/deletion). Gardener upgrade tests also help qualifying releases for all flavors (non-HA or HA with failure tolerance node/zone).\nJust like E2E tests, upgrade tests also use a KinD cluster and skaffold for bootstrapping a full Gardener installation based on the current revision/branch, including provider-local. This allows running e2e tests in an isolated test environment, fully locally without any infrastructure interaction. The tests perform a set of operations on Shoot clusters, e.g. create, delete, hibernate and wake up.\nBelow is a sequence describing how the tests are performed.\n Create a kind cluster. Install Gardener version vX.X.X. Run gardener pre-upgrade tests which are labeled with pre-upgrade. Upgrade Gardener version from vX.X.X to vY.Y.Y. Run gardener post-upgrade tests which are labeled with post-upgrade Tear down seed and kind cluster. How to Run Upgrade Tests Between Two Gardener Releases Sometimes, we need to verify/qualify two Gardener releases when we upgrade from one version to another. This can performed by fetching the two Gardener versions from the GitHub Gardener release page and setting appropriate env variables GARDENER_PREVIOUS_RELEASE, GARDENER_NEXT_RELEASE.\n GARDENER_PREVIOUS_RELEASE – This env variable refers to a source revision/branch (or a specific release) which has to be installed first and then upgraded to version GARDENER_NEXT_RELEASE. By default, it fetches the latest release version from GitHub Gardener release page.\n GARDENER_NEXT_RELEASE – This env variable refers to the target revision/branch (or a specific release) to be upgraded to after successful installation of GARDENER_PREVIOUS_RELEASE. By default, it considers the local HEAD revision, builds code, and installs Gardener from the current revision where the Gardener upgrade tests triggered.\n make ci-e2e-kind-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 make ci-e2e-kind-ha-single-zone-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 make ci-e2e-kind-ha-multi-zone-upgrade GARDENER_PREVIOUS_RELEASE=v1.60.0 GARDENER_NEXT_RELEASE=v1.61.0 Purpose of Upgrade Tests Tests will ensure that shoot clusters reconciled with the previous version of Gardener work as expected even with the next Gardener version. This will reproduce or catch actual issues faced by end users. One of the test cases ensures no downtime is faced by the end-users for shoots while upgrading Gardener if the shoot’s control-plane is configured as HA. Writing Upgrade Tests Tests are divided into two parts and labeled with pre-upgrade and post-upgrade labels. An example test case which ensures a shoot which was hibernated in a previous Gardener release should wakeup as expected in next release: Creating a shoot and hibernating a shoot is pre-upgrade test case which should be labeled pre-upgrade label. Then wakeup a shoot and delete a shoot is post-upgrade test case which should be labeled post-upgrade label. Test Machinery Tests Please see Test Machinery Tests.\nPurpose of Test Machinery Tests Test machinery tests have to be executed against full-blown Gardener installations. They can provide a very high level of confidence that an installation is functional in its current state, this includes: all Gardener components, Extensions, the used Cloud Infrastructure, all relevant settings/configuration. This brings the following benefits: They test more realistic scenarios than e2e tests (real configuration, real infrastructure, etc.). Tests run “where the users are”. However, this also brings significant drawbacks: Tests are difficult to develop and maintain. Tests require a full Gardener installation and cannot be executed in CI (on PR-level or against master). Tests require real infrastructure (think cloud provider credentials, cost). Using TestDefinitions under .test-defs requires a full test machinery installation. Accordingly, tests are heavyweight and expensive to run. Testing against real infrastructure can cause flakes sometimes (e.g., in outage situations). Failures are hard to debug, because clusters are deleted after the test (for obvious cost reasons). Bugs can only be caught, once it’s “too late”, i.e., when code is merged and deployed. Today, test machinery tests cover a bigger “test matrix” (e.g., Shoot creation across infrastructures, kubernetes versions, machine image versions). Test machinery also runs Kubernetes conformance tests. However, because of the listed drawbacks, we should rather focus on augmenting our e2e tests, as we can run them locally and in CI in order to catch bugs before they get merged. It’s still a good idea to add test machinery tests if a feature that is depending on some installation-specific configuration needs to be tested. Writing Test Machinery Tests Generally speaking, most points from writing integration tests and writing e2e tests apply here as well. However, test machinery tests contain a lot of technical debt and existing code doesn’t follow these best practices. As test machinery tests are out of our general focus, we don’t intend on reworking the tests soon or providing more guidance on how to write new ones. Manual Tests Manual tests can be useful when the cost of trying to automatically test certain functionality are too high. Useful for PR verification, if a reviewer wants to verify that all cases are properly tested by automated tests. Currently, it’s the simplest option for testing upgrade scenarios. e.g. migration coding is probably best tested manually, as it’s a high effort to write an automated test for little benefit Obviously, the need for manual tests should be kept at a bare minimum. Instead, we should add e2e tests wherever sensible/valuable. We want to implement some form of general upgrade tests as part of #6016. ","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline This document walks you …","ref":"/docs/gardener/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline Intent of this document is to introduce you (the developer) to the following:\n Category of tests that exists. Libraries that are used to write tests. Best practices to write tests that are correct, stable, fast and maintainable. How to run each category of tests. For any new contributions tests are a strict requirement. Boy Scouts Rule is followed: If you touch a code for which either no tests exist or coverage is insufficient then it is expected that you will add relevant tests.\nTools Used for Writing Tests These are the following tools that were used to write all the tests (unit + envtest + vanilla kind cluster tests), it is preferred not to introduce any additional tools / test frameworks for writing tests:\nGomega We use gomega as our matcher or assertion library. Refer to Gomega’s official documentation for details regarding its installation and application in tests.\nTesting Package from Standard Library We use the Testing package provided by the standard library in golang for writing all our tests. Refer to its official documentation to learn how to write tests using Testing package. You can also refer to this example.\nWriting Tests Common for All Kinds For naming the individual tests (TestXxx and testXxx methods) and helper methods, make sure that the name describes the implementation of the method. For eg: testScalingWhenMandatoryResourceNotFound tests the behaviour of the scaler when a mandatory resource (KCM deployment) is not present. Maintain proper logging in tests. Use t.log() method to add appropriate messages wherever necessary to describe the flow of the test. See this for examples. Make use of the testdata directory for storing arbitrary sample data needed by tests (YAML manifests, etc.). See this package for examples. From https://pkg.go.dev/cmd/go/internal/test: The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Table-driven tests We need a tabular structure in two cases:\n When we have multiple tests which require the same kind of setup:- In this case we have a TestXxxSuite method which will do the setup and run all the tests. We have a slice of test struct which holds all the tests (typically a title and run method). We use a for loop to run all the tests one by one. See this for examples. When we have the same code path and multiple possible values to check:- In this case we have the arguments and expectations in a struct. We iterate through the slice of all such structs, passing the arguments to appropriate methods and checking if the expectation is met. See this for examples. Env Tests Env tests in Dependency Watchdog use the sigs.k8s.io/controller-runtime/pkg/envtest package. It sets up a temporary control plane (etcd + kube-apiserver) and runs the test against it. The code to set up and teardown the environment can be checked out here.\nThese are the points to be followed while writing tests that use envtest setup:\n All tests should be divided into two top level partitions:\n tests with common environment (testXxxCommonEnvTests) tests which need a dedicated environment for each one. (testXxxDedicatedEnvTests) They should be contained within the TestXxxSuite method. See this for examples. If all tests are of one kind then this is not needed.\n Create a method named setUpXxxTest for performing setup tasks before all/each test. It should either return a method or have a separate method to perform teardown tasks. See this for examples.\n The tests run by the suite can be table-driven as well.\n Use the envtest setup when there is a need of an environment close to an actual setup. Eg: start controllers against a real Kubernetes control plane to catch bugs that can only happen when talking to a real API server.\n NOTE: It is currently not possible to bring up more than one envtest environments. See issue#1363. We enforce running serial execution of test suites each of which uses a different envtest environments. See hack/test.sh.\n Vanilla Kind Cluster Tests There are some tests where we need a vanilla kind cluster setup, for eg:- The scaler.go code in the prober package uses the scale subresource to scale the deployments mentioned in the prober config. But the envtest setup does not support the scale subresource as of now. So we need this setup to test if the deployments are scaled as per the config or not. You can check out the code for this setup here. You can add utility methods for different kubernetes and custom resources in there.\nThese are the points to be followed while writing tests that use Vanilla Kind Cluster setup:\n Use this setup only if there is a need of an actual Kubernetes cluster(api server + control plane + etcd) to write the tests. (Because this is slower than your normal envTest setup) Create setUpXxxTest similar to the one in envTest. Follow the same structural pattern used in envTest for writing these tests. See this for examples. Run Tests To run unit tests, use the following Makefile target\nmake test To run KIND cluster based tests, use the following Makefile target\nmake kind-tests # these tests will be slower as it brings up a vanilla KIND cluster To view coverage after running the tests, run :\ngo tool cover -html=cover.out Flaky tests If you see that a test is flaky then you can use make stress target which internally uses stress tool\nmake stress test-package=\u003ctest-package\u003e test-func=\u003ctest-func\u003e tool-params=\"\u003ctool-params\u003e\" An example invocation:\nmake stress test-package=./internal/util test-func=TestRetryUntilPredicateWithBackgroundContext tool-params=\"-p 10\" The make target will do the following:\n It will create a test binary for the package specified via test-package at /tmp/pkg-stress.test directory. It will run stress tool passing the tool-params and targets the function test-func. ","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline Intent of this document is to …","ref":"/docs/other-components/dependency-watchdog/testing/","tags":"","title":"Testing"},{"body":"Testing Strategy and Developer Guideline Intent of this document is to introduce you (the developer) to the following:\n Libraries that are used to write tests. Best practices to write tests that are correct, stable, fast and maintainable. How to run tests. The guidelines are not meant to be absolute rules. Always apply common sense and adapt the guideline if it doesn’t make much sense for some cases. If in doubt, don’t hesitate to ask questions during a PR review (as an author, but also as a reviewer). Add new learnings as soon as we make them!\nFor any new contributions tests are a strict requirement. Boy Scouts Rule is followed: If you touch a code for which either no tests exist or coverage is insufficient then it is expected that you will add relevant tests.\nCommon guidelines for writing tests We use the Testing package provided by the standard library in golang for writing all our tests. Refer to its official documentation to learn how to write tests using Testing package. You can also refer to this example.\n We use gomega as our matcher or assertion library. Refer to Gomega’s official documentation for details regarding its installation and application in tests.\n For naming the individual test/helper functions, ensure that the name describes what the function tests/helps-with. Naming is important for code readability even when writing tests - example-testcase-naming.\n Introduce helper functions for assertions to make test more readable where applicable - example-assertion-function.\n Introduce custom matchers to make tests more readable where applicable - example-custom-matcher.\n Do not use time.Sleep and friends as it renders the tests flaky.\n If a function returns a specific error then ensure that the test correctly asserts the expected error instead of just asserting that an error occurred. To help make this assertion consider using DruidError where possible. example-test-utility \u0026 usage.\n Creating sample data for tests can be a high effort. Consider writing test utilities to generate sample data instead. example-test-object-builder.\n If tests require any arbitrary sample data then ensure that you create a testdata directory within the package and keep the sample data as files in it. From https://pkg.go.dev/cmd/go/internal/test\n The go tool will ignore a directory named “testdata”, making it available to hold ancillary data needed by the tests.\n Avoid defining shared variable/state across tests. This can lead to race conditions causing non-deterministic state. Additionally it limits the capability to run tests concurrently via t.Parallel().\n Do not assume or try and establish an order amongst different tests. This leads to brittle tests as the codebase evolves.\n If you need to have logs produced by test runs (especially helpful in failing tests), then consider using t.Log or t.Logf.\n Unit Tests If you need a kubernetes client.Client, prefer using fake client instead of mocking the client. You can inject errors when building the client which enables you test error handling code paths. Mocks decrease maintainability because they expect the tested component to follow a certain way to reach the desired goal (e.g., call specific functions with particular arguments). All unit tests should be run quickly. Do not use envtest and do not set up a Kind cluster in unit tests. If you have common setup for variations of a function, consider using table-driven tests. See this as an example. An individual test should only test one and only one thing. Do not try and test multiple variants in a single test. Either use table-driven tests or write individual tests for each variation. If a function/component has multiple steps, its probably better to split/refactor it into multiple functions/components that can be unit tested individually. If there are a lot of edge cases, extract dedicated functions that cover them and use unit tests to test them. Running Unit Tests NOTE: For unit tests we are currently transitioning away from ginkgo to using golang native tests. The make test-unit target runs both ginkgo and golang native tests. Once the transition is complete this target will be simplified.\n Run all unit tests\n\u003e make test-unit Run unit tests of specific packages:\n# if you have not already installed gotestfmt tool then install it once. # make test-unit target automatically installs this in ./hack/tools/bin. You can alternatively point the GOBIN to this directory and then directly invoke test-go.sh \u003e go install github.com/gotesttools/gotestfmt/v2/cmd/gotestfmt@v2.5.0 \u003e ./hack/test-go.sh \u003cpackage-1\u003e \u003cpackage-2\u003e De-flaking Unit Tests If tests have sporadic failures, then trying running ./hack/stress-test.sh which internally uses stress tool.\n# install the stress tool \u003e go install golang.org/x/tools/cmd/stress@latest # invoke the helper script to execute the stress test \u003e ./hack/stress-test.sh test-package=\u003ctest-package\u003e test-func=\u003ctest-function\u003e tool-params=\"\u003ctool-params\u003e\" An example invocation:\n\u003e ./hack/stress-test.sh test-package=./internal/utils test-func=TestRunConcurrentlyWithAllSuccessfulTasks tool-params=\"-p 10\" 5s: 877 runs so far, 0 failures 10s: 1906 runs so far, 0 failures 15s: 2885 runs so far, 0 failures ... stress tool will output a path to a file containing the full failure message when a test run fails.\nIntegration Tests (envtests) Integration tests in etcd-druid use envtest. It sets up a minimal temporary control plane (etcd + kube-apiserver) and runs the test against it. Test suites (group of tests) start their individual envtest environment before running the tests for the respective controller/webhook. Before exiting, the temporary test environment is shutdown.\n NOTE: For integration-tests we are currently transitioning away from ginkgo to using golang native tests. All ginkgo integration tests can be found here and golang native integration tests can be found here.\n Integration tests in etcd-druid only targets a single controller. It is therefore advised that code (other than common utility functions should not be shared between any two controllers). If you are sharing a common envtest environment across tests then it is recommended that an individual test is run in a dedicated namespace. Since envtest is used to setup a minimum environment where no controller (e.g. KCM, Scheduler) other than etcd and kube-apiserver runs, status updates to resources controller/reconciled by not-deployed-controllers will not happen. Tests should refrain from asserting changes to status. In case status needs to be set as part of a test setup then it must be done explicitly. If you have common setup and teardown, then consider using TestMain -example. If you have to wait for resources to be provisioned or reach a specific state, then it is recommended that you create smaller assertion functions and use Gomega’s AsyncAssertion functions - example. Beware of the default Eventually / Consistently timeouts / poll intervals: docs. Don’t forget to call {Eventually,Consistently}.Should(), otherwise the assertions always silently succeeds without errors: onsi/gomega#561 Running Integration Tests \u003e make test-integration Debugging Integration Tests There are two ways in which you can debug Integration Tests:\nUsing IDE All commonly used IDE’s provide in-built or easy integration with delve debugger. For debugging integration tests the only additional requirement is to set KUBEBUILDER_ASSETS environment variable. You can get the value of this environment variable by executing the following command:\n# ENVTEST_K8S_VERSION is the k8s version that you wish to use for testing. \u003e setup-envtest --os $(go env GOOS) --arch $(go env GOARCH) use $ENVTEST_K8S_VERSION -p path NOTE: All integration tests usually have a timeout. If you wish to debug a failing integration-test then increase the timeouts.\n Use standalone envtest We also provide a capability to setup a stand-alone envtest and leverage the cluster to run individual integration-test. This allows you more control over when this k8s control plane is destroyed and allows you to inspect the resources at the end of the integration-test run using kubectl.\n NOTE: While you can use an existing cluster (e.g., kind), some test suites expect that no controllers and no nodes are running in the test environment (as it is the case in envtest test environments). Hence, using a full-blown cluster with controllers and nodes might sometimes be impractical, as you would need to stop cluster components for the tests to work.\n To setup a standalone envtest and run an integration test against it, do the following:\n# In a terminal session use the following make target to setup a standalone envtest \u003e make start-envtest # As part of output path to kubeconfig will be also be printed on the console. # In another terminal session setup resource(s) watch: \u003e kubectl get po -A -w # alternatively you can also use `watch -d \u003ccommand\u003e` utility. # In another terminal session: \u003e export KUBECONFIG=\u003cenvtest-kubeconfig-path\u003e \u003e export USE_EXISTING_K8S_CLUSTER=true # run the test \u003e go test -run=\"\u003cregex-for-test\u003e\" \u003cpackage\u003e # example: go test -run=\"^TestEtcdDeletion/test deletion of all*\" ./test/it/controller/etcd Once you are done the testing you can press Ctrl+C in the terminal session where you started envtest. This will shutdown the kubernetes control plane.\nEnd-To-End (e2e) Tests End-To-End tests are run using Kind cluster and Skaffold. These tests provide a high level of confidence that the code runs as expected by users when deployed to production.\n Purpose of running these tests is to be able to catch bugs which result from interaction amongst different components within etcd-druid.\n In CI pipelines e2e tests are run with S3 compatible LocalStack (in cases where backup functionality has been enabled for an etcd cluster).\n In future we will only be using a file-system based local provider to reduce the run times for the e2e tests when run in a CI pipeline.\n e2e tests can be triggered either with other cloud provider object-store emulators or they can also be run against actual/remove cloud provider object-store services.\n In contrast to integration tests, in e2e tests, it might make sense to specify higher timeouts for Gomega’s AsyncAssertion calls.\n Running e2e tests locally Detailed instructions on how to run e2e tests can be found here.\n","categories":"","description":"","excerpt":"Testing Strategy and Developer Guideline Intent of this document is to …","ref":"/docs/other-components/etcd-druid/testing/","tags":"","title":"Testing"},{"body":"Dependency management We use golang modules to manage golang dependencies. In order to add a new package dependency to the project, you can perform go get \u003cPACKAGE\u003e@\u003cVERSION\u003e or edit the go.mod file and append the package along with the version you want to use.\nUpdating dependencies The Makefile contains a rule called tidy which performs go mod tidy.\ngo mod tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module’s packages and dependencies, and it removes unused modules that don’t provide any relevant packages.\n$ make tidy The dependencies are installed into the go mod cache folder.\n⚠️ Make sure you test the code after you have updated the dependencies!\n","categories":"","description":"","excerpt":"Dependency management We use golang modules to manage golang …","ref":"/docs/other-components/machine-controller-manager/testing_and_dependencies/","tags":"","title":"Testing And Dependencies"},{"body":"Test Machinery Tests In order to automatically qualify Gardener releases, we execute a set of end-to-end tests using Test Machinery. This requires a full Gardener installation including infrastructure extensions, as well as a setup of Test Machinery itself. These tests operate on Shoot clusters across different Cloud Providers, using different supported Kubernetes versions and various configuration options (huge test matrix).\nThis manual gives an overview about test machinery tests in Gardener.\n Structure Add a new test Test Labels Framework Container Images Structure Gardener test machinery tests are split into two test suites that can be found under test/testmachinery/suites:\n The Gardener Test Suite contains all tests that only require a running gardener instance. The Shoot Test Suite contains all tests that require a predefined running shoot cluster. The corresponding tests of a test suite are defined in the import statement of the suite definition (see shoot/run_suite_test.go) and their source code can be found under test/testmachinery.\nThe test directory is structured as follows:\ntest ├── e2e # end-to-end tests (using provider-local) │ ├── gardener │ │ ├── seed │ │ ├── shoot | | └── ... | └──operator ├── framework # helper code shared across integration, e2e and testmachinery tests ├── integration # integration tests (envtests) │ ├── controllermanager │ ├── envtest │ ├── resourcemanager │ ├── scheduler │ └── ... └── testmachinery # test machinery tests ├── gardener # actual test cases imported by suites/gardener │ └── security ├── shoots # actual test cases imported by suites/shoot │ ├── applications │ ├── care │ ├── logging │ ├── operatingsystem │ ├── operations │ └── vpntunnel ├── suites # suites that run agains a running garden or shoot cluster │ ├── gardener │ └── shoot └── system # suites that are used for building a full test flow ├── complete_reconcile ├── managed_seed_creation ├── managed_seed_deletion ├── shoot_cp_migration ├── shoot_creation ├── shoot_deletion ├── shoot_hibernation ├── shoot_hibernation_wakeup └── shoot_update A suite can be executed by running the suite definition with ginkgo’s focus and skip flags to control the execution of specific labeled test. See the example below:\ngo test -timeout=0 ./test/testmachinery/suites/shoot \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of teh current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -shoot-name=\u003cshoot-name\u003e \\ # Name of the shoot to test -project-namespace=\u003cgardener project namespace\u003e \\ # Name of the gardener project the test shoot resides -ginkgo.focus=\"\\[RELEASE\\]\" \\ # Run all tests that are tagged as release -ginkgo.skip=\"\\[SERIAL\\]|\\[DISRUPTIVE\\]\" # Exclude all tests that are tagged SERIAL or DISRUPTIVE Add a New Test To add a new test the framework requires the following steps (step 1. and 2. can be skipped if the test is added to an existing package):\n Create a new test file e.g. test/testmachinery/shoot/security/my-sec-test.go Import the test into the appropriate test suite (gardener or shoot): import _ \"github.com/gardener/gardener/test/testmachinery/shoot/security\" Define your test with the testframework. The framework will automatically add its initialization, cleanup and dump functions. var _ = ginkgo.Describe(\"my suite\", func(){ f := framework.NewShootFramework(nil) f.Beta().CIt(\"my first test\", func(ctx context.Context) { f.ShootClient.Get(xx) // testing ... }) }) The newly created test can be tested by focusing the test with the default ginkgo focus f.Beta().FCIt(\"my first test\", func(ctx context.Context) and running the shoot test suite with:\ngo test -timeout=0 ./test/testmachinery/suites/shoot \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of the current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -shoot-name=\u003cshoot-name\u003e \\ # Name of the shoot to test -project-namespace=\u003cgardener project namespace\u003e \\ -fenced=\u003ctrue|false\u003e # Tested shoot is running in a fenced environment and cannot be reached by gardener or for the gardener suite with:\ngo test -timeout=0 ./test/testmachinery/suites/gardener \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file --disable-dump=false \\ # disables dumping of the current state if a test fails -kubecfg=/path/to/gardener/kubeconfig \\ -project-namespace=\u003cgardener project namespace\u003e ⚠️ Make sure that you do not commit any focused specs as this feature is only intended for local development! Ginkgo will fail the test suite if there are any focused specs.\nAlternatively, a test can be triggered by specifying a ginkgo focus regex with the name of the test e.g.\ngo test -timeout=0 ./test/testmachinery/suites/gardener \\ --v -ginkgo.v -ginkgo.show-node-events -ginkgo.no-color \\ --report-file=/tmp/report.json \\ # write elasticsearch formatted output to a file -kubecfg=/path/to/gardener/kubeconfig \\ -project-namespace=\u003cgardener project namespace\u003e \\ -ginkgo.focus=\"my first test\" # regex to match test cases Test Labels Every test should be labeled by using the predefined labels available with every framework to have consistent labeling across all test machinery tests.\nThe labels are applied to every new It()/CIt() definition by:\nf := framework.NewCommonFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [SERIAL] my test\" f := framework.NewShootFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [SERIAL] [SHOOT] my test\" f := framework.NewGardenerFramework() f.Default().Serial().It(\"my test\") =\u003e \"[DEFAULT] [GARDENER] [SERIAL] my test\" Labels:\n Beta: Newly created tests with no experience on stableness should be first labeled as beta tests. They should be watched (and probably improved) until stable enough to be promoted to Default. Default: Tests that were Beta before and proved to be stable are promoted to Default eventually. Default tests run more often, produce alerts and are considered during the release decision although they don’t necessarily block a release. Release: Test are release relevant. A failing Release test blocks the release pipeline. Therefore, these tests need to be stable. Only tests proven to be stable will eventually be promoted to Release. Behavior Labels:\n Serial: The test should always be executed in serial with no other tests running, as it may impact other tests. Destructive: The test is destructive. Which means that is runs with no other tests and may break Gardener or the shoot. Only create such tests if really necessary, as the execution will be expensive (neither Gardener nor the shoot can be reused in this case for other tests). Framework The framework directory contains all the necessary functions / utilities for running test machinery tests. For example, there are methods for creation/deletion of shoots, waiting for shoot deletion/creation, downloading/installing/deploying helm charts, logging, etc.\nThe framework itself consists of 3 different frameworks that expect different prerequisites and offer context specific functionality.\n CommonFramework: The common framework is the base framework that handles logging and setup of commonly needed resources like helm. It also contains common functions for interacting with Kubernetes clusters like Waiting for resources to be ready or Exec into a running pod. GardenerFramework contains all functions of the common framework and expects a running Gardener instance with the provided Gardener kubeconfig and a project namespace. It also contains functions to interact with gardener like Waiting for a shoot to be reconciled or Patch a shoot or Get a seed. ShootFramework: contains all functions of the common and the gardener framework. It expects a running shoot cluster defined by the shoot’s name and namespace (project namespace). This framework contains functions to directly interact with the specific shoot. The whole framework also includes commonly used checks, ginkgo wrapper, etc., as well as commonly used tests. Theses common application tests (like the guestbook test) can be used within multiple tests to have a default application (with ingress, deployment, stateful backend) to test external factors.\nConfig\nEvery framework commandline flag can also be defined by a configuration file (the value of the configuration file is only used if a flag is not specified by commandline). The test suite searches for a configuration file (yaml is preferred) if the command line flag --config=/path/to/config/file is provided. A framework can be defined in the configuration file by just using the flag name as root key e.g.\nverbose: debug kubecfg: /kubeconfig/path project-namespace: garden-it Report\nThe framework automatically writes the ginkgo default report to stdout and a specifically structured elastichsearch bulk report file to a specified location. The elastichsearch bulk report will write one json document per testcase and injects the metadata of the whole testsuite. An example document for one test case would look like the following document:\n{ \"suite\": { \"name\": \"Shoot Test Suite\", \"phase\": \"Succeeded\", \"tests\": 3, \"failures\": 1, \"errors\": 0, \"time\": 87.427 }, \"name\": \"Shoot application testing [DEFAULT] [RELEASE] [SHOOT] should download shoot kubeconfig successfully\", \"shortName\": \"should download shoot kubeconfig successfully\", \"labels\": [ \"DEFAULT\", \"RELEASE\", \"SHOOT\" ], \"phase\": \"Succeeded\", \"time\": 0.724512057 } Resources\nThe resources directory contains templates used by the tests.\nresources └── templates ├── guestbook-app.yaml.tpl └── logger-app.yaml.tpl System Tests This directory contains the system tests that have a special meaning for the testmachinery with their own Test Definition. Currently, these system tests consist of:\n Shoot creation Shoot deletion Shoot Kubernetes update Gardener Full reconcile check Shoot Creation Test Create Shoot test is meant to test shoot creation.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/shoot_creation \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -cloud-profile-name=$CLOUDPROFILE \\ -seed=$SEED \\ -secret-binding=$SECRET_BINDING \\ -provider-type=$PROVIDER_TYPE \\ -region=$REGION \\ -k8s-version=$K8S_VERSION \\ -project-namespace=$PROJECT_NAMESPACE \\ -annotations=$SHOOT_ANNOTATIONS \\ -infrastructure-provider-config-filepath=$INFRASTRUCTURE_PROVIDER_CONFIG_FILEPATH \\ -controlplane-provider-config-filepath=$CONTROLPLANE_PROVIDER_CONFIG_FILEPATH \\ -workers-config-filepath=$$WORKERS_CONFIG_FILEPATH \\ -worker-zone=$ZONE \\ -networking-pods=$NETWORKING_PODS \\ -networking-services=$NETWORKING_SERVICES \\ -networking-nodes=$NETWORKING_NODES \\ -start-hibernated=$START_HIBERNATED Shoot Deletion Test Delete Shoot test is meant to test the deletion of a shoot.\nExample Run\ngo test -timeout=0 -ginkgo.v -ginkgo.show-node-events \\ ./test/testmachinery/system/shoot_deletion \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -project-namespace=$PROJECT_NAMESPACE Shoot Update Test The Update Shoot test is meant to test the Kubernetes version update of a existing shoot. If no specific version is provided, the next patch version is automatically selected. If there is no available newer version, this test is a noop.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/shoot_update \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -shoot-name=$SHOOT_NAME \\ -project-namespace=$PROJECT_NAMESPACE \\ -version=$K8S_VERSION Gardener Full Reconcile Test The Gardener Full Reconcile test is meant to test if all shoots of a Gardener instance are successfully reconciled.\nExample Run\ngo test -timeout=0 ./test/testmachinery/system/complete_reconcile \\ --v -ginkgo.v -ginkgo.show-node-events \\ -kubecfg=$HOME/.kube/config \\ -project-namespace=$PROJECT_NAMESPACE \\ -gardenerVersion=$GARDENER_VERSION # needed to validate the last acted gardener version of a shoot Container Images Test machinery tests usually deploy a workload to the Shoot cluster as part of the test execution. When introducing a new container image, consider the following:\n Make sure the container image is multi-arch. Tests are executed against amd64 and arm64 based worker Nodes. Do not use container images from Docker Hub. Docker Hub has rate limiting (see Download rate limit). For anonymous users, the rate limit is set to 100 pulls per 6 hours per IP address. In some fenced environments the network setup can be such that all egress connections are issued from single IP (or set of IPs). In such scenarios the allowed rate limit can be exhausted too fast. See https://github.com/gardener/gardener/issues/4160. Docker Hub registry doesn’t support pulling images over IPv6 (see Beta IPv6 Support on Docker Hub Registry). Avoid manually copying Docker Hub images to Gardener GCR (europe-docker.pkg.dev/gardener-project/releases/3rd/). Use the existing prow job for this (see Copy Images). If possible, use a Kubernetes e2e image (registry.k8s.io/e2e-test-images/\u003cimage-name\u003e). In some cases, there is already a Kubernetes e2e image alternative of the Docker Hub image. For example, use registry.k8s.io/e2e-test-images/busybox instead of europe-docker.pkg.dev/gardener-project/releases/3rd/busybox or docker.io/busybox. Kubernetes has multiple test images - see https://github.com/kubernetes/kubernetes/tree/v1.27.0/test/images. agnhost is the most widely used image in Kubernetes e2e tests. It contains multiple testing related binaries inside such as pause, logs-generator, serve-hostname, webhook and others. See all of them in the agnhost’s README.md. The list of available Kubernetes e2e images and tags can be checked in this page. ","categories":"","description":"","excerpt":"Test Machinery Tests In order to automatically qualify Gardener …","ref":"/docs/gardener/testmachinery_tests/","tags":"","title":"Testmachinery Tests"},{"body":"Topology-Aware Traffic Routing Motivation The enablement of highly available shoot control-planes requires multi-zone seed clusters. A garden runtime cluster can also be a multi-zone cluster. The topology-aware routing is introduced to reduce costs and to improve network performance by avoiding the cross availability zone traffic, if possible. The cross availability zone traffic is charged by the cloud providers and it comes with higher latency compared to the traffic within the same zone. The topology-aware routing feature enables topology-aware routing for Services deployed in a seed or garden runtime cluster. For the clients consuming these topology-aware services, kube-proxy favors the endpoints which are located in the same zone where the traffic originated from. In this way, the cross availability zone traffic is avoided.\nHow it works The topology-aware routing feature relies on the Kubernetes feature TopologyAwareHints.\nEndpointSlice Hints Mutating Webhook The component that is responsible for providing hints in the EndpointSlices resources is the kube-controller-manager, in particular this is the EndpointSlice controller. However, there are several drawbacks with the TopologyAwareHints feature that don’t allow us to use it in its native way:\n The algorithm in the EndpointSlice controller is based on a CPU-balance heuristic. From the TopologyAwareHints documentation:\n The controller allocates a proportional amount of endpoints to each zone. This proportion is based on the allocatable CPU cores for nodes running in that zone. For example, if one zone had 2 CPU cores and another zone only had 1 CPU core, the controller would allocate twice as many endpoints to the zone with 2 CPU cores.\n In case it is not possible to achieve a balanced distribution of the endpoints, as a safeguard mechanism the controller removes hints from the EndpointSlice resource. In our setup, the clients and the servers are well-known and usually the traffic a component receives does not depend on the zone’s allocatable CPU. Many components deployed by Gardener are scaled automatically by VPA. In case of an overload of a replica, the VPA should provide and apply enhanced CPU and memory resources. Additionally, Gardener uses the cluster-autoscaler to upscale/downscale Nodes dynamically. Hence, it is not possible to ensure a balanced allocatable CPU across the zones.\n The TopologyAwareHints feature does not work at low-endpoint counts. It falls apart for a Service with less than 10 Endpoints.\n Hints provided by the EndpointSlice controller are not deterministic. With cluster-autoscaler running and load increasing, hints can be removed in the next moment. There is no option to enforce the zone-level topology.\n For more details, see the following issue kubernetes/kubernetes#113731.\nTo circumvent these issues with the EndpointSlice controller, a mutating webhook in the gardener-resource-manager assigns hints to EndpointSlice resources. For each endpoint in the EndpointSlice, it sets the endpoint’s hints to the endpoint’s zone. The webhook overwrites the hints provided by the EndpointSlice controller in kube-controller-manager. For more details, see the webhook’s documentation.\nkube-proxy By default, with kube-proxy running in iptables mode, traffic is distributed randomly across all endpoints, regardless of where it originates from. In a cluster with 3 zones, traffic is more likely to go to another zone than to stay in the current zone. With the topology-aware routing feature, kube-proxy filters the endpoints it routes to based on the hints in the EndpointSlice resource. In most of the cases, kube-proxy will prefer the endpoint(s) in the same zone. For more details, see the Kubernetes documentation.\nHow to make a Service topology-aware? To make a Service topology-aware, the following annotation and label have to be added to the Service:\napiVersion: v1 kind: Service metadata: annotations: service.kubernetes.io/topology-aware-hints: \"auto\" labels: endpoint-slice-hints.resources.gardener.cloud/consider: \"true\" Note: In Kubernetes 1.27 the service.kubernetes.io/topology-aware-hints=auto annotation is deprecated in favor of the newly introduced service.kubernetes.io/topology-mode=auto. When the runtime cluster’s K8s version is \u003e= 1.27, use the service.kubernetes.io/topology-mode=auto annotation. For more details, see the corresponding upstream PR.\n The service.kubernetes.io/topology-aware-hints=auto annotation is needed for kube-proxy. One of the prerequisites on kube-proxy side for using topology-aware routing is the corresponding Service to be annotated with the service.kubernetes.io/topology-aware-hints=auto. For more details, see the following kube-proxy function. The endpoint-slice-hints.resources.gardener.cloud/consider=true label is needed for gardener-resource-manager to prevent the EndpointSlice hints mutating webhook from selecting all EndpointSlice resources but only the ones that are labeled with the consider label.\nThe Gardener extensions can use this approach to make a Service they deploy topology-aware.\nPrerequisites for making a Service topology-aware:\n The Pods backing the Service should be spread on most of the available zones. This constraint should be ensured with appropriate scheduling constraints (topology spread constraints, (anti-)affinity). Enabling the feature for a Service with a single backing Pod or Pods all located in the same zone does not lead to a benefit. The component should be scaled up by VerticalPodAutoscaler. In case of an overload (a large portion of the of the traffic is originating from a given zone), the VerticalPodAutoscaler should provide better resource recommendations for the overloaded backing Pods. Consider the TopologyAwareHints constraints. Note: The topology-aware routing feature is considered as alpha feature. Use it only for evaluation purposes.\n Topology-aware Services in the Seed cluster etcd-main-client and etcd-events-client The etcd-main-client and etcd-events-client Services are topology-aware. They are consumed by the kube-apiserver.\nkube-apiserver The kube-apiserver Service is topology-aware. It is consumed by the controllers running in the Shoot control plane.\n Note: The istio-ingressgateway component routes traffic in topology-aware manner - if possible, it routes traffic to the target kube-apiserver Pods in the same zone. If there is no healthy kube-apiserver Pod available in the same zone, the traffic is routed to any of the healthy Pods in the other zones. This behaviour is unconditionally enabled.\n gardener-resource-manager The gardener-resource-manager Service that is part of the Shoot control plane is topology-aware. The resource-manager serves webhooks and the Service is consumed by the kube-apiserver for the webhook communication.\nvpa-webhook The vpa-webhook Service that is part of the Shoot control plane is topology-aware. It is consumed by the kube-apiserver for the webhook communication.\nTopology-aware Services in the garden runtime cluster virtual-garden-etcd-main-client and virtual-garden-etcd-events-client The virtual-garden-etcd-main-client and virtual-garden-etcd-events-client Services are topology-aware. virtual-garden-etcd-main-client is consumed by virtual-garden-kube-apiserver and gardener-apiserver, virtual-garden-etcd-events-client is consumed by virtual-garden-kube-apiserver.\nvirtual-garden-kube-apiserver The virtual-garden-kube-apiserver Service is topology-aware. It is consumed by virtual-garden-kube-controller-manager, gardener-controller-manager, gardener-scheduler, gardener-admission-controller, extension admission components, gardener-dashboard and other components.\n Note: Unlike the other Services, the virtual-garden-kube-apiserver Service is of type LoadBalancer. In-cluster components consuming the virtual-garden-kube-apiserver Service by its Service name will have benefit from the topology-aware routing. However, the TopologyAwareHints feature cannot help with external traffic routed to load balancer’s address - such traffic won’t be routed in a topology-aware manner and will be routed according to the cloud-provider specific implementation.\n gardener-apiserver The gardener-apiserver Service is topology-aware. It is consumed by virtual-garden-kube-apiserver. The aggregation layer in virtual-garden-kube-apiserver proxies requests sent for the Gardener API types to the gardener-apiserver.\ngardener-admission-controller The gardener-admission-controller Service is topology-aware. It is consumed by virtual-garden-kube-apiserver and gardener-apiserver for the webhook communication.\nHow to enable the topology-aware routing for a Seed cluster? For a Seed cluster the topology-aware routing functionality can be enabled in the Seed specification:\napiVersion: core.gardener.cloud/v1beta1 kind: Seed # ... spec: settings: topologyAwareRouting: enabled: true The topology-aware routing setting can be only enabled for a Seed cluster with more than one zone. gardenlet enables topology-aware Services only for Shoot control planes with failure tolerance type zone (.spec.controlPlane.highAvailability.failureTolerance.type=zone). Control plane Pods of non-HA Shoots and HA Shoots with failure tolerance type node are pinned to single zone. For more details, see High Availability Of Deployed Components.\nHow to enable the topology-aware routing for a garden runtime cluster? For a garden runtime cluster the topology-aware routing functionality can be enabled in the Garden resource specification:\napiVersion: operator.gardener.cloud/v1alpha1 kind: Garden # ... spec: runtimeCluster: settings: topologyAwareRouting: enabled: true The topology-aware routing setting can be only enabled for a garden runtime cluster with more than one zone.\n","categories":"","description":"","excerpt":"Topology-Aware Traffic Routing Motivation The enablement of highly …","ref":"/docs/gardener/topology_aware_routing/","tags":"","title":"Topology Aware Routing"},{"body":"Trigger Shoot Operations Through Annotations You can trigger a few explicit operations by annotating the Shoot with an operation annotation. This might allow you to induct certain behavior without the need to change the Shoot specification. Some of the operations can also not be caused by changing something in the shoot specification because they can’t properly be reflected here. Note that once the triggered operation is considered by the controllers, the annotation will be automatically removed and you have to add it each time you want to trigger the operation.\nPlease note: If .spec.maintenance.confineSpecUpdateRollout=true, then the only way to trigger a shoot reconciliation is by setting the reconcile operation, see below.\nImmediate Reconciliation Annotate the shoot with gardener.cloud/operation=reconcile to make the gardenlet start a reconciliation operation without changing the shoot spec and possibly without being in its maintenance time window:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=reconcile Immediate Maintenance Annotate the shoot with gardener.cloud/operation=maintain to make the gardener-controller-manager start maintaining your shoot immediately (possibly without being in its maintenance time window). If no reconciliation starts, then nothing needs to be maintained:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=maintain Retry Failed Reconciliation Annotate the shoot with gardener.cloud/operation=retry to make the gardenlet start a new reconciliation loop on a failed shoot. Failed shoots are only reconciled again if a new Gardener version is deployed, the shoot specification is changed or this annotation is set:\nkubectl -n garden-\u003cproject-name\u003e annotate shoot \u003cshoot-name\u003e gardener.cloud/operation=retry Credentials Rotation Operations Please consult Credentials Rotation for Shoot Clusters for more information.\nRestart systemd Services on Particular Worker Nodes It is possible to make Gardener restart particular systemd services on your shoot worker nodes if needed. The annotation is not set on the Shoot resource but directly on the Node object you want to target. For example, the following will restart both the kubelet and the containerd services:\nkubectl annotate node \u003cnode-name\u003e worker.gardener.cloud/restart-systemd-services=kubelet,containerd It may take up to a minute until the service is restarted. The annotation will be removed from the Node object after all specified systemd services have been restarted. It will also be removed even if the restart of one or more services failed.\n ℹ️ In the example mentioned above, you could additionally verify when/whether the kubelet restarted by using kubectl describe node \u003cnode-name\u003e and looking for such a Starting kubelet event.\n Force Deletion When the ShootForceDeletion feature gate in the gardener-apiserver is enabled, users will be able to force-delete the Shoot. This is only possible if the Shoot fails to be deleted normally. For forceful deletion, the following conditions must be met:\n Shoot has a deletion timestamp. Shoot status contains at least one of the following ErrorCodes: ERR_CLEANUP_CLUSTER_RESOURCES ERR_CONFIGURATION_PROBLEM ERR_INFRA_DEPENDENCIES ERR_INFRA_UNAUTHENTICATED ERR_INFRA_UNAUTHORIZED If the above conditions are satisfied, you can annotate the Shoot with confirmation.gardener.cloud/force-deletion=true, and Gardener will cleanup the Shoot controlplane and the Shoot metadata.\n ⚠️ You MUST ensure that all the resources created in the IaaS account are cleaned up to prevent orphaned resources. Gardener will NOT delete any resources in the underlying infrastructure account. Hence, use this annotation at your own risk and only if you are fully aware of these consequences.\n ","categories":"","description":"","excerpt":"Trigger Shoot Operations Through Annotations You can trigger a few …","ref":"/docs/gardener/shoot_operations/","tags":"","title":"Trigger Shoot Operations Through Annotations"},{"body":"Trusted TLS Certificate for Shoot Control Planes Shoot clusters are composed of several control plane components deployed by Gardener and its corresponding extensions.\nSome components are exposed via Ingress resources, which make them addressable under the HTTPS protocol.\nExamples:\n Alertmanager Plutono Prometheus Gardener generates the backing TLS certificates, which are signed by the shoot cluster’s CA by default (self-signed).\nUnlike with a self-contained Kubeconfig file, common internet browsers or operating systems don’t trust a shoot’s cluster CA and adding it as a trusted root is often undesired in enterprise environments.\nTherefore, Gardener operators can predefine trusted wildcard certificates under which the mentioned endpoints will be served instead.\nRegister a trusted wildcard certificate Since control plane components are published under the ingress domain (core.gardener.cloud/v1beta1.Seed.spec.ingress.domain) a wildcard certificate is required.\nFor example:\n Seed ingress domain: dev.my-seed.example.com CN or SAN for a certificate: *.dev.my-seed.example.com A wildcard certificate matches exactly one seed. It must be deployed as part of your landscape setup as a Kubernetes Secret inside the garden namespace of the corresponding seed cluster.\nPlease ensure that the secret has the gardener.cloud/role label shown below:\napiVersion: v1 data: ca.crt: base64-encoded-ca.crt tls.crt: base64-encoded-tls.crt tls.key: base64-encoded-tls.key kind: Secret metadata: labels: gardener.cloud/role: controlplane-cert name: seed-ingress-certificate namespace: garden type: Opaque Gardener copies the secret during the reconciliation of shoot clusters to the shoot namespace in the seed. Afterwards, the Ingress resources in that namespace for the mentioned components will refer to the wildcard certificate.\nBest Practice While it is possible to create the wildcard certificates manually and deploy them to seed clusters, it is recommended to let certificate management components do this job. Often, a seed cluster is also a shoot cluster at the same time (ManagedSeed) and might already provide a certificate service extension. Otherwise, a Gardener operator may use solutions like Cert-Management or Cert-Manager.\n","categories":"","description":"","excerpt":"Trusted TLS Certificate for Shoot Control Planes Shoot clusters are …","ref":"/docs/gardener/trusted-tls-for-control-planes/","tags":"","title":"Trusted Tls For Control Planes"},{"body":"Trusted TLS Certificate for Garden Runtime Cluster In Garden Runtime Cluster components are exposed via Ingress resources, which make them addressable under the HTTPS protocol.\nExamples:\n Plutono Gardener generates the backing TLS certificates, which are signed by the garden runtime cluster’s CA by default (self-signed).\nUnlike with a self-contained Kubeconfig file, common internet browsers or operating systems don’t trust a garden runtime’s cluster CA and adding it as a trusted root is often undesired in enterprise environments.\nTherefore, Gardener operators can predefine a trusted wildcard certificate under which the mentioned endpoints will be served instead.\nRegister a trusted wildcard certificate Since Garden Runtime Cluster components are published under the ingress domain (operator.gardener.cloud/v1alpha1.Garden.spec.runtimeCluster.ingress.domain) a wildcard certificate is required.\nFor example:\n Garden Runtime cluster ingress domain: dev.my-garden.example.com CN or SAN for a certificate: *.dev.my-garden.example.com It must be deployed as part of your landscape setup as a Kubernetes Secret inside the garden namespace of the garden runtime cluster.\nPlease ensure that the secret has the gardener.cloud/role label shown below:\napiVersion: v1 data: ca.crt: base64-encoded-ca.crt tls.crt: base64-encoded-tls.crt tls.key: base64-encoded-tls.key kind: Secret metadata: labels: gardener.cloud/role: controlplane-cert name: garden-ingress-certificate namespace: garden type: Opaque Best Practice While it is possible to create the wildcard certificate manually and deploy it to the cluster, it is recommended to let certificate management components (e.g. gardener/cert-management) do this job.\n","categories":"","description":"","excerpt":"Trusted TLS Certificate for Garden Runtime Cluster In Garden Runtime …","ref":"/docs/gardener/trusted-tls-for-garden-runtime/","tags":"","title":"Trusted Tls For Garden Runtime"},{"body":"Gardener Extension for Ubuntu OS \nThis controller operates on the OperatingSystemConfig resource in the extensions.gardener.cloud/v1alpha1 API group. It manages those objects that are requesting Ubuntu OS configuration (.spec.type=ubuntu). An experimental support for Ubuntu Pro is added (.spec.type=ubuntu-pro):\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfig metadata: name: pool-01-original namespace: default spec: type: ubuntu units: ... files: ... Please find a concrete example in the example folder.\nAfter reconciliation the resulting data will be stored in a secret within the same namespace (as the config itself might contain confidential data). The name of the secret will be written into the resource’s .status field:\n... status: ... cloudConfig: secretRef: name: osc-result-pool-01-original namespace: default command: /usr/bin/env bash \u003cpath\u003e units: - docker-monitor.service - kubelet-monitor.service - kubelet.service The secret has one data key cloud_config that stores the generation.\nAn example for a ControllerRegistration resource that can be used to register this controller to Gardener can be found here.\nPlease find more information regarding the extensibility concepts and a detailed proposal here.\n How to start using or developing this extension controller locally You can run the controller locally on your machine by executing make start. Please make sure to have the kubeconfig to the cluster you want to connect to ready in the ./dev/kubeconfig file. Static code checks and tests can be executed by running make verify. We are using Go modules for Golang package dependency management and Ginkgo/Gomega for testing.\nFeedback and Support Feedback and contributions are always welcome. Please report bugs or suggestions as GitHub issues or join our Slack channel #gardener (please invite yourself to the Kubernetes workspace here).\nLearn more! Please find further resources about out project here:\n Our landing page gardener.cloud “Gardener, the Kubernetes Botanist” blog on kubernetes.io “Gardener Project Update” blog on kubernetes.io Gardener Extensions Golang library GEP-1 (Gardener Enhancement Proposal) on extensibility Extensibility API documentation ","categories":"","description":"Gardener extension controller for the Ubuntu operating system","excerpt":"Gardener extension controller for the Ubuntu operating system","ref":"/docs/extensions/os-extensions/gardener-extension-os-ubuntu/","tags":"","title":"Ubuntu OS"},{"body":"Using the Alicloud provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for Alicloud and provides an example Shoot manifest with minimal configuration that can be used to create an Alicloud cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nAlicloud Provider Credentials In order for Gardener to create a Kubernetes cluster using Alicloud infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Alicloud project. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the Alicloud project.\nThis Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-alicloud namespace: garden-dev type: Opaque data: accessKeyID: base64(access-key-id) accessKeySecret: base64(access-key-secret) The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nThe required credentials for the Alicloud project are an AccessKey Pair associated with a Resource Access Management (RAM) User. A RAM user is a special account that can be used by services and applications to interact with Alicloud Cloud Platform APIs. Applications can use AccessKey pair to authorize themselves to a set of APIs and perform actions within the permissions granted to the RAM user.\nMake sure to create a Resource Access Management User, and create an AccessKey Pair that shall be used for the Shoot cluster.\nPermissions Please make sure the provided credentials have the correct privileges. You can use the following Alicloud RAM policy document and attach it to the RAM user backed by the credentials you provided.\n Click to expand the Alicloud RAM policy document! { \"Statement\": [ { \"Action\": [ \"vpc:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ecs:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"slb:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ram:GetRole\", \"ram:CreateRole\", \"ram:CreateServiceLinkedRole\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] }, { \"Action\": [ \"ros:*\" ], \"Effect\": \"Allow\", \"Resource\": [ \"*\" ] } ], \"Version\": \"1\" } InfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: # specify either 'id' or 'cidr' # id: my-vpc cidr: 10.250.0.0/16 # gardenerManagedNATGateway: true zones: - name: eu-central-1a workers: 10.250.1.0/24 # natGateway: # eipAllocationID: eip-ufxsdg122elmszcg The networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.id is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …). If networks.vpc.cidr is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vpc.id or networks.vpc.cidr must be present, but not both at the same time. When networks.vpc.id is present, in addition, you can also choose to set networks.vpc.gardenerManagedNATGateway. It is by default false. When it is set to true, Gardener will create an Enhanced NATGateway in the VPC and associate it with a VSwitch created in the first zone in the networks.zones. Please note that when networks.vpc.id is present, and networks.vpc.gardenerManagedNATGateway is false or not set, you have to manually create an Enhance NATGateway and associate it with a VSwitch that you manually created. In this case, make sure the worker CIDRs in networks.zones do not overlap with the one you created. If a NATGateway is created manually and a shoot is created in the same VPC with networks.vpc.gardenerManagedNATGateway set true, you need to manually adjust the route rule accordingly. You may refer to here. The networks.zones section describes which subnets you want to create in availability zones. For every zone, the Alicloud extension creates one subnet:\n The workers subnet is used for all shoot worker nodes, i.e., VMs which later run your applications. For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDR and it is your responsibility to properly design the network layout to suit your needs.\nIf you want to use multiple availability zones then add a second, third, … entry to the networks.zones[] list and properly specify the AZ name in networks.zones[].name.\nApart from the VPC and the subnets the Alicloud extension will also create a NAT gateway (only if a new VPC is created), a key pair, elastic IPs, VSwitches, a SNAT table entry, and security groups.\nBy default, the Alicloud extension will create a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic. The networks.zones[].natGateway.eipAllocationID field allows you to specify the Elastic IP Allocation ID of an existing Elastic IP allocation in case you want to bring your own. If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.\n⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while Alicloud applies this change, because the NAT gateway must be recreated with the new Elastic IP association. Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the Alicloud extension.\nControlPlaneConfig The control plane configuration mainly contains values for the Alicloud-specific control plane components. Today, the Alicloud extension deploys the cloud-controller-manager and the CSI controllers.\nAn example ControlPlaneConfig for the Alicloud extension looks as follows:\napiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig csi: enableADController: true # cloudControllerManager: # featureGates: # SomeKubernetesFeature: true The csi.enableADController is used as the value of environment DISK_AD_CONTROLLER, which is used for AliCloud csi-disk-plugin. This field is optional. When a new shoot is creatd, this field is automatically set true. For an existing shoot created in previous versions, it remains unchanged. If there are persistent volumes created before year 2021, please be cautious to set this field true because they may fail to mount to nodes.\nThe cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nWorkerConfig The Alicloud extension does not support a specific WorkerConfig. However, it supports additional data volumes (plus encryption) per machine. By default (if not stated otherwise), all the disks are unencrypted. For each data volume, you have to specify a name. It also supports encrypted system disk. However, only Customized image is currently supported to be used as a basic image for encrypted system disk. Please be noted that the change of system disk encryption flag will cause reconciliation of a shoot, and it will result in nodes rolling update within the worker group.\nThe following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: cloud_efficiency size: 20Gi encrypted: true dataVolumes: - name: kubelet-dir type: cloud_efficiency size: 25Gi encrypted: true Example Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-alicloud namespace: garden-dev spec: cloudProfileName: alicloud region: eu-central-1 secretBindingName: core-alicloud provider: type: alicloud infrastructureConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: ecs.sn2ne.large minimum: 2 maximum: 2 volume: size: 50Gi type: cloud_efficiency zones: - eu-central-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (two availability zones) Please find below an example Shoot manifest for two availability zones:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-alicloud namespace: garden-dev spec: cloudProfileName: alicloud region: eu-central-1 secretBindingName: core-alicloud provider: type: alicloud infrastructureConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/26 - name: eu-central-1b workers: 10.250.0.64/26 controlPlaneConfig: apiVersion: alicloud.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: ecs.sn2ne.large minimum: 2 maximum: 4 volume: size: 50Gi type: cloud_efficiency # NOTE: Below comment is for the case when encrypted field of an existing shoot is updated from false to true. # It will cause affected nodes to be rolling updated. Users must trigger a MAINTAIN operation of the shoot. # Otherwise, the shoot will fail to reconcile. # You could do it either via Dashboard or annotating the shoot with gardener.cloud/operation=maintain encrypted: true zones: - eu-central-1a - eu-central-1b networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Kubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-alicloud@v1.33.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation feature gate since gardener-extension-provider-alicloud@v1.36 and ShootSARotation feature gate since gardener-extension-provider-alicloud@v1.37.\n","categories":"","description":"","excerpt":"Using the Alicloud provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-alicloud/usage/","tags":"","title":"Usage"},{"body":"Using the AWS provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for AWS and provide an example Shoot manifest with minimal configuration that you can use to create an AWS cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider Secret Data Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of your AWS account. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-aws namespace: garden-dev type: Opaque data: accessKeyID: base64(access-key-id) secretAccessKey: base64(secret-access-key) The AWS documentation explains the necessary steps to enable programmatic access, i.e. create access key ID and access key, for the user of your choice.\n⚠️ For security reasons, we recommend creating a dedicated user with programmatic access only. Please avoid re-using a IAM user which has access to the AWS console (human user).\n⚠️ Depending on your AWS API usage it can be problematic to reuse the same AWS Account for different Shoot clusters in the same region due to rate limits. Please consider spreading your Shoots over multiple AWS Accounts if you are hitting those limits.\nPermissions Please make sure that the provided credentials have the correct privileges. You can use the following AWS IAM policy document and attach it to the IAM user backed by the credentials you provided (please check the official AWS documentation as well):\n Click to expand the AWS IAM policy document! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": \"autoscaling:*\", \"Resource\": \"*\" }, { \"Effect\": \"Allow\", \"Action\": \"ec2:*\", \"Resource\": \"*\" }, { \"Effect\": \"Allow\", \"Action\": \"elasticloadbalancing:*\", \"Resource\": \"*\" }, { \"Action\": [ \"iam:GetInstanceProfile\", \"iam:GetPolicy\", \"iam:GetPolicyVersion\", \"iam:GetRole\", \"iam:GetRolePolicy\", \"iam:ListPolicyVersions\", \"iam:ListRolePolicies\", \"iam:ListAttachedRolePolicies\", \"iam:ListInstanceProfilesForRole\", \"iam:CreateInstanceProfile\", \"iam:CreatePolicy\", \"iam:CreatePolicyVersion\", \"iam:CreateRole\", \"iam:CreateServiceLinkedRole\", \"iam:AddRoleToInstanceProfile\", \"iam:AttachRolePolicy\", \"iam:DetachRolePolicy\", \"iam:RemoveRoleFromInstanceProfile\", \"iam:DeletePolicy\", \"iam:DeletePolicyVersion\", \"iam:DeleteRole\", \"iam:DeleteRolePolicy\", \"iam:DeleteInstanceProfile\", \"iam:PutRolePolicy\", \"iam:PassRole\", \"iam:UpdateAssumeRolePolicy\" ], \"Effect\": \"Allow\", \"Resource\": \"*\" }, // The following permission set is only needed, if AWS Load Balancer controller is enabled (see ControlPlaneConfig) { \"Effect\": \"Allow\", \"Action\": [ \"cognito-idp:DescribeUserPoolClient\", \"acm:ListCertificates\", \"acm:DescribeCertificate\", \"iam:ListServerCertificates\", \"iam:GetServerCertificate\", \"waf-regional:GetWebACL\", \"waf-regional:GetWebACLForResource\", \"waf-regional:AssociateWebACL\", \"waf-regional:DisassociateWebACL\", \"wafv2:GetWebACL\", \"wafv2:GetWebACLForResource\", \"wafv2:AssociateWebACL\", \"wafv2:DisassociateWebACL\", \"shield:GetSubscriptionState\", \"shield:DescribeProtection\", \"shield:CreateProtection\", \"shield:DeleteProtection\" ], \"Resource\": \"*\" } ] } InfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig enableECRAccess: true dualStack: enabled: false networks: vpc: # specify either 'id' or 'cidr' # id: vpc-123456 cidr: 10.250.0.0/16 # gatewayEndpoints: # - s3 zones: - name: eu-west-1a internal: 10.250.112.0/22 public: 10.250.96.0/22 workers: 10.250.0.0/19 # elasticIPAllocationID: eipalloc-123456 ignoreTags: keys: # individual ignored tag keys - SomeCustomKey - AnotherCustomKey keyPrefixes: # ignored tag key prefixes - user.specific/prefix/ The enableECRAccess flag specifies whether the AWS IAM role policy attached to all worker nodes of the cluster shall contain permissions to access the Elastic Container Registry of the respective AWS account. If the flag is not provided it is defaulted to true. Please note that if the iamInstanceProfile is set for a worker pool in the WorkerConfig (see below) then enableECRAccess does not have any effect. It only applies for those worker pools whose iamInstanceProfile is not set.\n Click to expand the default AWS IAM policy document used for the instance profiles! { \"Version\": \"2012-10-17\", \"Statement\": [ { \"Effect\": \"Allow\", \"Action\": [ \"ec2:DescribeInstances\" ], \"Resource\": [ \"*\" ] }, // Only if `.enableECRAccess` is `true`. { \"Effect\": \"Allow\", \"Action\": [ \"ecr:GetAuthorizationToken\", \"ecr:BatchCheckLayerAvailability\", \"ecr:GetDownloadUrlForLayer\", \"ecr:GetRepositoryPolicy\", \"ecr:DescribeRepositories\", \"ecr:ListImages\", \"ecr:BatchGetImage\" ], \"Resource\": [ \"*\" ] } ] } The dualStack.enabled flag specifies whether dual-stack or IPv4-only should be supported by the infrastructure. When the flag is set to true an Amazon provided IPv6 CIDR block will be attached to the VPC. All subnets will receive a /64 block from it and a route entry is added to the main route table to route all IPv6 traffic over the IGW.\nThe networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.id is given then you have to specify the VPC ID of the existing VPC that was created by other means (manually, other tooling, …). Please make sure that the VPC has attached an internet gateway - the AWS controller won’t create one automatically for existing VPCs. To make sure the nodes are able to join and operate in your cluster properly, please make sure that your VPC has enabled DNS Support, explicitly the attributes enableDnsHostnames and enableDnsSupport must be set to true. If networks.vpc.cidr is given then you have to specify the VPC CIDR of a new VPC that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vpc.id or networks.vpc.cidr must be present, but not both at the same time. networks.vpc.gatewayEndpoints is optional. If specified then each item is used as service name in a corresponding Gateway VPC Endpoint. The networks.zones section contains configuration for resources you want to create or use in availability zones. For every zone, the AWS extension creates three subnets:\n The internal subnet is used for internal AWS load balancers. The public subnet is used for public AWS load balancers. The workers subnet is used for all shoot worker nodes, i.e., VMs which later run your applications. For every subnet, you have to specify a CIDR range contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nAlso, the AWS extension creates a dedicated NAT gateway for each zone. By default, it also creates a corresponding Elastic IP that it attaches to this NAT gateway and which is used for egress traffic. The elasticIPAllocationID field allows you to specify the ID of an existing Elastic IP allocation in case you want to bring your own. If provided, no new Elastic IP will be created and, instead, the Elastic IP specified by you will be used.\n⚠️ If you change this field for an already existing infrastructure then it will disrupt egress traffic while AWS applies this change. The reason is that the NAT gateway must be recreated with the new Elastic IP association. Also, please note that the existing Elastic IP will be permanently deleted if it was earlier created by the AWS extension.\nYou can configure Gateway VPC Endpoints by adding items in the optional list networks.vpc.gatewayEndpoints. Each item in the list is used as a service name and a corresponding endpoint is created for it. All created endpoints point to the service within the cluster’s region. For example, consider this (partial) shoot config:\nspec: region: eu-central-1 provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: gatewayEndpoints: - s3 The service name of the S3 Gateway VPC Endpoint in this example is com.amazonaws.eu-central-1.s3.\nIf you want to use multiple availability zones then add a second, third, … entry to the networks.zones[] list and properly specify the AZ name in networks.zones[].name.\nApart from the VPC and the subnets the AWS extension will also create DHCP options and an internet gateway (only if a new VPC is created), routing tables, security groups, elastic IPs, NAT gateways, EC2 key pairs, IAM roles, and IAM instance profiles.\nThe ignoreTags section allows to configure which resource tags on AWS resources managed by Gardener should be ignored during infrastructure reconciliation. By default, all tags that are added outside of Gardener’s reconciliation will be removed during the next reconciliation. This field allows users and automation to add custom tags on AWS resources created and managed by Gardener without loosing them on the next reconciliation. Tags can ignored either by specifying exact key values (ignoreTags.keys) or key prefixes (ignoreTags.keyPrefixes). In both cases it is forbidden to ignore the Name tag or any tag starting with kubernetes.io or gardener.cloud. Please note though, that the tags are only ignored on resources created on behalf of the Infrastructure CR (i.e. VPC, subnets, security groups, keypair, etc.), while tags on machines, volumes, etc. are not in the scope of this controller.\nControlPlaneConfig The control plane configuration mainly contains values for the AWS-specific control plane components. Today, the only component deployed by the AWS extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the AWS extension looks as follows:\napiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig cloudControllerManager: # featureGates: # SomeKubernetesFeature: true useCustomRouteController: true # loadBalancerController: # enabled: true # ingressClassName: alb storage: managedDefaultClass: false The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe cloudControllerManager.useCustomRouteController controls if the custom routes controller should be enabled. If enabled, it will add routes to the pod CIDRs for all nodes in the route tables for all zones.\nThe storage.managedDefaultClass controls if the default storage / volume snapshot classes are marked as default by Gardener. Set it to false to mark another storage / volume snapshot class as default without Gardener overwriting this change. If unset, this field defaults to true.\nIf the AWS Load Balancer Controller should be deployed, set loadBalancerController.enabled to true. In this case, it is assumed that an IngressClass named alb is created by the user. You can overwrite the name by setting loadBalancerController.ingressClassName.\nPlease note, that currently only the “instance” mode is supported.\nExamples for Ingress and Service managed by the AWS Load Balancer Controller: Prerequites Make sure you have created an IngressClass. For more details about parameters, please see AWS Load Balancer Controller - IngressClass\napiVersion: networking.k8s.io/v1 kind: IngressClass metadata: name: alb # default name if not specified by `loadBalancerController.ingressClassName` spec: controller: ingress.k8s.aws/alb Ingress apiVersion: networking.k8s.io/v1 kind: Ingress metadata: namespace: default name: echoserver annotations: # complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/ingress/annotations/ alb.ingress.kubernetes.io/scheme: internet-facing alb.ingress.kubernetes.io/target-type: instance # target-type \"ip\" NOT supported in Gardener spec: ingressClassName: alb rules: - http: paths: - path: / pathType: Prefix backend: service: name: echoserver port: number: 80 For more details see AWS Load Balancer Documentation - Ingress Specification\nService of Type LoadBalancer This can be used to create a Network Load Balancer (NLB).\napiVersion: v1 kind: Service metadata: annotations: # complete set of annotations: https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/ service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: instance # target-type \"ip\" NOT supported in Gardener service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing name: ingress-nginx-controller namespace: ingress-nginx ... spec: ... type: LoadBalancer loadBalancerClass: service.k8s.aws/nlb # mandatory to be managed by AWS Load Balancer Controller (otherwise the Cloud Controller Manager will act on it) For more details see AWS Load Balancer Documentation - Network Load Balancer\nWorkerConfig The AWS extension supports encryption for volumes plus support for additional data volumes per machine. For each data volume, you have to specify a name. By default (if not stated otherwise), all the disks (root \u0026 data volumes) are encrypted. Please make sure that your instance-type supports encryption. If your instance-type doesn’t support encryption, you will have to disable encryption (which is enabled by default) by setting volume.encrpyted to false (refer below shown YAML snippet).\nThe following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: gp2 size: 20Gi encrypted: false dataVolumes: - name: kubelet-dir type: gp2 size: 25Gi encrypted: true Note: The AWS extension does not support EBS volume (root \u0026 data volumes) encryption with customer managed CMK. Support for customer managed CMK is out of scope for now. Only AWS managed CMK is supported.\n Additionally, it is possible to provide further AWS-specific values for configuring the worker pools. The additional configuration must be specified in the providerConfig field of the respective worker.\nspec: provider: workers: - name: cpu-worker ... providerConfig: # AWS worker config The configuration will be evaluated when the provider-aws will reconcile the worker pools for the respective shoot.\nAn example WorkerConfig for the AWS extension looks as follows:\nspec: provider: workers: - name: cpu-worker ... providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: iops: 10000 throughput: 200 dataVolumes: - name: kubelet-dir iops: 12345 throughput: 150 snapshotID: snap-1234 iamInstanceProfile: # (specify either ARN or name) name: my-profile instanceMetadataOptions: httpTokens: required httpPutResponseHopLimit: 2 # arn: my-instance-profile-arn nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 0 memory: 50Gi The .volume.iops is the number of I/O operations per second (IOPS) that the volume supports. For io1 and gp3 volume type, this represents the number of IOPS that are provisioned for the volume. For gp2 volume type, this represents the baseline performance of the volume and the rate at which the volume accumulates I/O credits for bursting. For more information about General Purpose SSD baseline performance, I/O credits, IOPS range and bursting, see Amazon EBS Volume Types (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html) in the Amazon Elastic Compute Cloud User Guide.\nConstraint: IOPS should be a positive value. Validation of IOPS (i.e. whether it is allowed and is in the specified range for a particular volume type) is done on aws side.\nThe volume.throughput is the throughput that the volume supports, in MiB/s. As of 16th Aug 2022, this parameter is valid only for gp3 volume types and will return an error from the provider side if specified for other volume types. Its current range of throughput is from 125MiB/s to 1000 MiB/s. To know more about throughput and its range, see the official AWS documentation here.\nThe .dataVolumes can optionally contain configurations for the data volumes stated in the Shoot specification in the .spec.provider.workers[].dataVolumes list. The .name must match to the name of the data volume in the shoot. It is also possible to provide a snapshot ID. It allows to restore the data volume from an existing snapshot.\nThe iamInstanceProfile section allows to specify the IAM instance profile name xor ARN that should be used for this worker pool. If not specified, a dedicated IAM instance profile created by the infrastructure controller is used (see above).\nThe instanceMetadataOptions controls access to the instance metadata service (IMDS) for members of the worker. You can do the following operations:\n access IMDSv1 (default) access IMDSv2 - httpPutResponseHopLimit \u003e= 2 access IMDSv2 only (restrict access to IMDSv1) - httpPutResponseHopLimit \u003e=2, httpTokens = \"required\" disable access to IMDS - httpTokens = \"required\" Note: The accessibility of IMDS discussed in the previous point is referenced from the point of view of containers NOT running in the host network. By default on host network IMDSv2 is already enabled (but not accessible from inside the pods). It is currently not possible to create a VM with complete restriction to the IMDS service. It is however possible to restrict access from inside the pods by setting httpTokens to required and not setting httpPutResponseHopLimit (or setting it to 1).\n You can find more information regarding the options in the AWS documentation.\ncpuOptions grants more finegrained control over the worker’s CPU configuration. It has two attributes:\n coreCount: Specify a custom amount of cores the instance should be configured with. threadsPerCore: How many threads should there be on each core. Set to 1 to disable multi-threading. Note that if you decide to configure cpuOptions both these values need to be provided. For a list of valid combinations of these values refer to the AWS documentation.\nExample Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: cloudProfileName: aws region: eu-central-1 secretBindingName: core-aws provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a internal: 10.250.112.0/22 public: 10.250.96.0/22 workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: m5.large minimum: 2 maximum: 2 volume: size: 50Gi type: gp2 # The following provider config is valid if the volume type is `io1`. # providerConfig: # apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 # kind: WorkerConfig # volume: # iops: 10000 zones: - eu-central-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (three availability zones) Please find below an example Shoot manifest for three availability zones:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-aws namespace: garden-dev spec: cloudProfileName: aws region: eu-central-1 secretBindingName: core-aws provider: type: aws infrastructureConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vpc: cidr: 10.250.0.0/16 zones: - name: eu-central-1a workers: 10.250.0.0/26 public: 10.250.96.0/26 internal: 10.250.112.0/26 - name: eu-central-1b workers: 10.250.0.64/26 public: 10.250.96.64/26 internal: 10.250.112.64/26 - name: eu-central-1c workers: 10.250.0.128/26 public: 10.250.96.128/26 internal: 10.250.112.128/26 controlPlaneConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: m5.large minimum: 3 maximum: 9 volume: size: 50Gi type: gp2 zones: - eu-central-1a - eu-central-1b - eu-central-1c networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every AWS shoot cluster will be deployed with the AWS EBS CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new ebs.csi.aws.com provisioner.\nNode-specific Volume Limits The Kubernetes scheduler allows configurable limit for the number of volumes that can be attached to a node. See https://k8s.io/docs/concepts/storage/storage-limits/#custom-limits.\nCSI drivers usually have a different procedure for configuring this custom limit. By default, the EBS CSI driver parses the machine type name and then decides the volume limit. However, this is only a rough approximation and not good enough in most cases. Specifying the volume attach limit via command line flag (--volume-attach-limit) is currently the alternative until a more sophisticated solution presents itself (dynamically discovering the maximum number of attachable volume per EC2 machine type, see also https://github.com/kubernetes-sigs/aws-ebs-csi-driver/issues/347). The AWS extension allows the --volume-attach-limit flag of the EBS CSI driver to be configurable via aws.provider.extensions.gardener.cloud/volume-attach-limit annotation on the Shoot resource. If the annotation is added to an existing Shoot, then reconciliation needs to be triggered manually (see Immediate reconciliation), as in general adding annotation to resource is not a change that leads to .metadata.generation increase in general.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-aws@v1.34.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-aws@v1.36.\nFlow Infrastructure Reconciler The extension offers two different reconciler implementations for the infrastructure resource:\n terraform-based native Go SDK based (dubbed the “flow”-based implementation) The default implementation currently is the terraform reconciler which uses the https://github.com/gardener/terraformer as the backend for managing the shoot’s infrastructure.\nThe “flow” implementation is a newer implementation that is trying to solve issues we faced with managing terraform infrastructure on Kubernetes. The goal is to have more control over the reconciliation process and be able to perform fine-grained tuning over it. The implementation is completely backwards-compatible and offers a migration route from the legacy terraformer implementation.\nFor most users there will be no noticable difference. However for certain use-cases, users may notice a slight deviation from the previous behavior. For example, with flow-based infrastructure users may be able to perform certain modifications to infrastructure resources without having them reconciled back by terraform. Operations that would degrade the shoot infrastructure are still expected to be reverted back.\nFor the time-being, to take advantage of the flow reconcilier users have to “opt-in” by annotating the shoot manifest with: aws.provider.extensions.gardener.cloud/use-flow=\"true\". For existing shoots with this annotation, the migration will take place on the next infrastructure reconciliation (on maintenance window or if other infrastructure changes are requested). The migration is not revertible.\n","categories":"","description":"","excerpt":"Using the AWS provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-aws/usage/","tags":"","title":"Usage"},{"body":"Using the Azure provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for Azure and provides an example Shoot manifest with minimal configuration that can be used to create an Azure cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nAzure Provider Credentials In order for Gardener to create a Kubernetes cluster using Azure infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired Azure subscription. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the Azure subscription. The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nCreate an Azure Application and Service Principle and obtain its credentials.\nPlease ensure that the Azure application (spn) has the IAM actions defined here assigned. If no fine-grained permissions/actions required then simply assign the Contributor role.\nThe example below demonstrates how the secret containing the client credentials of the Azure Application has to look like:\napiVersion: v1 kind: Secret metadata: name: core-azure namespace: garden-dev type: Opaque data: clientID: base64(client-id) clientSecret: base64(client-secret) subscriptionID: base64(subscription-id) tenantID: base64(tenant-id) ⚠️ Depending on your API usage it can be problematic to reuse the same Service Principal for different Shoot clusters due to rate limits. Please consider spreading your Shoots over Service Principals from different Azure subscriptions if you are hitting those limits.\nManaged Service Principals The operators of the Gardener Azure extension can provide managed service principals. This eliminates the need for users to provide an own service principal for a Shoot.\nTo make use of a managed service principal, the Azure secret of a Shoot cluster must contain only a subscriptionID and a tenantID field, but no clientID and clientSecret. Removing those fields from the secret of an existing Shoot will also let it adopt the managed service principal.\nBased on the tenantID field, the Gardener extension will try to assign the managed service principal to the Shoot. If no managed service principal can be assigned then the next operation on the Shoot will fail.\n⚠️ The managed service principal need to be assigned to the users Azure subscription with proper permissions before using it.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: # specify either 'name' and 'resourceGroup' or 'cidr' # name: my-vnet # resourceGroup: my-vnet-resource-group cidr: 10.250.0.0/16 # ddosProtectionPlanID: /subscriptions/test/resourceGroups/test/providers/Microsoft.Network/ddosProtectionPlans/test-ddos-protection-plan workers: 10.250.0.0/19 # natGateway: # enabled: false # idleConnectionTimeoutMinutes: 4 # zone: 1 # ipAddresses: # - name: my-public-ip-name # resourceGroup: my-public-ip-resource-group # zone: 1 # serviceEndpoints: # - Microsoft.Test # zones: # - name: 1 # cidr: \"10.250.0.0/24 # - name: 2 # cidr: \"10.250.0.0/24\" # natGateway: # enabled: false zoned: false # resourceGroup: # name: mygroup #identity: # name: my-identity-name # resourceGroup: my-identity-resource-group # acrAccess: true Currently, it’s not yet possible to deploy into existing resource groups. The .resourceGroup.name field will allow specifying the name of an already existing resource group that the shoot cluster and all infrastructure resources will be deployed to.\nVia the .zoned boolean you can tell whether you want to use Azure availability zones or not. If you don’t use zones then an availability set will be created and only basic load balancers will be used. Zoned clusters use standard load balancers.\nThe networks.vnet section describes whether you want to create the shoot cluster in an already existing VNet or whether to create a new one:\n If networks.vnet.name and networks.vnet.resourceGroup are given then you have to specify the VNet name and VNet resource group name of the existing VNet that was created by other means (manually, other tooling, …). If networks.vnet.cidr is given then you have to specify the VNet CIDR of a new VNet that will be created during shoot creation. You can freely choose a private CIDR range. Either networks.vnet.name and neworks.vnet.resourceGroup or networks.vnet.cidr must be present, but not both at the same time. The networks.vnet.ddosProtectionPlanID field can be used to specify the id of a ddos protection plan which should be assigned to the VNet. This will only work for a VNet managed by Gardener. For externally managed VNets the ddos protection plan must be assigned by other means. If a vnet name is given and cilium shoot clusters are created without a network overlay within one vnet make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that vnet. Overlapping pod CIDRs will lead to disfunctional shoot clusters. It’s possible to place multiple shoot cluster into the same vnet The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications. The specified CIDR range must be contained in the VNet CIDR specified above, or the VNet CIDR of your already existing VNet. You can freely choose this CIDR and it is your responsibility to properly design the network layout to suit your needs.\nIn the networks.serviceEndpoints[] list you can specify the list of Azure service endpoints which shall be associated with the worker subnet. All available service endpoints and their technical names can be found in the (Azure Service Endpoint documentation](https://docs.microsoft.com/en-us/azure/virtual-network/virtual-network-service-endpoints-overview).\nThe networks.natGateway section contains configuration for the Azure NatGateway which can be attached to the worker subnet of a Shoot cluster. Here are some key information about the usage of the NatGateway for a Shoot cluster:\n NatGateway usage is optional and can be enabled or disabled via .networks.natGateway.enabled. If the NatGateway is not used then the egress connections initiated within the Shoot cluster will be nated via the LoadBalancer of the clusters (default Azure behaviour, see here). NatGateway is only available for zonal clusters .zoned=true. The NatGateway is currently not zone redundantly deployed. That mean the NatGateway of a Shoot cluster will always be in just one zone. This zone can be optionally selected via .networks.natGateway.zone. Caution: Modifying the .networks.natGateway.zone setting requires a recreation of the NatGateway and the managed public ip (automatically used if no own public ip is specified, see below). That mean you will most likely get a different public ip for egress connections. It is possible to bring own zonal public ip(s) via networks.natGateway.ipAddresses. Those public ip(s) need to be in the same zone as the NatGateway (see networks.natGateway.zone) and be of SKU standard. For each public ip the name, the resourceGroup and the zone need to be specified. The field networks.natGateway.idleConnectionTimeoutMinutes allows the configuration of NAT Gateway’s idle connection timeout property. The idle timeout value can be adjusted from 4 minutes, up to 120 minutes. Omitting this property will set the idle timeout to its default value according to NAT Gateway’s documentation. In the identity section you can specify an Azure user-assigned managed identity which should be attached to all cluster worker machines. With identity.name you can specify the name of the identity and with identity.resourceGroup you can specify the resource group which contains the identity resource on Azure. The identity need to be created by the user upfront (manually, other tooling, …). Gardener/Azure Extension will only use the referenced one and won’t create an identity. Furthermore the identity have to be in the same subscription as the Shoot cluster. Via the identity.acrAccess you can configure the worker machines to use the passed identity for pulling from an Azure Container Registry (ACR). Caution: Adding, exchanging or removing the identity will require a rolling update of all worker machines in the Shoot cluster.\nApart from the VNet and the worker subnet the Azure extension will also create a dedicated resource group, route tables, security groups, and an availability set (if not using zoned clusters).\nInfrastructureConfig with dedicated subnets per zone Another deployment option for zonal clusters only, is to create and configure a separate subnet per availability zone. This network layout is recommended to users that require fine-grained control over their network setup. One prevalent usecase is to create a zone-redundant NAT Gateway deployment by taking advantage of the ability to deploy separate NAT Gateways for each subnet.\nTo use this configuration the following requirements must be met:\n the zoned field must be set to true. the networks.vnet section must not be empty and must contain a valid configuration. For existing clusters that were not using the networks.vnet section, it is enough if networks.vnet.cidr field is set to the current networks.worker value. For each of the target zones a subnet CIDR range must be specified. The specified CIDR range must be contained in the VNet CIDR specified above, or the VNet CIDR of your already existing VNet. In addition, the CIDR ranges must not overlap with the ranges of the other subnets.\nServiceEndpoints and NatGateways can be configured per subnet. Respectively, when networks.zones is specified, the fields networks.workers, networks.serviceEndpoints and networks.natGateway cannot be set. All the configuration for the subnets must be done inside the respective zone’s configuration.\nExample:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: zoned: true vnet: # specify either 'name' and 'resourceGroup' or 'cidr' cidr: 10.250.0.0/16 zones: - name: 1 cidr: \"10.250.0.0/24\" - name: 2 cidr: \"10.250.0.0/24\" natGateway: enabled: false Migrating to zonal shoots with dedicated subnets per zone For existing zonal clusters it is possible to migrate to a network layout with dedicated subnets per zone. The migration works by creating additional network resources as specified in the configuration and progressively roll part of your existing nodes to use the new resources. To achieve the controlled rollout of your nodes, parts of the existing infrastructure must be preserved which is why the following constraint is imposed:\nOne of your specified zones must have the exact same CIDR range as the current network.workers field. Here is an example of such migration:\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true to\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 zones: - name: 3 cidr: 10.250.0.0/19 # note the preservation of the 'workers' CIDR # optionally add other zones # - name: 2 # cidr: 10.250.32.0/19 # natGateway: # enabled: true zoned: true Another more advanced example with user-provided public IP addresses for the NAT Gateway and how it can be migrated:\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 natGateway: enabled: true zone: 1 ipAddresses: - name: pip1 resourceGroup: group zone: 1 - name: pip2 resourceGroup: group zone: 1 zoned: true to\ninfrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig zoned: true networks: vnet: cidr: 10.250.0.0/16 zones: - name: 1 cidr: 10.250.0.0/19 # note the preservation of the 'workers' CIDR natGateway: enabled: true ipAddresses: - name: pip1 resourceGroup: group zone: 1 - name: pip2 resourceGroup: group zone: 1 # optionally add other zones # - name: 2 # cidr: 10.250.32.0/19 # natGateway: # enabled: true # ipAddresses: # - name: pip3 # resourceGroup: group You can apply such change to your shoot by issuing a kubectl patch command to replace your current .spec.provider.infrastructureConfig section:\n$ cat new-infra.json [ { \"op\": \"replace\", \"path\": \"/spec/provider/infrastructureConfig\", \"value\": { \"apiVersion\": \"azure.provider.extensions.gardener.cloud/v1alpha1\", \"kind\": \"InfrastructureConfig\", \"networks\": { \"vnet\": { \"cidr\": \"\u003cyour-vnet-cidr\u003e\" }, \"zones\": [ { \"name\": 1, \"cidr\": \"10.250.0.0/24\", \"natGateway\": { \"enabled\": true } }, { \"name\": 1, \"cidr\": \"10.250.1.0/24\", \"natGateway\": { \"enabled\": true } }, ] }, \"zoned\": true } } ] kubectl patch --type=\"json\" --patch-file new-infra.json shoot \u003cmy-shoot\u003e ⚠️ The migration to shoots with dedicated subnets per zone is a one-way process. Reverting the shoot to the previous configuration is not supported.\n⚠️ During the migration a subset of the nodes will be rolled to the new subnets.\nControlPlaneConfig The control plane configuration mainly contains values for the Azure-specific control plane components. Today, the only component deployed by the Azure extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the Azure extension looks as follows:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig cloudControllerManager: # featureGates: # SomeKubernetesFeature: true The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nstorage contains options for storage-related control plane component. storage.managedDefaultStorageClass is enabled by default and will deploy a storageClass and mark it as a default (via the storageclass.kubernetes.io/is-default-class annotation) storage.managedDefaultVolumeSnapshotClass is enabled by default and will deploy a volumeSnapshotClass and mark it as a default (via the snapshot.storage.kubernetes.io/is-default-classs annotation) In case you want to manage your own default storageClass or volumeSnapshotClass you need to disable the respective options above, otherwise reconciliation of the controlplane may fail.\nWorkerConfig The Azure extension supports encryption for volumes plus support for additional data volumes per machine. Please note that you cannot specify the encrypted flag for Azure disks as they are encrypted by default/out-of-the-box. For each data volume, you have to specify a name. The following YAML is a snippet of a Shoot resource:\nspec: provider: workers: - name: cpu-worker ... volume: type: Standard_LRS size: 20Gi dataVolumes: - name: kubelet-dir type: Standard_LRS size: 25Gi Additionally, it supports for other Azure-specific values and could be configured under .spec.provider.workers[].providerConfig\nAn example WorkerConfig for the Azure extension looks like:\napiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 1 memory: 50Gi diagnosticsProfile: enabled: true # storageURI: https://\u003cstorage-account-name\u003e.blob.core.windows.net/ dataVolumes: - name: test-image imageRef: communityGalleryImageID: /CommunityGalleries/gardenlinux-13e998fe-534d-4b0a-8a27-f16a73aef620/Images/gardenlinux/Versions/1443.10.0 # sharedGalleryImageID: /SharedGalleries/82fc46df-cc38-4306-9880-504e872cee18-VSMP_MEMORYONE_GALLERY/Images/vSMP_MemoryONE/Versions/1062800168.0.0 # id: /Subscriptions/2ebd38b6-270b-48a2-8e0b-2077106dc615/Providers/Microsoft.Compute/Locations/westeurope/Publishers/sap/ArtifactTypes/VMImage/Offers/gardenlinux/Skus/greatest/Versions/1443.10.0 # urn: sap:gardenlinux:greatest:1443.10.0 The .nodeTemplate is used to specify resource information of the machine during runtime. This then helps in Scale-from-Zero. Some points to note for this field:\n Currently only cpu, gpu and memory are configurable. a change in the value lead to a rolling update of the machine in the worker pool all the resources needs to be specified The .diagnosticsProfile is used to enable machine boot diagnostics (disabled per default). A storage account is used for storing vm’s boot console output and screenshots. If .diagnosticsProfile.StorageURI is not specified azure managed storage will be used (recommended way).\nThe .dataVolumes field is used to add provider specific configurations for dataVolumes. .dataVolumes[].name must match with one of the names in workers.dataVolumes[].name. To specify an image source for the dataVolume either use communityGalleryImageID, sharedGalleryImageID, id or urn as imageRef. However, users have to make sure that the image really exists, there’s yet no check in place. If the image does not exist the machine will get stuck in creation.\nExample Shoot manifest (non-zoned) Please find below an example Shoot manifest for a non-zoned cluster:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: false controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS # providerConfig: # apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 # kind: WorkerConfig # nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) # capacity: # cpu: 2 # gpu: 1 # memory: 50Gi networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (zoned) Please find below an example Shoot manifest for a zoned cluster:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Example Shoot manifest (zoned with NAT Gateways per zone) Please find below an example Shoot manifest for a zoned cluster using NAT Gateways per zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfile: name: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 zones: - name: 1 cidr: 10.250.0.0/24 serviceEndpoints: - Microsoft.Storage - Microsoft.Sql natGateway: enabled: true idleConnectionTimeoutMinutes: 4 - name: 2 cidr: 10.250.1.0/24 serviceEndpoints: - Microsoft.Storage - Microsoft.Sql natGateway: enabled: true zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every Azure shoot cluster will be deployed with the Azure Disk CSI driver and the Azure File CSI driver.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-azure@v1.25.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-azure@v1.28.\nMiscellaneous Azure Accelerated Networking All worker machines of the cluster will be automatically configured to use Azure Accelerated Networking if the prerequisites are fulfilled. The prerequisites are that the cluster must be zoned, and the used machine type and operating system image version are compatible for Accelerated Networking. Availability Set based shoot clusters will not be enabled for accelerated networking even if the machine type and operating system support it, this is necessary because all machines from the availability set must be scheduled on special hardware, more daitls can be found here. Supported machine types are listed in the CloudProfile in .spec.providerConfig.machineTypes[].acceleratedNetworking and the supported operating system image versions are defined in .spec.providerConfig.machineImages[].versions[].acceleratedNetworking.\nPreview: Shoot clusters with VMSS Flexible Orchestration (VMSS Flex/VMO) The machines of an Azure cluster can be created while being attached to an Azure Virtual Machine ScaleSet with flexible orchestraion. The Virtual Machine ScaleSet with flexible orchestration feature is currently in preview and not yet general available on Azure. Subscriptions need to join the preview to make use of the feature.\nAzure VMSS Flex is intended to replace Azure AvailabilitySet for non-zoned Azure Shoot clusters in the mid-term (once the feature goes GA) as VMSS Flex come with less disadvantages like no blocking machine operations or compability with Standard SKU loadbalancer etc.\nTo configure an Azure Shoot cluster which make use of VMSS Flex you need to do the following:\n The InfrastructureConfig of the Shoot configuration need to contain .zoned=false Shoot resource need to have the following annotation assigned: alpha.azure.provider.extensions.gardener.cloud/vmo=true Some key facts about VMSS Flex based clusters:\n Unlike regular non-zonal Azure Shoot clusters, which have a primary AvailabilitySet which is shared between all machines in all worker pools of a Shoot cluster, a VMSS Flex based cluster has an own VMSS for each workerpool In case the configuration of the VMSS will change (e.g. amount of fault domains in a region change; configured in the CloudProfile) all machines of the worker pool need to be rolled It is not possible to migrate an existing primary AvailabilitySet based Shoot cluster to VMSS Flex based Shoot cluster and vice versa VMSS Flex based clusters are using Standard SKU LoadBalancers instead of Basic SKU LoadBalancers for AvailabilitySet based Shoot clusters ","categories":"","description":"","excerpt":"Using the Azure provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-azure/usage/","tags":"","title":"Usage"},{"body":"Using the Equinix Metal provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for Equinix Metal and provide an example Shoot manifest with minimal configuration that you can use to create an Equinix Metal cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider secret data Every shoot cluster references a SecretBinding which itself references a Secret, and this Secret contains the provider credentials of your Equinix Metal project. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: my-secret namespace: garden-dev type: Opaque data: apiToken: base64(api-token) projectID: base64(project-id) Please look up https://metal.equinix.com/developers/api/ as well.\nWith Secret created, create a SecretBinding resource referencing it. It may look like this:\napiVersion: core.gardener.cloud/v1beta1 kind: SecretBinding metadata: name: my-secret namespace: garden-dev secretRef: name: my-secret quotas: [] InfrastructureConfig Currently, there is no infrastructure configuration possible for the Equinix Metal environment.\nAn example InfrastructureConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig The Equinix Metal extension will only create a key pair.\nControlPlaneConfig The control plane configuration mainly contains values for the Equinix Metal-specific control plane components. Today, the Equinix Metal extension deploys the cloud-controller-manager and the CSI controllers, however, it doesn’t offer any configuration options at the moment.\nAn example ControlPlaneConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig WorkerConfig The Equinix Metal extension supports specifying IDs for reserved devices that should be used for the machines of a specific worker pool.\nAn example WorkerConfig for the Equinix Metal extension looks as follows:\napiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig reservationIDs: - my-reserved-device-1 - my-reserved-device-2 reservedDevicesOnly: false The .reservationIDs[] list contains the list of IDs of the reserved devices. The .reservedDevicesOnly field indicates whether only reserved devices from the provided list of reservation IDs should be used when new machines are created. It always will attempt to create a device from one of the reservation IDs. If none is available, the behaviour depends on the setting:\n true: return an error false: request a regular on-demand device The default value is false.\nExample Shoot manifest Please find below an example Shoot manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: my-shoot namespace: garden-dev spec: cloudProfileName: equinix-metal region: ny # Corresponds to a metro secretBindingName: my-secret provider: type: equinixmetal infrastructureConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig controlPlaneConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-pool1 machine: type: t1.small minimum: 2 maximum: 2 volume: size: 50Gi type: storage_1 zones: # Optional list of facilities, all of which MUST be in the metro; if not provided, then random facilities within the metro will be chosen for each machine. - ewr1 - ny5 - name: reserved-pool machine: type: t1.small minimum: 1 maximum: 2 providerConfig: apiVersion: equinixmetal.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig reservationIDs: - reserved-device1 - reserved-device2 reservedDevicesOnly: true volume: size: 50Gi type: storage_1 networking: type: calico kubernetes: version: 1.27.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true ⚠️ Note that if you specify multiple facilities in the .spec.provider.workers[].zones[] list then new machines are randomly created in one of the provided facilities. Particularly, it is not ensured that all facilities are used or that all machines are equally or unequally distributed.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-equinix-metal@v2.2.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation feature gate since gardener-extension-provider-equinix-metal@v2.3 and ShootSARotation feature gate since gardener-extension-provider-equinix-metal@v2.4.\n","categories":"","description":"","excerpt":"Using the Equinix Metal provider extension with Gardener as end-user …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-equinix-metal/usage/","tags":"","title":"Usage"},{"body":"Using the GCP provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nThis document describes the configurable options for GCP and provides an example Shoot manifest with minimal configuration that can be used to create a GCP cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nGCP Provider Credentials In order for Gardener to create a Kubernetes cluster using GCP infrastructure components, a Shoot has to provide credentials with sufficient permissions to the desired GCP project. Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of the GCP project. The SecretBinding/CredentialsBinding is configurable in the Shoot cluster with the field secretBindingName/credentialsBindingName.\nThe required credentials for the GCP project are a Service Account Key to authenticate as a GCP Service Account. A service account is a special account that can be used by services and applications to interact with Google Cloud Platform APIs. Applications can use service account credentials to authorize themselves to a set of APIs and perform actions within the permissions granted to the service account.\nMake sure to enable the Google Identity and Access Management (IAM) API. Create a Service Account that shall be used for the Shoot cluster. Grant at least the following IAM roles to the Service Account.\n Service Account Admin Service Account Token Creator Service Account User Compute Admin Create a JSON Service Account key for the Service Account. Provide it in the Secret (base64 encoded for field serviceaccount.json), that is being referenced by the SecretBinding in the Shoot cluster configuration.\nThis Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-gcp namespace: garden-dev type: Opaque data: serviceaccount.json: base64(serviceaccount-json) ⚠️ Depending on your API usage it can be problematic to reuse the same Service Account Key for different Shoot clusters due to rate limits. Please consider spreading your Shoots over multiple Service Accounts on different GCP projects if you are hitting those limits, see https://cloud.google.com/compute/docs/api-rate-limits.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: # vpc: # name: my-vpc # cloudRouter: # name: my-cloudrouter workers: 10.250.0.0/16 # internal: 10.251.0.0/16 # cloudNAT: # minPortsPerVM: 2048 # maxPortsPerVM: 65536 # endpointIndependentMapping: # enabled: false # enableDynamicPortAllocation: false # natIPNames: # - name: manualnat1 # - name: manualnat2 # udpIdleTimeoutSec: 30 # icmpIdleTimeoutSec: 30 # tcpEstablishedIdleTimeoutSec: 1200 # tcpTransitoryIdleTimeoutSec: 30 # tcpTimeWaitTimeoutSec: 120 # flowLogs: # aggregationInterval: INTERVAL_5_SEC # flowSampling: 0.2 # metadata: INCLUDE_ALL_METADATA The networks.vpc section describes whether you want to create the shoot cluster in an already existing VPC or whether to create a new one:\n If networks.vpc.name is given then you have to specify the VPC name of the existing VPC that was created by other means (manually, other tooling, …). If you want to get a fresh VPC for the shoot then just omit the networks.vpc field.\n If a VPC name is not given then we will create the cloud router + NAT gateway to ensure that worker nodes don’t get external IPs.\n If a VPC name is given then a cloud router name must also be given, failure to do so would result in validation errors and possibly clusters without egress connectivity.\n If a VPC name is given and calico shoot clusters are created without a network overlay within one VPC make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that VPC. Overlapping pod CIDRs will lead to disfunctional shoot clusters.\n The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications.\nThe networks.internal section is optional and can describe a CIDR for a subnet that is used for internal load balancers,\nThe networks.cloudNAT.minPortsPerVM is optional and is used to define the minimum number of ports allocated to a VM for the CloudNAT\nThe networks.cloudNAT.natIPNames is optional and is used to specify the names of the manual ip addresses which should be used by the nat gateway\nThe networks.cloudNAT.endpointIndependentMapping is optional and is used to define the endpoint mapping behavior. You can enable it or disable it at any point by toggling networks.cloudNAT.endpointIndependentMapping.enabled. By default, it is disabled.\nnetworks.cloudNAT.enableDynamicPortAllocation is optional (default: false) and allows one to enable dynamic port allocation (https://cloud.google.com/nat/docs/ports-and-addresses#dynamic-port). Note that enabling this puts additional restrictions on the permitted values for networks.cloudNAT.minPortsPerVM and networks.cloudNAT.minPortsPerVM, namely that they now both are required to be powers of two. Also, maxPortsPerVM may not be given if dynamic port allocation is disabled.\nnetworks.cloudNAT.udpIdleTimeoutSec, networks.cloudNAT.icmpIdleTimeoutSec, networks.cloudNAT.tcpEstablishedIdleTimeoutSec, networks.cloudNAT.tcpTransitoryIdleTimeoutSec, and networks.cloudNAT.tcpTimeWaitTimeoutSec give more fine-granular control over various timeout-values. For more details see https://cloud.google.com/nat/docs/public-nat#specs-timeouts.\nThe specified CIDR ranges must be contained in the VPC CIDR specified above, or the VPC CIDR of your already existing VPC. You can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nThe networks.flowLogs section describes the configuration for the VPC flow logs. In order to enable the VPC flow logs at least one of the following parameters needs to be specified in the flow log section:\n networks.flowLogs.aggregationInterval an optional parameter describing the aggregation interval for collecting flow logs. For more details, see aggregation_interval reference.\n networks.flowLogs.flowSampling an optional parameter describing the sampling rate of VPC flow logs within the subnetwork where 1.0 means all collected logs are reported and 0.0 means no logs are reported. For more details, see flow_sampling reference.\n networks.flowLogs.metadata an optional parameter describing whether metadata fields should be added to the reported VPC flow logs. For more details, see metadata reference.\n Apart from the VPC and the subnets the GCP extension will also create a dedicated service account for this shoot, and firewall rules.\nControlPlaneConfig The control plane configuration mainly contains values for the GCP-specific control plane components. Today, the only component deployed by the GCP extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the GCP extension looks as follows:\napiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b cloudControllerManager: # featureGates: # SomeKubernetesFeature: true storage: managedDefaultStorageClass: true managedDefaultVolumeSnapshotClass: true The zone field tells the cloud-controller-manager in which zone it should mainly operate. You can still create clusters in multiple availability zones, however, the cloud-controller-manager requires one “main” zone. ⚠️ You always have to specify this field!\nThe cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommend to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe members of the storage allows to configure the provided storage classes further. If storage.managedDefaultStorageClass is enabled (the default), the default StorageClass deployed will be marked as default (via storageclass.kubernetes.io/is-default-class annotation). Similarly, if storage.managedDefaultVolumeSnapshotClass is enabled (the default), the default VolumeSnapshotClass deployed will be marked as default. In case you want to set a different StorageClass or VolumeSnapshotClass as default you need to set the corresponding option to false as at most one class should be marked as default in each case and the ResourceManager will prevent any changes from the Gardener managed classes to take effect.\nWorkerConfig The worker configuration contains:\n Local SSD interface for the additional volumes attached to GCP worker machines.\nIf you attach the disk with SCRATCH type, either an NVMe interface or a SCSI interface must be specified. It is only meaningful to provide this volume interface if only SCRATCH data volumes are used.\n Volume Encryption config that specifies values for kmsKeyName and kmsKeyServiceAccountName.\n The kmsKeyName is the key name of the cloud kms disk encryption key and must be specified if CMEK disk encryption is needed. The kmsKeyServiceAccount is the service account granted the roles/cloudkms.cryptoKeyEncrypterDecrypter on the kmsKeyName If empty, then the role should be given to the Compute Engine Service Agent Account. This CESA account usually has the name: service-PROJECT_NUMBER@compute-system.iam.gserviceaccount.com. See: https://cloud.google.com/iam/docs/service-agents#compute-engine-service-agent Prior to use, the operator should add IAM policy binding using the gcloud CLI: gcloud projects add-iam-policy-binding projectId --member serviceAccount:name@projectIdgserviceaccount.com --role roles/cloudkms.cryptoKeyEncrypterDecrypter Setting a volume image with dataVolumes.sourceImage. However, this parameter should only be used with particular caution. For example Gardenlinux works with filesystem LABELs only and creating another disk form the very same image causes the LABELs to be duplicated. See: https://github.com/gardener/gardener-extension-provider-gcp/issues/323\n Some hyperdisks allow adjustment of their default values for provisionedIops and provisionedThroughput. Keep in mind though that Hyperdisk Extreme and Hyperdisk Throughput volumes can’t be used as boot disks.\n Service Account with their specified scopes, authorized for this worker.\nService accounts created in advance that generate access tokens that can be accessed through the metadata server and used to authenticate applications on the instance.\nNote: If you do not provide service accounts for your workers, the Compute Engine default service account will be used. For more details on the default account, see https://cloud.google.com/compute/docs/access/service-accounts#default_service_account. If the DisableGardenerServiceAccountCreation feature gate is disabled, Gardener will create a shared service accounts to use for all instances. This feature gate is currently in beta and it will no longer be possible to re-enable the service account creation via feature gate flag.\n GPU with its type and count per node. This will attach that GPU to all the machines in the worker grp\nNote:\n A rolling upgrade of the worker group would be triggered in case the acceleratorType or count is updated.\n Some machineTypes like a2 family come with already attached gpu of a100 type and pre-defined count. If your workerPool consists of such machineTypes, please specify exact GPU configuration for the machine type as specified in Google cloud documentation. acceleratorType to use for families with attached gpu are stated below:\n a2 family -\u003e nvidia-tesla-a100 g2 family -\u003e nvidia-l4 Sufficient quota of gpu is needed in the GCP project. This includes quota to support autoscaling if enabled.\n GPU-attached machines can’t be live migrated during host maintenance events. Find out how to handle that in your application here\n GPU count specified here is considered for forming node template during scale-from-zero in Cluster Autoscaler\n The .nodeTemplate is used to specify resource information of the machine during runtime. This then helps in Scale-from-Zero. Some points to note for this field:\n Currently only cpu, gpu and memory are configurable. a change in the value lead to a rolling update of the machine in the workerpool all the resources needs to be specified An example WorkerConfig for the GCP looks as follows:\n apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig volume: interface: NVME encryption: kmsKeyName: \"projects/projectId/locations/\u003czoneName\u003e/keyRings/\u003ckeyRingName\u003e/cryptoKeys/alpha\" kmsKeyServiceAccount: \"user@projectId.iam.gserviceaccount.com\" dataVolumes: - name: test sourceImage: projects/sap-se-gcp-gardenlinux/global/images/gardenlinux-gcp-gardener-prod-amd64-1443-3-c261f887 provisionedIops: 3000 provisionedThroughput: 140 serviceAccount: email: foo@bar.com scopes: - https://www.googleapis.com/auth/cloud-platform gpu: acceleratorType: nvidia-tesla-t4 count: 1 nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) capacity: cpu: 2 gpu: 1 memory: 50Gi Example Shoot manifest Please find below an example Shoot manifest:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-gcp namespace: garden-dev spec: cloudProfileName: gcp region: europe-west1 secretBindingName: core-gcp provider: type: gcp infrastructureConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: workers: 10.250.0.0/16 controlPlaneConfig: apiVersion: gcp.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig zone: europe-west1-b workers: - name: worker-xoluy machine: type: n1-standard-4 minimum: 2 maximum: 2 volume: size: 50Gi type: pd-standard zones: - europe-west1-b networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every GCP shoot cluster will be deployed with the GCP PD CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new pd.csi.storage.gke.io provisioner.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-gcp@v1.21.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-gcp@v1.23.\n","categories":"","description":"","excerpt":"Using the GCP provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-gcp/usage/","tags":"","title":"Usage"},{"body":"Using the OpenStack provider extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that are meant to contain provider-specific configuration.\nIn this document we are describing how this configuration looks like for OpenStack and provide an example Shoot manifest with minimal configuration that you can use to create an OpenStack cluster (modulo the landscape-specific information like cloud profile names, secret binding names, etc.).\nProvider Secret Data Every shoot cluster references a SecretBinding or a CredentialsBinding which itself references a Secret, and this Secret contains the provider credentials of your OpenStack tenant. This Secret must look as follows:\napiVersion: v1 kind: Secret metadata: name: core-openstack namespace: garden-dev type: Opaque data: domainName: base64(domain-name) tenantName: base64(tenant-name) # either use username/password username: base64(user-name) password: base64(password) # or application credentials #applicationCredentialID: base64(app-credential-id) #applicationCredentialName: base64(app-credential-name) # optional #applicationCredentialSecret: base64(app-credential-secret) Please look up https://docs.openstack.org/keystone/pike/admin/identity-concepts.html as well.\nFor authentication with username/password see Keystone username/password\nAlternatively, for authentication with application credentials see Keystone Application Credentials.\n⚠️ Depending on your API usage it can be problematic to reuse the same provider credentials for different Shoot clusters due to rate limits. Please consider spreading your Shoots over multiple credentials from different tenants if you are hitting those limits.\nInfrastructureConfig The infrastructure configuration mainly describes how the network layout looks like in order to create the shoot worker nodes in a later step, thus, prepares everything relevant to create VMs, load balancers, volumes, etc.\nAn example InfrastructureConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig floatingPoolName: MY-FLOATING-POOL # floatingPoolSubnetName: my-floating-pool-subnet-name networks: # id: 12345678-abcd-efef-08af-0123456789ab # router: # id: 1234 workers: 10.250.0.0/19 # shareNetwork: # enabled: true The floatingPoolName is the name of the floating pool you want to use for your shoot. If you don’t know which floating pools are available look it up in the respective CloudProfile.\nWith floatingPoolSubnetName you can explicitly define to which subnet in the floating pool network (defined via floatingPoolName) the router should be attached to.\nnetworks.id is an optional field. If it is given, you can specify the uuid of an existing private Neutron network (created manually, by other tooling, …) that should be reused. A new subnet for the Shoot will be created in it.\nIf a networks.id is given and calico shoot clusters are created without a network overlay within one network make sure that the pod CIDR specified in shoot.spec.networking.pods is not overlapping with any other pod CIDR used in that network. Overlapping pod CIDRs will lead to disfunctional shoot clusters.\nThe networks.router section describes whether you want to create the shoot cluster in an already existing router or whether to create a new one:\n If networks.router.id is given then you have to specify the router id of the existing router that was created by other means (manually, other tooling, …). If you want to get a fresh router for the shoot then just omit the networks.router field.\n In any case, the shoot cluster will be created in a new subnet.\n The networks.workers section describes the CIDR for a subnet that is used for all shoot worker nodes, i.e., VMs which later run your applications.\nYou can freely choose these CIDRs and it is your responsibility to properly design the network layout to suit your needs.\nApart from the router and the worker subnet the OpenStack extension will also create a network, router interfaces, security groups, and a key pair.\nThe optional networks.shareNetwork.enabled field controls the creation of a share network. This is only needed if shared file system storage (like NFS) should be used. Note, that in this case, the ControlPlaneConfig needs additional configuration, too.\nControlPlaneConfig The control plane configuration mainly contains values for the OpenStack-specific control plane components. Today, the only component deployed by the OpenStack extension is the cloud-controller-manager.\nAn example ControlPlaneConfig for the OpenStack extension looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: haproxy loadBalancerClasses: - name: lbclass-1 purpose: default floatingNetworkID: fips-1-id floatingSubnetName: internet-* - name: lbclass-2 floatingNetworkID: fips-1-id floatingSubnetTags: internal,private - name: lbclass-3 purpose: private subnetID: internal-id # cloudControllerManager: # featureGates: # SomeKubernetesFeature: true # storage: # csiManila: # enabled: true The loadBalancerProvider is the provider name you want to use for load balancers in your shoot. If you don’t know which types are available look it up in the respective CloudProfile.\nThe loadBalancerClasses field contains an optional list of load balancer classes which will be available in the cluster. Each entry can have the following fields:\n name to select the load balancer class via the kubernetes service annotations loadbalancer.openstack.org/class=name purpose with values default or private The configuration of the default load balancer class will be used as default for all other kubernetes loadbalancer services without a class annotation The configuration of the private load balancer class will be also set to the global loadbalancer configuration of the cluster, but will be overridden by the default purpose floatingNetworkID can be specified to receive an ip from an floating/external network, additionally the subnet in this network can be selected via floatingSubnetName can be either a full subnet name or a regex/glob to match subnet name floatingSubnetTags a comma seperated list of subnet tags floatingSubnetID the id of a specific subnet subnetID can be specified by to receive an ip from an internal subnet (will not have an effect in combination with floating/external network configuration) The cloudControllerManager.featureGates contains a map of explicitly enabled or disabled feature gates. For production usage it’s not recommended to use this field at all as you can enable alpha features or disable beta/stable features, potentially impacting the cluster stability. If you don’t want to configure anything for the cloudControllerManager simply omit the key in the YAML specification.\nThe optional storage.csiManila.enabled field is used to enable the deployment of the CSI Manila driver to support NFS persistent volumes. In this case, please ensure to set networks.shareNetwork.enabled=true in the InfrastructureConfig, too. Additionally, if CSI Manila driver is enabled, for each availability zone a NFS StorageClass will be created on the shoot named like csi-manila-nfs-\u003czone\u003e.\nWorkerConfig Each worker group in a shoot may contain provider-specific configurations and options. These are contained in the providerConfig section of a worker group and can be configured using a WorkerConfig object. An example of a WorkerConfig looks as follows:\napiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig serverGroup: policy: soft-anti-affinity # nodeTemplate: # (to be specified only if the node capacity would be different from cloudprofile info during runtime) # capacity: # cpu: 2 # gpu: 0 # memory: 50Gi # machineLabels: # - name: my-label # value: foo # - name: my-rolling-label # value: bar # triggerRollingOnUpdate: true # means any change of the machine label value will trigger rolling of all machines of the worker pool ServerGroups When you specify the serverGroup section in your worker group configuration, a new server group will be created with the configured policy for each worker group that enabled this setting and all machines managed by this worker group will be assigned as members of the created server group.\nFor users to have access to the server group feature, it must be enabled on the CloudProfile by your operator. Existing clusters can take advantage of this feature by updating the server group configuration of their respective worker groups. Worker groups that are already configured with server groups can update their setting to change the policy used, or remove it altogether at any time.\nUsers must be aware that any change to the server group settings will result in a rolling deployment of new nodes for the affected worker group.\nPlease note the following restrictions when deploying workers with server groups:\n The serverGroup section is optional, but if it is included in the worker configuration, it must contain a valid policy value. The available policy values that can be used, are defined in the provider specific section of CloudProfile by your operator. Certain policy values may induce further constraints. Using the affinity policy is only allowed when the worker group utilizes a single zone. MachineLabels The machineLabels section in the worker group configuration allows to specify additional machine labels. These labels are added to the machine instances only, but not to the node object. Additionally, they have an optional triggerRollingOnUpdate field. If it is set to true, changing the label value will trigger a rolling of all machines of this worker pool.\nNode Templates Node templates allow users to override the capacity of the nodes as defined by the server flavor specified in the CloudProfile’s machineTypes. This is useful for certain dynamic scenarios as it allows users to customize cluster-autoscaler’s behavior for these workergroup with their provided values.\nExample Shoot manifest (one availability zone) Please find below an example Shoot manifest for one availability zone:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-openstack namespace: garden-dev spec: cloudProfile: name: openstack region: europe-1 secretBindingName: core-openstack provider: type: openstack infrastructureConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig floatingPoolName: MY-FLOATING-POOL networks: workers: 10.250.0.0/19 controlPlaneConfig: apiVersion: openstack.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig loadBalancerProvider: haproxy workers: - name: worker-xoluy machine: type: medium_4_8 minimum: 2 maximum: 2 zones: - europe-1a networking: nodes: 10.250.0.0/16 type: calico kubernetes: version: 1.28.2 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true CSI volume provisioners Every OpenStack shoot cluster will be deployed with the OpenStack Cinder CSI driver. It is compatible with the legacy in-tree volume provisioner that was deprecated by the Kubernetes community and will be removed in future versions of Kubernetes. End-users might want to update their custom StorageClasses to the new cinder.csi.openstack.org provisioner.\nKubernetes Versions per Worker Pool This extension supports gardener/gardener’s WorkerPoolKubernetesVersion feature gate, i.e., having worker pools with overridden Kubernetes versions since gardener-extension-provider-openstack@v1.23.\nShoot CA Certificate and ServiceAccount Signing Key Rotation This extension supports gardener/gardener’s ShootCARotation and ShootSARotation feature gates since gardener-extension-provider-openstack@v1.26.\n","categories":"","description":"","excerpt":"Using the OpenStack provider extension with Gardener as end-user The …","ref":"/docs/extensions/infrastructure-extensions/gardener-extension-provider-openstack/usage/","tags":"","title":"Usage"},{"body":"Using the Networking Calico extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a networking field that is meant to contain network-specific configuration.\nIn this document we are describing how this configuration looks like for Calico and provide an example Shoot manifest with minimal configuration that you can use to create a cluster.\nCalico Typha Calico Typha is an optional component of Project Calico designed to offload the Kubernetes API server. The Typha daemon sits between the datastore (such as the Kubernetes API server which is the one used by Gardener managed Kubernetes) and many instances of Felix. Typha’s main purpose is to increase scale by reducing each node’s impact on the datastore. You can opt-out Typha via .spec.networking.providerConfig.typha.enabled=false of your Shoot manifest. By default the Typha is enabled.\nEBPF Dataplane Calico can be run in ebpf dataplane mode. This has several benefits, calico scales to higher troughput, uses less cpu per GBit and has native support for kubernetes services (without needing kube-proxy). To switch to a pure ebpf dataplane it is recommended to run without an overlay network. The following configuration can be used to run without an overlay and without kube-proxy.\nAn example ebpf dataplane NetworkingConfig manifest:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ebpfDataplane: enabled: true overlay: enabled: false To disable kube-proxy set the enabled field to false in the shoot manifest.\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: ebpf-shoot namespace: garden-dev spec: kubernetes: kubeProxy: enabled: false Know limitations of the EBPF Dataplane Please note that the default settings for calico’s ebpf dataplane may interfere with accelerated networking in azure rendering nodes with accelerated networking unusable in the network. The reason for this is that calico does not ignore the accelerated networking interface enP... as it should, but applies its ebpf programs to it. A simple mitigation for this is to adapt the FelixConfiguration default and ensure that the bpfDataIfacePattern does not include enP.... Per default bpfDataIfacePattern is not set. The default value for this option can be found here. For example, you could apply the following change:\n$ kubectl edit felixconfiguration default ... apiVersion: crd.projectcalico.org/v1 kind: FelixConfiguration metadata: ... name: default ... spec: bpfDataIfacePattern: ^((en|wl|ww|sl|ib)[opsx].*|(eth|wlan|wwan).*|tunl0$|vxlan.calico$|wireguard.cali$|wg-v6.cali$) ... AutoScaling Autoscaling defines how the calico components are automatically scaled. It allows to use either static resource assignment, vertical pod or cluster-proportional autoscaler (default: cluster-proportional).\nThe cluster-proportional autoscaling mode is preferable when conditions require minimal disturbances and vpa mode for improved cluster resource utilization. Static resource assignments causes no disruptions due to autoscaling, but has no dynamics to handle changing demands.\nPlease note VPA must be enabled on the shoot as a pre-requisite to enabling vpa mode.\nAn example NetworkingConfig manifest for vertical pod autoscaling:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig autoScaling: mode: \"vpa\" An example NetworkingConfig manifest for static resource assignment:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig autoScaling: mode: \"static\" resources: node: cpu: 100m memory: 100Mi typha: cpu: 100m memory: 100Mi ℹ️ Please note that in static mode, you have the option to configure the resource requests for calico-node and calico-typha. If not specified, default settings will be used. If the resource requests are chosen too low, it might impact the stability/performance of the cluster. Specifying the resource requests for any other autoscaling mode has no effect.\n Example NetworkingConfig manifest An example NetworkingConfig for the Calico extension looks as follows:\napiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ipam: type: host-local cidr: usePodCIDR vethMTU: 1440 typha: enabled: true overlay: enabled: true autoScaling: mode: \"vpa\" Example Shoot manifest Please find below an example Shoot manifest with calico networking configratations:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: johndoe-azure namespace: garden-dev spec: cloudProfileName: azure region: westeurope secretBindingName: core-azure provider: type: azure infrastructureConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureConfig networks: vnet: cidr: 10.250.0.0/16 workers: 10.250.0.0/19 zoned: true controlPlaneConfig: apiVersion: azure.provider.extensions.gardener.cloud/v1alpha1 kind: ControlPlaneConfig workers: - name: worker-xoluy machine: type: Standard_D4_v3 minimum: 2 maximum: 2 volume: size: 50Gi type: Standard_LRS zones: - \"1\" - \"2\" networking: type: calico nodes: 10.250.0.0/16 providerConfig: apiVersion: calico.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig ipam: type: host-local vethMTU: 1440 overlay: enabled: true typha: enabled: false kubernetes: version: 1.28.3 maintenance: autoUpdate: kubernetesVersion: true machineImageVersion: true addons: kubernetesDashboard: enabled: true nginxIngress: enabled: true Known Limitations in conjunction with NodeLocalDNS If NodeLocalDNS is active in a shoot cluster, which uses calico as CNI without overlay network, it may be impossible to block DNS traffic to the cluster DNS server via network policy. This is due to FELIX_CHAININSERTMODE being set to APPEND instead of INSERT in case SNAT is being applied to requests to the infrastructure DNS server. In this scenario the iptables rules of NodeLocalDNS already accept the traffic before the network policies are checked.\nThis only applies to traffic directed to NodeLocalDNS. If blocking of all DNS traffic is desired via network policy the pod dnsPolicy should be changed to Default so that the cluster DNS is not used. Alternatives are usage of overlay network or disabling of NodeLocalDNS.\n","categories":"","description":"","excerpt":"Using the Networking Calico extension with Gardener as end-user The …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-calico/usage/","tags":"","title":"Usage"},{"body":"Using the Networking Cilium extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a networking field that is meant to contain network-specific configuration.\nIn this document we are describing how this configuration looks like for Cilium and provide an example Shoot manifest with minimal configuration that you can use to create a cluster.\nCilium Hubble Hubble is a fully distributed networking and security observability platform build on top of Cilium and BPF. It is optional and is deployed to the cluster when enabled in the NetworkConfig. If the dashboard is not externally exposed\nkubectl port-forward -n kube-system deployment/hubble-ui 8081 can be used to acess it locally.\nExample NetworkingConfig manifest An example NetworkingConfig for the Cilium extension looks as follows:\napiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig hubble: enabled: true #debug: false #tunnel: vxlan #store: kubernetes NetworkingConfig options The hubble.enabled field describes whether hubble should be deployed into the cluster or not (default).\nThe debug field describes whether you want to run cilium in debug mode or not (default), change this value to true to use debug mode.\nThe tunnel field describes the encapsulation mode for communication between nodes. Possible values are vxlan (default), geneve or disabled.\nThe bpfSocketLBHostnsOnly.enabled field describes whether socket LB will be skipped for services when inside a pod namespace (default), in favor of service LB at the pod interface. Socket LB is still used when in the host namespace. This feature is required when using cilium with a service mesh like istio or linkerd.\nSetting the field cni.exclusive to false might be useful when additional plugins, such as Istio or Linkerd, wish to chain after Cilium. This action disables the default behavior of Cilium, which is to overwrite changes to the CNI configuration file.\nThe egressGateway.enabled field describes whether egress gateways are enabled or not (default). To use this feature kube-proxy must be disabled. This can be done with the following configuration in the Shoot:\nspec: kubernetes: kubeProxy: enabled: false The egress gateway feature is only supported in gardener with an overlay network (shoot.spec.networking.providerConfig.overlay.enabled: true) at the moment. This is due to the reason that bpf masquerading is required for the egress gateway feature. Once the overlay network is enabled bpf.masquerade is set to true in the cilium configmap.\nThe snatToUpstreamDNS.enabled field describes whether the traffic to the upstream dns server should be masqueraded or not (default). This is needed on some infrastructures where traffic to the dns server with the pod CIDR range is blocked.\nExample Shoot manifest Please find below an example Shoot manifest with cilium networking configuration:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: aws-cilium namespace: garden-dev spec: networking: type: cilium providerConfig: apiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1 kind: NetworkConfig hubble: enabled: true pods: 100.96.0.0/11 nodes: 10.250.0.0/16 services: 100.64.0.0/13 ... If you would like to see a provider specific shoot example, please check out the documentation of the well-known extensions. A list of them can be found here.\n","categories":"","description":"","excerpt":"Using the Networking Cilium extension with Gardener as end-user The …","ref":"/docs/extensions/network-extensions/gardener-extension-networking-cilium/usage/","tags":"","title":"Usage"},{"body":"Using the CoreOS extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for CoreOS workers Gardener allows you to create CoreOS based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise CoreOS based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true (necessary for collecting node metrics) enableDnsSupport: true ","categories":"","description":"","excerpt":"Using the CoreOS extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-coreos/usage/","tags":"","title":"Usage"},{"body":"Using the SuSE CHost extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for SuSE CHost workers Gardener allows you to create SuSE CHost based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise SuSE CHost based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true enableDnsSupport: true Support for vSMP MemoryOne This extension controller is also capable of generating user-data for the vSMP MemoryOne operating system in conjunction with SuSE CHost. It reacts on the memoryone-chost extension type. Additionally, it allows certain customizations with the following configuration:\napiVersion: memoryone-chost.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"3\" systemMemory: \"7x\" The memoryTopology field controls the mem_topology setting. If it’s not provided then it will default to 2. The systemMemory field controls the system_memory setting. If it’s not provided then it defaults to 6x. Please note that it was only e2e-tested on AWS. Additionally, you need a snapshot ID of a SuSE CHost/CHost volume (see below how to create it).\nAn exemplary worker pool configuration inside a Shoot resource using for the vSMP MemoryOne operating system would look as follows:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: vsmp-memoryone namespace: garden-foo spec: ... workers: - name: cpu-worker3 minimum: 1 maximum: 1 maxSurge: 1 maxUnavailable: 0 machine: image: name: memoryone-chost version: 9.5.195 providerConfig: apiVersion: memoryone-chost.os.extensions.gardener.cloud/v1alpha1 kind: OperatingSystemConfiguration memoryTopology: \"2\" systemMemory: \"6x\" type: c5d.metal volume: size: 20Gi type: gp2 dataVolumes: - name: chost size: 50Gi type: gp2 providerConfig: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: WorkerConfig dataVolumes: - name: chost snapshotID: snap-123456 zones: - eu-central-1b Please note that vSMP MemoryOne only works for EC2 bare-metal instance types such as M5d, R5, C5, C5d, etc. - please consult the EC2 instance types overview page and the documentation of vSMP MemoryOne to find out whether the instance type in question is eligible.\nGenerating an AWS snapshot ID for the CHost/CHost operating system The following script will help to generate the snapshot ID on AWS. It runs in the region that is selected in your $HOME/.aws/config file. Consequently, if you want to generate the snapshot in multiple regions, you have to run in multiple times after configuring the respective region using aws configure.\nami=\"ami-1234\" #Replace the ami with the intended one. name=`aws ec2 describe-images --image-ids $ami --query=\"Images[].Name\" --output=text` cur=`aws ec2 describe-snapshots --filter=\"Name=description,Values=snap-$name\" --query=\"Snapshots[].Description\" --output=text` if [ -n \"$cur\" ]; then echo \"AMI $nameexists as snapshot $cur\" continue fi echo \"AMI $name... creating private snapshot\" inst=`aws ec2 run-instances --instance-type t3.nano --image-id $ami --query 'Instances[0].InstanceId' --output=text --subnet-id subnet-1234 --tag-specifications 'ResourceType=instance,Tags=[{Key=scalemp-test,Value=scalemp-test}]'` #Replace the subnet-id with the intended one. aws ec2 wait instance-running --instance-ids $inst vol=`aws ec2 describe-instances --instance-ids $inst --query \"Reservations[].Instances[].BlockDeviceMappings[0].Ebs.VolumeId\" --output=text` snap=`aws ec2 create-snapshot --description \"snap-$name\" --volume-id $vol --query='SnapshotId' --tag-specifications \"ResourceType=snapshot,Tags=[{Key=Name,Value=\\\"$name\\\"}]\" --output=text` aws ec2 wait snapshot-completed --snapshot-ids $snap aws ec2 terminate-instances --instance-id $inst \u003e /dev/null echo $snap ","categories":"","description":"","excerpt":"Using the SuSE CHost extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-suse-chost/usage/","tags":"","title":"Usage"},{"body":"Using the Ubuntu extension with Gardener as end-user The core.gardener.cloud/v1beta1.Shoot resource declares a few fields that must be considered when this OS extension is used.\nIn this document we describe how this configuration looks like and under which circumstances your attention may be required.\nAWS VPC settings for Ubuntu workers Gardener allows you to create Ubuntu based worker nodes by:\n Using a Gardener managed VPC Reusing a VPC that already exists (VPC id specified in InfrastructureConfig] If the second option applies to your use-case please make sure that your VPC has enabled DNS Support. Otherwise Ubuntu based nodes aren’t able to join or operate in your cluster properly.\nDNS settings (required):\n enableDnsHostnames: true enableDnsSupport: true ","categories":"","description":"","excerpt":"Using the Ubuntu extension with Gardener as end-user The …","ref":"/docs/extensions/os-extensions/gardener-extension-os-ubuntu/usage/","tags":"","title":"Usage"},{"body":"Disclaimer This post is meant to give a basic end-to-end description for deploying and using Prometheus and Grafana. Both applications offer a wide range of flexibility, which needs to be considered in case you have specific requirements. Such advanced details are not in the scope of this topic.\nIntroduction Prometheus is an open-source systems monitoring and alerting toolkit for recording numeric time series. It fits both machine-centric monitoring as well as monitoring of highly dynamic service-oriented architectures. In a world of microservices, its support for multi-dimensional data collection and querying is a particular strength.\nPrometheus is the second hosted project to graduate within CNCF.\nThe following characteristics make Prometheus a good match for monitoring Kubernetes clusters:\n Pull-based Monitoring Prometheus is a pull-based monitoring system, which means that the Prometheus server dynamically discovers and pulls metrics from your services running in Kubernetes.\n Labels Prometheus and Kubernetes share the same label (key-value) concept that can be used to select objects in the system.\nLabels are used to identify time series and sets of label matchers can be used in the query language (PromQL) to select the time series to be aggregated.\n Exporters\nThere are many exporters available, which enable integration of databases or even other monitoring systems not already providing a way to export metrics to Prometheus. One prominent exporter is the so called node-exporter, which allows to monitor hardware and OS related metrics of Unix systems.\n Powerful Query Language The Prometheus query language PromQL lets the user select and aggregate time series data in real time. Results can either be shown as a graph, viewed as tabular data in the Prometheus expression browser, or consumed by external systems via the HTTP API.\n Find query examples on Prometheus Query Examples.\nOne very popular open-source visualization tool not only for Prometheus is Grafana. Grafana is a metric analytics and visualization suite. It is popular for visualizing time series data for infrastructure and application analytics but many use it in other domains including industrial sensors, home automation, weather, and process control. For more information, see the Grafana Documentation.\nGrafana accesses data via Data Sources. The continuously growing list of supported backends includes Prometheus.\nDashboards are created by combining panels, e.g., Graph and Dashlist.\nIn this example, we describe an End-To-End scenario including the deployment of Prometheus and a basic monitoring configuration as the one provided for Kubernetes clusters created by Gardener.\nIf you miss elements on the Prometheus web page when accessing it via its service URL https://\u003cyour K8s FQN\u003e/api/v1/namespaces/\u003cyour-prometheus-namespace\u003e/services/prometheus-prometheus-server:80/proxy, this is probably caused by a Prometheus issue - #1583. To workaround this issue, set up a port forward kubectl port-forward -n \u003cyour-prometheus-namespace\u003e \u003cprometheus-pod\u003e 9090:9090 on your client and access the Prometheus UI from there with your locally installed web browser. This issue is not relevant in case you use the service type LoadBalancer.\nPreparation The deployment of Prometheus and Grafana is based on Helm charts.\nMake sure to implement the Helm settings before deploying the Helm charts.\nThe Kubernetes clusters provided by Gardener use role based access control (RBAC). To authorize the Prometheus node-exporter to access hardware and OS relevant metrics of your cluster’s worker nodes, specific artifacts need to be deployed.\nBind the Prometheus service account to the garden.sapcloud.io:monitoring:prometheus cluster role by running the command kubectl apply -f crbinding.yaml.\nContent of crbinding.yaml\napiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: \u003cyour-prometheus-name\u003e-server roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: garden.sapcloud.io:monitoring:prometheus subjects: - kind: ServiceAccount name: \u003cyour-prometheus-name\u003e-server namespace: \u003cyour-prometheus-namespace\u003e Deployment of Prometheus and Grafana Only minor changes are needed to deploy Prometheus and Grafana based on Helm charts.\nCopy the following configuration into a file called values.yaml and deploy Prometheus: helm install \u003cyour-prometheus-name\u003e --namespace \u003cyour-prometheus-namespace\u003e stable/prometheus -f values.yaml\nTypically, Prometheus and Grafana are deployed into the same namespace. There is no technical reason behind this, so feel free to choose different namespaces.\nContent of values.yaml for Prometheus:\nrbac: create: false # Already created in Preparation step nodeExporter: enabled: false # The node-exporter is already deployed by default server: global: scrape_interval: 30s scrape_timeout: 30s serverFiles: prometheus.yml: rule_files: - /etc/config/rules - /etc/config/alerts scrape_configs: - job_name: 'kube-kubelet' honor_labels: false scheme: https tls_config: # This is needed because the kubelets' certificates are not generated # for a specific pod IP insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - target_label: __metrics_path__ replacement: /metrics - source_labels: [__meta_kubernetes_node_address_InternalIP] target_label: instance - action: labelmap regex: __meta_kubernetes_node_label_(.+) - job_name: 'kube-kubelet-cadvisor' honor_labels: false scheme: https tls_config: # This is needed because the kubelets' certificates are not generated # for a specific pod IP insecure_skip_verify: true bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token kubernetes_sd_configs: - role: node relabel_configs: - target_label: __metrics_path__ replacement: /metrics/cadvisor - source_labels: [__meta_kubernetes_node_address_InternalIP] target_label: instance - action: labelmap regex: __meta_kubernetes_node_label_(.+) # Example scrape config for probing services via the Blackbox Exporter. # # Relabelling allows to configure the actual service scrape endpoint using the following annotations: # # * `prometheus.io/probe`: Only probe services that have a value of `true` - job_name: 'kubernetes-services' metrics_path: /probe params: module: [http_2xx] kubernetes_sd_configs: - role: service relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe] action: keep regex: true - source_labels: [__address__] target_label: __param_target - target_label: __address__ replacement: blackbox - source_labels: [__param_target] target_label: instance - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] target_label: kubernetes_name # Example scrape config for pods # # Relabelling allows to configure the actual service scrape endpoint using the following annotations: # # * `prometheus.io/scrape`: Only scrape pods that have a value of `true` # * `prometheus.io/path`: If the metrics path is not `/metrics` override this. # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`. - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: (.+):(?:\\d+);(\\d+) replacement: ${1}:${2} target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name # Scrape config for service endpoints. # # The relabeling allows the actual service scrape endpoint to be configured # via the following annotations: # # * `prometheus.io/scrape`: Only scrape services that have a value of `true` # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need # to set this to `https` \u0026 most likely set the `tls_config` of the scrape config. # * `prometheus.io/path`: If the metrics path is not `/metrics` override this. # * `prometheus.io/port`: If the metrics are exposed on a different port to the # service then set this appropriately. - job_name: 'kubernetes-service-endpoints' kubernetes_sd_configs: - role: endpoints relabel_configs: - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme] action: replace target_label: __scheme__ regex: (https?) - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port] action: replace target_label: __address__ regex: (.+)(?::\\d+);(\\d+) replacement: $1:$2 - action: labelmap regex: __meta_kubernetes_service_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_service_name] action: replace target_label: kubernetes_name # Add your additional configuration here... Next, deploy Grafana. Since the deployment in this post is based on the Helm default values, the settings below are set explicitly in case the default changed.\nDeploy Grafana via helm install grafana --namespace \u003cyour-prometheus-namespace\u003e stable/grafana -f values.yaml. Here, the same namespace is chosen for Prometheus and for Grafana.\nContent of values.yaml for Grafana:\nserver: ingress: enabled: false service: type: ClusterIP Check the running state of the pods on the Kubernetes Dashboard or by running kubectl get pods -n \u003cyour-prometheus-namespace\u003e. In case of errors, check the log files of the pod(s) in question.\nThe text output of Helm after the deployment of Prometheus and Grafana contains very useful information, e.g., the user and password of the Grafana Admin user. The credentials are stored as secrets in the namespace \u003cyour-prometheus-namespace\u003e and could be decoded via kubectl get secret --namespace \u003cmy-grafana-namespace\u003e grafana -o jsonpath=\"{.data.admin-password}\" | base64 --decode ; echo.\nBasic Functional Tests To access the web UI of both applications, use port forwarding of port 9090.\nSetup port forwarding for port 9090:\nkubectl port-forward -n \u003cyour-prometheus-namespace\u003e \u003cyour-prometheus-server-pod\u003e 9090:9090 Open http://localhost:9090 in your web browser. Select Graph from the top tab and enter the following expressing to show the overall CPU usage for a server (see Prometheus Query Examples):\n100 * (1 - avg by(instance)(irate(node_cpu{mode='idle'}[5m]))) This should show some data in a graph.\nTo show the same data in Grafana setup port forwarding for port 3000 for the Grafana pod and open the Grafana Web UI by opening http://localhost:3000 in a browser. Enter the credentials of the admin user.\nNext, you need to enter the server name of your Prometheus deployment. This name is shown directly after the installation via helm.\nRun\nhelm status \u003cyour-prometheus-name\u003e to find this name. Below, this server name is referenced by \u003cyour-prometheus-server-name\u003e.\nFirst, you need to add your Prometheus server as data source:\n Navigate to Dashboards → Data Sources Choose Add data source Enter: Name: \u003cyour-prometheus-datasource-name\u003e\nType: Prometheus\nURL: http://\u003cyour-prometheus-server-name\u003e\nAccess: proxy Choose Save \u0026 Test In case of failure, check the Prometheus URL in the Kubernetes Dashboard.\nTo add a Graph follow these steps:\n In the left corner, select Dashboards → New to create a new dashboard Select Graph to create a new graph Next, select the Panel Title → Edit Select your Prometheus Data Source in the drop down list Enter the expression 100 * (1 - avg by(instance)(irate(node_cpu{mode='idle'}[5m]))) in the entry field A Select the floppy disk symbol (Save) on top Now you should have a very basic Prometheus and Grafana setup for your Kubernetes cluster.\nAs a next step you can implement monitoring for your applications by implementing the Prometheus client API.\nRelated Links Prometheus Prometheus Helm Chart Grafana Grafana Helm Chart ","categories":"","description":"How to deploy and configure Prometheus and Grafana to collect and monitor kubelet container metrics","excerpt":"How to deploy and configure Prometheus and Grafana to collect and …","ref":"/docs/guides/applications/prometheus/","tags":"","title":"Using Prometheus and Grafana to Monitor K8s"},{"body":"Using the Dashboard Terminal The dashboard features an integrated web-based terminal to your clusters. It allows you to use kubectl without the need to supply kubeconfig. There are several ways to access it and they’re described on this page.\nPrerequisites You are logged on to the Gardener Dashboard. You have created a cluster and its status is operational. The landscape administrator has enabled the terminal feature The cluster you want to connect to is reachable from the dashboard On this page:\n Open from cluster list Open from cluster details page Terminal Open from cluster list Choose your project from the menu on the left and choose CLUSTERS.\n Locate a cluster for which you want to open a Terminal and choose the key icon.\n In the dialog, choose the icon on the right of the Terminal label.\n Open from cluster details page Choose your project from the menu on the left and choose CLUSTERS.\n Locate a cluster for which you want to open a Terminal and choose to display its details.\n In the Access section, choose the icon on the right of the Terminal label.\n Terminal Opening up the terminal in either of the ways discussed here results in the following screen:\nIt provides a bash environment and range of useful tools and an installed and configured kubectl (with alias k) to use right away with your cluster.\nTry to list the namespaces in the cluster.\n$ k get ns You get a result like this: ","categories":"","description":"","excerpt":"Using the Dashboard Terminal The dashboard features an integrated …","ref":"/docs/dashboard/using-terminal/","tags":"","title":"Using Terminal"},{"body":"Version Skew Policy This document describes the maximum version skew supported between various Gardener components.\nSupported Gardener Versions Gardener versions are expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version, following Semantic Versioning terminology.\nThe Gardener project maintains release branches for the three most recent minor releases.\nApplicable fixes, including security fixes, may be backported to those three release branches, depending on severity and feasibility. Patch releases are cut from those branches at a regular cadence, plus additional urgent releases when required.\nFor more information, see the Releases document.\nSupported Version Skew Technically, we follow the same policy as the Kubernetes project. However, given that our release cadence is much more frequent compared to Kubernetes (every 14d vs. every 120d), in many cases it might be possible to skip versions, though we do not test these upgrade paths. Consequently, in general it might not work, and to be on the safe side, it is highly recommended to follow the described policy.\n🚨 Note that downgrading Gardener versions is generally not tested during development and should be considered unsupported.\ngardener-apiserver In multi-instance setups of Gardener, the newest and oldest gardener-apiserver instances must be within one minor version.\nExample:\n newest gardener-apiserver is at 1.37 other gardener-apiserver instances are supported at 1.37 and 1.36 gardener-controller-manager, gardener-scheduler, gardener-admission-controller gardener-controller-manager, gardener-scheduler, and gardener-admission-controller must not be newer than the gardener-apiserver instances they communicate with. They are expected to match the gardener-apiserver minor version, but may be up to one minor version older (to allow live upgrades).\nExample:\n gardener-apiserver is at 1.37 gardener-controller-manager, gardener-scheduler, and gardener-admission-controller are supported at 1.37 and 1.36 gardenlet gardenlet must not be newer than gardener-apiserver gardenlet may be up to two minor versions older than gardener-apiserver Example:\n gardener-apiserver is at 1.37 gardenlet is supported at 1.37, 1.36, and 1.35 gardener-operator Since gardener-operator manages the Gardener control plane components (gardener-apiserver, gardener-controller-manager, gardener-scheduler, gardener-admission-controller), it follows the same policy as for gardener-apiserver.\nIt implements additional start-up checks to ensure adherence to this policy. Concretely, gardener-operator will crash when\n its gets downgraded. its version gets upgraded and skips at least one minor version. Supported Component Upgrade Order The supported version skew between components has implications on the order in which components must be upgraded. This section describes the order in which components must be upgraded to transition an existing Gardener installation from version 1.37 to version 1.38.\ngardener-apiserver Prerequisites:\n In a single-instance setup, the existing gardener-apiserver instance is 1.37. In a multi-instance setup, all gardener-apiserver instances are at 1.37 or 1.38 (this ensures maximum skew of 1 minor version between the oldest and newest gardener-apiserver instance). The gardener-controller-manager, gardener-scheduler, and gardener-admission-controller instances that communicate with this gardener-apiserver are at version 1.37 (this ensures they are not newer than the existing API server version and are within 1 minor version of the new API server version). gardenlet instances on all seeds are at version 1.37 or 1.36 (this ensures they are not newer than the existing API server version and are within 2 minor versions of the new API server version). Actions:\n Upgrade gardener-apiserver to 1.38. gardener-controller-manager, gardener-scheduler, gardener-admission-controller Prerequisites:\n The gardener-apiserver instances these components communicate with are at 1.38 (in multi-instance setups in which these components can communicate with any gardener-apiserver instance in the cluster, all gardener-apiserver instances must be upgraded before upgrading these components). Actions:\n Upgrade gardener-controller-manager, gardener-scheduler, and gardener-admission-controller to 1.38 gardenlet Prerequisites:\n The gardener-apiserver instances the gardenlet communicates with are at 1.38. Actions:\n Optionally upgrade gardenlet instances to 1.38 (or they can be left at 1.37 or 1.36). [!WARNING] Running a landscape with gardenlet instances that are persistently two minor versions behind gardener-apiserver means they must be upgraded before the Gardener control plane can be upgraded.\n gardener-operator Prerequisites:\n All gardener-operator instances are at 1.37. Actions:\n Upgrade gardener-operator to 1.38. Supported Gardener Extension Versions Extensions are maintained and released separately and independently of the gardener/gardener repository. Consequently, providing version constraints is not possible in this document. Sometimes, the documentation of extensions contains compatibility information (e.g., “this extension version is only compatible with Gardener versions higher than 1.80”, see this example).\nHowever, since all extensions typically make use of the extensions library (example), a general constraint is that no extension must depend on a version of the extensions library higher than the version of gardenlet.\nExample 1:\n gardener-apiserver and other Gardener control plane components are at 1.37. All gardenlets are at 1.37. Only extensions are supported which depend on 1.37 or lower of the extensions library. Example 2:\n gardener-apiserver and other Gardener control plane components are at 1.37. Some gardenlets are at 1.37, others are at 1.36. Only extensions are supported which depend on 1.36 or lower of the extensions library. Supported Kubernetes Versions Please refer to Supported Kubernetes Versions.\n","categories":"","description":"","excerpt":"Version Skew Policy This document describes the maximum version skew …","ref":"/docs/gardener/deployment/version_skew_policy/","tags":"","title":"Version Skew Policy"},{"body":"Webhooks The etcd-druid controller-manager registers certain admission webhooks that allow for validation or mutation of requests on resources in the cluster, in order to prevent misconfiguration and restrict access to the etcd cluster resources.\nAll webhooks that are a part of etcd-druid reside in package internal/webhook, as sub-packages.\nPackage Structure The typical package structure for the webhooks that are part of etcd-druid is shown with the EtcdComponents Webhook:\ninternal/webhook/etcdcomponents ├── config.go ├── handler.go └── register.go config.go: contains all the logic for the configuration of the webhook, including feature gate activations, CLI flag parsing and validations. register.go: contains the logic for registering the webhook with the etcd-druid controller manager. handler.go: contains the webhook admission handler logic. Each webhook package may also contain auxiliary files which are relevant to that specific webhook.\nEtcd Components Webhook Druid controller-manager registers and runs the etcd controller, which creates and manages various components/resources such as Leases, ConfigMaps, and the Statefulset for the etcd cluster. It is essential for all these resources to contain correct configuration for the proper functioning of the etcd cluster.\nUnintended changes to any of these managed resources can lead to misconfiguration of the etcd cluster, leading to unwanted downtime for etcd traffic. To prevent such unintended changes, a validating webhook called EtcdComponents Webhook guards these managed resources, ensuring that only authorized entities can perform operations on these managed resources.\nEtcdComponents webhook prevents UPDATE and DELETE operations on all resources managed by etcd controller, unless such an operation is performed by druid itself, and during reconciliation of the Etcd resource. Operations are also allowed if performed by one of the authorized entities specified by CLI flag --etcd-components-webhook-exempt-service-accounts, but only if the Etcd resource is not being reconciled by etcd-druid at that time.\nThere may be specific cases where a human operator may need to make changes to the managed resources, possibly to test or fix an etcd cluster. An example of this is recovery from permanent quorum loss, where a human operator will need to suspend reconciliation of the Etcd resource, make changes to the underlying managed resources such as StatefulSet and ConfigMap, and then resume reconciliation for the Etcd resource. Such manual interventions will require out-of-band changes to the managed resources. Protection of managed resources for such Etcd resources can be turned off by adding an annotation druid.gardener.cloud/disable-etcd-component-protection on the Etcd resource. This will effectively disable EtcdComponents Webhook protection for all managed resources for the specific Etcd.\nNote: UPDATE operations for Leases by etcd members are always allowed, since these are regularly updated by the etcd-backup-restore sidecar.\nThe Etcd Components Webhook is disabled by default, and can be enabled via the CLI flag `–enable-etcd-components-webhook.\n","categories":"","description":"","excerpt":"Webhooks The etcd-druid controller-manager registers certain admission …","ref":"/docs/other-components/etcd-druid/concepts/webhooks/","tags":"","title":"Webhooks"},{"body":"Webterminals Architecture Overview Motivation We want to give garden operators and “regular” users of the Gardener dashboard an easy way to have a preconfigured shell directly in the browser.\nThis has several advantages:\n no need to set up any tools locally no need to download / store kubeconfigs locally Each terminal session will have its own “access” service account created. This makes it easier to see “who” did “what” when using the web terminals. The “access” service account is deleted when the terminal session expires Easy “privileged” access to a node (privileged container, hostPID, and hostNetwork enabled, mounted host root fs) in case of troubleshooting node. If allowed by PSP. How it’s done - TL;DR On the host cluster, we schedule a pod to which the dashboard frontend client attaches to (similar to kubectl attach). Usually the ops-toolbelt image is used, containing all relevant tools like kubectl. The Pod has a kubeconfig secret mounted with the necessary privileges for the target cluster - usually cluster-admin.\nTarget types There are currently three targets, where a user can open a terminal session to:\n The (virtual) garden cluster - Currently operator only The shoot cluster The control plane of the shoot cluster - operator only Host There are different factors on where the host cluster (and namespace) is chosen by the dashboard:\n Depending on, the selected target and the role of the user (operator or “regular” user) the host is chosen. For performance / low latency reasons, we want to place the “terminal” pods as near as possible to the target kube-apiserver. For example, the user wants to have a terminal for a shoot cluster. The kube-apiserver of the shoot is running in the seed-shoot-ns on the seed.\n If the user is an operator, we place the “terminal” pod directly in the seed-shoot-ns on the seed. However, if the user is a “regular” user, we don’t want to have “untrusted” workload scheduled on the seeds, that’s why the “terminal” pod is scheduled on the shoot itself, in a temporary namespace that is deleted afterwards. Lifecycle of a Web Terminal Session 1. Browser / Dashboard Frontend - Open Terminal User chooses the target and clicks in the browser on Open terminal button. A POST request is made to the dashboard backend to request a new terminal session.\n2. Dashboard Backend - Create Terminal Resource According to the privileges of the user (operator / enduser) and the selected target, the dashboard backend creates a terminal resource on behalf of the user in the (virtual) garden and responds with a handle to the terminal session.\n3. Browser / Dashboard Frontend The frontend makes another POST request to the dashboard backend to fetch the terminal session. The Backend waits until the terminal resource is in a “ready” state (timeout 10s) before sending a response to the frontend. More to that later.\n4. Terminal Resource The terminal resource, among other things, holds the information of the desired host and target cluster. The credentials to these clusters are declared as references (secretRef / serviceAccountRef). The terminal resource itself doesn’t contain sensitive information.\n5. Admission A validating webhook is in place to ensure that the user, that created the terminal resource, has the permission to read the referenced credentials. There is also a mutating webhook in place. Both admission configurations have failurePolicy: Fail.\n6. Terminal-Controller-Manager - Apply Resources on Host \u0026 Target Cluster Sidenote: The terminal-controller-manager has no knowledge about the gardener, its shoots, and seeds. In that sense it can be considered as independent from the gardener.\nThe terminal-controller-manager watches terminal resources and ensures the desired state on the host and target cluster. The terminal-controller-manager needs the permission to read all secrets / service accounts in the virtual garden. As additional safety net, the terminal-controller-manager ensures that the terminal resource was not created before the admission configurations were created.\nThe terminal-controller-manager then creates the necessary resources in the host and target cluster.\n Target Cluster: “Access” service account + (cluster)rolebinding usually to cluster-admin cluster role used from within the “terminal” pod Host Cluster: “Attach” service Account + rolebinding to “attach” cluster role (privilege to attach and get pod) will be used by the browser to attach to the pod Kubeconfig secret, containing the “access” token from the target cluster The “terminal” pod itself, having the kubeconfig secret mounted 7. Dashboard Backend - Responds to Frontend As mentioned in step 3, the dashboard backend waits until the terminal resource is “ready”. It then reads the “attach” token from the host cluster on behalf of the user. It responds with:\n attach token hostname of the host cluster’s api server name of the pod and namespace 8. Browser / Dashboard Frontend - Attach to Pod Dashboard frontend attaches to the pod located on the host cluster by opening a WebSocket connection using the provided parameter and credentials. As long as the terminal window is open, the dashboard regularly annotates the terminal resource (heartbeat) to keep it alive.\n9. Terminal-Controller-Manager - Cleanup When there is no heartbeat on the terminal resource for a certain amount of time (default is 5m) the created resources in the host and target cluster are cleaned up again and the terminal resource will be deleted.\nBrowser Trusted Certificates for Kube-Apiservers When the dashboard frontend opens a secure WebSocket connection to the kube-apiserver, the certificate presented by the kube-apiserver must be browser trusted. Otherwise, the connection can’t be established due to browser policy. Most kube-apiservers have self-signed certificates from a custom Root CA.\nThe Gardener project now handles the responsibility of exposing the kube-apiservers with browser trusted certificates for Seeds (gardener/gardener#7764) and Shoots (gardener/gardener#7712). For this to work, a Secret must exist in the garden namespace of the Seed cluster. This Secret should have a label gardener.cloud/role=controlplane-cert. The Secret is expected to contain the wildcard certificate for Seeds ingress domain.\nAllowlist for Hosts Motivation When a user starts a terminal session, the dashboard frontend establishes a secure WebSocket connection to the corresponding kube-apiserver. This connection is controlled by the connectSrc directive of the content security policy, which governs the hosts that the browser can connect to.\nBy default, the connectSrc directive only permits connections to the same host. However, to enable the webterminal feature to function properly, connections to additional trusted hosts are required. This is where the allowedHostSourceList configuration becomes relevant. It directly impacts the connectSrc directive by specifying the hostnames that the browser is allowed to connect to during a terminal session. By defining this list, you can extend the range of terminal connections to include the necessary trusted hosts, while still preventing any unauthorized or potentially harmful connections.\nConfiguration The allowedHostSourceList can be configured within the global.terminal section of the gardener-dashboard Helm values.yaml file. The list should consist of permitted hostnames (without the scheme) for terminal connections.\nIt is important to consider that the usage of wildcards follows the rules defined by the content security policy.\nHere is an example of how to configure the allowedHostSourceList:\nglobal: terminal: allowedHostSourceList: - \"*.seed.example.com\" In this example, any host under the seed.example.com domain is allowed for terminal connections.\n","categories":"","description":"","excerpt":"Webterminals Architecture Overview Motivation We want to give garden …","ref":"/docs/dashboard/webterminals/","tags":"","title":"Webterminals"},{"body":"Weeder Overview Weeder watches for an update to service endpoints and on receiving such an event it will create a time-bound watch for all configured dependent pods that need to be actively recovered in case they have not yet recovered from CrashLoopBackoff state. In a nutshell it accelerates recovery of pods when an upstream service recovers.\nAn interference in automatic recovery for dependent pods is required because kubernetes pod restarts a container with an exponential backoff when the pod is in CrashLoopBackOff state. This backoff could become quite large if the service stays down for long. Presence of weeder would not let that happen as it’ll restart the pod.\nPrerequisites Before we understand how Weeder works, we need to be familiar with kubernetes services \u0026 endpoints.\n NOTE: If a kubernetes service is created with selectors then kubernetes will create corresponding endpoint resource which will have the same name as that of the service. In weeder implementation service and endpoint name is used interchangeably.\n Config Weeder can be configured via command line arguments and a weeder configuration. See configure weeder.\nInternals Weeder keeps a watch on the events for the specified endpoints in the config. For every endpoints a list of podSelectors can be specified. It cretes a weeder object per endpoints resource when it receives a satisfactory Create or Update event. Then for every podSelector it creates a goroutine. This goroutine keeps a watch on the pods with labels as per the podSelector and kills any pod which turn into CrashLoopBackOff. Each weeder lives for watchDuration interval which has a default value of 5 mins if not explicitly set.\nTo understand the actions taken by the weeder lets use the following diagram as a reference. Let us also assume the following configuration for the weeder:\nwatchDuration: 2m0s servicesAndDependantSelectors: etcd-main-client: # name of the service/endpoint for etcd statefulset that weeder will receive events for. podSelectors: # all pods matching the label selector are direct dependencies for etcd service - matchExpressions: - key: gardener.cloud/role operator: In values: - controlplane - key: role operator: In values: - apiserver kube-apiserver: # name of the service/endpoint for kube-api-server pods that weeder will receive events for. podSelectors: # all pods matching the label selector are direct dependencies for kube-api-server service - matchExpressions: - key: gardener.cloud/role operator: In values: - controlplane - key: role operator: NotIn values: - main - apiserver Only for the sake of demonstration lets pick the first service -\u003e dependent pods tuple (etcd-main-client as the service endpoint).\n Assume that there are 3 replicas for etcd statefulset. Time here is just for showing the series of events t=0 -\u003e all etcd pods go down t=10 -\u003e kube-api-server pods transition to CrashLoopBackOff t=100 -\u003e all etcd pods recover together t=101 -\u003e Weeder sees Update event for etcd-main-client endpoints resource t=102 -\u003e go routine created to keep watch on kube-api-server pods t=103 -\u003e Since kube-api-server pods are still in CrashLoopBackOff, weeder deletes the pods to accelerate the recovery. t=104 -\u003e new kube-api-server pod created by replica-set controller in kube-controller-manager Points to Note Weeder only respond on Update events where a notReady endpoints resource turn to Ready. Thats why there was no weeder action at time t=10 in the example above. notReady -\u003e no backing pod is Ready Ready -\u003e atleast one backing pod is Ready Weeder doesn’t respond on Delete events Weeder will always wait for the entire watchDuration. If the dependent pods transition to CrashLoopBackOff after the watch duration or even after repeated deletion of these pods they do not recover then weeder will exit. Quality of service offered via a weeder is only Best-Effort. ","categories":"","description":"","excerpt":"Weeder Overview Weeder watches for an update to service endpoints and …","ref":"/docs/other-components/dependency-watchdog/concepts/weeder/","tags":"","title":"Weeder"},{"body":"Can you adapt a DNS configuration to be used by the workload on the cluster (CoreDNS configuration)? Yes, you can. Information on that can be found in Custom DNS Configuration.\nHow to use custom domain names using a DNS provider? Creating custom domain names for the Gardener infrastructure DNS records using DNSRecords resources With DNSRecords internal and external domain names of the kube-apiserver are set, as well as the deprecated ingress domain name and an “owner” DNS record for the owning seed.\nFor this purpose, you need either a provider extension supporting the needed resource kind DNSRecord/\u003cprovider-type\u003e or a special extension.\nAll main providers support their respective IaaS specific DNS servers:\n AWS =\u003e DNSRecord/aws-route53 GCP =\u003e DNSRecord/google-cloudns Azure =\u003e DNSRecord/azure-dns Openstack =\u003e DNSRecord/openstack-designate AliCloud =\u003e DNSRecord/alicloud-dns For Cloudflare there is a community extension existing.\nFor other providers like Netlify and infoblox there is currently no known supporting extension, however, they are supported for shoot-dns-service.\nCreating domain names for cluster resources like ingress or services with services of type Loadbalancers and for TLS certificates For this purpose, the shoot-dns-service extension is used (DNSProvider and DNSEntry resources).\nYou can read more on it in these documents:\n Deployment of the Shoot DNS Service Extension Request DNS Names in Shoot Clusters DNS Providers Gardener DNS Management for Shoots Request X.509 Certificates Gardener Certificate Management ","categories":"","description":"","excerpt":"Can you adapt a DNS configuration to be used by the workload on the …","ref":"/docs/faq/dns-config/","tags":"","title":"What are the meanings of different DNS configuration options?"},{"body":"Contract: Worker Resource While the control plane of a shoot cluster is living in the seed and deployed as native Kubernetes workload, the worker nodes of the shoot clusters are normal virtual machines (VMs) in the end-users infrastructure account. The Gardener project features a sub-project called machine-controller-manager. This controller is extending the Kubernetes API using custom resource definitions to represent actual VMs as Machine objects inside a Kubernetes system. This approach unlocks the possibility to manage virtual machines in the Kubernetes style and benefit from all its design principles.\nWhat is the machine-controller-manager doing exactly? Generally, there are provider-specific MachineClass objects (AWSMachineClass, AzureMachineClass, etc.; similar to StorageClass), and MachineDeployment, MachineSet, and Machine objects (similar to Deployment, ReplicaSet, and Pod). A machine class describes where and how to create virtual machines (in which networks, region, availability zone, SSH key, user-data for bootstrapping, etc.), while a Machine results in an actual virtual machine. You can read up more information in the machine-controller-manager’s repository.\nThe gardenlet deploys the machine-controller-manager, hence, provider extensions only have to inject their specific out-of-tree machine-controller-manager sidecar container into the Deployment.\nWhat needs to be implemented to support a new worker provider? As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Worker metadata: name: bar namespace: shoot--foo--bar spec: type: azure region: eu-west-1 secretRef: name: cloudprovider namespace: shoot--foo--bar infrastructureProviderStatus: apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1 kind: InfrastructureStatus ec2: keyName: shoot--foo--bar-ssh-publickey iam: instanceProfiles: - name: shoot--foo--bar-nodes purpose: nodes roles: - arn: arn:aws:iam::0123456789:role/shoot--foo--bar-nodes purpose: nodes vpc: id: vpc-0123456789 securityGroups: - id: sg-1234567890 purpose: nodes subnets: - id: subnet-01234 purpose: nodes zone: eu-west-1b - id: subnet-56789 purpose: public zone: eu-west-1b - id: subnet-0123a purpose: nodes zone: eu-west-1c - id: subnet-5678a purpose: public zone: eu-west-1c pools: - name: cpu-worker minimum: 3 maximum: 5 maxSurge: 1 maxUnavailable: 0 machineType: m4.large machineImage: name: coreos version: 1967.5.0 nodeAgentSecretName: gardener-node-agent-local-ee46034b8269353b nodeTemplate: capacity: cpu: 2 gpu: 0 memory: 8Gi labels: node.kubernetes.io/role: node worker.gardener.cloud/cri-name: containerd worker.gardener.cloud/pool: cpu-worker worker.gardener.cloud/system-components: \"true\" userDataSecretRef: name: user-data-secret key: cloud_config volume: size: 20Gi type: gp2 zones: - eu-west-1b - eu-west-1c machineControllerManager: drainTimeout: 10m healthTimeout: 10m creationTimeout: 10m maxEvictRetries: 30 nodeConditions: - ReadonlyFilesystem - DiskPressure - KernelDeadlock clusterAutoscaler: scaleDownUtilizationThreshold: 0.5 scaleDownGpuUtilizationThreshold: 0.5 scaleDownUnneededTime: 30m scaleDownUnreadyTime: 1h maxNodeProvisionTime: 15m The .spec.secretRef contains a reference to the provider secret pointing to the account that shall be used to create the needed virtual machines. Also, as you can see, Gardener copies the output of the infrastructure creation (.spec.infrastructureProviderStatus, see Infrastructure resource), into the .spec.\nIn the .spec.pools[] field, the desired worker pools are listed. In the above example, one pool with machine type m4.large and min=3, max=5 machines shall be spread over two availability zones (eu-west-1b, eu-west-1c). This information together with the infrastructure status must be used to determine the proper configuration for the machine classes.\nThe spec.pools[].labels map contains all labels that should be added to all nodes of the corresponding worker pool. Gardener configures kubelet’s --node-labels flag to contain all labels that are mentioned here and allowed by the NodeRestriction admission plugin. This makes sure that kubelet adds all user-specified and gardener-managed labels to the new Node object when registering a new machine with the API server. Nevertheless, this is only effective when bootstrapping new nodes. The provider extension (respectively, machine-controller-manager) is still responsible for updating the labels of existing Nodes when the worker specification changes.\nThe spec.pools[].nodeTemplate.capacity field contains the resource information of the machine like cpu, gpu, and memory. This info is used by Cluster Autoscaler to generate nodeTemplate during scaling the nodeGroup from zero.\nThe spec.pools[].machineControllerManager field allows to configure the settings for machine-controller-manager component. Providers must populate these settings on worker-pool to the related fields in MachineDeployment.\nThe spec.pools[].clusterAutoscaler field contains cluster-autoscaler settings that are to be applied only to specific worker group. cluster-autoscaler expects to find these settings as annotations on the MachineDeployment, and so providers must pass these values to the corresponding MachineDeployment via annotations. The keys for these annotations can be found here and the values for the corresponding annotations should be the same as what is passed into the field. Providers can use the helper function extensionsv1alpha1helper.GetMachineDeploymentClusterAutoscalerAnnotations that returns the annotation map to be used.\nThe controller must only inject its provider-specific sidecar container into the machine-controller-manager Deployment managed by gardenlet.\nAfter that, it must compute the desired machine classes and the desired machine deployments. Typically, one class maps to one deployment, and one class/deployment is created per availability zone. Following this convention, the created resource would look like this:\napiVersion: v1 kind: Secret metadata: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar labels: gardener.cloud/purpose: machineclass type: Opaque data: providerAccessKeyId: eW91ci1hd3MtYWNjZXNzLWtleS1pZAo= providerSecretAccessKey: eW91ci1hd3Mtc2VjcmV0LWFjY2Vzcy1rZXkK userData: c29tZSBkYXRhIHRvIGJvb3RzdHJhcCB0aGUgVk0K --- apiVersion: machine.sapcloud.io/v1alpha1 kind: AWSMachineClass metadata: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar spec: ami: ami-0123456789 # Your controller must map the stated version to the provider specific machine image information, in the AWS case the AMI. blockDevices: - ebs: volumeSize: 20 volumeType: gp2 iam: name: shoot--foo--bar-nodes keyName: shoot--foo--bar-ssh-publickey machineType: m4.large networkInterfaces: - securityGroupIDs: - sg-1234567890 subnetID: subnet-01234 region: eu-west-1 secretRef: name: shoot--foo--bar-cpu-worker-z1-3db65 namespace: shoot--foo--bar tags: kubernetes.io/cluster/shoot--foo--bar: \"1\" kubernetes.io/role/node: \"1\" --- apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: shoot--foo--bar-cpu-worker-z1 namespace: shoot--foo--bar spec: replicas: 2 selector: matchLabels: name: shoot--foo--bar-cpu-worker-z1 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: name: shoot--foo--bar-cpu-worker-z1 spec: class: kind: AWSMachineClass name: shoot--foo--bar-cpu-worker-z1-3db65 for the first availability zone eu-west-1b, and\napiVersion: v1 kind: Secret metadata: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar labels: gardener.cloud/purpose: machineclass type: Opaque data: providerAccessKeyId: eW91ci1hd3MtYWNjZXNzLWtleS1pZAo= providerSecretAccessKey: eW91ci1hd3Mtc2VjcmV0LWFjY2Vzcy1rZXkK userData: c29tZSBkYXRhIHRvIGJvb3RzdHJhcCB0aGUgVk0K --- apiVersion: machine.sapcloud.io/v1alpha1 kind: AWSMachineClass metadata: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar spec: ami: ami-0123456789 # Your controller must map the stated version to the provider specific machine image information, in the AWS case the AMI. blockDevices: - ebs: volumeSize: 20 volumeType: gp2 iam: name: shoot--foo--bar-nodes keyName: shoot--foo--bar-ssh-publickey machineType: m4.large networkInterfaces: - securityGroupIDs: - sg-1234567890 subnetID: subnet-0123a region: eu-west-1 secretRef: name: shoot--foo--bar-cpu-worker-z2-5z6as namespace: shoot--foo--bar tags: kubernetes.io/cluster/shoot--foo--bar: \"1\" kubernetes.io/role/node: \"1\" --- apiVersion: machine.sapcloud.io/v1alpha1 kind: MachineDeployment metadata: name: shoot--foo--bar-cpu-worker-z1 namespace: shoot--foo--bar spec: replicas: 1 selector: matchLabels: name: shoot--foo--bar-cpu-worker-z1 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: name: shoot--foo--bar-cpu-worker-z1 spec: class: kind: AWSMachineClass name: shoot--foo--bar-cpu-worker-z2-5z6as for the second availability zone eu-west-1c.\nAnother convention is the 5-letter hash at the end of the machine class names. Most controllers compute a checksum out of the specification of the machine class. Any change to the value of the nodeAgentSecretName field must result in a change of the machine class name. The checksum in the machine class name helps to trigger a rolling update of the worker nodes if, for example, the machine image version changes. In this case, a new checksum will be generated which results in the creation of a new machine class. The MachineDeployment’s machine class reference (.spec.template.spec.class.name) is updated, which triggers the rolling update process in the machine-controller-manager. However, all of this is only a convention that eases writing the controller, but you can do it completely differently if you desire - as long as you make sure that the described behaviours are implemented correctly.\nAfter the machine classes and machine deployments have been created, the machine-controller-manager will start talking to the provider’s IaaS API and create the virtual machines. Gardener makes sure that the content of the Secret referenced in the userDataSecretRef field that is used to bootstrap the machines contains the required configuration for installation of the kubelet and registering the VM as worker node in the shoot cluster. The Worker extension controller shall wait until all the created MachineDeployments indicate healthiness/readiness before it ends the control loop.\nDoes Gardener need some information that must be returned back? Another important benefit of the machine-controller-manager’s design principles (extending the Kubernetes API using CRDs) is that the cluster-autoscaler can be used without any provider-specific implementation. We have forked the upstream Kubernetes community’s cluster-autoscaler and extended it so that it understands the machine API. Definitely, we will merge it back into the community’s versions once it has been adapted properly.\nOur cluster-autoscaler only needs to know the minimum and maximum number of replicas per MachineDeployment and is ready to act. Without knowing that, it needs to talk to the provider APIs (it just modifies the .spec.replicas field in the MachineDeployment object). Gardener deploys this autoscaler if there is at least one worker pool that specifies max\u003emin. In order to know how it needs to configure it, the provider-specific Worker extension controller must expose which MachineDeployments it has created and how the min/max numbers should look like.\nConsequently, your controller should write this information into the Worker resource’s .status.machineDeployments field. It should also update the .status.machineDeploymentsLastUpdateTime field along with .status.machineDeployments, so that gardener is able to deploy Cluster-Autoscaler right after the status is updated with the latest MachineDeployments and does not wait for the reconciliation to be completed:\n--- apiVersion: extensions.gardener.cloud/v1alpha1 kind: Worker metadata: name: worker namespace: shoot--foo--bar spec: ... status: lastOperation: ... machineDeployments: - name: shoot--foo--bar-cpu-worker-z1 minimum: 2 maximum: 3 - name: shoot--foo--bar-cpu-worker-z2 minimum: 1 maximum: 2 machineDeploymentsLastUpdateTime: \"2023-05-01T12:44:27Z\" In order to support a new worker provider, you need to write a controller that watches all Workers with .spec.type=\u003cmy-provider-name\u003e. You can take a look at the below referenced example implementation for the AWS provider.\nThat sounds like a lot that needs to be done, can you help me? All of the described behaviour is mostly the same for every provider. The only difference is maybe the version/configuration of the provider-specific machine-controller-manager sidecar container, and the machine class specification itself. You can take a look at our extension library, especially the worker controller part where you will find a lot of utilities that you can use. Note that there are also utility functions for getting the default sidecar container specification or corresponding VPA container policy in the machinecontrollermanager package called ProviderSidecarContainer and ProviderSidecarVPAContainerPolicy. Also, using the library you only need to implement your provider specifics - all the things that can be handled generically can be taken for free and do not need to be re-implemented. Take a look at the AWS worker controller for finding an example.\nNon-provider specific information required for worker creation All the providers require further information that is not provider specific but already part of the shoot resource. One example for such information is whether the shoot is hibernated or not. In this case, all the virtual machines should be deleted/terminated, and after that the machine controller-manager should be scaled down. You can take a look at the AWS worker controller to see how it reads this information and how it is used. As Gardener cannot know which information is required by providers, it simply mirrors the Shoot, Seed, and CloudProfile resources into the seed. They are part of the Cluster extension resource and can be used to extract information that is not part of the Worker resource itself.\nReferences and Additional Resources Worker API (Golang Specification) Extension Controller Library Generic Worker Controller Exemplary Implementation for the AWS Provider ","categories":"","description":"","excerpt":"Contract: Worker Resource While the control plane of a shoot cluster …","ref":"/docs/gardener/extensions/worker/","tags":"","title":"Worker"},{"body":"Workerless Shoots Starting from v1.71, users can create a Shoot without any workers, known as a “workerless Shoot”. Previously, worker nodes had to always be included even if users only needed the Kubernetes control plane. With workerless Shoots, Gardener will not create any worker nodes or anything related to them.\nHere’s an example manifest for a local workerless Shoot:\napiVersion: core.gardener.cloud/v1beta1 kind: Shoot metadata: name: local namespace: garden-local spec: cloudProfile: name: local region: local provider: type: local kubernetes: version: 1.26.0 ⚠️ It’s important to note that a workerless Shoot cannot be converted to a Shoot with workers or vice versa.\n As part of the control plane, the following components are deployed in the seed cluster for workerless Shoot:\n etcds kube-apiserver kube-controller-manager gardener-resource-manager logging and monitoring components extension components (if they support workerless Shoots, see here) ","categories":"","description":"What is a Workerless Shoot and how to create one","excerpt":"What is a Workerless Shoot and how to create one","ref":"/docs/gardener/shoot_workerless/","tags":"","title":"Workerless `Shoot`s"},{"body":"Working with Projects Overview Projects are used to group clusters, onboard IaaS resources utilized by them, and organize access control. To work with clusters, first you need to create a project that they will belong to.\nCreating Your First Project Prerequisites You have access to the Gardener Dashboard and have permissions to create projects Steps Logon to the Gardener Dashboard and choose CREATE YOUR FIRST PROJECT.\n Provide a project Name, and optionally a Description and a Purpose, and choose CREATE.\n ⚠️ You will not be able to change the project name later. The rest of the details will be editable.\n Result After completing the steps above, you will arrive at a similar screen: Creating More Projects If you need to create more projects, expand the Projects list dropdown on the left. When expanded, it reveals a CREATE PROJECT button that brings up the same dialog as above.\nRotating Your Project’s Secrets After rotating your Gardener credentials and updating the corresponding secret in Gardener, you also need to reconcile all the shoots so that they can start using the updated secret. Updating the secret on its own won’t trigger shoot reconciliation and the shoot will use the old credentials until reconciliation, which is why you need to either trigger reconciliation or wait until it is performed in the next maintenance time window.\nFor more information, see Credentials Rotation for Shoot Clusters.\nDeleting Your Project When you need to delete your project, go to ADMINISTRATON, choose the trash bin icon and, confirm the operation.\n","categories":"","description":"","excerpt":"Working with Projects Overview Projects are used to group clusters, …","ref":"/docs/dashboard/working-with-projects/","tags":"","title":"Working With Projects"}] \ No newline at end of file