Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry messy, images built with arbitrary versions, no latest tag #76

Open
vlerenc opened this issue May 30, 2022 · 5 comments
Open
Labels
area/ops-productivity Operator productivity related (how to improve operations) kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage)

Comments

@vlerenc
Copy link
Member

vlerenc commented May 30, 2022

What would you like to be added:
I have difficulties picking the right image, because:

What do you recommend:

  • Build images from 1.15 till 1.24, because these are the version we support (if 1.15 and 1.16 is actual effort, then leave them out, but we need 1.23 and 1.24)
  • Set the latest tag for all built images
  • Make sure, only the intended images are used by ops, referred to by ops-guide, launched by Dashboard (as web terminal image)
  • Cleanup registry and remove all images with tags that are no longer updated

Why is this needed:

  • To pick the right image for the job.
  • To avoid people using outdated images.

cc @petersutter
cc @neo-liang-sap @plkokanov @jfortin-sap
cc @dguendisch @hendrikKahl @BeckerMax

@vlerenc vlerenc added area/ops-productivity Operator productivity related (how to improve operations) kind/enhancement Enhancement, improvement, extension labels May 30, 2022
@plkokanov
Copy link
Contributor

Maybe this is a bit off topic but just to let you know: with the latest release with @petersutter we decided to remove the kubectl X IaaS cli version matrix because it was generating too many security vulnerabilities that had to be triaged separately.
Additionally, we thought that using 1.22.7 should be enough for now - even though it is not recommended to be used with older k8s versions due to version skew it still does the job pretty well. And we would only add additional versions if we see it becoming a problem.
As for the IaaS clients, we noticed that on older versions of the ops-toolbelt some of the clients weren't even installed properly. Since no-one complained we thought that nobody is using those images and decided to remove them. And since gardenctl was removed but there is still no gardenctl-v2 using IaaS clis from the docker image and having to configure them becomes more of a chore.

Not sure if we could somehow label and deprecate the old images.

@vlerenc
Copy link
Member Author

vlerenc commented May 30, 2022

On the Kubernetes version:

I find it incomprehensible to add and then remove that feature again, especially because there are incompatibilities and we were already criticised for this by our end users/had problems ourselves (I at least had). Was that the source of security vulnerabilities? Can we then find a better way to deal with it? We also support the old Kubernetes versions in Gardener - so what's the issue here with kubectl if there was one?

On the IaaS CLIs:

Naturally, most of us have the CLIs already installed locally. Also, interactions are rarely required (so, I am not insisting or whatever). But when we require the CLIs and are under time pressure, this should be the most convenient (=fastest) way to get to the right clients right away. Also, during ops, people shy away and rather escalate the issue to the next support level instead of taking a look themselves. Making it even harder for them, will not help improve the situation. Was that the source of security vulnerabilities?

On gardenctl(-v2):

I never really understood why I would need gardenctl within the ops-toolbelt. What's the use case? At the time I am launching an ops pod, I have already targeted it, no? It's the one thing that should make use of the ops-toolbelt, not the other way around.

On cleaning up the registry:

I don't understand. One opens the GCR and deletes the images. Done. Where is the problem (other than first making sure, the old stuff is no longer referenced, as suggested)?

@vlerenc
Copy link
Member Author

vlerenc commented May 30, 2022

Anyway, if you want to throw away everything again, then so be it (no veto or whatever from my side - though I have to say, that I find the kubectl decision completely incomprehensible, especially for web terminals), but let me then know which is the right image now and please clean up the mess in GCR. All this back and forth makes me dizzy.

@petersutter
Copy link
Member

but let me then know which is the right image

There is only one image left, which is eu.gcr.io/gardener-project/gardener/ops-toolbelt, that is updated with the latest tag. We mentioned it as breaking change in the release notes https://github.com/gardener/ops-toolbelt/releases/tag/0.17.0.

and please clean up the mess in GCR

Usually I'm reluctant to delete anything from the registry, but sure we can do so as I doubt it was ever used internally. However we don't know if it was used outside SAP or not..

I find it incomprehensible to add and then remove that feature again

The built images weren't even used in the dashboard (for the webterminal feature). #50 was started but the dashboard side was never built and we do not have it on our roadmap as of now - if it's not contributed. Looking back we shouldn't have merged #50 before the feature on the dashboard side was not in sight. We also never saw any issues reported regarding the version skew of kubectl.
It's the sheer amount of images that we build that is the problem in case vulnerabilites are found. Usually those are found in the ubuntu base image or in the common components and then we need to take a look at all the findings for all the images that we build.

I never really understood why I would need gardenctl within the ops-toolbelt. What's the use case? At the time I am launching an ops pod, I have already targeted it, no?

If you have gardenctl configured in your ops-toolbelt it would be possible to configure the infra-clis (gardenctl provider-env). Not sure how it would work the other way around.

@vlerenc
Copy link
Member Author

vlerenc commented Aug 18, 2022

Well, thank you, but that's not nice. The kubectl skew is a problem. That's already bad enough.

As for the vulnerabilities: we may have many images, but they all stem/are built from the same base image that will in most cases contain the vulnerability, so the fix is exactly the same (one-time fix and rerun/republish, whether there is a matrix or not).

Anyway, if we don't want to improve / get back to where we once were and/or have no time anyway, then let's close the ticket. I cannot mentally click that button. ;-)

@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Feb 14, 2023
@gardener-robot gardener-robot added lifecycle/rotten Nobody worked on this for 12 months (final aging stage) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops-productivity Operator productivity related (how to improve operations) kind/enhancement Enhancement, improvement, extension lifecycle/rotten Nobody worked on this for 12 months (final aging stage)
Projects
None yet
Development

No branches or pull requests

4 participants