Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main.go: enable cleanup of old PVCs #38

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

squat
Copy link
Member

@squat squat commented Jan 17, 2020

This commit enables the controller to clean up the PVCs of Thanos
Receive Pods that are watched by it. Whenever a receiver is deleted, the
controller will spawn a helper container that mounts all PVCs specified
by the StatefulSet for that container, replicate the contents to object
storage, and then rm -rf the contents.

Tested on a kind cluster. A follow up PR will add E2E tests and verify
this functionality.

@squat squat requested review from kakkoyun and metalmatze and removed request for kakkoyun January 17, 2020 16:23
main.go Outdated Show resolved Hide resolved
@squat squat force-pushed the cleanup-pvcs branch 3 times, most recently from 58ba741 to abea992 Compare January 20, 2020 15:49
}

func (c *controller) generateHelper(name string, sts *appsv1.StatefulSet) *corev1.Pod {
initContainerTemplate := corev1.Container{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the right pattern?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some established way to execute multiple containers serially? I would try to stay away from shell scripting as you have so far, but this seems a little hacky.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init containers by design all execute in series; the rm -rf container could even be a second initcontainer. so I don't think this is much of an abuse. another option is to have multiple jobs that are dispatched one after another, i.e. one backup job, then a cleanup job. However, this seems like a lot of book-keeping for not much benefit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed. Let’s go with this strategy.

main.go Outdated Show resolved Hide resolved
},
},
}
// Inject extra environment variables into the cleanup Pod if provided.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this meant for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can inject e.g. AWS_SECRET_ACCESS_KEY AWS_ACCESS_KEY_ID, which we don't put inside of the thanos objstore secret

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm .. I feel like we should read the job/pod template from "disk" and mount it into the controller via a configmap, there will be endless customizations people will want to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this but I am a bit hesitant. I would really want to keep the scope of customizations minimal, especially because there controller has to override several fields:

  • volumes (to specify the pvcs)
  • container volume mounts ( to mount the pvcs)
  • container env (to set the 2 objstore configs, one of which points to the local pvc)
  • container cmd + args (to pass the env var to the cmd)
    In order to override those fields we need to select the container index in a stable way.

@squat
Copy link
Member Author

squat commented Jan 20, 2020

This PR is currently blocked on https://github.com/observatorium/thanos-replicate/blob/eb8dd2444ac8a24b1cf2d458055d1e1c9f727591/scheme.go#L197.

Because there are no labels on the block before the receive shipper uploads them, thanos-replicate skips the block when running in the cleanup job.

Will make a PR to move this forward

@squat squat force-pushed the cleanup-pvcs branch 2 times, most recently from c1a799a to 3da631a Compare January 20, 2020 18:09
@brancz
Copy link
Contributor

brancz commented Jan 21, 2020

I also realized this won’t work anymore with multitsdb, we way want to implement that right away as I think the strategy might be a bit different.

c.cleanupErrors.WithLabelValues(createLabel).Add(0)
c.cleanupErrors.WithLabelValues(decodeLabel).Add(0)
c.cleanupErrors.WithLabelValues(deleteLabel).Add(0)
c.cleanupErrors.WithLabelValues(fetchLabel).Add(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's worth looping bucks over these labels?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing is that not every counter uses every label, it's a bit funny. we'd end up with three different loops :/

@squat
Copy link
Member Author

squat commented Jan 28, 2020

I also realized this won’t work anymore with multitsdb, we way want to implement that right away as I think the strategy might be a bit different.

Yeah, that sounds like a plan

squat added 2 commits May 12, 2020 18:26
This commit enables the controller to clean up the PVCs of Thanos
Receive Pods that are watched by it. Whenever a receiver is deleted, the
controller will spawn a helper container that mounts all PVCs specified
by the StatefulSet for that container and `rm -rf`s the contents of them.

Tested on a kind cluster. A follow up PR will add E2E tests and verify
this functionality.
@tekicode
Copy link
Contributor

tekicode commented Oct 3, 2023

This should be closed. Kubernetes now supports discarding PVCs on StatefulSet scale down Here's the 2021 blog on this feature as alpha in Kubernetes 1.23.

In 2023, most of us running Kuberentes should have access to this functionality. https://kubernetes.io/blog/2021/12/16/kubernetes-1-23-statefulset-pvc-auto-deletion/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants