Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust docker image cleanup tolerances. #238

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nuclearsandwich
Copy link
Contributor

Jenkins and the cleanup script do their accounting slightly differently
so there are times when the cleanup script runs and Jenkins still
considers the disk to be too full.
By bumping the cleanup minimum to 60G that gives us a pretty comfortable
margin over the 50G minimum for Jenkins. I also bumped the percentage so
we don't flap on and off as frequently.

These changes affect all agents but I don't think any of them are harmed
by the tuning of these parameters.

As we work on the future agents we're going to delegate cleanup
responsibility to docker prune commands, which we are already doing for
the Jenkins host which previously did not have a configured cleanup
script.

Jenkins and the cleanup script do their accounting slightly differently
so there are times when the cleanup script runs and Jenkins still
considers the disk to be too full.
By bumping the cleanup minimum to 60G that gives us a pretty comfortable
margin over the 50G minimum for Jenkins. I also bumped the percentage so
we don't flap on and off as frequently.

These changes affect all agents but I don't think any of them are harmed
by the tuning of these parameters.

As we work on the future agents we're going to delegate cleanup
responsibility to docker prune commands, which we are already doing for
the Jenkins host which previously did not have a configured cleanup
script.
@nuclearsandwich nuclearsandwich self-assigned this Aug 14, 2020
@dirk-thomas
Copy link
Member

The specific numbers of the different nodes should be taken into consideration here (non-ARM as well as ARM nodes) and based on their disk sizes the limits should be set.

Atm for e.g. ARM nodes which have less than 500 GB disk space the cleanup script only kicks in at 50 GB (since 10% of less than 500 GB is less) but the configured threshold in Jenkins is also 50 GB which leaves no margin. So when the cleanup script kicks in it is already too late and Jenkins will take the node offline.

@@ -132,7 +132,7 @@

# clean up containers and dangling images https://github.com/docker/docker/issues/928#issuecomment-58619854
cron {'docker_cleanup_images':
command => "bash -c \"python3 -u /home/${agent_username}/cleanup_docker_images.py --minimum-free-percent 10 --minimum-free-space 50\"",
command => "bash -c \"python3 -u /home/${agent_username}/cleanup_docker_images.py --minimum-free-percent 30 --minimum-free-space 60\"",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 30% is necessary / reasonable. That would waste a lot of disk space rather than using it for caching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants