Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup pre-checks do not support cgroups v2 #232

Open
denniseffing opened this issue Jan 18, 2025 · 2 comments · May be fixed by #233
Open

cgroup pre-checks do not support cgroups v2 #232

denniseffing opened this issue Jan 18, 2025 · 2 comments · May be fixed by #233

Comments

@denniseffing
Copy link

Summary

The cgroup pre-check task uses the command grep -E "^{{ cgroup.name }}\s+.*\s+1$" /proc/cgroups to check if all cgroup controllers are enabled properly. Because the /proc/cgroups file is meaningless for v2, this check may fail inadvertently.

In my case, on a host running an up-to-date Arch Linux, the file /proc/cgroups does not include the memory controller for whatever reason. However, the file /sys/fs/cgroup reports the enabled controllers correctly.

This is the output of the deprecated /proc/cgroups file on the affected system:

#subsys_name    hierarchy       num_cgroups     enabled
cpu     0       114     1
cpuacct 0       114     1
blkio   0       114     1
devices 0       114     1
freezer 0       114     1
net_cls 0       114     1
perf_event      0       114     1
net_prio        0       114     1
hugetlb 0       114     1
pids    0       114     1
rdma    0       114     1
misc    0       114     1

And this is the output of the /sys/fs/cgroup file on the affected system:

cpuset cpu io memory hugetlb pids rdma misc

Issue Type

  • Bug Report

Controller Environment and Configuration

I am using ansible-role-k3s v3.4.4.

Steps to Reproduce

N/A

Expected Result

cgroups pre-check is successful if the controller is included in /sys/fs/cgroup/cgroup.controllers

Actual Result

cgroups pre-check fails even if the controller is included in /sys/fs/cgroup/cgroup.controllers

fatal: [host3]: FAILED! => {
    "assertion": "k3s_check_cgroup_option.rc == 0",
    "changed": false,
    "evaluated_to": false,
    "msg": "memory cgroup disabled. If you are running on a Raspberry Pi, see:\nhttps://rancher.com/docs/k3s/latest/en/advanced/#enabling-cgroups-for-raspbian-buster\n"
}
denniseffing added a commit to denniseffing/home-infrastructure that referenced this issue Jan 18, 2025
@ToroNZ ToroNZ linked a pull request Jan 23, 2025 that will close this issue
@ToroNZ
Copy link

ToroNZ commented Jan 23, 2025

Similar issue checking cpuset cgroup:

fatal: [worker03]: FAILED! => {
    "assertion": "k3s_check_cgroup_option.rc == 0",
    "changed": false,
    "evaluated_to": false,
    "msg": "cpuset cgroup disabled. If you are running Alpine Linux, see:\nhttps://rancher.com/docs/k3s/latest/en/advanced/#additional-preparation-for-alpine-linux-setup\n"
}

BEFORE:

Fedora CoreOS 41.20241215.3.0
6.11.11-300.fc41.x86_64

$ cat /proc/cgroups
#subsys_name    hierarchy       num_cgroups     enabled
cpuset  0       127     1
cpu     0       127     1
cpuacct 0       127     1
blkio   0       127     1
memory  0       127     1
devices 0       127     1
freezer 0       127     1
net_cls 0       127     1
perf_event      0       127     1
net_prio        0       127     1
hugetlb 0       127     1
pids    0       127     1
rdma    0       127     1
misc    0       127     1

AFTER:

Fedora CoreOS 41.20250105.3.0
6.12.7-200.fc41.x86_64

$ cat /proc/cgroups
#subsys_name    hierarchy       num_cgroups     enabled
cpu     0       163     1
cpuacct 0       163     1
blkio   0       163     1
memory  0       163     1
devices 0       163     1
freezer 0       163     1
net_cls 0       163     1
perf_event      0       163     1
net_prio        0       163     1
hugetlb 0       163     1
pids    0       163     1
rdma    0       163     1
misc    0       163     1

Cause

cgroupv1 features are progressively being trimmed out of the kernel by default:

https://lore.kernel.org/lkml/[email protected]/T/
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=af000ce85293b8e608f696f0c6c280bc3a75887f
https://lore.kernel.org/lkml/[email protected]/T/

Workaround

Disable validation checks:
k3s_skip_validation: true

Fix

cpuset and memory v2 cgroup controllers availabilty can be checked here:

$ cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma misc

PR

Fix: #233

@ToroNZ ToroNZ mentioned this issue Jan 23, 2025
@denniseffing
Copy link
Author

Awesome find, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants