Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch systemd unit to TasksMax #107

Open
bt90 opened this issue Dec 28, 2023 · 13 comments · May be fixed by #108
Open

Switch systemd unit to TasksMax #107

bt90 opened this issue Dec 28, 2023 · 13 comments · May be fixed by #108

Comments

@bt90
Copy link
Contributor

bt90 commented Dec 28, 2023

I'm hitting the same problem as outlined in caddyserver/caddy#1802. The culprit seems to be how systemd handles the LimitNProc option:

LimitNPROC=512

While caddy doesn't occupy that many processes, some other docker containers seem to use the same UID for their processes:

sudo ps -U caddy
    PID TTY          TIME CMD
   4491 ?        00:00:01 mailrise
  36706 ?        00:00:28 postgres
  36760 ?        00:00:01 postgres
  36761 ?        00:00:06 postgres
  36762 ?        00:00:10 postgres
  36763 ?        00:03:55 postgres
  36764 ?        00:00:14 postgres
  36765 ?        00:01:17 postgres
  36766 ?        00:00:00 postgres
1597030 ?        00:00:03 postgres
1599669 ?        00:00:03 postgres
2081581 ?        00:25:43 redis-server
2082548 ?        00:00:36 postgres
2082623 ?        00:00:34 postgres
2654461 ?        00:00:00 start.sh
2654495 ?        00:00:00 Xvfb
2654496 ?        00:00:00 dumb-init
2654497 ?        00:48:58 node
2654671 ?        00:01:16 chrome
2654672 ?        00:01:16 chrome
2654673 ?        00:01:14 chrome
2654674 ?        00:01:14 chrome
2654675 ?        00:01:16 chrome
2654676 ?        00:01:13 chrome
2654677 ?        00:01:15 chrome
2654678 ?        00:01:14 chrome
2654683 ?        00:00:00 chrome_crashpad
2654684 ?        00:00:00 chrome_crashpad
2654685 ?        00:00:00 chrome_crashpad
2654686 ?        00:00:00 chrome_crashpad
2654691 ?        00:00:00 chrome_crashpad
2654692 ?        00:00:00 chrome_crashpad
2654693 ?        00:00:00 chrome_crashpad
2654694 ?        00:00:00 chrome_crashpad
2654703 ?        00:00:00 chrome
2654704 ?        00:00:00 chrome
2654705 ?        00:00:00 chrome
2654706 ?        00:00:00 chrome
2654707 ?        00:00:00 chrome
2654708 ?        00:00:00 chrome
2654709 ?        00:00:00 chrome
2654710 ?        00:00:00 chrome
2654711 ?        00:01:14 chrome
2654712 ?        00:01:13 chrome
2654715 ?        00:00:00 chrome_crashpad
2654717 ?        00:00:00 chrome_crashpad
2654718 ?        00:00:00 chrome_crashpad
2654722 ?        00:00:00 chrome_crashpad
2654723 ?        00:00:00 chrome
2654724 ?        00:00:00 chrome
2654727 ?        00:00:00 chrome
2654728 ?        00:00:00 chrome
2654729 ?        00:00:00 nacl_helper
2654730 ?        00:00:00 nacl_helper
2654732 ?        00:00:00 chrome_crashpad
2654750 ?        00:00:00 chrome_crashpad
2654752 ?        00:00:00 chrome_crashpad
2654753 ?        00:00:00 nacl_helper
2654757 ?        00:00:00 nacl_helper
2654759 ?        00:00:00 chrome_crashpad
2654761 ?        00:00:00 chrome
2654762 ?        00:00:00 chrome
2654767 ?        00:00:00 chrome_crashpad
2654768 ?        00:00:00 chrome_crashpad
2654770 ?        00:00:00 nacl_helper
2654781 ?        00:00:00 chrome
2654786 ?        00:00:00 chrome
2654796 ?        00:00:00 chrome_crashpad
2654800 ?        00:00:00 chrome
2654802 ?        00:00:00 chrome
2654816 ?        00:00:00 chrome_crashpad
2654817 ?        00:00:16 chrome
2654818 ?        00:00:17 chrome
2654821 ?        00:00:00 chrome
2654822 ?        00:00:00 chrome
2654823 ?        00:00:17 chrome
2654824 ?        00:00:17 chrome
2654828 ?        00:00:00 nacl_helper
2654881 ?        00:00:17 chrome
2654884 ?        00:00:00 nacl_helper
2654885 ?        00:00:16 chrome
2654886 ?        00:00:17 chrome
2654901 ?        00:00:00 nacl_helper
2654907 ?        00:00:17 chrome
2654910 ?        00:00:17 chrome
2654916 ?        00:00:00 nacl_helper
2654922 ?        00:00:17 chrome
2654985 ?        00:00:19 chrome
2654999 ?        00:00:00 nacl_helper
2655029 ?        00:00:05 chrome
2655048 ?        00:00:17 chrome
2655053 ?        00:00:05 chrome
2655063 ?        00:00:16 chrome
2655065 ?        00:00:17 chrome
2655066 ?        00:00:17 chrome
2655079 ?        00:00:17 chrome
2655080 ?        00:00:16 chrome
2655085 ?        00:00:05 chrome
2655089 ?        00:00:05 chrome
2655092 ?        00:00:17 chrome
2655096 ?        00:00:05 chrome
2655097 ?        00:00:05 chrome
2655105 ?        00:00:05 chrome
2655129 ?        00:00:05 chrome
2655136 ?        00:00:05 chrome
2655179 ?        00:00:05 chrome
2655180 ?        00:00:20 chrome
2655186 ?        00:00:17 chrome
2655199 ?        00:00:05 chrome
2655223 ?        00:00:05 chrome
2655315 ?        00:00:05 chrome
2655323 ?        00:00:05 chrome
2655330 ?        00:00:05 chrome
2655337 ?        00:00:05 chrome
2655341 ?        00:00:05 chrome
2655346 ?        00:00:05 chrome
2655385 ?        00:00:05 chrome
2655391 ?        00:00:05 chrome

The systemd documentation notes that TasksMax should be preferred over LimitNProc:

Note that LimitNPROC= will limit the number of processes from one (real) UID and not the number of processes started (forked) by the service. Therefore the limit is cumulative for all processes running under the same UID. Please also note that the LimitNPROC= will not be enforced if the service is running as root (and not dropping privileges). Due to these limitations, TasksMax= (see systemd.resource-control(5)) is typically a better choice than LimitNPROC=.

https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#Process%20Properties

@bt90 bt90 linked a pull request Dec 28, 2023 that will close this issue
@bt90
Copy link
Contributor Author

bt90 commented Dec 28, 2023

The limit got raised to a generous value of 512 in caddyserver/caddy#1825 in order to solve caddyserver/caddy#1723. But it's still possible to hit the limit due to misattribution of other container processes with the same UID.

The TasksMax option solves this by only limiting the number of processes started as part of the service, which is what we actually want to achieve.

@francislavoie
Copy link
Member

I'm confused. Why would you be running Docker using the caddy user?

Anyway this seems to make sense on paper, but I'd like @carlwgeorge to review as well.

@bt90
Copy link
Contributor Author

bt90 commented Dec 28, 2023

The problem is that the numeric UID of the caddy user happens to overlap with one or more users inside Docker containers.

On my host, the caddy user has the UID 999. If the user in a container happens to have the same UID, systemd would attribute those processes to the caddy user and conclude that the limit has been reached.

@bt90
Copy link
Contributor Author

bt90 commented Dec 28, 2023

The limit seems to include threads as explained in setrlimit(2):

The maximum number of processes (or, more precisely on Linux, threads) that can be created for the real user ID of the calling process. Upon encountering this limit, fork(2) fails with the error EAGAIN.

I used the following bash script to determine the current value:

sudo ps -U caddy -h -o nlwp | awk '{total += $1} END {print total}'

This yields 749 on my system with all services and containers running. LimitNPROC=800 works, but the unit fails to start once I switch to LimitNPROC=700.

The task limit is much more reliable and better reflects reality:

● caddy.service - Caddy
     Loaded: loaded (/etc/systemd/system/caddy.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2023-12-28 20:45:05 CET; 8min ago
       Docs: https://caddyserver.com/docs/
   Main PID: 1885185 (caddy)
      Tasks: 8 (limit: 512)
     Memory: 12.7M
        CPU: 413ms
     CGroup: /system.slice/caddy.service
             └─1885185 /usr/bin/caddy run --environ --config /etc/caddy/Caddyfile

1 parent process + 7 children -> 8 tasks

grafik

@bt90
Copy link
Contributor Author

bt90 commented Dec 29, 2023

The main offender in my case is the browserless/chrome container.

docker top playwright-chrome o user,uid,pid
USER                UID                 PID
caddy               999                 2654461
caddy               999                 2654495
caddy               999                 2654496
caddy               999                 2654497
caddy               999                 2654671
caddy               999                 2654672
caddy               999                 2654673
caddy               999                 2654674
caddy               999                 2654675
caddy               999                 2654676
caddy               999                 2654677
caddy               999                 2654678
caddy               999                 2654683
caddy               999                 2654684
caddy               999                 2654685
caddy               999                 2654686
caddy               999                 2654691
caddy               999                 2654692
caddy               999                 2654693
caddy               999                 2654694
caddy               999                 2654703
caddy               999                 2654704
caddy               999                 2654705
caddy               999                 2654706
caddy               999                 2654707
caddy               999                 2654708
caddy               999                 2654709
caddy               999                 2654710
caddy               999                 2654711
caddy               999                 2654712
caddy               999                 2654715
caddy               999                 2654717
caddy               999                 2654718

The container alone is enough to trip the limitation:

docker top playwright-chrome -o nlwp,pid | tail -n +2 | awk '{total += $1} END {print total}'

yields 717.

@bt90
Copy link
Contributor Author

bt90 commented Jan 11, 2024

@francislavoie the following systemd documentation PR describes the situation very good: systemd/systemd#23242

@carlwgeorge
Copy link
Collaborator

@francislavoie Sorry for my delay in getting back to you on this. Adoption of this option should wait until the project is ready to abandon building for RHEL 7. TasksMax was added in systemd 227, but RHEL 7 only has systemd 219. RHEL 8 bumps up to systemd 239. RHEL 7 goes EOL on 2024-06-30, so that may the ideal time to switch to TasksMax.

@francislavoie
Copy link
Member

francislavoie commented Feb 8, 2024

Thanks @carlwgeorge, glad I waited.

Would it be okay if we dropped RHEL 7 support early then? It just would mean it wouldn't receive this one last release before official EOL I guess.

Does COPR have recent download stats that would give us an idea how much it's still being used?

@bt90
Copy link
Contributor Author

bt90 commented Feb 8, 2024

Note that users would still able to work around it using
systemctl edit caddy.

@francislavoie
Copy link
Member

Yeah, understood. I just rather not block merging this for everyone else who would benefit, while waiting for one old distribution to cycle out.

@carlwgeorge
Copy link
Collaborator

Would it be okay if we dropped RHEL 7 support early then? It just would mean it wouldn't receive this one last release before official EOL I guess.

I don't have a problem with the project dropping support for RHEL 7 early, I just would like to be an explicit decision, not a "whoops". Doing it intentionally would also be less disruptive for RHEL 7 caddy users, as we wouldn't ship an update to them with an incompatible option. Ideally we would inform them with some kind of announcement that there will be no more RHEL 7 caddy updates.

Does COPR have recent download stats that would give us an idea how much it's still being used?

RHEL 7 and it's derivatives are still pretty popular, more so than they should be this late into their lifecycles. Here are the download stats from COPR.

  • EPEL 7: 16,491
  • EPEL 8: 11,145
  • EPEL 9: 26,813

Note that users would still able to work around it using
systemctl edit caddy.

Just like people that want to start using TasksMax now can, before the project makes it the default. With the broad range of systemd versions in the wild, it makes more sense to keep the default unit using directives that are available on all distributions that the project targets with the apt and rpm repos.

@bt90
Copy link
Contributor Author

bt90 commented Feb 8, 2024

We could also simply drop LimitNProc to be honest.

@francislavoie
Copy link
Member

francislavoie commented Feb 8, 2024

EPEL 7: 16,491

Oof, yeah that's not as low as I'd hoped.

We could also simply drop LimitNProc to be honest.

Yeah, I'd be okay with that too in the short term.

I don't really think we need to worry about Go running wild, it's a well behaving runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants