-
-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nix: fix auto-allocate-uids #1335
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I tried to accommodate auto-allocate-uids
when making these changes but missed this detail. I’m surprised it has users since my experience was that it was pretty broken, but I definitely don’t want to make it harder for people to try out experimental features.
I didn’t realize that the nixbld
group still has to exist for auto-allocate-uids
; I guess for ownership of store paths being built or something?
c3fe32d
to
7f6a969
Compare
This turned out to be a bit more involved than I originally anticipated. Whilst the original version of the PR did evaluate without errors, if it activated it would have probably broken everything, and by sheer coincidence I was on slow Wi-Fi and it needed to download some big updated packages, which just didn't work. So it never activated, which would have probably totally broken my computer and deleted the build group or something. Now, I tested both options on my system, toggling between And also, in auto-allocate-uids.mp4 |
Hold off updating nix-darwin past LnL7/nix-darwin#1313 until LnL7/nix-darwin#1335 is resolved, because of auto-allocate-uids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being so slow at getting back to this one.
modules/users/default.nix
Outdated
echo "deleting user ${name}..." >&2 | ||
dscl . -delete ${escapeShellArg "/Users/${name}"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this some more I’m wondering if we should just keep the warning here for now to unblock this without risking clobbering system users.
cc @Enzime – do you think it’s safe to remove this conditional here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think I’d be comfortable with the conservative change of allowing 350 < UID ≤ (350 + maxBuildUsers) as well as UID > 501 here. I think the idea is to avoid deleting system users.
7f6a969
to
494257c
Compare
Ok, I updated it so now the group is properly emptied when I didn't cap This also has the side effect of fixing another bug I ran across whilst working on this: if you raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing the fancier approach! 💜 And yeah, fixing nrBuildUsers
adjustment was my original motivation for wanting to pre‐fill knownUsers
that happened to dovetail nicely with the problems here.
I didn't cap
nrBuildUsers
in the config though, because I think it can be considered a breaking API change, which would need a new state version and everything. So, a potential error can arise ifnrBuildUsers > 128
, ifauto-allocate-uids
is then turned on, it will only delete the first 128 users, which could be seen as problematic. It wouldn't break anything, since none of the users are used anyway, but it's not nice to have those users around not doing anything and polluting the system. For now I think they can be deleted manually, and it shouldn't be too much of a big issue sincenrBuildUsers
is mostly set to the number of CPU cores, and no Mac has ever had more than 128 CPU cores (at least I don't think it has...? Unless maybe Hackintosh or VM...). But in the end I did it this way to avoid a breaking change.
As far as I’m aware, the most parallel Mac ever was the Mac Pro with a Xeon W-3275M, with 28 cores and 56 threads. There are rumours of a 32‐core Apple Silicon chip on the horizon, but that wouldn’t have SMT. I have no idea if max-jobs = auto
uses cores or threads, though. In practice I think 32 jobs should be more than enough for almost everybody, because with that many cores you also probably want to have those builds using parallelism themselves.
Evaluation‐time errors are a relatively innocuous kind of backwards incompatibility and I think it would be okay here. The state version is mostly for things where there’s a backwards‐incompatible change in state managed by the system; say, defaulting to a newer database version with an incompatible format, or otherwise changing a default or internal format in a way that could break an existing system at activation.
Tightening up evaluation checks can’t really brick someone’s system in that way, so I think constraining the range of nrBuildUsers
to avoid any mishaps is a good idea. Since build UIDs start at 351 these days, and going above 400 has proven dodgy in the past, we probably want the default cap to be < 50. Apple already uses GID 395, so I’d feel hesitant going above 44. Perhaps we could bound the maximum at a nice round 40 or even 36 and wait to see if that’s a problem for anyone? It’s easy for us to expand the maximum range later, especially if we move away from the knownUsers
model, but shrinking it is tough, as you’ve found, so I’d err on the side of conservatism.
(People who have a custom ids.uids.nixbld
set would be overly constrained by any of these values. But if they know what they’re doing then they can override the type, and if there’s a legitimate use case for high values I’d rather get a bug report about it now than only find out about it when something else breaks.)
# between `_nixbld1` and `_nixbld128` is considered "managed" by | ||
# nix-darwin, and will be created/deleted as `nrBuildUsers` increases | ||
# and decreases (or is set to 0 in case of `auto-allocate-uids`. | ||
nixbldUserNames = map (x: "_nixbld${toString x}") (range 1 128); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nixbldUserNames = map (x: "_nixbld${toString x}") (range 1 128); | |
nixbldUserNames = map (x: "_nixbld${toString x}") (range 1 maxBuildUsers); |
and use that for a lib.types.ints.between 1 maxBuildUsers
type in nrBuildUsers
.
modules/users/default.nix
Outdated
echo "deleting user ${name}..." >&2 | ||
dscl . -delete ${escapeShellArg "/Users/${name}"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think I’d be comfortable with the conservative change of allowing 350 < UID ≤ (350 + maxBuildUsers) as well as UID > 501 here. I think the idea is to avoid deleting system users.
After thinking about it a little more: Since the installers don’t really cap the value and someone who installs with 64 users and then gets the default value of 32 would have the users from 32 to I’ll sleep on it, but let me know what you think. One concern I have is whether checking the existence of all of those users might meaningfully slow down activation… |
37bd408
to
554455d
Compare
Ok, I set the deletion check to be like before, checking if UID > 501, but now also if it's between One scenario which I can think of in which this can break is if someone decides that they want to move their UID range to something else, say they are currently on base UID of 350, but want to go to 30000 or something. The new
Makes sense, so basically cleaning up the installer's leftover unused users. Although I wonder what the upper limit for this should really be? I set it as 128 for now, since it seemed like a good number, but ideally there wouldn't be any number at all and any user of the build group would be considered "managed".
Here's a totally unscientific benchmark of 0, 40, 64, and 128 users, based purely off of screen recording my terminal (measured first frame of "setting up user launchd services" to drop to shell + 5 frames). I think 128 users could be considered a bit too slow, but if you think so, I can just set it to 40 or 64 users, which are a bit faster. Of course 0 users is fastest, but currently most people will have 32 in knownUsers. Oh also in all of these recordings 0 users actually exist ( activation.mp4Also now manually setting |
554455d
to
bf27f60
Compare
Fixes auto-allocate-uids not working due to errors during activation checks. Also fixes a bug where raising `nrBuildUsers` would create new `_nixbld` users, but lowering it wouldn't then remove those users.
bf27f60
to
3d43c10
Compare
A few days ago, there was a change 1 that removed the
nix.configureBuildUsers
option, and made it so that the build users and group was always managed. Unfortunately this broke theauto-allocate-uids
option:configureBuildUsers
(internal variable) is set to false ifauto-allocate-uids
is set to true. LinkThe users and groups are configured when
configureBuildUsers
is true (soauto-allocate-uids
is false)... Link...but the users and groups are added to
knownUsers
andknownGroups
regardless... Link...which leads to the assertions on Line 798 always being false, and also leads to nix-darwin attempt to delete the
nixbld
group. LinkThe error shown when rebuilding with the problematic change and
auto-allocate-uids
enabled is this:This PR fixes both of these issues (failed assertion and attempt to delete
nixbld
group, which is still necessary forauto-allocate-uids
despite no users being in the group), by only adding the user assertions whenconfigureBuildUsers
is true, and updating theusers.knownUsers
to also only be set in that case. Additionally, thenixbld
group is now always created.Footnotes
Commit adc989f ↩