armbian-firstlogin: power-loss recovery and atomic writes#9954
armbian-firstlogin: power-loss recovery and atomic writes#9954mingzhangqun wants to merge 3 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds atomic_write and applies staged atomic file replacements across firstlogin/firstrun; makes locale, netplan, shell defaults, sudoers, SSH edits atomic; and adds user-creation and marker coordination logic to tolerate partial runs and power loss. ChangesFirst-login Script Crash and Power-Loss Safety
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/bsp/common/usr/lib/armbian/armbian-firstlogin`:
- Around line 1114-1118: The rm -f /root/.not_logged_in_yet deletion is outside
the main guard (if [[ -f /root/.not_logged_in_yet && -n "$(tty)" ]]) and can
remove the marker when the setup block (lines ~724-1108) is skipped; restore
robustness by relocating the marker removal inside that main guard or by adding
the same condition before deleting (check -f /root/.not_logged_in_yet and -n
"$(tty)" or equivalent variable used in the script) so that
/root/.not_logged_in_yet is only removed after the setup logic in the main block
(the same guarded section that performs the setup steps) has completed.
- Around line 18-25: atomic_write currently treats an empty string mode the same
as "unset" using [[ -n "$mode"]], causing fallback to --reference when the dest
may not exist; change the check to detect whether a second argument was actually
passed (e.g. if [ "$#" -ge 2 ]), and if provided call chmod "$mode" and
fail/return if that chmod fails (do not fall through to --reference), otherwise
(mode not provided) only attempt chmod --reference="$dest" when the destination
exists (silently skip if it doesn't); also audit call sites that create new
files (e.g. sudoers/netplan writers) and pass an explicit mode when the
destination may not exist.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cc4b89eb-0168-4f28-b71d-82a08570e3ce
📒 Files selected for processing (1)
packages/bsp/common/usr/lib/armbian/armbian-firstlogin
ba40aa9 to
f2e9a4d
Compare
|
Why not just use the |
|
Hi @adeepn Thank you for your reply. |
my bad: I was reading BSD version man page, which does handle flush and renames :) But in that case I'd suggest having the function mirror install's call arguments, so the function's semantics are clear from the call site: anyone reading the code should immediately see what it does, rather than having to guess that it appends .tmp to the name. |
|
Good point about readability — I've updated |
|
since you prepared "atomic-install", please re-fix #9937 |
| chmod --reference="$dest" "$src" | ||
| fi | ||
| mv -f "$src" "$dest" | ||
| sync -f "$dest" 2>/dev/null || sync "$dest" 2>/dev/null |
There was a problem hiding this comment.
I was about to write "thou shall not use sync rather sync --file-system", but then seeing sync --file-system here already!
Nice!
And to be honest, I was made aware of the -f option today when reading up on https://www.linux.org/threads/sync-vs-fsync-%E2%80%94-making-sure-your-writes-actually-hit-the-disk.64365/.
| #Enable Wlan, disable Eth | ||
| local _f="/etc/netplan/30-${DEVTYPE}${CONFIG_NAME}.yaml" | ||
| echo -e "$(createYAML)" > "${_f}.tmp" | ||
| atomic_write "${_f}.tmp" "$_f" 600 |
There was a problem hiding this comment.
Style comment; please use ${VAR} everywhere, not just where it's really really needed.
|
|
||
| # change shell for future users | ||
| sed -i "s|^SHELL=.*|SHELL=${SHELL_PATH}|" /etc/default/useradd | ||
| sync -f /etc/default/useradd 2>/dev/null |
There was a problem hiding this comment.
The atomic_write() is a very OK way to do these things, as sed -i does have some tendency to bork things up. Why not use that here as well?
| # change default locales | ||
| sed -i "s/=.*/=$(echo ${LOCALES} | cut -d" " -f1)/g" /etc/default/locale | ||
| # change default locales (atomic write) | ||
| if sed -i "s/=.*/=$(echo ${LOCALES} | cut -d" " -f1)/g" /etc/default/locale; then |
There was a problem hiding this comment.
Please do the atomic_write() here as well
| if ! grep -q '^root\s\+ALL' /etc/sudoers 2>/dev/null; then | ||
| # sudoers corrupted (power-loss), restore from dpkg default | ||
| local _f="/etc/sudoers" | ||
| cat > "${_f}.tmp" << 'SUEOF' |
There was a problem hiding this comment.
Be aware that herefiles with a quoted end tag ('EOL' or \EOL) will not do any shell expansion inside of the herefile content. Best to add a remark for future users that want to modify this part.
There was a problem hiding this comment.
Done — added a comment above the heredoc in the latest commit:
# Note: quoted heredoc tag ('SUEOF') disables shell variable expansion;
# all content is literal — do not add $VAR references inside without unquoting.
cat > "${_f}.tmp" << 'SUEOF'| REPEATS=$((REPEATS - 1)) | ||
| fi | ||
| [[ "$REPEATS" -eq 0 ]] && logout | ||
| [[ "$REPEATS" -eq 0 ]] && exit 0 |
There was a problem hiding this comment.
What is the reason to do an exit here instead of a logout ?
There was a problem hiding this comment.
The script runs as a child process (has #!/bin/bash shebang, executed not sourced), so it's not a login shell. logout is a bash builtin restricted to interactive login shells and can behave unexpectedly outside that context. exit 0 exits the script process explicitly and works regardless of how the script is invoked. Using exit 0 here makes the retry-exhaustion path correct and portable.
| sed -e '/^#AcceptEnv LANG/ s/^#//' -i /etc/ssh/sshd_config | ||
| # re-enable passing locale environment via ssh and allow root password login (atomic write) | ||
| _f="/etc/ssh/sshd_config" | ||
| if sed -e '/^#AcceptEnv LANG/ s/^#//' -e 's/^#PermitRootLogin.*/PermitRootLogin yes/' "$_f" > "${_f}.tmp"; then |
There was a problem hiding this comment.
Perhaps make this a little more permissive with:
sed -E "
/^[[:blank:]]*#[[:blank:]]*acceptenv[[:blank:]]+lang/I {
i #[NOTE: Modified by ${0##*/}]
s/^[[:blank:]]**#//;
}
/^[[:blank:]]*(#[[:blank:]]*)?permitrootlogin[[:blank:]]+/I {
i #[NOTE: Added by ${0##*/}]
i PermitRootLogin yes
d;
}" "${_f} > ...
| sed -e '/^#AcceptEnv LANG/ s/^#//' -i /etc/ssh/sshd_config | ||
| # re-enable passing locale environment via ssh and allow root password login (atomic write) | ||
| _f="/etc/ssh/sshd_config" | ||
| if sed -e '/^#AcceptEnv LANG/ s/^#//' -e 's/^#PermitRootLogin.*/PermitRootLogin yes/' "$_f" > "${_f}.tmp"; then |
There was a problem hiding this comment.
Not so sure about the PermitRootLogin yes here, what's the reason to have that in? For "hardened" configurations, if root login is to be permitted, at least add something additional like PermitRootLogin prohibit-password (which would require a pubkey foir root logins instead of a password) to prevent brute force attacks, or force to have both password and a pubkey.
There was a problem hiding this comment.
Good point — I've removed PermitRootLogin yes in the latest commit. It was not in the original upstream code and is out of scope for this power-loss safety PR. The sshd_config edit now only uncomments AcceptEnv LANG (using the more permissive sed -E pattern you suggested), which is what the original code did.
Add atomic_write() helper that writes to a temp file, syncs, then renames — surviving power loss at any point. Use it for all armbianEnv.txt modifications (hostname, MAC, root password). Guard atomic_write permission handling: only chmod --reference when the destination exists; use explicit mode for new files. Guard firstlogin marker deletion behind the same tty check as the main setup block so non-interactive invocations don't remove the marker without completing setup.
Change atomic_write() signature from (dest, [mode]) to (src, dest, [mode]) so call sites clearly show the data flow without relying on an implicit .tmp naming convention. Use local variables to avoid repeating long paths. Consolidate separate chmod calls into the mode parameter where possible.
Move atomic_write() from armbian-firstlogin into armbian-common so both armbian-firstrun and armbian-firstlogin share one implementation, and refactor the inline atomic-write logic in armbian-firstrun's MAC randomization (armbian#9937) to use it. Also address review feedback on armbian#9954: - Use ${VAR} notation consistently in the newly added code - Apply atomic_write() to /etc/default/useradd, /etc/adduser.conf and /etc/default/locale (previously left with plain sed -i + sync) - Add a note about the quoted heredoc tag ('SUEOF') disabling variable expansion - Drop the PermitRootLogin yes change from sshd_config editing (not in the original code, and out of scope for this power-loss safety PR) - Use a more permissive sed -E pattern for uncommenting AcceptEnv LANG
fe1b713 to
e9b0b9d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/bsp/common/usr/lib/armbian/armbian-common`:
- Line 44: The current fallback call uses 'sync "${dest}"' which fails on
systems where sync has no file argument; update the fallback so if 'sync -f
"${dest}" 2>/dev/null' fails, it calls plain 'sync' (no arguments) instead
(i.e., change the expression that references the command string 'sync -f
"${dest}" 2>/dev/null || sync "${dest}" 2>/dev/null' to fall back to a bare
'sync' so older systems actually flush to disk).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a9144ad4-1905-4cd5-bdba-eff0d52a669d
📒 Files selected for processing (3)
packages/bsp/common/usr/lib/armbian/armbian-commonpackages/bsp/common/usr/lib/armbian/armbian-firstloginpackages/bsp/common/usr/lib/armbian/armbian-firstrun
| chmod --reference="${dest}" "${src}" | ||
| fi | ||
| mv -f "${src}" "${dest}" | ||
| sync -f "${dest}" 2>/dev/null || sync "${dest}" 2>/dev/null |
There was a problem hiding this comment.
Correct the sync fallback to work on older systems.
The fallback sync "${dest}" will fail on systems where sync does not support the -f option, because standard sync (without -f or --file-system) does not accept file arguments. The error is suppressed by 2>/dev/null, so the sync silently fails, leaving data potentially unflushed and defeating the crash-safety goal of this PR.
🔧 Proposed fix
- sync -f "${dest}" 2>/dev/null || sync "${dest}" 2>/dev/null
+ sync -f "${dest}" 2>/dev/null || sync📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| sync -f "${dest}" 2>/dev/null || sync "${dest}" 2>/dev/null | |
| sync -f "${dest}" 2>/dev/null || sync |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/bsp/common/usr/lib/armbian/armbian-common` at line 44, The current
fallback call uses 'sync "${dest}"' which fails on systems where sync has no
file argument; update the fallback so if 'sync -f "${dest}" 2>/dev/null' fails,
it calls plain 'sync' (no arguments) instead (i.e., change the expression that
references the command string 'sync -f "${dest}" 2>/dev/null || sync "${dest}"
2>/dev/null' to fall back to a bare 'sync' so older systems actually flush to
disk).
Summary
sed -iwith atomic write pattern (sedto temp file →sync→mv) for critical config files (sudoers, sshd_config, locale.gen, netplan)chmod --referenceto preserve original file permissions in atomic writes (per feedback on armbian-firstrun: use atomic write for armbianEnv.txt MAC randomization #9937).not_logged_in_yetdeletion to end of script so any interrupted step causes re-runlogout→exit 0(correct for subshell context)$(tty)in testBackground
On embedded devices (e.g. reComputer RK3576), power loss during first boot is a real scenario. The stock
armbian-firstloginhas several issues:sed -itruncates and rewrites — power loss during this window leaves zero-filled filesTest plan
armbianEnv.txt,sudoers,sshd_configare valid after rebootSummary by CodeRabbit