Skip to content

Conversation

@pprindeville
Copy link
Member

If named gets stopped, then started again, but isc-dhcpd isn't also restarted, then we want named to at least have the existing content.

📦 Package Details

Maintainer: @nmeyerhans

Description:
Save domains we serve dynamically so they're reloaded if the server gets started after a stop.


🧪 Run Testing Details

  • OpenWrt Version: HEAD
  • OpenWrt Target/Subtarget: x86_64/generic
  • OpenWrt Device: supermicro-sys-5018d-fn8t

✅ Formalities

  • I have reviewed the CONTRIBUTING.md file for detailed contributing guidelines.

If your PR contains a patch:

  • It can be applied using git am
  • It has been refreshed to avoid offsets, fuzzes, etc., using
  • It is structured in a way that it is potentially upstreamable

cc: @Alphix @systemcrash

@pprindeville
Copy link
Member Author

Just verified that if you do an opkg install on a running system, then the update is seamless and you don't need to also restart dhcpd.

@Alphix
Copy link
Contributor

Alphix commented Dec 2, 2025

Shouldn't that just be "rndc stop"? It'll save the dynamic zones and it'll shutdown bind cleanly... Avoiding things like an nsupdate coming in between sync and shutdown...(IIRC)

@pprindeville
Copy link
Member Author

Shouldn't that just be "rndc stop"? It'll save the dynamic zones and it'll shutdown bind cleanly... Avoiding things like an nsupdate coming in between shutdown and sync...(IIRC)

We're running the server with -f as a foreground child of procd, so killing the process out from under procd is going to cause issues...

@nmeyerhans
Copy link
Contributor

nmeyerhans commented Dec 2, 2025

Shouldn't that just be "rndc stop"? It'll save the dynamic zones and it'll shutdown bind cleanly... Avoiding things like an nsupdate coming in between shutdown and sync...(IIRC)

We're running the server with -f as a foreground child of procd, so killing the process out from under procd is going to cause issues...

We may want to switch to a pattern similar to the one described here rather than run named in the foreground. We can the call rndc stop in the cleanup function.

@systemcrash
Copy link
Contributor

Foreground might not be desirable in managing process lifetimes but it’s an effect of running procd which does all the lifting for us.

@Alphix
Copy link
Contributor

Alphix commented Dec 2, 2025

Shouldn't that just be "rndc stop"? It'll save the dynamic zones and it'll shutdown bind cleanly... Avoiding things like an nsupdate coming in between shutdown and sync...(IIRC)

We're running the server with -f as a foreground child of procd, so killing the process out from under procd is going to cause issues...

I haven't tested it myself, what issues does it cause? Like, ugly log messages or what are the issues?

@nmeyerhans
Copy link
Contributor

Foreground might not be desirable in managing process lifetimes but it’s an effect of running procd which does all the lifting for us.

We need a process in the foreground, but it doesn't need to be named. As described in the doc I linked to, if we need to do anything at all complex in the wrapper script, then using the shell as the foreground process is an established pattern.

@pprindeville
Copy link
Member Author

@nmeyerhans Should I merge and someone can change the wrapper if they're inclined? Looking through the sources I couldn't find a .init script that did its own backgrounding of the daemon.

@systemcrash
Copy link
Contributor

Let the usual homie procd do its thing.

@Alphix
Copy link
Contributor

Alphix commented Dec 3, 2025

Let the usual homie procd do its thing.

We're talking about potential data loss...(and yes, I understand this PR improves things, but we should aim for a complete fix)

@nmeyerhans
Copy link
Contributor

@nmeyerhans Should I merge and someone can change the wrapper if they're inclined? Looking through the sources I couldn't find a .init script that did its own backgrounding of the daemon.

I don't think we should merge as-is. The potential for data loss due to the race condition is something we should avoid. The procd docs suggest that stopping the service within the stop_service() function is appropriate. "stop_service() is only needed when you need special things to stop your service" would seem applicable here. Do we know for sure that rndc stop in this function causes problems? If so, what sort of problems?

@Alphix
Copy link
Contributor

Alphix commented Dec 4, 2025

"rndc freeze" might be a workable alternative (assuming that it doesn't return until everything has been synced to disk and that the frozen state isn't remembered across restarts)... But I'd still like to know what issues "rndc stop" causes with procd?

@nmeyerhans
Copy link
Contributor

I've just tested this with rndc stop and I don't see any sign of this being an issue:

root@OpenWrt:/etc/config# /etc/init.d/named start
root@OpenWrt:/etc/config# /etc/init.d/named status
running
root@OpenWrt:/etc/config# grep -A2 stop_service /etc/init.d/named
stop_service() {
        rndc stop
}
root@OpenWrt:/etc/config# /etc/init.d/named stop
root@OpenWrt:/etc/config# /etc/init.d/named status
inactive
root@OpenWrt:/etc/config# logread | tail
Thu Dec  4 17:44:03 2025 daemon.info named[3860]: managed-keys-zone: Key 20326 for zone . is now trusted (acceptance timer complete)
Thu Dec  4 17:44:03 2025 daemon.info named[3860]: managed-keys-zone: Key 38696 for zone . is now trusted (acceptance timer complete)
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: received control channel command 'stop'
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on 127.0.0.1#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on 172.18.0.202#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on ::1#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on fe80::5054:ff:fe0a:506%5#53
Thu Dec  4 17:44:25 2025 daemon.notice named[3860]: stopping command channel on 127.0.0.1#953
Thu Dec  4 17:44:25 2025 daemon.notice named[3860]: stopping command channel on ::1#953
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: shutting down: flushing changes

If rndc stop is executed outside the procd script, then procd restarts named as expected. But by invoking it within the stop_service handler, the expected stat is that it's inactive, and procd does not automatically restart it. Restarting it manually by way of /etc/init.d/named start again starts it as expected.

@pprindeville
Copy link
Member Author

I've just tested this with rndc stop and I don't see any sign of this being an issue:

root@OpenWrt:/etc/config# /etc/init.d/named start
root@OpenWrt:/etc/config# /etc/init.d/named status
running
root@OpenWrt:/etc/config# grep -A2 stop_service /etc/init.d/named
stop_service() {
        rndc stop
}
root@OpenWrt:/etc/config# /etc/init.d/named stop
root@OpenWrt:/etc/config# /etc/init.d/named status
inactive
root@OpenWrt:/etc/config# logread | tail
Thu Dec  4 17:44:03 2025 daemon.info named[3860]: managed-keys-zone: Key 20326 for zone . is now trusted (acceptance timer complete)
Thu Dec  4 17:44:03 2025 daemon.info named[3860]: managed-keys-zone: Key 38696 for zone . is now trusted (acceptance timer complete)
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: received control channel command 'stop'
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on 127.0.0.1#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on 172.18.0.202#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on ::1#53
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: no longer listening on fe80::5054:ff:fe0a:506%5#53
Thu Dec  4 17:44:25 2025 daemon.notice named[3860]: stopping command channel on 127.0.0.1#953
Thu Dec  4 17:44:25 2025 daemon.notice named[3860]: stopping command channel on ::1#953
Thu Dec  4 17:44:25 2025 daemon.info named[3860]: shutting down: flushing changes

If rndc stop is executed outside the procd script, then procd restarts named as expected. But by invoking it within the stop_service handler, the expected stat is that it's inactive, and procd does not automatically restart it. Restarting it manually by way of /etc/init.d/named start again starts it as expected.

So... merge?

If named gets stopped, then started again, but isc-dhcpd isn't also
restarted, then we want named to at least have the existing content.

Signed-off-by: Philip Prindeville <[email protected]>
@pprindeville pprindeville merged commit 605a457 into openwrt:master Dec 6, 2025
13 checks passed
@pprindeville pprindeville deleted the bind-save-domains branch December 6, 2025 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants