-
Notifications
You must be signed in to change notification settings - Fork 173
fence_sbd: get devices from SBD_DEVICE env variable if devices parameter isnt set #591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fence_sbd: get devices from SBD_DEVICE env variable if devices parameter isnt set #591
Conversation
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/1/input |
|
Guess we have to carefully think if this behavior is desirable. Klaus |
|
|
You have to run |
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/2/input |
gao-yan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @liangxin1300 !
Guess we have to carefully think if this behavior is desirable. Since it is possible to have nodes that do support running the SBD daemon (e.g. proper watchdog-device) and others that don't I'd rather have the fence-agent behave the same way regardless on which node it is running.
It still makes sense to respect explicitly specified "devices" especially since there might be the cases where some of the cluster nodes are not running sbd daemon but can be a fencing executor. But OTOH IMO, it also makes sense to be possible to omit the parameter for the normal cases where all nodes are configured to be running sbd daemon. So that by falling back to sysconfig, it'll avoid unnecessary duplication or even inconsistency with what's configured in the sysconfig.
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/3/input |
f542fab to
4aff0dc
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/4/input |
4aff0dc to
f4222cb
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/5/input |
f4222cb to
bda9d27
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/6/input |
|
Puh ... that is a while back but was there a reason to explicitly parse sbd-config? Not 100% sure but I guess systemd is executing it as shell-script (if not we would at least have consistent behavior) so we might do here as well. And shouldn't it even be in the environment of pacemaker and thus be inherited to the fence-agents? |
SBD is not an easy thing to set up. It involves setups of watchdog & block devices, on-disk metadata, sysconfig file, CIB configuration... When the other sbd fence agent was implemented, I believe it was exactly for the purpose of easing the setup and more importantly ensuring the consistency by reading already known configuration from sysconfig rather than unnecessarily repeating them in the CIB.
Unlikely so. It seems to do the parsing by itself:
The python module "configparser" could be a helper. Although it was originally meant for parsing ini files, but apparently it's used by us as well such as:
It'd probably be asking too much of pacemaker if it was purely for the purpose of passing on the environment variables to the fence agent? It might be more convincing if pacemaker itself should care about any of the sbd settings.
Sounds sensible.
It might arguably be a change of behavior but it wouldn't affect existing setups. I believe all of them have always had "devices" explicitly configured. I'd rather not unnecessarily complicate this. This agent handles fencing via disk-based sbd. It seems very strait-forward to me that it should try finding the device list from sysconfig if "devices" is not explicitly specified. Of course if SBD_DEVICE is not even configured in sysconfig, an error should be dropped.
Separate sbd resources configured for different nodes with specific "devices" or without it, together with location constraints should do? |
bda9d27 to
3afaf47
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/7/input |
|
retest this please |
I guess that was a misunderstanding. I meant that the environment that is currently passed to the fence-agents by pacemaker should already contain the definitions from /etc/sysconfig/sbd. Didn't check though. |
Ah, you are right. Somehow I forgot pacemaker already retrieves sbd sysconfig. In that case, it'd be even better to just reference the environment variable SBD_DEVICE in here. |
3afaf47 to
794b26e
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/9/input |
794b26e to
744d534
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/10/input |
|
The CI fail has nothing to do with the code, so no need to worry about it. |
|
Was thinking of a smart way to check for a locally running sbd daemon. |
But as long as there should be the use cases where a node is not running sbd daemon but could be an sbd fencing executor, either fence agent or sbd CLI should be able to handle hanging block devices by itself anyway such as ClusterLabs/sbd#148 ?
Checking with systemd might be more straight-forward but of course technically less generic and more of a dependency. OTOH doing it via sbd.pid would of course require logic in configure.ac dealing with |
|
Guess we should have checking for the daemon in this PR. @oalbrigt what would you prefer? |
|
Yeah. I guess it would make sense to check if sbd is configured and running in this case. |
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/11/input |
|
Add a commit: Please take a look Thanks! |
agents/sbd/fence_sbd.py
Outdated
| if "--devices" not in options: | ||
| dev_list = os.getenv("SBD_DEVICE") | ||
| if dev_list: | ||
| if sbd_daemon_is_running() and dev_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If nothing appropriate is coming from SBD_DEVICE we don't have to check for the process.
Thus I'd turn the check around.
… enviroment variable And add @SBDPID_PATH@ for the sbd daemon pid file path
ec93e39 to
06457f9
Compare
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-591/12/input |
|
Looking good from my POV. |
|
Thanks. |
…ter isnt set (ClusterLabs#591) * fence_sbd: check if the sbd daemon is running before using SBD_DEVICE enviroment variable
…ter isnt set (ClusterLabs#591) * fence_sbd: check if the sbd daemon is running before using SBD_DEVICE enviroment variable
from environment