-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd:check lock ownership in tcmu_rbd_get_lock_tag #452
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: tangwenji <[email protected]>
Hey, Good to have you back working on runner! I am not sure I understood the bug. Back story on why we have that code: The reason for that callout is ESX can send a RTPG to any path to the device. It then expects that the data returned for all ports is current and the same through all paths. In newer version of SPC (4r37) there is a note about this, but ESX already does what it does. Windows has a similar issue where it might send a RTPG through different paths and if they do not match you end up with a inconsistent multiapth device state. Linux will always send a RTPG through each path and only uses the info for the local port so it does not have a similar issues. So the problem with the patch is:
Do you mean we had the lock, it got taken and now we are blacklisted so rbd/ceph request will fail? If so, tcmu_rbd_get_lock_tag should have got a -ESHUTDOWN err from one of the rbd/ceph calls indicating it is blacklisted. It then should have returned TCMU_STS_FENCED. tcmur_device_get_lock_tag should then reopen the device to clear the blacklisting and we should retry the get_lock_tag_id operation and it should work. Is that not working?
Ok, so I might be misunderstanding the problem. The above comment sounds like there an issue where the client has the lock (tag in metadata indicates that client), runner restarts due to crash, then on startup we do not have the lock anymore, but the metadata is invalid because it still references the old client? We then incorrectly report we have the lock because the now stale metadata tag info still points to our target port group, so no STPG is ever sent to this patch to activate it? I see that bug in the code. If that is your bug, how about something like the attached patch which detects the problem and cleans up. |
Yes,I mean this。
You mean after apply with my patch,then your patch has bug? |
I think that when tcmu starts, no locks should be implicitly acquired, regardless of whether there is a lock before. The lock can only be obtained explicitly by issuing the stpg command through the Initiator. |
In the two TCMUs of HA, the alua group id of the same image cannot be set to the same value? |
I meant I see the bug you were describing.
The problem is that if tcmu-runner crashes, the metadata and rbd/ceph still thinks there is a lock held (rbd_lock_get_owners will return the holder) . So you need to either re-acquire the lock or clean it up so the other nodes do not misreport the state. My patch re-acquired it. To clean it up, we still need to grab the lock to clean up the metadata because we must hold that to access the metadata to prevent races where other nodes might be trying to grab the lock and update it.
I did not understand the question. If you have 2 node, each node has 1 iscsi target port group with 1 enabled portal in the group. image1234 is exported through both nodes, node2 has group id 2 and node3 has group id 3. Does that answer your question? I attached a targetcli config .json file here: [edited the node numbering] You would run this one node2 and then run a similar config file on node3 (this file would have the tpg local to that node enabled instead). Here is what it looks like from targetcli ls on the node where the portal 192.168.32.101 is enabled.
|
I understanded, thanks!! |
Are you ok with my patch? Does it fix the issue you hit? Do you use explicit failover? I think for the next release I am going to mark it experimental because our QE team did not get to fully test it. |
@mikechristie I use explicit failover,but it may have some bugs. |
Just a status on this. I found some bugs with my patch, and am fixing. There also seems to be a bug for the case where tcmu-runner is stopped gracefully with systemctl stop tcmu-runner This will start the paths in Standby. The problem is if Linux initiators/multipath had last seen the state where path1 == AO and path2 == Standby and IO was in flight during the tcmu-runner stop. On start up the IO will be failed with an error indicating the path is in standby. That is failed from the SCSI layer to the multipath layer. If multipathd tests the path before dm-multipath has tried to activate a new path (run scsi_dh_activate to send a STPG), then the test IO will succeed and the original path will just be used again. without sending a STPG. The IO will again be failed with a standby error and and we might repeat this until the initiator is rebooted. A similar error can happen on Windows 2016. Updating my patch to handle both the unclean and clean tcmu-runner shutdown cases. |
The tcmu service restarts abnormally when the lock is obtained, the initiator sends an rtpg command to get the alua status is AO, but since the current client is not the ownership of the lock, the IO cannot be executed.
The ownership of the lock is the client before the tcmu service restarts, and it is also stored in the rbd metadata.
Signed-off-by: tangwenji [email protected]