possible lock starvation in lock conflict scenerio #1126

skmprabhu252 · 2024-05-06T12:18:36Z

Lets say

Process-1 takes a blocking read lock and goes into sleep for 5 seconds.
Process-2 requests a blocking write lock, and the lock will be added to the blocked lock list since Ganesha finds a conflicting lock in the internal structure. The sbd_grant_type is set to STATE_GRANT_INTERNAL.
Process-3 continuously sends read lock requests and unlocks in a loop.
Process-4 from NFS client-2 continuously sends read lock requests and unlocks in a loop.

The problem is that the write lock request from Process-2 may starve for a long time if there is continuous conflict in the FSA. I suggest considering upgrading STATE_GRANT_INTERNAL to STATE_GRANT_POLL after the first failure.

In try_to_grant_lock()

                blocked = lock_entry->sle_blocked;
                lock_entry->sle_blocked = STATE_GRANTING;
                if (lock_entry->sle_block_data->sbd_grant_type ==
                    STATE_GRANT_NONE)
                        lock_entry->sle_block_data->sbd_grant_type =
                            STATE_GRANT_INTERNAL;

                status = call_back(lock_entry->sle_obj,
                                   lock_entry);

                if (status == STATE_LOCK_BLOCKED) {
                        /* The lock is still blocked, restore it's type and
                         * leave it in the list.
                         */
                        lock_entry->sle_blocked = blocked;
                        lock_entry->sle_block_data->sbd_grant_type =
                                                        STATE_GRANT_NONE;
                        LogEntry("Granting callback left lock still blocked",
                                 lock_entry);
                        return;
                }

In the above code, I guess we need to change

                        lock_entry->sle_block_data->sbd_grant_type =
                                                        STATE_GRANT_NONE;

to

                        lock_entry->sle_block_data->sbd_grant_type =
                                                        STATE_GRANT_POLL;

The text was updated successfully, but these errors were encountered:

ffilz · 2024-05-06T17:49:39Z

Hmm, I'd have to think about this. We haven't really addressed lock fairness and there is a question how much energy to put into NFSv3. What does GPFS itself do with this scenario? Also, is this an actual customer scenario or just a torture test that QA came up with?

skmprabhu252 · 2024-05-06T23:26:29Z

This is a customer scenario where performance is degraded due to frequent lock conflicts. Ganesha requests a non-blocking lock instead of a blocking lock, causing GPFS FSAL to return an error. If it were a blocking lock, the request might be queued and granted with priority.

ffilz added Analyzing Need Info Need more information from the reporter labels May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

possible lock starvation in lock conflict scenerio #1126

possible lock starvation in lock conflict scenerio #1126

skmprabhu252 commented May 6, 2024

ffilz commented May 6, 2024

skmprabhu252 commented May 6, 2024

possible lock starvation in lock conflict scenerio #1126

possible lock starvation in lock conflict scenerio #1126

Comments

skmprabhu252 commented May 6, 2024

ffilz commented May 6, 2024

skmprabhu252 commented May 6, 2024