Skip to content

v2.3.3 deadlock: pool Close and query sessions hang when BeforeClose cleanup and search-path locking contend #4917

@pskrbasu

Description

@pskrbasu

In v2.3.3, two concurrency changes can deadlock Steampipe service and block dynamic connection refresh:

  1. The pgx BeforeClose hook now locks sessionsMutex to clean up session entries. When pool.Close() runs during service stop/restart or pool reset, it waits for all conns to run this hook. If any goroutine holds sessionsMutex (e.g., during AcquireSession), Close() can
    hang. Symptoms: steampipe service stop/restart appears stuck; Postgres shows backends in BIND; dynamic connection schemas never materialize (“relation … does not exist”).
  2. GetRequiredSessionSearchPath/SetRequiredSessionSearchPath now use an exclusive searchPathMutex. During concurrent query setup (ensureSessionSearchPath), one goroutine holding this lock can block others in BIND state. If shutdown/refresh coincides, the BeforeClose
    hook waits on sessionsMutex, compounding the deadlock.

Reports from containerized deployments (v2.3.3) show:

  • Queries against dynamically created connections return “relation … does not exist”.
  • Service stop/restart hangs; requires pkill to terminate.
  • Postgres activity shows connections stuck in BIND.

What to fix:

  • Make BeforeClose cleanup non-blocking (try-lock or defer cleanup) so pool.Close() can’t hang.
  • Use RW locks or otherwise avoid exclusive locks for search-path reads to prevent query/session setup stalls.

Impact: Service stop/restart can hang; dynamic schema refresh fails; queries time out. Observed in 2.3.3; 2.3.2 reportedly OK.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions