Can't import pool read-write after PSU failure, but can read-only #17820

mulroony · 2025-10-06T19:48:42Z

mulroony
Oct 6, 2025

We recently had a dual PSU failure in a JBOD. We replaced one of the failed PSU with a spare while waiting on the replacement. When we try to import the zpool (either just zpool import zdata3 or zpool import -F zdata3) it immediately suspends itself. When imported read-only it seems just fine.

I suspect there might still be an underlying hardware issue (working on getting the power situation fully redundant and checking with the hardware vendor for other signs of failures), but was hoping to run this past the community and see if there are any suggestions.

Let me know if additional information is needed / helpful.

$ zpool status zdata3
  pool: zdata3
 state: ONLINE
  scan: resilvered 18.0T in 3 days 04:43:51 with 0 errors on Sat Sep 27 15:31:13 2025
config:

	NAME                                 STATE     READ WRITE CKSUM
	zdata3                               ONLINE       0     0     0
	  draid2:8d:240c:1s-0                ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7015720  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70175b0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c701ebf4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c702005c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70245ac  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7024680  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7024a40  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7024f88  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca40f219cb0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7025438  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70254e8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c702557c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca40f3100e8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7025a3c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7025ae8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7026444  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c702644c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7026780  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c702fa2c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca40f1c6368  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7032228  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7033f00  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c703402c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70341c0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70351e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70352c4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70354b4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70356f4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c703570c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7035d84  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7036214  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7046118  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704d104  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704dc18  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704dc5c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704df2c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704e2e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704e2f4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704ee50  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704ef24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704f0d4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704f610  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704f65c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704f6b0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c704facc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7050150  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70503c8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7052de4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7054310  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7055628  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7058a60  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70596e8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2ed5b5f54  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c705ee90  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c705f024  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7060dbc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7061e64  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707dfe4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e014  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e030  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e038  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e050  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e114  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e170  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e240  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e258  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2ed55553c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e2ec  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e32c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e3fc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c707e4e0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080da0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080ec8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080f10  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca40f175c48  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080fc4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080fc8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7080ff0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2ed5bb1b0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70810c0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70810d0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7081140  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c72d92b8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7081198  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c72c1928  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70811fc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7081280  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708134c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70813e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70814b0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70814ec  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7082eb8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7082ed8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7082f9c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7082fcc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708305c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70830d8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083120  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708316c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708318c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083194  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083208  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70832a8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70832bc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083304  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70833c0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70833e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708343c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708349c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70834e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70834fc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083530  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083558  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70835ac  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70835b8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70835e8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70835f8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708368c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70836ac  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70836d4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70837f0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708383c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083858  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083894  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083928  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708395c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70839c8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083aa4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083b14  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083c24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083cc4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083ce8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083d70  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083ec0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083f18  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083f24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7083f80  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084008  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708415c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084168  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70841d8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084368  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70844ac  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70845a4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c75efce0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084680  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084684  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70846ac  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084710  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084770  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70847c8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70848e0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084934  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084964  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70849a4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084b88  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084c0c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084c30  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084c64  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084d24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084d7c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084dcc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084e24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084f20  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084fa4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7084fc8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085048  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085050  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708513c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085b0c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085b3c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085b54  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085c24  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085cbc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085cec  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085d40  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085d5c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085d8c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085e40  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085f3c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085f44  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085f9c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7085fd8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086008  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086048  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086094  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70860dc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086120  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70863e4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086410  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708646c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708647c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70864e8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086524  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708656c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70865b0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70865dc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70865ec  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708667c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708669c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086700  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086744  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70867c4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70867d8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70867ec  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70867f0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708680c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086818  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708683c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086840  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c70868fc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086914  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086930  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086950  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c708699c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086a6c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086a78  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086a98  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086acc  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086b04  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086b14  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086bc8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086c28  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086e20  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086e50  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086e70  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086e8c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086ec8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086ed8  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086f1c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086f4c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086f5c  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086f90  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7086fa4  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7087014  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7087044  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2ed5b5fa0  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7087054  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2c7087184  ONLINE       0     0     0
	    dm-uuid-mpath-35000cca2ed5b2cb8  ONLINE       0     0     0
	spares
	  draid2-0-0                         AVAIL   

errors: No known data errors

From zpool events -vf

Oct  6 2025 11:07:23.721414336 sysevent.fs.zfs.history_event
        version = 0x0
        class = "sysevent.fs.zfs.history_event"
        pool = "zdata3"
        pool_guid = 0x52649849daeb4397
        pool_state = 0x0
        pool_context = 0x0
        history_hostname = "shasta"
        history_internal_str = "pool version 5000; software version 3e4a3e1-dist; uts shasta 4.18.0-553.75.1.el8_10.x86_64 #1 SMP Wed Sep 10 00:05:32 EDT 2025 x86_64"
        history_internal_name = "open"
        history_txg = 0x1900bfe
        history_time = 0x68e4055b
        time = 0x68e4055b 0x2affe8c0
        eid = 0xc

Oct  6 2025 11:07:25.942536661 ereport.fs.zfs.io_failure
        class = "ereport.fs.zfs.io_failure"
        ena = 0xc8da41cd631c801
        detector = (embedded nvlist)
                version = 0x0
                scheme = "zfs"
                pool = 0x52649849daeb4397
        (end detector)
        pool = "zdata3"
        pool_guid = 0x52649849daeb4397
        pool_state = 0x0
        pool_context = 0x0
        pool_failmode = "wait"
        time = 0x68e4055d 0x382df7d5
        eid = 0xd

And from /proc/spl/kstat/zfs/dbgmsg during the import

1759773953   vdev.c:2543:vdev_copy_path_impl(): vdev_copy_path: vdev 16626179716460605126: vdev_enc_sysfs_path changed from '/sys/class/enclosure/1:0:61:0/SLOT 36,8LG4EGZB            ' to '/sys/class/enclosure/1:0:61:0/SLOT 36,9AHME8UR            '

1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 17185, smp_length 327760, unflushed_allocs 4878049280, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860105 ms, loading_time 39 ms, ms_max_size 230031360, max size error 230031360, old_weight 6c0000000000004, new_weight 6c0000000000004
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18277, smp_length 196936, unflushed_allocs 2304942080, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860145 ms, loading_time 34 ms, ms_max_size 188252160, max size error 188252160, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18365, smp_length 255224, unflushed_allocs 58163200, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860179 ms, loading_time 42 ms, ms_max_size 206274560, max size error 206274560, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18377, smp_length 107496, unflushed_allocs 16261120, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860222 ms, loading_time 41 ms, ms_max_size 238223360, max size error 238223360, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18443, smp_length 29008, unflushed_allocs 3522560, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860264 ms, loading_time 39 ms, ms_max_size 173342720, max size error 173342720, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18498, smp_length 243320, unflushed_allocs 3522560, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860304 ms, loading_time 44 ms, ms_max_size 160071680, max size error 160071680, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18515, smp_length 21600, unflushed_allocs 8478720, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860348 ms, loading_time 16 ms, ms_max_size 240025600, max size error 240025600, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18589, smp_length 244216, unflushed_allocs 7782400, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860364 ms, loading_time 30 ms, ms_max_size 192839680, max size error 192839680, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   metaslab.c:2469:metaslab_load_impl(): metaslab_load: txg 26217470, spa zdata3, vdev_id 0, ms_id 18750, smp_length 163752, unflushed_allocs 27115520, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 860394 ms, loading_time 38 ms, ms_max_size 238551040, max size error 238551040, old_weight 6c0000000000003, new_weight 6c0000000000003
1759774043   spa_history.c:310:spa_history_log_sync(): txg 26217470 open pool version 5000; software version 3e4a3e1-dist; uts shasta 4.18.0-553.75.1.el8_10.x86_64 #1 SMP Wed Sep 10 00:05:32 EDT 2025 x86_64
1759774045   vdev_label.c:2031:vdev_config_sync(): vdev_label_sync_list() returned error 5 for pool 'zdata3' when syncing out the even labels of dirty vdevs

Answered by pcd1193182

Oct 15, 2025

This turns out to be a fascinating one. This is the first time I've seen this, but what happened here is that the vdev config is larger than fits in the label, and so the vdevs all fail to sync their labels out. Because they don't even issue the write, the parent IO (for the top-level vdev) sees good_writes == 0 and sets the error to be EIO. The error message for this is terrible, which is probably something we should fix.

The extreme size of the vdev config is the result of a combination of two things: First, there are 240 disks in a single top-level vdev, which is a lot. Second, the disks have a path, devid, phys_path, and enclosure path all configured. The thing that triggered the prob…

View full answer

mulroony · 2025-10-07T14:52:02Z

mulroony
Oct 7, 2025
Author

Been poking more and best I can tell the underlying hardware is just fine. Tried one rewind to a previous txg and no luck but since it seems to be on writes it makes sense that wouldn't fix it (I think).

Forgot to add this...

$ cat /etc/os-release 
NAME="Red Hat Enterprise Linux"
VERSION="8.10 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.10"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.10 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8"
BUG_REPORT_URL="https://issues.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.10
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.10"

$ zpool --version
zfs-2.2.8-1
zfs-kmod-2.2.8-1

0 replies

pcd1193182 · 2025-10-15T20:53:17Z

pcd1193182
Oct 15, 2025
Collaborator

This turns out to be a fascinating one. This is the first time I've seen this, but what happened here is that the vdev config is larger than fits in the label, and so the vdevs all fail to sync their labels out. Because they don't even issue the write, the parent IO (for the top-level vdev) sees good_writes == 0 and sets the error to be EIO. The error message for this is terrible, which is probably something we should fix.

The extreme size of the vdev config is the result of a combination of two things: First, there are 240 disks in a single top-level vdev, which is a lot. Second, the disks have a path, devid, phys_path, and enclosure path all configured. The thing that triggered the problem to start happening is that, after the outage/reboot, ZFS is trying to set the phys_path for every single disk in the vdev; the config already in the label doesn't have it set for each disk, and the newly generated one does. Weirdly, it is setting them all to the same value; a single entry in /dev/disk/by-uuid. It seems like there's some combination of a bug/misconfiguration in libudev, a bug in ZFS, and the extreme scale of this dRAID vdev working together to overflow the label config.

The way we worked around the issue is by setting ZFS_VDEV_DEVID_OPT_OUT=YES in the environment before running the import. This strips the devid and phys_path out of the config, which brings it down below the size cap. This can't occur (in this same way) in versions before 2.2.3, since that's the version where we started storing those paths in the config. Before that, you'd need to have a truly ridiculous number of leaves under a single top-level vdev to trigger this issue.

0 replies

mulroony · 2025-10-16T01:47:15Z

mulroony
Oct 16, 2025
Author

Really big thanks to Paul on this. Boils down to my fault for creating a really large dRAID, nearing the max size apparently. We have a plan to better layout the pool.

Just wanted to add a few more bits of info here. The pool was created on 2022-11-02. Not sure which version of ZFS we were running but it has been through a few upgrades. The child disks ended up each with a few keys which really increased the size of the label. Unfortunately I do not have a full capture of zdb output but did grab this.

                children[0]:
                type: 'draid'
                id: 0
                guid: 12204684796498786716
                nparity: 2
                draid_ndata: 8
                draid_nspares: 1
                draid_ngroups: 239
                metaslab_array: 384
                metaslab_shift: 36
                ashift: 12
                asize: 4780128647249920
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 4238001315091840211
                    whole_disk: 0
                    DTL: 72216
                    create_txg: 4
                    path: '/dev/disk/by-id/dm-uuid-mpath-35000cca2c7015720'
                    devid: 'dm-uuid-mpath-35000cca2c7015720'
                    phys_path: '/dev/disk/by-uuid/5937037651754042263'
                    vdev_enc_sysfs_path: '/sys/class/enclosure/1:0:61:0/SLOT 18,8LG0RVDE            '

Looking at some other pools on other systems, some very old, looks like we have some values in 'phys_path' that are really confusing. Above that 'phys_path' is the guid of the zpool. On all the EL8/9 systems I could find '/usr/lib/udev/rules.d/13-dm-disk.rules' will create '/dev/disk/by-uuid/ZPOOL_GUID' to the most recently added disk when triggered.

ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"

$ zpool get guid zdata3
NAME    PROPERTY  VALUE                SOURCE
zdata3  guid      5937037651754042263  -

$ ll /dev/disk/by-uuid/5937037651754042263 
lrwxrwxrwx 1 root root 12 Oct 15 13:25 /dev/disk/by-uuid/5937037651754042263 -> ../../dm-149

So all the disks that I could see in the pool had the same 'phys_path' value, which was a pointer to one disk in the pool. Looking at other older pools they have some phys_path values that must be from a test another admin did 5+ years ago but there is no current entry for on the system so seems like it is not updating.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't import pool read-write after PSU failure, but can read-only #17820

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can't import pool read-write after PSU failure, but can read-only #17820

Uh oh!

Uh oh!

mulroony Oct 6, 2025

Replies: 3 comments

Uh oh!

mulroony Oct 7, 2025 Author

Uh oh!

Uh oh!

pcd1193182 Oct 15, 2025 Collaborator

Uh oh!

mulroony Oct 16, 2025 Author

mulroony
Oct 6, 2025

mulroony
Oct 7, 2025
Author

pcd1193182
Oct 15, 2025
Collaborator

mulroony
Oct 16, 2025
Author