Skip to content

Conversation

@tonyhutter
Copy link
Contributor

@tonyhutter tonyhutter commented Oct 20, 2025

Motivation and Context

Proposed patch set for zfs-2.2.9.

Description

Add 6.17 kernel support, and a few fixes. Also update CI code to latest version.

How Has This Been Tested?

CI will test

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

mcmilk and others added 30 commits October 16, 2025 16:45
FreeBSD provides CI-IMAGES since some time. These images are
based on nuageinit, which does not support fqdn and sudo for
example. So we need currently some workarounds to get it
working.

The FreeBSD images will be more compatible with cloud-init in
some near future. Then we can remove the workaround things.

These versions are used for testing:
- freebsd13-4r (RELEASE)
- freebsd14-3s (STABLE)
- freebsd15-0c (CURRENT)

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#17462
When running ztest under the CI a common failure mode is for the
underlying filesystem to run out of available free space.  Since
the storage associated with a GitHub-hosted running is fixed, we
instead create a pool and use a compressed ZFS dataset to store
the ztest vdev files.  This significantly increases the available
capacity since the data written by ztest is highly compressible.
A compression ratio of over 40:1 is conservatively achieved using
the default lz4 compression.  Autotrimming is enabled to ensure
freed blocks are discarded from the backing cipool vdev file.

Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17501
FreeBSD 13.4 is EOL since June 30, 2025.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:	Alexander Motin <[email protected]>
Closes openzfs#17519
The package ksh93 is replaced by ksh now.
This works for FreeBSD 13 and 14 also.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#17523
Testing on CentOS Stream provides several months advance notice of
changes coming to the RHEL kernel.  This should help OpenZFS be
proactive instead of reactive to new RHEL minor versions.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by: Carl George <[email protected]>
ZFS-CI-Type: full
Closes openzfs#16904
Closes openzfs#17526
The latest Debian 11 image includes bullseye-backports as a default
repository in the /etc/apt/sources.list.  However, this repository
has gone end of life which effectively breaks the default install.

We shouldn't need anything in backports so lets unconditionally
remove backports on all Debian builders to resolve the issue.

Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17569
This check is currently limited to checking mismatches that occur in the
same stack frame. It does not detect across stack frames.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes openzfs#17352
Adjust the regexes to match the test line with timestamps, then remove
them for the summary. The internal timestamp is still in the full logs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17045
Chase URL change from the FreeBSD project.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Colin Percival <[email protected]>
Closes openzfs#17617
In the past there have been times when we need to generate new RPMs
for an existing ZFS release.  Typically this happens when a new RHEL
version comes out and the kernel symbols no longer match.  To get
users to auto-update we just bump the patch number.  For example, we
had to create zfs-2.1.13-1 for EL8.8 and zfs-2.1.13-2 for EL8.9.

This commit adds an optional patch level text box to the github
package builder runner.

In addition, this commit also uses `hostnamectl` instead of `hostname`
for F42+ compatibility, if available.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17638
This commit adds Debian 13 alias Trixie to the checked operating
systems. The image needs to be run with UEFI support.

Current Debian version overview:
- Debian 11 (Bullseye) -> "oldoldstable"
- Debian 12 (Bookworm) -> "oldstable"
- Debian 13 (Trixie) -> new "stable"

The CI will be run on Debian 12 and Debian 13 now.
Debian 11 is kept, but won't be used automatically.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#17648
We've seen Fedora 42 still setting up after 10 min.  Change the timeout
to 15 min.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17697
Because GitHub creates a merge commit on top of real head, so the check
on HEAD will fail regardlessly.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17695
Otherwise it might become `if [ == "" ]` which is ill-formed.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17695
- Increase qemu-1-setup.sh timeout to 20min since it sometimes
  fails to complete after 15min.

- Timestamp all qemu-1-setup.sh lines to look for hangs.

- Add a 'watchdog' process to print out the top running process every
  30sec to help with debugging.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17714
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Closes openzfs#17749
When updating a Fedora instance to an experimental kernel make sure
to include the matching versioned perf and bpftool packages.  This
helps ensure there are no unexpected conflicts which would prevent
the new packages from being installed.

Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17791
The Buildbot CI infrastructure has been fully replaced by GitHub
Actions.  Remove any lingering references from the repository.

Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17794
Signed-off-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17795
Add a -O option to zfs-test.sh to dump debug information on test
timeout.  The debug info includes:

- 30 lines from 'top'
- /proc/<PID>/stack output of process with highest CPU usage
- Last lines strace-ing process with highest CPU usage
- /proc/sysrq-trigger kernel stack traces

All debug information gets dumped to /dev/kmsg (Linux only).

In addition, print out the VM console lines from the "Setup Testing
Machines" step.  We have often see VMs timeout at this step and don't
know why.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17753
Signed-off-by: Shreshth Srivastava <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes openzfs#17815
FreeBSD 15.0-ALPHA5 image fails to boot on cloud VMs due to missing
/boot/efi mount point, causing the system to drop to single user mode
where SSH cannot start. Work around this by staying on ALPHA4 and
setting IGNORE_OSVERSION=yes to bypass pkg's kernel version mismatch
prompt during bootstrap. This allows CI to proceed with ALPHA4 until we
have a stable FreeBSD 15.0 image.

Signed-off-by: Ameer Hamza <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Closes openzfs#17846
Linux 5.16 by default fails the build on objtool warnings. We have
known and understood objtool warnings we can't fix without
involving Linux maintainers.

To work around this we introduce an objtool wrapper script which
removes the `--Werror` flag.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Closes openzfs#17456
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17443
Avoid calling dbuf_evict_one() from memory reclaim contexts (e.g. Linux
kswapd, FreeBSD pagedaemon). This prevents deadlock caused by reclaim
threads waiting for the dbuf hash lock in the call sequence:
dbuf_evict_one -> dbuf_destroy -> arc_buf_destroy

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Kaitlin Hoang <[email protected]>
Closes openzfs#17561
03987f7 (openzfs#16069) added a workaround to get the blk-mq hardware
context for older kernels that don't cache it in the struct request.
However, this workaround appears to be incomplete.

In 4.19, the rq data context is optional. If its not initialised, then
the cached rq->cpu will be -1, and so using it to index into mq_map
causes a crash.

Given that the upstream 4.19 is now in extended LTS and rarely seen,
RHEL8 4.18+ has long carried "modern" blk-mq support, and the cached
hardware context has been available since 5.1, I'm not going to huge
lengths to get queue selection correct for the very few people that are
likely to feel it. To that end, we simply call raw_smp_processor_id() to
get a valid CPU id and use that instead.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes openzfs#17597
Update the META file to reflect compatibility with the 6.16
kernel.

Tested with 6.16.0-0-stable of Alpine Linux edge, see
<https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/87929>.

Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Achill Gilgenast <[email protected]>
Closes openzfs#17578
We only have extremely narrow uses, so move it all into a single
function that does only what we need, with and without d_set_d_op().

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17621
Three cases were discovered where 'zpool add' would fail to
warn when adding vdevs to a pool with a mismatched replication
level.  These are:

  1. When a pool contains mixed file and disk vdevs.
  2. When a pool contains an active dRAID distributed spare
  3. When a pool contains an active hot spare

The lack of warnings are caused by get_replication() assessing
the current pool configuration an inconsistent and disabling
the mismatched replication check for the new pool configuration
after 'zpool add'.  This change updates get_replication() to
be slightly more tolerant in the non-fatal case.

The zpool_add_010_pos.ksh test case was split in to separate
tests: zpool_add_warn_create.ksh, pool_add_warn_degraded.ksh,
and zpool_add_warn_removal.  These test were extended to
include coverage for dRAID pools and the three scenarios
described above.

Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17780
behlendorf and others added 8 commits October 20, 2025 14:56
Provide an interface to retrieve the lowest and highest minimum
allocation size for the normal allocation class.  This can be used
by external consumers of the DMU to estimate potential wasted
capacity when setting the recordsize for an object.

The new "min_alloc" and "max_alloc" keys are added to the pool
configuration and used by default_volblocksize() to warn when
an ineffecient block size is requested.  For older kmods which
don't yet include the new keys fallback to the previous logic.

Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17758
Update the META file to reflect compatibility with the 6.17
kernel.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#17789
In all cases, rely on mktemp itself to make the best decision about
where to place the file or directory. In all cases, that decision will
be $TMPDIR, which we have set globally.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
The zvol blk-mq codepaths would erroneously send FLUSH and TRIM
commands down the read codepath, rather than write.  This fixes
the issue, and updates the zvol_misc_fua test to verify that
sync writes are actually happening.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Ameer Hamza <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17761
Closes openzfs#17765
ZVOLs don't support all block layer IO request types.  Add a check for
the IO types we do support.  Also, remove references to
io_is_secure_erase() since they are not supported on ZVOLs.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes openzfs#17803
Move a bracket '{' to make checkstyle happy

Signed-off-by: Tony Hutter <[email protected]>
I got a newer shellcheck, and it pointed out that read without a target
variable is not POSIXly. The var was removed in c3ef9f7, so I put it
back, and now shellcheck complains about an unused var. That's actually
correct, but necessary, so I've added a suppression for that, probably
better.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes openzfs#17626
This changes the basic search algorithm from a single search up and down
the tree to a full depth-first traversal to handle conditions where the
tree matches at a higher level but not a lower level.

Normally higher level blocks always point to matching blocks, but there
are cases where this does not happen:

1. Racing block pointer updates from dbuf_write_ready.

   Before f664f1e (openzfs#8946), both dbuf_write_ready and
   dnode_next_offset held dn_struct_rwlock which protected against
   pointer writes from concurrent syncs.

   This no longer applies, so sync context can f.e. clear or fill all
   L1->L0 BPs before the L2->L1 BP and higher BP's are updated.

   dnode_free_range in particular can reach this case and skip over L1
   blocks that need to be dirtied. Later, sync will panic in
   free_children when trying to clear a non-dirty indirect block.

   This case was found with ztest.

2. txg > 0, non-hole case. This is openzfs#11196.

   Freeing blocks/dnodes breaks the assumption that a match at a higher
   level implies a match at a lower level when filtering txg > 0.

   Whenever some but not all L0 blocks are freed, the parent L1 block is
   rewritten. Its updated L2->L1 BP reflects a newer birth txg.

   Later when searching by txg, if the L1 block matches since the txg is
   newer, it is possible that none of the remaining L1->L0 BPs match if
   none have been updated.

   The same behavior is possible with dnode search at L0.

   This is reachable from dsl_destroy_head for synchronous freeing.
   When this happens open context fails to free objects leaving sync
   context stuck freeing potentially many objects.

   This is also reachable from traverse_pool for extreme rewind where it
   is theoretically possible that datasets not dirtied after txg are
   skipped if the MOS has high enough indirection to trigger this case.

In both of these cases, without backtracking the search ends prematurely
as ESRCH result implies no more matches in the entire object.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Akash B <[email protected]>
Signed-off-by: Robert Evans <[email protected]>
Closes openzfs#16025
Closes openzfs#11196
@amotin amotin added the Status: Code Review Needed Ready for review and testing label Oct 22, 2025
@amotin
Copy link
Member

amotin commented Oct 22, 2025

It seems to miss something on Centos:

    LD [M]  /tmp/zfs-build-zfs-9qvGK3GE/BUILD/zfs-2.2.9/module/spl.o
  /bin/sh: line 1: ../scripts/objtool-wrapper: No such file or directory
  make[7]: *** [scripts/Makefile.build:425: /tmp/zfs-build-zfs-9qvGK3GE/BUILD/zfs-2.2.9/module/spl.o] Error 127
  make[7]: *** Deleting file '/tmp/zfs-build-zfs-9qvGK3GE/BUILD/zfs-2.2.9/module/spl.o'

@AttilaFueloep
Copy link
Contributor

Most likely commit 8de8e0d ( #17541 ).

@tonyhutter
Copy link
Contributor Author

@amotin @AttilaFueloep thanks for the heads-up, I'll pull that in

AttilaFueloep and others added 4 commits October 22, 2025 10:34
Older kernel versions run make outside of the build directory. This
works since all paths are absolute. Relative paths will fail in such
a scenario.

Use an absolute path to the objtool wrapper as well, since the
relative path breaks the build on older kernels.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Closes openzfs#17541
6cf17f6 (openzfs#17456) introduced a change to `configure.ac` which
breaks the patching done in the Debian packages DKMS source
installation phase. This results in a failed module build.

Adapt the awk script doing the patching to handle the added
`AC_CONFIG_FILE` entry.

Reviewed-by: Brian Behlendorf <[email protected]>
Tested-by: Shengqi Chen <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Closes openzfs#17633
Closes openzfs#17646
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Attila Fülöp <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes openzfs#17633
Closes openzfs#17646
META file and changelog updated.

Signed-off-by: Tony Hutter <[email protected]>
@amotin
Copy link
Member

amotin commented Oct 22, 2025

Just for a note, we have this PR open for 2.2: #17583 . I wonder if it would be cleaner to include the dependencies (if possible) rather than rewrite, though I don't have significant interest in this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Code Review Needed Ready for review and testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.