-
Notifications
You must be signed in to change notification settings - Fork 268
Description
Hi,
during evaluation of btrfs I noticed that the following scenario causes a broken file system:
- create RAID1 on two drives disk1 and disk2
- write data
- remove disk2, replace with new with
btrfs replaceand wait for completion - remove disk1
- mounting fails
Running btrfs balance after removing disk2 prevents the issue. But this is not (clearly) documented in man btrfs-replace which states "On a live filesystem, duplicate the data to the target device which is currently stored on the source device."
Tested on Debian Trixie running Linux Kernel 6.12.63 and btrfs-progs 6.14-1.
Is this the intended behavior? If so the documentation should be updated (I can provide a pull request). If not this should be fixed.
The following script reproduces this issue (using loop devices, but this was also reproduced on real hardware):
set -eu
cd /tmp
mkdir -p mnt
# Create two 10GiB disks
rm -f disk0; truncate -s 10G disk0
rm -f disk1; truncate -s 10G disk1
# Btrfs needs to see both devices when mounting
losetup /dev/loop0 disk0
losetup /dev/loop1 disk1
# Initialize btrfs RAID1 and create a file with random data
mkfs.btrfs --data raid1 --metadata raid1 /dev/loop0 /dev/loop1
mount /dev/loop0 mnt
dd if=/dev/urandom bs=1G count=1 > mnt/data
sha512sum mnt/data > mnt/data.sha512sum
umount mnt
# Destroy data on second disk
rm -f disk1; truncate -s 10G disk1
losetup -d /dev/loop1; losetup /dev/loop1 disk1
# Not necessary, but just to make clear it's not a cache issue
echo 3 > /proc/sys/vm/drop_caches
# Replace second disk (-B waits until replace is complete)
mount -o degraded /dev/loop0 mnt
btrfs replace start -B 2 /dev/loop1 mnt
# btrfs filesystem usage -T mnt
# btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft mnt
# echo write > mnt/test
umount mnt
# Destroy data on first disk
rm -f disk0; truncate -s 10G disk0
losetup -d /dev/loop0; losetup /dev/loop0 disk0
# Not necessary
echo 3 > /proc/sys/vm/drop_caches
# Attempt to mount
mount -o degraded /dev/loop1 mnt
sha512sum -c mnt/data.sha512sum
# Cleanup
umount mnt
losetup -d /dev/loop0; rm disk0
losetup -d /dev/loop1; rm disk1
The mount fails with:
mount: /tmp/mnt: can't read superblock on /dev/loop1.
dmesg(1) may have more information after failed mount system call.
BTRFS info (device loop1): first mount of filesystem 5f1f3583-8c44-4671-9831-02bd8ff743e1
BTRFS info (device loop1): using crc32c (crc32c-intel) checksum algorithm
BTRFS warning (device loop1): devid 1 uuid 2b43640d-819f-42d2-9559-e8e284b84be1 is missing
BTRFS error (device loop1): failed to read chunk root
BTRFS error (device loop1): open_ctree failed: -5
Writing data before the unmount changes the result:
mount: /tmp/mnt: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error.
dmesg(1) may have more information after failed mount system call.
BTRFS warning (device loop1): devid 1 uuid b015b488-4a96-4c5a-81f2-cbec50dfe6ca is missing
BTRFS warning (device loop1): chunk 1372585984 missing 1 devices, max tolerance is 0 for writable mount
BTRFS warning (device loop1): writable mount is not allowed due to too many missing devices
BTRFS error (device loop1): open_ctree failed: -22
Mounting with -o degraded,ro works but an attempt to replace the missing disk fails with ERROR: ioctl(DEV_REPLACE_START) failed on "mnt": Read-only file system.
btrfs filesystem usage -T before the unmount shows the problem:
Overall:
Device size: 20.00GiB
Device allocated: 3.80GiB
Device unallocated: 16.20GiB
Device missing: 0.00B
Device slack: 0.00B
Used: 2.00GiB
Free (estimated): 11.47GiB (min: 8.77GiB)
Free (statfs, df): 7.96GiB
Data ratio: 1.50
Metadata ratio: 1.50
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: yes (data, metadata, system)
Data Data Metadata Metadata System System
Id Path single RAID1 single RAID1 single RAID1 Unallocated Total Slack
-- ---------- ------- ------- --------- --------- -------- ------- ----------- -------- -----
1 /dev/loop0 1.00GiB 1.00GiB 256.00MiB 256.00MiB 32.00MiB 8.00MiB 7.46GiB 10.00GiB -
2 /dev/loop1 - 1.00GiB - 256.00MiB - 8.00MiB 8.74GiB 10.00GiB -
-- ---------- ------- ------- --------- --------- -------- ------- ----------- -------- -----
Total 1.00GiB 1.00GiB 256.00MiB 256.00MiB 32.00MiB 8.00MiB 16.20GiB 20.00GiB 0.00B
Used 0.00B 1.00GiB 0.00B 1.14MiB 16.00KiB 0.00B
Best,
Simon