ZFS silent corruption bug found: replaces chunks inside copied files by zeroes

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

zfs

719 10,125 9.7 C

OpenZFS on Linux and FreeBSD

I will soon prepare some scripts to do that, but I first want to confirm the core ideas and design something that I will only have to do once, and the discussion on https://github.com/openzfs/zfs/issues/15526#issuecomment-182... is still ongoing.
If you want more detail about my proposed approach, please check the discussion on https://old.reddit.com/r/zfs/comments/182x5wy/with_old_backu...
I have 18 months of backups, and could go back earlier if needed but accessing and processing each backup will take time. I don't want to do that multiple times.
Anything you can mount, using any filesystem keeping this metadata, should be usable as in input: this means from NTFS to zfs snapshots. It may even be possible to use ZIP files (which keep date and time) to feed a medata sqlite database.
Comparing the metadata DB to the actual ZFS filesystem would give you a list of suspicious files, and the most recent backup you could use to restore them
Alternatively, it could be possible to deduplicate all the backups when measuring the metadata, then to keep local copies of all the files, but it may add complexity and storage requirements.
If you are in a rush, write a script like that, but given how the null-bytes detection approach now seems flawed, you may have to rewrite that as we learn more details.
Also, there's no real fix for this bug yet that isn't introducing other problems. While we're still learning the ins and out of this >18 years old bug, I recommend to keep using a 2.1 version of zfs with zfs_dmu_offset_next_sync=0, and to bet on the low probability of corruption that allowed this bug to persist for such a long time.
Of course, keep your cold backups (don't delete!) but they can't accumulate corruption if you don't access them.

httm

98 1,199 9.9 Rust

Interactive, file-level Time Machine-like tool for ZFS/btrfs/nilfs2 (and even actual Time Machine backups!)

> It's worth noting that copy_file_range is used by a lot of things.
Yes, but the trigger feature, block cloning, only landed in the latest 2.2 release. If you immediately hopped on 2.2, and used a system with lots copy_file_range and FICLONE use, yes, you may have a problem (like, as you note, on Gentoo, where this problem surfaced).
Most people were just hopping on the bandwagon. My distro ships 2.1.5, so I have a 6 month wait until this feature lands, so I was just building copy_file_range support into my ZFS apps, right before news of this bug hit.[0]
> There are other things required to trigger the bug that are a lot less common though.
Exactly. My guess is the incidence of this will exceedingly rare for the common user/small NAS user/etc. I've run a corruption detector[0], and what I've found mostly indicates false positives. Some are build artifact fingerprints, which I don't care about, and which were deleted with the next build. The ones with an extant file on another system, I confirmed were a diff match with the origin using `rsync -rincv` and whats on snapshots with `httm --map-aliases`. So far no positive matches.
[0]: https://github.com/kimono-koans/httm

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
bmap-tools

2 222 5.5 Python

Discontinued BMAP Tools

(>_<) Oh man, I knew about [0] when I posted (which is why I said it just reduces the chance of hitting the bug (by a lot)). But after spending all Saturday JST on it, I went to bed before [1] was posted.
Skimming through #6958 though, it seems like it's the lesser of evils, compared to #15526... I think? It's less obvious (to me) what the impact of #6958 is. Is it silent undetectable corruption of your precious data potentially over years, or more likely to cause a crash or runtime error?
Reports like https://github.com/intel/bmap-tools/issues/65 make it seem more like the latter.
But I have to read more. But since the zfs_dmu_offset_next_sync setting was disabled by default until recently, I still suspect (but yeah, don't know for sure) that disabling is the safest thing we can currently do on unmodified ZFS systems.

zfs-issue-15526-check-file

1 1 7.0 Rust

> If the result is 0 for both bcloneused and bclonesaved then it's safe to say that you don't have silent corruption. [0]
People are using [`reproducer.sh`](https://gist.github.com/tonyhutter/d69f305508ae3b7ff6e9263b2...) to see if they can reproduce the bug intentionally: [1]
[This script](https://github.com/0x0177b11f/zfs-issue-15526-check-file) tries to find potentially corrupted files (with some risk of false positives) by searching for zero-byte blocks: [2]
[0]: https://github.com/openzfs/zfs/issues/15526#issuecomment-181...
[1]: https://github.com/openzfs/zfs/issues/15526#issuecomment-182...
[2]: https://github.com/openzfs/zfs/issues/15526#issuecomment-182...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project