The bug that was very occasionally corrupting data on file copies in OpenZFS 2.2.0 has been identified and fixed, and there's a fix for the previous OpenZFS release too.
The OpenZFS development team have put out not one but two new releases of the open-source cross-platform filesystem for Linux and FreeBSD. Version 2.2.2 fixes the problem that showed up in the latest version, which is included in FreeBSD 14 as well as several Linux distros, including Ubuntu 23.10.
There's also a new release in the previous version of OpenZFS: version 2.1.14 which applies to FreeBSD back to version 12.
This was necessary because while, as we reported a week ago, it was OpenZFS 2.2.0 that brought the issue to light and made it visible, it didn't actually cause the problem.
It merely exposed an underlying bug which had been around for years: OpenZFS 2.2.0's new, faster copy function simply made the existing issue much more likely to happen.
The FreeBSD project has published an errata notice, and made fixes available for FreeBSD 12, 13 and 14.
The investigation that's been going on since then has revealed more.
The bug was also confirmed in Illumos, the open-source fork of OpenSolaris which has continued development since Oracle killed off the open source project in 2010.
It looks like Red Hat backported this functionality from Coreutils 9.x to 8.x, and it's been identified in CentOS Stream 9 as well as in the OpenELA source code.
I'd link to the corresponding RHEL code, but sadly they no longer publish it.
RHEL doesn't include OpenZFS, so this data-loss issue will not affect it.
RHEL doesn't even include Btrfs but Oracle Linux does, although that's no cause for concern here: Btrfs itself is immune from the bug.
What this illustrates is the problem with trying to pin down affected versions.
As we described back in June, Red Hat puts a lot of engineering time and effort into backporting features from newer kernels into its very-long-term supported enterprise kernels.
These optimizations are perfectly safe on the Big Purple Hat's own distro, and indeed its RHELatives such as Oracle and Alma and so on.
Such changes can get picked up by other distros, or even by people hand-building complex bespoke installations.
There's a newer overview of the issue on Github, but the investigation as to when the bug first appeared is still underway, as the comments there show.
Although bug fix #15571 in these two new OpenZFS releases does resolve the issue, another, newer attempt to fix the issue in a cleaner way is also under investigation as bug fix #15615.
ZFS is a complex filesystem, and this is a complex bug that may have remained hidden for 17 years.
If there is a simpler, cleaner way to fix the issue, that would be a good thing.
This Cyber News was published on go.theregister.com. Publication date: Mon, 04 Dec 2023 16:43:39 +0000