[rant] everytime i give it a chance, btrfs lets me down

beleza pura@lemmy.eco.br · 1 month ago

[rant] everytime i give it a chance, btrfs lets me down

Atemu@lemmy.ml · edit-2 29 days ago

It only works if the hardware doesn’t lie about write barriers. If it says it’s written some sectors, btrfs assumes that reading any of those sectors will return the written data rather than the data that was there before. What’s important here isn’t that the data will forever stay in-tact but ordering. Once a metadata generation has been written to disk, btrfs waits on the write barrier and only updates the superblock (the final metadata “root”) afterwards.

If the system loses power while the metadata generation is being written, all is well because the superblock still points at the old generation as the write barrier hasn’t passed yet. On the next boot, btrfs will simply continue with the previous generation referenced in the superblock which is fully committed.
If the hardware lied about the write barrier before the superblock update though (i.e. for performance reasons) and has only written e.g. half of the sectors containing the metadata generation but did write the superblock, that would be an inconsistent state which btrfs cannot trivially recover from.

If that promise is broken, there’s nothing btrfs (or ZFS for that matter) can do. Software cannot reliably protect against this failure mode.
You could mitigate it by waiting some amount of time which would reduce (but not eliminate) the risk of the data before the barrier not being written yet but that would also make every commit take that much longer which would kill performance.

It can reliably protect against power loss (bugs not withstanding) but only if the hardware doesn’t lie about some basic guarantees.

FuckBigTech347@lemmygrad.ml · 28 days ago

I had a drive where data would get silently corrupted after some time no matter what filesystem was on it. Machine’s RAM tested fine. Turned out the write cache on the drive was bad! I was able to “fix” it by disabling the cache via hdparm until I was able to replace that drive.