• 0 Posts
  • 37 Comments
Joined 1 year ago
cake
Cake day: June 15th, 2023

help-circle



  • It’s not that clear cut a problem. There seems to be two elements; the kernel driver had a memory safety bug; and a definitions file was deployed incorrectly, triggering the bug. The kernel driver definitely deserves a lot of scrutiny and static analysis should have told them this bug existed. The live updates are a bit different since this is a real-time response system. If malware starts actively exploiting a software vulnerability, they can’t wait for distribution maintainers to package their mitigation - they have to be deployed ASAP. They certainly should roll-out definitions progressively and monitor for anything anomalous but it has to be quick or the malware could beat them to it.

    This is more a code safety issue than CI/CD strategy. The bug was in the driver all along, but it had never been triggered before so it passed the tests and got rolled out to everyone. Critical code like this ought to be written in memory safe languages like Rust.



  • This doesn’t really answer my question but Crowdstrike do explain a bit here: https://www.crowdstrike.com/blog/technical-details-on-todays-outage/

    These channel files are configuration for the driver and are pushed several times a day. It seems the driver can take a page fault if certain conditions are met. A mistake in a config file triggered this condition and put a lot of machines into a BSOD bootloop.

    I think it makes sense that this was a preexisting bug in the driver which was triggered by an erroneous config. What I still don’t know is if these channel updates have a staged deployment (presumably driver updates do), and what fraction of machines that got the bad update actually had a BSOD.

    Anyway, they should rewrite it in Rust.







  • Is there any reason to keep the existing set-up? If it’s just one drive, you could replace it with another and install Alma or something fresh. Then you could copy over whatever config the old system had to get up and running again. You could swap to the old drive if you needed to revert. If you have a spare machine, you could stand up the fresh setup side-by-side with the old one before swapping over.






  • Ah, that’s the misunderstanding. The original comment was talking about “watching something on another pc”. Like playing a video from a desktop PC on a laptop in another room. So it’s the samba server we want to prevent from sleeping, not the client. Yes it’d be nice to have a 24/7 media server set up, but for the simple case of sharing a file from one PC to another, it’d be nice for the server not to sleep in the middle of it by default.


  • For sure, I don’t know the internals of Samba, but surely the server knows that it’s serving a file no matter how the client accesses it. I don’t think a few dbus messages would cause issues.

    I have my own service that looks at the network traffic via /proc and a few other things. That sends the system to sleep itself if everything looks truly idle.

    I do think it would be nice for a file server like samba to inhibit sleep using the standard interface for it. But yeah, I appreciate there are complications, like video playback is presumably pulling a small extent of a file at a time, so there would have to be some kind of timer before releasing the inhibition or the system would sleep between transfers.

    EDIT: I just took a look; with loglevel set to 3 for smb and smb2 I see log messages like:

    smbd_smb2_read: fnum 1712966762, file my_video.mkv, length=262144 offset=82366464 read=262144
    

    These occur at most 10 seconds apart when playing a video over a share from another host. I don’t see why the smbd daemon couldn’t inhibit sleep untill smbd_smb2_read hasn’t run for a minute or so. You could have a script that monitors that log output and does this externally but it’d be nice to have built in.


  • Not every program or service on your system

    Of course not, but plenty do when running a task where the user is unlikely to make inputs and also doesn’t want the machine to sleep. Firefox can call org.gnome.SessionManager.Inhibit over dbus with the “video-playing” description, same for VLC. Transmission can call that interface while a transfer is in progress (with a config toggle). It seems a pretty reasonable default for samba to do the same while a long-running file transfer is ongoing.

    [Samba] doesn’t copy your files for you.

    Sure but it has to know when a transfer is running. It would be nice to have the option to inhibit sleep if the transfer is runs for a significant amount of time.