• USB enclosures and Linux

    I’m beginning to see why everyone says not to use USB enclosures with Linux. It’s a cursed product category. I’m finding it very hard to find a simple two bay JBOD enclosure that might work reliably. OTOH I’ve been using 2.5″ USB drives with a server successfully for a year+ now, this general concept should work. And I’ve been using a new drive in a SABRENT 3.5″ single disk enclosure successfully. But I’m having a hard time finding 3.5″ enclosures that take more than one disk.

    Update 1: see at bottom

    Update 2: see this video which talks favorably about newer USB 3.2 enclosures like the Mediasonic Probox.

    Update 3: see this Reddit post I made.

    Horror stories abound of buggy JMicron chipsets. It’s never quite clear if the reports are real, they’re mostly ranty Amazon reviews. But definitely bad vibes.

    I know for sure there’s one big problem with the USB enclosure product category: power restoration. Most of these enclosures will not turn back on when power is restored until you press a button on them box. WTF? Why would you build hardware like that?

    Also USB enclosures typically have a low power / idle mode when not used for awhile. I’m OK with it taking a few seconds for the disks to spin up (although it’d be nice to have control over that). But some of them seem to never come back online or start throwing errors.

    None of these devices I’ve bought recently work out of the box with smartctl. They aren’t identified correctly. smartctl -d sat works around the problem, sort of. There’s also some way to register the device in a database to teach smartctl about it. This identification thing seems to be a problem with the UAS driver in particular, or maybe it’s the 2022 version of smartctl Debian is installing.

    SABRENT EC-KSL3, it works?

    I’ve had decent luck for a few days with a SABRENT EC-KSL3, a $30 single disk enclosure. I don’t love the physical toolless design but the interface seems to work with my Linux box running OMV. It’s fanless aluminum which I think is OK for a single disk.

    The device has a physical rocker switch for power. I need to test it but I assume that means it will come back on when power is restored. I think it’s gone to sleep a few times but seems to come back every time. I saw a couple of resets from the kernel driver early in trying to use it but have gone most of a day now without a single reset. So good?

    The chipset is some sort of JMicron, not sure which.

    Yottamaster PS200U3: it doesn’t work?

    Much worse first experience with a Yottamaster PS200U3, a $80 dual disk enclosure. The physical design is pretty nice, they even included decent tools and extra screws. No fan and it’s plastic inside, so some question about thermals.

    It has pushbutton power you definitely have to press to get it to turn on after a power failure. (Confirmed with support.) But none of that matters because the device is just not working. Ugh!

    The chipset is a JMicron JMS56x.

    I set it up with TrueNAS and my ZFS array disappeared within a few minutes. At least, TrueNAS can’t find it. Looking at the kernel log I see a whole lot of errors from the UAS driver and attempts to reset followed by giving up on the device entirely.

    Whatever the case the disks in this enclosure just do not work with TrueNAS scale. There could be other sources of errors: the disks themselves, or the virtualization in the middle, or maybe TrueNAS. But I’m inclined to blame the enclosure.

    Here’s an example dmesg log for errors in TrueNAS for the USB disk. This gets triggered just trying to look at the Storage status page in the web UI. The first line shows a reset at 67 seconds, the rest of it is another problem starting at 300 seconds (errors) and then a rest at 330s. I think these things have a 5 minute idle timeout so that doesn’t even explain it.

    [   67.520053] scsi host3: uas_eh_device_reset_handler success
    [  300.078621] sd 3:0:0:1: [sdc] tag#6 uas_eh_abort_handler 0 uas-tag 4 inflight: IN
    [  300.078642] sd 3:0:0:1: [sdc] tag#6 CDB: Read(16) 88 00 00 00 00 00 00 00 22 20 00 00 00 e0 00 00
    [  330.798889] sd 3:0:0:1: [sdc] tag#4 uas_eh_abort_handler 0 uas-tag 3 inflight: IN
    [  330.799100] sd 3:0:0:1: [sdc] tag#4 CDB: ATA command pass through(16) 85 08 0e 00 d5 00 01 00 06 00 4f 00 c2 00 b0 00
    [  330.814733] scsi host3: uas_eh_device_reset_handler start
    [  331.071192] usb 3-1: reset SuperSpeed USB device number 2 using xhci_hcd
    [  331.096325] scsi host3: uas_eh_device_reset_handler success
    

    On further testing I can reproduce the problem by trying to open the Storage Dashboard in TrueNAS, it happens even if I try again in just over a minute. My guess is TrueNAS is issuing some commands via UAS that the controller or disks really don’t like and it’s crashing the connection to the drive. Could be a Linux bug, could be a controller bug. I sure hope this problem is isolated to the Yottamaster.

    Update

    I’m beginning to suspect TrueNAS Scale is the issue. This exact same hardware is doing fine with a ZFS pool built inside Proxmox. I made a 4TB virtual disk and shared it into an OpenMediaVault VM and so far it’s working great. Was able to write a bunch of data and run fio tests with no problem, no errors from the kernel. Also smartctl works out of the box on Proxmox.

    Maybe TrueNAS is doing more complex probing on the disk? The TrueNAS docs do say not to use it with USB, so maybe that’s why. I may still try TrueNAS with some different enclosures.

    This Proxmox ZFS + OMV solution is not bad. It’s not ideal from a management perspective, I’d rather the NAS VM knew more about the real hardware. But Proxmox is pretty good at that too. And I/O performance as measured by fio is pretty good. Some numbers:

    Sequential writes on PVE (direct): 727MiB/s, 186k IOPS. ?!
    Sequential writes via NFS: 70.3 MiB/s, 18.0k IOPS.
    Sequential writes via SMB: 70.2MiB/s, 18.0k IOPS.

    Random writes on PVE: 26.3MiB/s, 6734 IOPS
    Random writes via SMB: 18.6MiB/s, 4759 IOPS
    Random writes via NFS: 29.1MiB/s, 7458 IOPS

    Basically this tells me it’s fast enough. The actual physical disks are probably about 100-150 MiB/s depending on where on the spindle you’re writing. So ZFS is costing us something, and then the network overhead is more, but it’s all reasonably fast for low end hardware.

    Clearly that 727MiB/s number is bogus, I guess ZFS write caching. Weird because I have --end_fsync=1. If I add --fsync=10000 it backs off to about 73MiB/s, more plausible.

  • Samsung SSD firmware updates

    This post is a rant about how bad Samsung’s SSD software is. Summary: the ISOs only boot with legacy BIOS, not UEFI. Samsung Magician for Windows is awful.

    My 256GB Samsung 970 Evo used to be a system drive, then sat idle, and is now a new system drive replacing the failed one in my BOSGAME. But this new (old) drive is also suspect, I got a scary looking error from the kernel last night

    [28863.608367] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s [28863.609453] sd 2:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current] [28863.610523] sd 2:0:0:0: [sda] tag#0 Add. Sense: Invalid command operation code [28863.611537] sd 2:0:0:0: [sda] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 02 fe 90 00 00 09 70 00 00 [28863.612526] critical target error, dev sda, sector 196240 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0

    So I checked some error logs with nvme and the firmware version with smartctl. I had firmware 1B2QEXE7 but there might be a 2B2QEXE7. There’s no clear way to update firmware from a running Linux system, so looking for options…

    Updating from ISO

    Samsung makes firmware available as ISO updates. Burn a small USB disk, boot it, update. Only joke’s on me, their ISO images won’t boot on a modern BIOS. Doesn’t seem to work with UEFI at all, secure boot or no. I could get it to boot in legacy mode / MBR, at least in a test VM, but screw that. (BTW the ISO contains a Linux system, lol.)

    So I gave up. I pulled the drive and put it in a USB enclosure to update on my Windows machine. Which didn’t work either.

    Samsung Magician Windows Updates

    Samsung has a consumer management tool called Magician. I thought it used to be pretty good but over the years it has enshittified.

    I had a copy of Magician my Windows system already, version 5.0.0.790. Which lies about auto-updates. Also the uninstaller fails to handle the very simple case where the program is running in the foreground.

    screenshot claiming 5.0.0 is latest version

    The installer is terrible. The 8.1 installer takes most of a minute to put a window on the screen. Which then inquires if I’m a member of a European GDPR country (or Brazil), so I guess this thing is preparing to fully invade my American privacy. (I wonder if Samsung is aware of California law?) Magician installs in the years-deprecated Program Files (x86) directory. It now takes up 677MB of disk space when installed (5.0 was 25MB).

    The usability is terrible. You can’t copy and paste, you have to re-type the serial number off the screen. Don’t mix up those 0s and Os! Also it has stupid formatting for the SMART data, but at least here there’s an export button. Temperature in Kelvin? Really?

    Also the functionality is terrible. I can’t do anything with the USB drive I’m trying to fix. It seems upset that it can’t mount the filesystem (it’s a ZFS Linux system, no duh.) and it refuses to do anything else with the drive because of that. Not a very useful hardware management tool.

    Magician also says another drive is counterfeit, an internal drive I’ve used for seven years to boot Windows. That’s not impossible, but awfully unlikely. This forum thread suggests that Magician has bugs leading to false counterfeits. Either way it locks me out of updating the firmware, and I think I’m two revisions behind. Ugh.

    Magician couldn’t do anything with two of three hard drives I tried. Why is so much software is so bad? Do they not give a shit?

    Update: I reinstalled Magician and ran it again and now it says the drive is legit, not counterfeit. (Also a Samsung service person looked up the serial and number said the same.) I upgraded the firmware. Revealing another bug: the firmware update says “your computer will be rebooted”. But it’s not, it shuts it down instead which caused a brief moment of panic until I thought to hit the power buttonn

    Did I mention Magician has some weird input thing where it blocks my global screenshot hotkey if the window had focus? How do you even do that?!

    Back to the ISO

    So I tried again to boot the update tool from the ISO image. I failed. First I tried setting my BIOS to legacy boot and boot that way: it didn’t recognize the USB I burned with unetbin according to instructions as bootable. Then I tried using Ventoy to boot it as an ISO. No such luck.

    The only thing I’ve seen booting this ISO is Proxmox with the legacy BIOS. I suppose I could pass through the SSD to a virtual machine, maybe via USB, and update that way. But yikes that seems risky.

    So I gave up. Not updating this firmware. Thanks, Samsung. Between this mess and the disaster of data loss on 980s and 990s I’m unlikely to buy another Samsung drive.

  • OpenMediaVault, exFAT, writeable

    OpenMediaVault, the NAS, now officially supports exFAT. It’s still a lousy choice for a Unix filesystem.

    I had it working with read/write access. But then the write access stopped working. Judging by searches I’m not the first one this has happened to.

    I got it working again by manually editing /etc/openmediavault/config.xml and changing the options for the fstab mount to defaults,nofail,fmask=000,dmask=000. Then I ran omv-salt deploy run fstab to regenerate /etc/fstab and rebooted. Now the Unix directory in /srv has everything with permissions 777 (whee!) and SMB users can write. Total mockery of POSIX file permissions but eh, it’s exFAT.

    Some other advice from the old days was to use exfat-fuse instead of exfat. I think that’s from when the Linux kernel only could read exFAT, not write it.

    The main advice is “don’t use exFAT”. Which fair enough. But it’s nice to pop a USB drive in from a Windows machine or Mac and have it on the NAS.

  • Proxmox notifications: gotify, email

    Proxmox sends emails for important notifications, there’s nice docs. But email sucks in 2024, I have no idea how to get mail like this delivered. The only other option Proxmox supports is Gotify. I wish it had some generic HTTP post or something simple, looks like the Proxmox devs are adding webhook endpoints. ntfy.sh seems popularly requested but can be supported in a webhook. See also apprise and UnifiedPush.

    Gotify

    Gotify is a little self-hosted server that receives notifications and does things with them. It has very nice docs. Gonna tinker by installing it in an LXC in Proxmox. This server should really be hosted elsewhere, ideally a cloud service, but those seem to start at $9/mo so let’s just hack in a container for now.

    Installing was easy with the tteck script. I added tailscale and had to manually wangle the LXC config to allow access to the tunnel device, per usual. With that gotify’s web server was available. The default login is admin/admin.

    AFAICT tteck doesn’t install any config script. There’s a systemd unit to start it and the software is in /opt/gotify. I think /etc/gotify/config.yml would be a reasonable place for a config. It actually seems OK with no config file? It must have reasonable defaults, like sqlite3.

    Getting Gotify up and running with Proxmox isn’t too hard. You create an application, take the token, then set up a notification destination in Proxmox. I also told Proxmox’s notification matcher to deliver to gotify. Test messages post immediately and are visible in the webapp. The Android app works but I don’t like the design: I want something else connected to Gotify and just delivering push messages to my phone.

    I could also use some other delivery options. Deliver to RSS in particular. There’s a bunch of plugins and contrib stuff. No Atom or RSS! There’s a Slack plugin but it’s 5 years old and doesn’t work with modern Slack tokens.

    Lots of interesting plugins for receiving notifications too. A zillion RSS receivers. Support for webhooks, yay. Also an SMTP gateway. Also appreciate this reverse proxy so I could use a public IP instead of Tailscale.

    I dunno, I’m left thinking the core Gotify message router is pretty good but usability and features are not. The default message view is very limited (no time filtering, doesn’t even show the application). The Android and other delivery options look limited. I’ll keep using Gotify for now because it’s all Proxmox supports but once PVE adds webhooks or something I should re-evaluate.

    Email

    The default configuration has you specify an Internet email address that root mail should be forwarded to. But they don’t do anything to actually help you get email delivered. The default config will gamely just try to SMTP deliver the mail but of course no modern mail system relays mail from strangers. The mail just disappears into a postfix queue until it times out.

    How about local delivery? I first tried making root’s forwarding address root, the empty string, root@localhost, or other variants. None worked: either invalid or a mailer loop. I then fixed it by removing ~root/.forward. Now postfix will write to /var/mail/root. Not ideal but at least it’s not disappearing.

    Another local option is to create a second user account on the PVE host (like “nelson”) and forward root’s mail to it. But I’m trying to avoid that.

    Remote delivery should be possible, in the past at least it was possible to get Gmail to relay mail for you if you were authenticated enough. But every time I look at doing that it gets harder. Screw it.

  • Migrating Proxmox to a new install

    After my Proxmox disk failed I managed to clone the disk before it died entirely and boot from it. But that’s temporary hardware, now I want to set up Proxmox again on a new permanent disk, a 256GB 970 EVO I had lying around.

    I’ve decided to go for a full rebuild, installing PVE and configuring it as if new. Then restoring all my guests from a backup. There’s a simpler path where you just restore /var/lib/pve-cluster/config.db /etc/hosts /etc/hostname but I kind of want to redo the setup. I did this before in January for a different server. This post is my boring careful notes here on exactly what I did.

    Backing up old host

    First thing is to back up everything on the old host disk while the old host is still working. I stopped all the guests and did a full backup of every single one. This is the precious data, what I’ll be restoring.

    Second is to back up /etc and /var/lib/pve-cluster/. Note that /etc/pve is special and backed by a live-mounted database in pve-cluster. I’m backing up the stuff in /etc/pve as if they were normal files (for future diffing) and taking a backup of the database with sqlite3 config.db ".backup 'config.db.backup'.

    After shutdown I’m going to unplug the old host disk for safekeeping, if things go badly I can use it to restore. Note it’s awkward to read the files on it because after you install a new Proxmox you end up with two ZFS pools named rpool that both want to mount to /. zpool import -d -R should help with that but I found it tricky. Instead I’ve backed up the files I want to read to the backup disk.

    Installing new Proxmox PVE server

    Do the regular Proxmox install from USB. There’s very few options there, I love how simple the Proxmox installer is. But this is what I did:

    1. ZFS RAID0 as the filesystem, with the one internal SSD
    2. United States, UTC timezone, US English keyboard
    3. Password and email as nelson@monkey.org even if I never have set up outgoing mail
    4. Hostname gvpve. Verify IP address is correct (thank you DHCP).

    Improving Proxmox install

    These are my generic customizations to any new Proxmox install

    1. Add my ssh key to ~root/.ssh/authorized_keys. Don’t disturb the key for root PVE made!
    2. Run the tteck helper script “Proxmox VE Post Install” and accept all defaults. This does a few things to make a new install nicer.
      • enable no-subscription repositories
      • disable the nag message
      • disable the high availability services (pve-ha-lrm, pve-ha-crm, corosync)
      • software update to Proxmox (takes a few minutes)
    3. Reboot
    4. Install some useful software. apt install joe sudo avahi-daemon
    5. Install tailscale. Start with tailscale up --accept-dns=false. Disable key expiry in Tailscale console.

    Restoring my guests

    Proxmox can do all this automatically if you have a cluster, this task just becomes a migration. (Automatic, if you have HA configured.) But it’s nice to do it manually as a recovery exercise.

    1. Plug in the USB disks for backups and USB media and verify with lsblk
    2. Restore /etc/fstab mounts for these disks (look at backup file from old server).
    3. Mount /mnt/backup and /mnt/usb-media
    4. Under Datacenter > Storage, add my Proxmox backup volume
      id: backup, type: Directory, Content: VZDump backup file, Path: /mnt/backup
    5. Remove “backup” as a content for the storage named “local”
    6. Under Datacenter > PVE > Backup, locate the files for the backups I took just before shutting down old host
    7. Restore each one manually (a lot of clicking!). Re-type the guest IDs instead of taking newly assigned ones. Probably unnecessary.
    8. Reboot to start all the relevant guests.

    The restores all went pretty well, Proxmox backs up the configuration of the guest as well as the guest’s disk images. For some reason the config for a TrueNAS VM changed, it added back the ISO for the install media. This could actually break the guest although it’s easy to fix. The vmgenid also changed on every VM.

    Finishing up

    1. Add my Influx DB as a system metrics server
    2. Create a new backup job on Proxmox
      All guests, 9am, keep 6 monthly / 4 weekly / 7 daily
    3. Test a manual backup. MISTAKE. The daily retention policy means this manual backup erased the reference backup from earlier today. I’d already used those to restore all my systems, so it’s probably no big deal, but definitely not the right thing to do.

    I tried diffing the old server’s /etc/pve (on backup disk) to the new one. Didn’t learn anything really. Lots of differences but most of them are what you’d expect, new security keys. Also missing config files because I looked before I restored the guest hosts. I did spot the config change for the TrueNAS VM though, still puzzled by that.

  • Proxmox split disk idea

    A setup I did not pursue, but seemed an interesting idea.

    Proxmox by default installs itself to a disk and takes over the whole disk. You can set it up with ext4 (LVM) or zfs and use that disk usefully for guest disk images, ISOs, backups, etc. But the whole thing belongs to Proxmox.

    What if instead you install Proxmox to a small partition, it only needs 16GB. And then allocate the rest as a ZFS pool? That seems like a clean separation. It also means you can use ext4 for Proxmox which might make recovery simpler.

    I went down this road, following notes similar to this guy. I told Proxmox to install to 32GB ext4. It creates a partition #3 with LVM. Inside is 3.8GB for swap, 13.6GB for root, and two 11.6GB marked “pve-data”. The rest of the disk is then unused.

    So I used fdisk to add a partition of type bf01 (zfs). The next step would be creating a zpool on it, then telling Proxmox to use that as storage for things. But I realized in the end I’d just be replicating more or less what Proxmox does by default with a ZFS install, only with non-standard names and setup for questionable benefit. So I stopped.

    There’s something vexing about systems having very capable but unreliable SSDs as their install drives. I want to use that storage better!

  • Some ZFS exploring, disk recovery

    My BOSGAME N100’s boot SSD failed. So I learned some ZFS to recover the files. I don’t think too much important is on this disk. Proxmox has been faithfully backing up system images to another drive and I have no reason to doubt them. But I’d like to rescue the Proxmox system itself, particularly since I don’t remember what all I customized on it (and don’t have a backup).

    Summary

    This post is more scattered than usual and unlikely to be interesting. This is what I learned:

    • You can boot off a dd clone of the Proxmox ZFS boot disk and it will work fine.
    • My disk failed with random access reads but I could use dd to clone it still. Weird.
    • You can recover the files on a ZFS pool with a system rescue disk. The key command is zfs import which brings the pool online in a new operating system.
    • zpool list and zpool status give you basic info. zdb is useful if you can’t get a pool online
    • Cheap SSDs are not to be trusted.

    Lots of details

    The big help here was the SystemRescue-ZFS fork. That’s the venerable Linux SystemRescueCD modernized and with ZFS installed for you.

    ZFS docs are confusing. There were a bunch written really well by Sun and/or Oracle in the early days. But that was over a decade ago. OpenZFS has good man pages but I’m missing an overall gestalt of understanding the system. And ZFS is complicated, it’s a comprehensive implementation of a lot of data storage concepts.

    Claude was a big help for me exploring ZFS. So was this StackOverflow post which has some recovery commands to try. And this recovery book was mildly helpful but is mostly about how to recover from partial failures where your pool still exists. My whole pool is gone because I’m using ZFS on a single disk, shame on me. These Proxmox docs for ZFS are also helpful.

    One ZFS concept I’ve learned: ZFS doesn’t really have fsck. It’s constantly fixing itself (hopefully), there’s no big check-the-thing-offline. (scrub is the closest it gets.) A second concept I’m only getting now: ZFS has a notion of “imported” pools, where the ZFS system becomes aware of some storage. It’s a little like mounting a filesystem but it is not just mounting into Unix namespace, it’s also enabling all the access to the pool, datasets, filesystems, etc.

    Another key concept: Proxmox creates a ZFS pool called rpool for its system boot disk. That’s what’s not working for me.

    Using SystemRescue

    SystemRescue is great, and the ZFS fork is perfect for me. Try setfont -d if the console font is too small. Remember that Linux lets you have multiple console windows: Try Ctrl-Alt-F2 to switch to window 2. Even better set a root password and then systemctl stop iptables.service so you can ssh in from a humane terminal emulator.

    Cloning the bad disk

    I suspected hardware failure so first thing I did once I had the rescue system booted was dd if=/dev/sda of=/dev/sdc bs=1M to do a block by block clone of the disk. This worked with no hitches, suggesting maybe the disk is not entirely gone. I unplugged the backup, that may be crucial since ZFS might be confused by having two copies of the same pool.

    Exploring the problem disk with SystemRescue

    I booted SystemRescue and first thing I tried was zpool status. It says no pools available. Same thing with zpool list. Um, is that bad? (I’d briefly tried Proxmox rescue too and it said the same thing). So I asked some AIs to go deeper and got to the ZFS debugger. zdb -lu /dev/sda3 does find some ZFS labels and stuff that I need. So the data is there and at least a little ZFS-shaped.

    Asking more AIs I learned that you have to “import” a pool: presumably this SystemRescue ZFS system has never heard of this ZFS pool and so didn’t import it. Sure enough zfs import finds my missing rpool. It doesn’t want to import it by default because it wasn’t cleanly exported by some previous system. But I can zfs import -f rpool to force it. Which I have done and it takes an alarmingly long time to respond! What’s it doing? After a couple of minutes it gives up and says cannot import 'rpool': one or more devices is currently unavailable. That’s bad right?

    Yes, it’s bad. journalctl has the news. The kernel module ata2, the driver for the SSD, is reporting errors. A lot of them, but the bad ones are failed command: READ FPDMA QUEUED, hard resetting link, COMRESET failed, and eventually ata2: reset failed, giving up. At which point zio and sd 1:0:0:0 start complaining about read errors. Also I have to reboot now, /dev/sda is no longer a working device.

    This sure seems like a hardware failure to me! I wonder why the dd worked though? Maybe it’s just random access reads that fail? Or maybe zpool import was trying to write and it’s the writes that are failing? But those errors were triggered by reads… I tried again with zpool import -o readonly... and got the same errors. I think this disk is cooked.

    Exploring the cloned image

    I took a gamble and tried to import the backup cloned disk image disk. It worked immediately! zpool import -o readonly -d /dev/sdc3 -f rpool. Note I’m using the -d option to specify what device to import from because I think I have two different disk devices with an rpool on them (one of which is offline.) Also that readonly turns out to be a lie.

    I think zpool import tries to mount all the filesystems where they were last time. But there was some error about rpool/ROOT/pve-1 not mounting, I assume because it wanted to mount to /. So nothing mounted. I can manually mount other filesystems with zfs mount but I really need pve-1 and it really wants to mount to /.

    In theory zpool import -R /tmp/restore should mount everything under /tmp/restore. But I kept getting confusing errors. I finally got the thing mounted by doing zfs set mountpoint=/tmp/rr rpool/ROOT/pve-1. Which mounted it under /tmp/restore/tmp/rr, not what I wanted, but close enough. (Note this edit was permanent, written to the disk, despite the pool being imported readonly. I had to undo it later.)

    Anyway all this hackery later and I have access to my files! They are mounted and I can make a backup of my Proxmox boot disk.

    Backing up Proxmox PVE itself

    One annoying thing about Proxmox is it has no system for backing up Proxmox itself. Everyone says “don’t customize it, just run guests and back those up” and mostly that works but still, it’d sure be nice to have a backup for situations like this.

    Here’s a five-year-old gist backup script. The discussion there is more valuable than the script. It backs up a bunch of stuff in /etc one by one. Me, I just took all of /etc.

    The Proxmox docs for Recovery are also useful. They suggest backing up /var/lib/pve-cluster/config.db /etc/hosts /etc/hostname.

    Big gotcha here: a lot of configs are in /etc/pve. But those aren’t normal files stored in ZFS. Instead that’s a FUSE filesystem Proxmox runs (keyword pmxcfs) so in my non-running system here /etc/pve is empty and my tar got nothing. Those files are backed by a sqlite database in /var/lib/pve-cluster/, as the docs above discuss. With the system offline you can just copy those files. With it online and live something like sqlite3 config.db ".backup 'config.db.backup'" is safer.

    Setting up a replacement Proxmox

    Now that I have a backup of my files I’ve got several options to bring something back online.

    1. Try jiggling the SSD a bit, maybe it’s just a loose connection? This problem started after I opened the case, maybe I disturbed it. Nope, no such luck.
    2. Try booting off the clone I made. That might Just Work, actually. Probably should make a second clone first. Yep, this worked!
    3. Reinstall Proxmox on clean disk, then follow those recovery notes to do a quick restore. But I have to re-apply whatever other customizations I did myself, too. And manually restore all the guest hosts.
    4. Reinstall Proxmox and just configure it again. May not be much more work than #3.

    No matter what I do I should replace that SSD, I’m never going to trust it. Was a cheap one anyway.

    New Install

    Option 2 worked so I have a working server again. I ran zpool scrub just to be safe and it found no errors.

    But I’d like to install cleanly on a new SSD. So I ordered a 1TB WD_BLACK 1TB SN850X based on these reviews. I used to be a Samsung partisan but they really screwed up with the 980s and 990s. There’s an argument that you should just use a tiny scratch drive to install Proxmox on, or maybe a fancy enterprise-grade SSD instead. Eh, I’ll go with this. (No one makes high quality 512GB SSDs anymore!)

    I will probably go with option 3 on this new install. Reinstall Proxmox, but then clone the config by copying a few files around. If I’m patient I should do option 4 and change the way some things work on my system. This was the first Proxmox install I did and I could do a cleaner job the fourth time around.

    Bonus: Exploring a healthy Proxmox install disk

    Before I worked on the bad disk I took a look at what a good disk looks like in ZFS. Here’s some things I learned from a working Proxmox system with no problems. Proxmox is installed to a single 2TB SSD with default ZFS options.

    The physical disk is /dev/nvme0n1. Proxmox created three GPT partitions. fdisk tells me nvme0n1p1 is about 1M big and is labelled “BIOS boot”. p1 is 1GB big and is “EFI System”. The rest of the disk is in partition 3 labelled “Solaris /usr & Apple ZFS”. lsblk -o +PARTTYPE will show the GUIDs for the types and that partition 3 is 6a898cc3-1dd2-11b2-99a6-080020736631. Which honest-to-goodness has two meanings in this reference table. (Neither of which are “OpenZFS” or “Linux ZFS”, lol.)

    lsblk also shows the SSD partitions and a couple of USB drives I have. And a bunch of /dev/zd* devices, which I gather are device names for ZFS snapshots or datasets or something. Presumably I shouldn’t mess with them.

    zpool status shows that I have one pool named rpool, that’s what Proxmox created. It has one device in it, nvme-eui.002538b22140b542-part3, which corresponds to my SSD partition. zpool list shows me a basic status of rpool, how much disk space is free and the like.

    zfs list shows me a whole bunch of datasets in ZFS, I think this is the top list of every thing in this pool. Most of these are under rpool/data and are either filesystems (for containers) or disk images (for VMs). But there’s also a few extras that I think are related to the PVE host: rpool/ROOT/pve-1 is the important one, also rpool/var-lib-vz. rpool/ROOT and rpool itself exist but seem pretty empty, mostly there to hold the subdirectory rpool/ROOT/pve-1.

    zfs list -t snapshot shows snapshots, just snapshots and not the primary datasets.

    mount | grep rpool shows me a bunch of ZFS things mounted in various places. rpool/ROOT/pve-1 is mounted on /, that’s my root filesystem. Most other things are mounted under /rpool and include a bunch of subvolumes for containers filesystems.

    zfs get all rpool/ROOT/pve-1 gives me a very deep list of properties for the root filesystem. Not much of interest here but it’s exhaustive.

    zdb -l /dev/nvme0n1p3 is a low-level ZFS debugger, it shows the labels on the disk device. I have a single one, “LABEL 0”, with the name rpool, a hostname that matches my PVE host’s name, and a bunch of stuff I don’t understand. Mostly looking at this because it was helpful with the broken disk since zpool wasn’t even showing it.

  • Linux console fonts are too small

    My eyes are getting older and I have big screens: the default Linux console fonts are too small. And I’ve been spending more time than I’d like in consoles lately. Some ideas here for coping.

    setfont -d is the quick fix. This will double the size of the current font as a one-off thing.

    setfont in general is the tool for changing fonts. But it’s hard to find a nice clear choice for an alternate bigger font, there’s a list of a zillion for different languages. I should go find one good font that’s likely be available on a system and remember its name.

    dpkg-reconfigure -plow console-setup works on Debian systems and gives you a nicer chooser.

  • Some NAS hardware estimates

    I’m trying to get an idea of what building a 10TB NAS would cost if you’re on a budget. Options range from plugging in a single USB drive into an existing N100 miniPC running TrueNAS up to a dedicated Synology device.

    I’m not an expert on this stuff: comments welcome!

    I’m targeting a working set of roughly 10 TB of data. I’m thinking spinning hard drives (HDDs) make more sense. I like SSDs but the speed is mostly irrelevant in a NAS on gigabit ethernet. Maybe SSDs are more durable, but they are 2-3x more expensive. For RAID options I’m considering setups where one disk can fail and we still can operate.

    Summary

    We can get 10TB usable storage for anywhere from $200 to $1500 depending on how we do it.

    Disks: 10TB usable can be done for $210 with a single disk (no recovery). Or RAID recovery with either 4x4TB or 2x10TB. Both RAID options are about $400 for spinning disks. $1300 to go with SSDs.

    USB enclosure: about $200 for a four bay enclosure. Not needed if we just do one or two drives on the N100.

    Separate PC appliance instead of the N100: about $250-$600, lets us avoid USB, also maybe a simpler commercial solution.

    Update: I ordered two 8TB WD Red Plus drives and a Yottamaster PS200U3 two disk enclosure. Basically the cheapest plausible mirrored setup. Total price $400 plus tax (plus the computer to plug them in to). The 10TB WD Red Plus disks are sort of weird. 12TB are OK but noisier and higher power, so starting smaller. Early indications are the Yottamaster is not reliable, having a lot of trouble with USB errors in TrueNAS Scale.

    N100 with TrueNAS and spinning disks

    Variants of what I’ve set up already, a homebrew inexpensive system. The experts will tell you USB is not suitable for real NAS systems. But I think it could be OK.

    10TB Cheapest: a single 10TB 3.5″ HDD on USB to the N100. 10TB is $210.

    10TB RAID1 (mirror): two 10TB HDDs on the N100. $420.

    10TB RAIDZ1: four 4TB HDDs ($400) in a USB enclosure ($200) connected to N100. $600

    N100 with TrueNAS and SSDs

    An SSD option. The disks are 3x the cost of HDDs.

    10TB RAIDZ1: four 4TB SATA 2.5″ SSDs ($1300) in a USB enclosure ($200) connected to N100. $1500

    Synology appliance (no N100)

    A third party commercial alternative to building our own NAS on the N100. Not as capable as TrueNAS but easier to understand and support. No USB involved.

    10TB RAID1: two 10TB drives ($420) in a DS223 ($250). $670

    10TB RAID5: four 4TB drives ($440) in a DS423 ($400). $840

    Build our own appliance

    To avoid USB we build a little mini-PC with SATA disks in it. This is a rough guess only.

    10TB RAID5. four 4TB drives ($400) and a cheap custom mini PC ($500?) $900ish

    Extra costs: Local versioned backup drive

    I think this is easy, just a local disk to copy stuff to.

    16TB: 3.5″ external HDD on USB to the N100. $280

    Extra costs: electricity

    50 watts works out to 438kWh a year or about $200/year.

  • SanDisk Extreme Pro: data loss, smartctl

    SMART

    I have a recently purchased 4TB SanDisk Extreme Pro 55AF, an SSD. (Note SanDisk is a Western Digital brand.) smartctl doesn’t recognize it. I found a workaround for it which is to pass –d sntasmedia to smartctl. Doing that gives plausible output but some stuff looks a bit off (like power on hours). See full SMART output below.

    Getting OpenMediaVault to pass this flag is not possible or at least not easy. There’s a remarkably detailed response here which says they want smartmontools to update its database instead. There’s also a suggested workaround which is to add an entry to /etc/smart_drivedb.h but the format for that seems daunting, I didn’t pursue it (yet).

    Bad drive?

    Unfortunately while researching this SMART problem I learned SanDisk Extreme Pro has been implicated in some serious data loss problems. See here, here, or here for reports. SanDisk has said it’s a firmware issue and put out an update, their tool here says we don’t need an update for this drive (if I have the right serial number). OTOH Tom’s Hardware is saying it’s a hardware flaw. That’s very troubling and knowing all this now I’ll think twice before buying any SanDisk or WD external USBs. Not sure what to do with this disk.

    I have seen one problem with this drive, btw. When using hdparm -t to test throughput it reads for 3 seconds. It tests fine, like 800 MB/s. But if I test a second time it tests super badly, like 60 kilobytes/sec, and there’s an error in dmesg saying it had to reset the drive interface. Then the drive seems to work fine again (until I test throughput again.)

    Jul 23 17:50:32 nchsomv (udev-worker)[7715]: sdb: Spawned process 'serial_id /dev/sdb' [7754] is taking longer than 59s to complete
    Jul 23 17:50:32 nchsomv systemd-udevd[295]: sdb: Worker [7715] processing SEQNUM=2565 is taking a long time
    Jul 23 17:50:32 nchsomv kernel: sd 3:0:0:0: [sdb] tag#11 uas_eh_abort_handler 0 uas-tag 2 inflight: IN
    Jul 23 17:50:32 nchsomv kernel: sd 3:0:0:0: [sdb] tag#11 CDB: ATA command pass through(16) 85 08 0e 00 00 00 01 00 00 00 00 00 00 00 ec 00
    Jul 23 17:50:33 nchsomv kernel: scsi host3: uas_eh_device_reset_handler start
    Jul 23 17:50:33 nchsomv kernel: usb 2-1: reset high-speed USB device number 2 using xhci_hcd
    Jul 23 17:50:33 nchsomv kernel: scsi host3: uas_eh_device_reset_handler success
    Jul 23 17:50:33 nchsomv kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
    

    I’ve only seen this error with a synthetic test, and that via a KVM pass-through USB device. So I just kind of shrugged it off. FWIW we’ve successfully written 1.7TB of data to the disk and read 3.2TB off of it, so it’s not failing all the time. Sure don’t care for this uncertainty though.

    Update: I’ve also seen this error now just reading files from the mounted disk (tar cf - . | pv > /dev/null). It’s possible this was related to the disk waking up from a sleep mode. The earlier hdparm failures were very unlikely to be sleep related. Then I was able to read 2.5TB or more off the disk, both streaming (at 400MB a second) and two parallel random access threads. Without a single error.

    SMART output

    # smartctl -a -d sntasmedia /dev/sdb
    smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-23-amd64] (local build)
    Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Model Number:                       WD_BLACK SN850XE 4000GB
    Serial Number:                      240632803712
    Firmware Version:                   624131EX
    PCI Vendor/Subsystem ID:            0x15b7
    IEEE OUI Identifier:                0x001b44
    Total NVM Capacity:                 4,000,787,030,016 [4.00 TB]
    Unallocated NVM Capacity:           0
    Controller ID:                      8224
    NVMe Version:                       1.4
    Number of Namespaces:               1
    Namespace 1 Size/Capacity:          4,000,787,030,016 [4.00 TB]
    Namespace 1 Formatted LBA Size:     512
    Namespace 1 IEEE EUI-64:            001b44 8b47193cd2
    Local Time is:                      Mon Jul 29 21:55:46 2024 UTC
    Firmware Updates (0x14):            2 Slots, no Reset required
    Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
    Optional NVM Commands (0x00df):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Verify
    Log Page Attributes (0x1e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
    Maximum Data Transfer Size:         128 Pages
    Warning  Comp. Temp. Threshold:     91 Celsius
    Critical Comp. Temp. Threshold:     94 Celsius
    Namespace 1 Features (0x02):        NA_Fields
    
    Supported Power States
    St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
     0 +     9.00W    9.00W       -    0  0  0  0        0       0
     1 +     6.00W    6.00W       -    0  0  0  0        0       0
     2 +     4.50W    4.50W       -    0  0  0  0        0       0
     3 -   0.0250W       -        -    3  3  3  3     3100   11900
     4 -   0.0050W       -        -    4  4  4  4     3900   45700
    
    Supported LBA Sizes (NSID 0x1)
    Id Fmt  Data  Metadt  Rel_Perf
     0 +     512       0         2
     1 -    4096       0         1
    
    === START OF SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    SMART/Health Information (NVMe Log 0x02)
    Critical Warning:                   0x00
    Temperature:                        34 Celsius
    Available Spare:                    100%
    Available Spare Threshold:          10%
    Percentage Used:                    0%
    Data Units Read:                    6,286,318 [3.21 TB]
    Data Units Written:                 3,285,119 [1.68 TB]
    Host Read Commands:                 62,684,905
    Host Write Commands:                33,660,622
    Controller Busy Time:               94
    Power Cycles:                       803
    Power On Hours:                     33
    Unsafe Shutdowns:                   8
    Media and Data Integrity Errors:    0
    Error Information Log Entries:      0
    Warning  Comp. Temperature Time:    0
    Critical Comp. Temperature Time:    0
    
    Warning: NVMe Get Log truncated to 0x200 bytes, 0x200 bytes zero filled
    Error Information (NVMe Log 0x01, 16 of 256 entries)
    No Errors Logged