Nelson's log – Page 11 – A personal work journal

Anbernic default OS
I got an Anbernic RG35XX H, a remarkable inexpensive retrogaming handheld computer. Apparently it’s an Android device at its core but it ships with an OS and GUI that is entirely about launching game emulators. And it’s really a Linux system. I hope to customize or at least upgrade the software, maybe starting with this guide.

The device was shipped to me with a 64 GB MicroSD Card that booted. Mostly it runs RetroArch and it had a wide variety of ROMS preloaded. Here’s what’s on it.

The card has six partitions:
```
Number  Start (sector)    End (sector)  Size       Code  Name
   1           73728        96542719   46.0 GiB    0700  Roms
   2        96542720        96608255   32.0 MiB    0700  boot-resource
   3        96608256        96641023   16.0 MiB    0700  env
   4        96641024        96772095   64.0 MiB    0700  boot
   5        96772096       113549311   8.0 GiB     0700  rootfs
   6       113549312       121063423   3.6 GiB     0700  UDISK
```
Partition 1, Roms, is a FAT filesystem. With a bunch of presumably unlicensed ROMs on it, naughty naughty. There’s 45 different systems here! There’s 700 ROMs for Atari 2600, for instance, although they stop at the letter H. 466 for GBA but the whole alphabet is represented. So a sort of demo collection but not carefully curated. Other big directories under Roms include bios, anbernic/bezels, and save. That save directory includes save games created by emulated ROMs I had run.

Partition 2, boot-resource, is a FAT filesystem. It contains a few images in BMP format for battery state, a boot logo, etc.

Partition 3, env is a mystery. File just says it is “data”. The first few bytes include what looks like Linux kernel command line parameters. Probably part of the boot chain.

Partition 4, the Boot, looks like a “Android bootimg”. It doesn’t have a filesystem I can boot. Maybe these tools would help.

Partition 5, the RootFS, is an ext4fs that has an ARM32 Ubuntu 18 (!) system. Presumably this is the OS for Anbernic. The directory /mnt/vendor seems to have all the interesting game related stuff in it.
- vendor/deep/retro is the biggest software package, it’s RetroArch.
- vendor/deep also contains PPSSPP (PSP), openBOR (side-scrollers), and drastic (Nintendo DS)
- vendor/bin/game has what look like a bunch of emulators in it, 32 bit ARM executables.
- vendor/ctrl looks like shell scripts for the Anbernic system, things like mounting SD cards or configuring CPU parameters.
- vendor/oem/version.ini is handy, it tells me 20240112.
Partition 6, the UDISK, is an ext4 filesystem. It’s basically empty but has stuff called dmenu and save in it.
2024/11/08
Hacking Proxmox boot process (ZFS)
I’m upgrading my main Proxmox home server from a single SSD with ZFS to two SSDs mirrored in ZFS. Some discussion here. The simple thing is just to reinstall Proxmox new but I’m going to try to do it in-place first. Either way requires careful backups.

The docs on changing a failed boot device are the in-place process I’m following. More or less. It boils down to “set up the new disk in ZFS, then run proxmox-boot-tool to make it bootable”. The blessing process is complicated. Booting with ZFS is always tricky. And Proxmox is rumored to have a special thing where it installs boot loaders on all mirror devices for redundancy. I want that with my new drive! This post is me learning about how it all works.

Summary: this worked very well and is simpler than imagined. Also Proxmox is remarkably well documented. Hurray!

How is Proxmox booting now?

A year ago Proxmox installed something that made it boot from ZFS. Works great. What’s it doing? The challenge is Proxmox has three different ways of booting and I’m not sure which mine is. These docs on using the boot tool are helpful. Some other useful docs here. I should just plug in a monitor and look at the boot process but no, I’m stubborn. The docs say proxmox-boot-tool status will show you the current boot method but the output is not illuminating.
```
System currently booted with uefi
4B96-7CDC is configured with: uefi (versions: 6.5.13-6-pve, 6.8.12-1-pve, 6.8.12-2-pve)
```
Yes, it’s UEFI, I knew that. But is that GRUB? Something else? Looking around for advice in docs and from Phind I got some other ideas to try, looking at the EFI boot options:
```
# efibootmgr -v
BootCurrent: 0002
Timeout: 1 seconds
BootOrder: 0002,0003,0001,0004,0005
Boot0001* UEFI:CD/DVD Drive BBS(129,,0x0)
Boot0002* Linux Boot Manager HD(2,GPT,55f108de-abfa-4dad-b478-641a6d640390,0x800,0x200000)/File(\EFI\SYSTEMD\SYSTEMD-BOOTX64.EFI)
Boot0003* UEFI OS HD(2,GPT,55f108de-abfa-4dad-b478-641a6d640390,0x800,0x200000)/File(\EFI\BOOT\BOOTX64.EFI)..BO
Boot0004* UEFI:Removable Device BBS(130,,0x0)
Boot0005* UEFI:Network Device BBS(131,,0x0)
```
So that says it’s “Linux Boot Manager”, which is systemd (why doesn’t it say so?). Other docs I read suggest ZFS systems boot from a whole Linux system installed in the ESP. That may involve systemd too. In any case it probably does not involve GRUB. I finally hooked up a monitor and rebooted and saw a menu but it didn’t quite look like GRUB. I did see a message go by about EFI Boot Stub.

Bottom line: I think I’m booting with Linux Boot Manager which involves systemd and a Linux kernel on the ESP partition. Not GRUB.

It’s possible to inspect the boot stuff. It’s on partition 2, the ESP partition in a VFAT filesystem. It has stuff in it like EFI/BOOT/BOOTX64.EFI, EFI/proxmox, and loader/entries.

Identifying the disks

/dev/nvme1n1 is my existing Proxmox boot drive, a zpool with just the one disk in it. I want to add /dev/nvme0n1 to the pool and make it a mirror. But there’s many ways to name a device on Linux and you want a stable name. (Fun fact: the drives on 0 and 1 have swapped between reboots without me changing the hardware.) What’s best?

zpool status -v rpool tells me the status of my existing one disk pool.
```
        NAME                               STATE     READ WRITE CKSUM
        rpool                              ONLINE       0     0     0
          nvme-eui.002538b22140b542-part3  ONLINE       0     0     0
```
What is that EUI string? That’s another name for /dev/nvme1n1p3, the “Extended Unique Identifier”. A 64 bit ID, different from the UUIDs, serial numbers, and other identifiers I’ve used before.

I first found this same name with this query: udevadm info --query=all --name=/dev/nvme1n1. But I don’t really know what that tool is. Simpler is ls -l /dev/disk/by-id/ | grep nvme, which shows that nvme-eui.002538b22140b542-part3 is (currently) a symlink to nvme1n1p3. That I understand. So I looked for what linked to nvme0n1 and found that nvme-eui.002538bc3140272b-part3 is the name for the new SSD’s partition 3.

Adding the ZFS mirror disk

OK, time to do surgery on my machine. I am more or less following this guide.

Step 1: clone the partition table to the new disk. sgdisk /dev/nvme1n1 -R /dev/nvme0n1. After doing this the new disk has the same layout as the old one. blkid will show the partitions. (Although confusingly blkid doesn’t show filesystem types: fdisk showed they were labelled as VFAT etc, just like the original disk.) However the UUIDs of each partition are the same too which is not good, we’ll change that soon.

Step 2: give partitions unique UUIDs. sgdisk -G /dev/nvme0n1. Partitions are the same but now they have new IDs.

Step 3: attach the new SSD to the pool. zpool attach rpool nvme-eui.002538b22140b542-part3 $nvme-eui.002538bc3140272b-part3. Why attach and not add? (Edit: add expands storage, attach mirrors it.) Not sure but that’s what Phind told me to do. This pauses a couple of seconds and then returns. zpool status -v rpool shows what’s going on. The ZFS docs suggest that “attach” is a special thing that only directly mirrors devices. I would have guessed zpool add would have been the command to use but that seems to be a more general purpose thing for other configurations with vdevs and maybe RAID-Z.
```
  pool: rpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Oct 27 19:49:41 2024
        114G / 114G scanned, 2.24G / 114G issued at 2.24G/s
        2.31G resilvered, 1.97% done, 00:00:49 to go
config:

        NAME                                 STATE     READ WRITE CKSUM
        rpool                                ONLINE       0     0     0
          mirror-0                           ONLINE       0     0     0
            nvme-eui.002538b22140b542-part3  ONLINE       0     0     0
            nvme-eui.002538bc3140272b-part3  ONLINE       0     0     0  (resilvering)
```
Just a bit later it reports resilvered 116G in 00:01:02 with 0 errors on Sun Oct 27 19:50:43 2024. Nice! These SSDs are fast.

At this point the ZFS stuff is all done. The system reboots fine. I did a zpool scrub for good measure and it reported no errors. I’m now getting most of the benefits of ZFS mirroring for data integrity. However there’s still only one boot device: if that fails, I’ll have to recover with some other boot media.

Installing boot stuff on the new disk

Proxmox has extra stuff for making boot more durable. proxmox-boot-tool manages it. See docs here.

Step 1: I verify that the new disk doesn’t have a boot ESP partition. Device names have changed on me after I rebooted, so /dev/nvme0n1p2 is the original bootable partition. /dev/nvme1n1p2 is the new one.
```
# old=/dev/nvme0n1p2
# new=/dev/nvme1n1p2
# mount $old m
# ls m
EFI  loader
# umount m
# mount $new m
mount: /tmp/m: wrong fs type, bad option, bad superblock on /dev/nvme1n1p2, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.
```
Step 2: format the ESP on the new disk. proxmox-boot-tool format $new. After this there’s a brand new VFAT filesystem on the partition, it is empty.

Step 3: get Proxmox to install its boot process on the new disk as well as the old. proxmox-boot-tool init $new. This copies all the kernels, etc to the new disk and enrolls it so updates should be written to it. After running this both disks have nearly identical boot partitions.

Step 4 (option): refresh. proxmox-boot-tool refresh. Just a sort of verification thing, shouldn’t be necessary, but it writes the kernels again to all enrolled disks.

Rebooting

It works! On reboot it just loads like nothing’s changed.

In the BIOS I now see more EFI boot options. Linux Boot Manager is now on both disks, other entries are also doubled. The EFI boot menu is totally rewritten and has two extra entries.
```
# efibootmgr -v
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0002,0003,0004,0001,0005,0006
Boot0000* Linux Boot Manager    HD(2,GPT,63a5e06a-3206-4a55-957c-0878a2275a8c,0x800,0x200000)/File(\EFI\SYSTEMD\SYSTEMD-BOOTX64.EFI)
Boot0001* UEFI:CD/DVD Drive     BBS(129,,0x0)
Boot0002* Linux Boot Manager    HD(2,GPT,55f108de-abfa-4dad-b478-641a6d640390,0x800,0x200000)/File(\EFI\SYSTEMD\SYSTEMD-BOOTX64.EFI)
Boot0003* UEFI OS       HD(2,GPT,55f108de-abfa-4dad-b478-641a6d640390,0x800,0x200000)/File(\EFI\BOOT\BOOTX64.EFI)..BO
Boot0004* UEFI OS       HD(2,GPT,63a5e06a-3206-4a55-957c-0878a2275a8c,0x800,0x200000)/File(\EFI\BOOT\BOOTX64.EFI)..BO
Boot0005* UEFI:Removable Device BBS(130,,0x0)
Boot0006* UEFI:Network Device   BBS(131,,0x0)
```
The real question is what happens if the boot disk fails. Will the system still boot? In theory yes, if you have ZFS mirrors and Proxmox’ magic boot setup. I don’t want to pull the M.2 drive in my machine to test it so I don’t know if it works. There’s two genuine looking boot partitions both listed in the EFI boot menus so I think odds are good.
2024/10/27
Things that cause hangs in Proxmox boot / shutdown

I love how Linux has (mostly) fast reboots, 10 seconds or less. So I’ve noticed some slow reboots, a minute or more, that are annoying. What’s happening?

One challenge: when you tell Proxmox to reboot it shuts down all the Proxmox PVE stuff like the Web GUI. That happens immediately. It also asks all the guests to also shut down (maybe at the same time) but that takes longer. Only now the UI is gone so you can’t see what’s happening. I only got some insight by staring at a console plugged in and then reviewing logs.

Another awkward thing: the “shut down all guests” task has no timeout from Proxmox. If one of your guests refuses to shut down I think Proxmox doesn’t reboot, just stays in a mostly-dead state. (Edit: maybe not, the docs mention a 180 second default timeout.) That makes sense in a high availability environment but is not great for my home single server. I don’t think this has ever bitten me but I have everything plugged in to an Internet power switch for just this eventuality.

Slow VM shutdown for a Linux guest

One thing I noticed is shutdowns sometimes take a minute+. The culprit is my big work VM takes forever to shut down Why? It NFS mounts a disk from an NFS server that’s running on the Proxmox host itself. So if the NFS server shuts down first, the client has a hard time unmounting it. (A weakness of NFS). The unmount does eventually time out after 90 seconds and the shutdown proceeds, but it’s a problem.

The root cause here is me running an NFS server on the PVE host: that’s not a great idea. But there’s a similar possible problem if the NFS server is in one guest, the client in another. Proxmox doesn’t do any sort of dependency inference itself although you can manually configure it. At least you can make a guest dependent on another guest, you can’t make it dependent on a rogue NFS server you installed on PVE itself.

I don’t have a fix for this now. I’m working on moving the NFS server to a guest, will revisit then.

Slow boot of Proxmox

Another problem I had started after I added a USB disk as storage, then unplugged it. On boot Proxmox waits to mount that disk again. It times out after 90 seconds. Fair enough but awkward.

Doubly awkward is it kept happening after I had removed the USB disk from the list of storage options. The disk is no longer referenced, why are we waiting? Turns out the mount job was left enabled. That may be a Proxmox bug? I manually disabled it with systemctl disable mnt-pve-extbackup.mount.

2024/10/27
IPv6 for first time (Ubiquiti USG, Starlink)
I finally decided to enable IPv6 on my Starlink network. And it seems to work? Ran into a lot of confusion with Ubiquiti Unifi OS, maybe also my Windows client. Still tinkering but some notes. Highlights of things I learned:
- Prefix Delegation Size: 56
- Prefix Delegation ID: 1
- Try an Android phone first as a test client
- Reboot Windows before it will work with IPv6
The big question is: what’s going to break now that I’ve enabled this? I think all my devices are now magically IPv6 and I just can’t wait to find out what changes. I enabled this because Starlink’s CGNAT means I often get blamed as if I’m a bot, maybe the IPv6 sites will work better. Already anticipating Youtube TV or Netflix or something pitching a fit though.

Starlink ISP

Starlink operates DHCPv6 and hands out /56 addresses. These are rumored to be dynamic and changing at least occasionally: Starlink doesn’t promise fixed IPs for residential users. Given they move me between Seattle and Los Angeles POPs with some regularity I’m not entirely surprised they aren’t maintaining a dynamic routing infrastructure for all users. (There’s a business level service with static IPs.)

Starlink didn’t initially offer IPv6. And for awhile they did but had some problems with their DHCP implementation. Internet searches often find old out of date info. Starlink does now document they offer IPv6. Also more info in this FAQ.

Unifi OS router

I had a hell of a time making sense of what my Ubiquiti Security Gateway (USG-3P) was doing. Some notes.

USG Internet configuration

The first part is getting the USG router to talk to the IPv6 Internet. The main configuration for this is in Settings / Internet / Primary (WAN1). You have to set “Advanced” to “Manual” and then enable “DHCPv6”. Also you have to type in a “Prefix Delegation Size”. 56 works for Starlink and most ISPs, that means you get a /56 address allocation from DHCP. That’s a very large address space.

There’s no indication this configuration works when you do it. The Unifi web UI still shows “IPv6 Address” as blank. If you log in to the USG command line and run ip a or ip 6 a, nothing will show up, it looks like there is no IPv6 address assigned to eth0 (the WAN). Well not entirely: you’ll find an address like fe80::e263:daff:fec4:19ff/64 which is the link-local address, but not an routable Internet address.

The way I realized it was working was running grep dhcp6c /var/log/messages on the USG shell. That shows me a message about IA_PD prefix which contains the actual IPv6 lease assignment from Starlink. Yay! Starlink’s block seems to be 2605:59C0::/28, customers are getting a /56 block from there.

The other way I realized it was working was running ping6 2001:4860:4860::8888 in the USG shell to ping Google DNS via IPv6. Actual IPv6 packets! There’s a traceroute6 that is fun to try too.

The confusing thing is my router still has no IPv6 address assigned to the WAN. Apparently this is expected. ip route is no help for how things are working either. But ip -6 route does show a route for default via fe80::200:5eff:fe00:101. I presume that fe80 address identifies the Starlink dish or the ethernet port it’s plugged in to. I can’t ping6 it, maybe it doesn’t make sense to ping a link local address.

Anyway, the router is on the IPv6 Internet. Now what about the rest of my clients on my LAN?

USG LAN configuration

Once the router is talking to the IPv6 Internet, the other part is convincing the DHCP server the router is running to give out IPv6 addresses to clients. This setting is in Settings / Networks / Default. There’s an IPv6 button. You set the radio button to “Prefix Delegation” to enable it. You’re also forced to set a “Prefix Delegation ID”, a decimal number. I still have no idea what this really means but the number “1” seems to work. Update: this explanation seems right. I’m given a /56 block from the ISP but internally I’m using a /64 subset. The extra 8 bits are the prefix deligation ID: I chose 01. That probably explains why my IPv6 address block given by Starlink has a :de00 in it but my machines have :de01: in the address, the 01 I picked was ORed in.

Note there’s an “Advanced” box. I left that to Auto which means it does SLAAC address assignment which is the simple thing you probably want.

After this is configured there is no indication in the Unifi web UI that it worked. But clients doing DHCP will now get v6 addresses if they ask for them. See below for testing clients.

The part of IPv6 I don’t understand is exactly how exposed my network is. AFAICT every device on my LAN is now potentially globally routable with a unique IPv6 address. So what’s keeping me safe? I think the answer is the default IPv6 firewall rules that Ubiquiti has but I really ought to learn more about that. (Update: yes, default firewall rules are “allow inbound traffic associated with existing connections that were made outbound” followed by “drop everything else”. That’s similar to the implicit firewall of IPv4 NAT routing. This port scanner plausibly confirms that my LAN machines have all their inbound ports filtered.)

Client: Android

My Pixel phone seems to work with IPv6 with no effort at all. https://test-ipv6.com/ shows 10/10 everything working. Various “what is my IP” services all show an IPv6 address. All just happened automatically, no reboot or reconfiguration required. Sadly this was the last client I tried but I should have tried it first.

Client: Alpine Linux

I fired up a quick Alpine VM to manually test IPv6. The default installer doesn’t enable IPv6, you have to add it with iface eth0 inet6 auto to the networking config. But doing that everything else works flawlessly. I can ping Google DNS via IPv6. curl https://api6.ipify.org shows me my IPv6 address. Nice!

Client: Windows

So those two clients were easy. Unfortunately the very first client I tried was my Windows 11 desktop. Once I enabled IPv6 the networking interface picked up an IPv6 address via DHCP. Two of them actually, both in 2605:59c8. One of them actually works! From a DOS command line I can ping my router’s IPv6 address. And from the router I can ping6 the one working Windows address.

However nothing else worked. I couldn’t ping Google DNS via IPv6. The browser test at https://test-ipv6.com/ didn’t work, shows nothing at all. Neither did the test at https://whatismyv6.com/.

I finally gave up and rebooted. And presto, now IPv6 is working fully on my Windows machine.

Client: Ubuntu 23.04 VM

An existing workhorse Linux box I had just started working fine on IPv6, no reconfiguration required. May have not started working until I happened to reboot it.

Client: Proxmox Virtual Environment

My Proxmox server does not seem to have IPv6 enabled by default. The docs indicate IPv6 is supported so it’s probably something with how I configured the server with a static IPv4 address at setup. Guest VMs can access IPv6 no problem (see above), not sure about containers. The default bridge (vmbr0) has some IPv6 stuff I didn’t configure so it probably won’t work. Actually a little confused, since that bridge is also used for the VM networking.
2024/10/15
Sharing files from Chrome OS to Android tablet
Challenge today: share two 3GB video files from my Chromebook to my Samsung tablet. So many ways, none of them reliable.
1. USB try 1. Plug in tablet. Popup says “access via Linux? Android?” I chose Linux, probably a mistake. Could never find the device in Linux. It’s there via lsusb but no appearance of file access. Maybe have to mount via command line?
2. USB try 2. Plug in tablet. Nothing happens. Try several times.
3. USB try 3. Reboot Chromebook. Plug in tablet. Device shows up in Files immediately! Drag file to copy, says “preparing” for one minute+, then shows an error
4. USB try 4. Unplug, replug. Try copying again. Get instant error.
1. WiFi try 1. Launch Solid Explorer on tablet, run its FTP server. Install ftp command line client in Linux. Vaguely remember binary and mput commands like it’s 1994. It seems to work but will take 15 minute. Stalls when one device goes to sleep. No clear way to resume.
1. Bluetooth try 1. Try Quick Share for first file. This kinda just works?! It’s slow, 10+ minutes, but it copies the file to and seems to be robust even if the device goes to sleep.
2. Bluetooth try 2. Try Quick Share for second file. Gives error. No diagnosis available, just “try again”.
3. Bluetooth try 3. Reboot both devices. Try Quick Share. Seems to work.
All that is just to get the file over to Android somehow, somewhere. The other challenge is working around how Android has slowly eroded the idea of a filesystem with general purpose access. Solid Explorer works as a file browser through some hack / special permission that lets it browse things. VLC can access things in Downloads reasonably reliably. Fortunately Quick Share put things in a folder named Downloads / Quick Share, so that wasn’t too bad.

I tried using Google to figure out how to do this and found a bunch of outdated advice that didn’t work. Also tried various AIs that suggested various hilarious non-working options, including things like running Android Debugger.

Why is this so bad? Does no one want to copy their photos from their phone to their Chromebook? It’s really dumb the USB doesn’t work. For some reason tablets and phones don’t show up as a normal USB storage device but instead use some weird mobile device system that’s awkward AF.
2024/10/07
More on Doze Stopper
I learned some more about Doze Stopper, the app that seems to be fixing my delayed notification problems on my Pixel phone (Android 14). (Previous discussion.)

First, some observations from using it:
- Notifications are no longer delayed. Problem fixed?!
- One drawback: the app prevents scheduled Do Not Disturb mode
- It doesn’t seem to disrupt battery life much if any
Second, some notes on how it works:
- The app is using alarms to interrupt Doze mode.
- Alarms during doze are delayed by a few minutes. This delay seems configurable in the app, I have it set to 5 minutes, the current minimum. So maybe my notifications are delayed by a few minutes still? I haven’t observed that.
- The app also receives push notifications, I think to enable the alarm.
It’s a pretty minimal app but it seems very good to me. I’m excited I found it.

The lack of battery impact makes me wonder why Google is fooling around with delayed notifications at all. I only get a few notifications a day because I carefully disable everything that’s spammy. Maybe it’d be worse for a normal phone?

Or maybe dozing notifications is just a bad idea that Google could remove from Android. When it was introduced years ago Android had all sorts of background activity consuming battery. Over the years they’ve cracked down on background processing and on network connectivity and maybe the platform is in a place where they could re-enable instant notifications.

One other mystery: Firebase has an option for high priority notifications that should get through even when Doze is active. So why aren’t messages from important apps like Gmail or Signal getting through? Maybe they’re just not setting the important bit? That’d be silly if so.
2024/09/21
Home security cameras: Eufy + Frigate
For years I’ve wanted some lightweight private home security cameras. I’m not too worried about security but it’s reassuring to be able to see video of the house when you’re not home.

I’ve come to really like Eufy cameras. The basic $30 indoor WiFi camera (plus a cheap microSD card) works great with their mobile app. Out of the box you get something that detects motion (including people vs animals) and stores video of events and alerts you on your phone. Also a quick live view browser. Firewall traversal works fine.

There’s no fee for all this service, the caveat being that you are live streaming video clips off of the camera’s storage in your house. There’s a cloud video storage option too if you’re worried about a clever thief disabling your camera (I am not). But what I like about this setup is Eufy isn’t storing video remotely at all. They’ve had a couple of privacy kerfuffles like not disclosing adequately that a still image is briefly uploaded to a cloud server if you have it taking snapshots and sending them to your phone. But they seem like well-meaning mistakes, not total failures (Wyze) or a not-private design by default (most consumer products).

Honestly the Eufy product is good enough on its own that may be all I need. The last piece I have to set up is geofencing so it doesn’t operate when we’re home. I think it supports everything I need (using your phone to detect when you’re home) but I haven’t used it yet.

Frigate

Still, what could one do with more software? Thanks to Aneel on Mastodon I learned about Frigate, an open source video recorder. It works with live continuous cameras and does fancy motion detection, etc and gives you a nice web display of live video and events. It seems quite sophisticated too with advanced, customizable detection algorithms, video area masking, etc. Also cool: it can use Intel QuickSync or a Coral TPU (!) for hardware acceleration of tasks.

I gave it a 15 minute try using a tteck script to install and run it on Proxmox. Got it up and running pretty quick although the hand-edited YAML file feels like a bit of a throwback. You enable an always-on RTSP feed from the Eufy camera and then configure Frigate to connect to it. The tteck scripts nicely set up Intel CPU acceleration for me already although I did have to do some extra bits to enable it and to enable recording.

It seems to work well. It’s a very nice product, the UI is impressive. Monitoring one camera the container is using 20% of a i7-12600k CPU and some tiny fraction of GPU. About 1GB of RAM. I imagine that multiplies with more cameras but this can run on a pretty low powered Intel machine.

I don’t know if I’m going to keep using Frigate or not. I still think the Eufy product does most of what I want. And I don’t love the idea of all this video using up WiFi bandwidth and CPU when nothing is happening. Makes me wonder if there’s some other open source software that can use the intermittent RTSP feed Eufy also offers only for motion events. But if I wanted to customize the object detection or do more fancy video analysis, Frigate definitely seems like a nice tool.
```
mqtt:
  enabled: false
cameras:
  sf4:
    ffmpeg:
      hwaccel_args: preset-vaapi
      inputs:
        - path: rtsp://usernamepassword@192.168.0.252:554/live0
          roles:
            - detect
            - rtmp
    detect:
      width: 1920
      height: 1080
      fps: 5
    record:
        enabled: True

detectors:
  ov:
    type: openvino
    device: CPU
    model:
      path: /openvino-model/FP16/ssdlite_mobilenet_v2.xml
model:
  width: 300
  height: 300
  input_tensor: nhwc
  input_pixel_format: bgr
  labelmap_path: /openvino-model/coco_91cl_bkgr.txt
version: 0.14
```
2024/09/21
Starlink adds an NTP server
This is very cool: Starlink’s dish now has an NTP server serving GPS time. That should be highly accurate. If they implemented it right, it’s going to be your best source of time short of specialty hardware you installed yourself. Certainly better than anything you’ll get from the Internet. See discussion here.

The server advertises itself as stratum 1. Thta’s a bold promise of high accuracy but is appropriate for a GPS receiver. It also means most NTP clients will prefer it as the time source to anything from the Internet. Which seems right to me.

(FWIW, I’ve synced hosts with chrony for years now over Starlink using the default Ubuntu pool config. Despite the high jitter Chrony does a good job synchronizing, it estimates it’s got accurate time to within 2ms despite 50ms of root delay. )

I added the server a few minutes ago and my machine is already syncing to it. Will update new stats after it has run awhile.

Update 1

3 hours later and it’s definitely running nicely. Here’s a graph of root mean square offset, a measure of clock accuracy. Over the last month, setting clocks over Starlink Internet with the Ubuntu pool got me an RMS of about 275µs … Now it’s more like 100 µs since using Starlink. Here’s a graph of the last 24 hours, the line is where I switched to Starlink’s clock.

Honestly I’m surprised at how well chronyc synced time over Starlink Internet. The servers are all 40-80ms away varying every few minutes, a huge amount of jitter. I think chronyc has special algorithms to estimate that kind of thing.

Also impressed with how Starlink has improved. A year ago, before the latency improvements, my RMS averaged 427µs. Here’s a graph of the RMS for the last two years, mostly on Starlink ISP.

Honestly all this stuff is astonishingly good timekeeping. 1ms accuracy with consumer parts and slightly wonky Internet connections. It’s nice now to have access to GPS time though!

Update 2

running for about 24 hours now and doing well. One issue with Dishy as a time server is it often reboots around 3am. Mine rebooted last night and while Chrony coped it definitely threw off time accuracy. The offset went from under 100µs to fluctuating between -1ms and 2ms. The RMS had converged as low as 30µs (very impressive) but reset to 800µs after the 3am reboot. Oddly I don’t see any evidence Chrony gave up on syncing to the clock entirely, but the root delay to it stopped being a steady 2ms and spiked up to 4ms for a couple of hours after the reboot. All of this is well within the bounds of an NTP client to cope but is an example of how it’s not quite as good as having a fully dedicated GPS time receiver. OTOH it’s free!

Some stats after a few hours
```
chronyc> tracking
Reference ID    : C0A86401 (192.168.100.1)
Stratum         : 2
Ref time (UTC)  : Tue Sep 17 20:16:58 2024
System time     : 0.000011511 seconds slow of NTP time
Last offset     : +0.000012743 seconds
RMS offset      : 0.000090009 seconds
Frequency       : 30.078 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.121 ppm
Root delay      : 0.001979256 seconds
Root dispersion : 0.002787174 seconds
Update interval : 512.1 seconds
Leap status     : Normal
chronyc> sourcestats
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
alphyn.canonical.com       48  28   13h     -0.020      0.111  +2013us  3097us
prod-ntp-3.ntp1.ps5.cano>  55  28   15h     -0.080      0.080    +66us  2833us
prod-ntp-4.ntp1.ps5.cano>  18   8  293m     +0.090      0.651  +2919us  3553us
prod-ntp-5.ntp4.ps5.cano>  64  30   18h     -0.053      0.069   +946us  2962us
ntp3.radio-sunshine.org    64  35   18h     -0.082      0.054  -2298us  2382us
rn5.quickhost.hk           18   8  310m     +0.075      0.326  +2550us  1875us
LAX.CALTICK.NET            34  22  569m     +0.051      0.100   +934us  1678us
time.cloudflare.com        24  17  413m     +0.103      0.153  +1269us  1407us
192.168.100.1              15  10  103m     +0.000      0.118   +173ns   167us
```
2024/09/17
Vault: towards my own NAS
Working on rationalizing my file storage now that I’m in a Proxmox universe. The idea here is to put all my important stuff on a NAS VM running the following services:
- NFSv4
- Samba
- Syncthing
- Restic with BackRest
I’ve got a basic cut at it working (other than Samba) and it seems good. Some notes on what and why.
Contents

What am I storing on the NAS? Very little right now, but the plan is:
- 14GB of source code. Most of this never changes and it’s mostly small files although some 1GB data blobs sneak in occasionally.
- 7GB of Windows Documents. Random stuff here but it’s helpful to sync and backup.
- < 1GB of files on mobile devices. I use Syncthing to get stuff off my Android devices.
- 4TB of Media archives. NFS and SMB mounted locally. No Syncthing, no backups. Right I am manually rsyncing them occasionally to the second house. I may migrate this to backups (local only, Backblaze is expensive) and Syncthing.
Access over the Internet

The challenge here is I want my files accessible over the Internet, because I have two houses. My house with Starlink is 50ms away from the house with the server (and Sonic fiber). NFS and Samba are only good for LAN access. Those network file systems sorta work over the Internet but it’s slow and awful.

So instead I’m using Syncthing to maintain a working copy of files on my local machine. That makes my setup a lot like what Dropbox or other cloud drives is doing. I’ve done it for years with important stuff including all my source code and it’s worked fine. A key thing here: I am almost never modifying files in two places at once. Syncthing has a conflict resolution system for that but it’s a little primitive.

I’m using Tailscale for now to access the Syncthing server on the vault. That shouldn’t strictly be necessary.

Storage medium

Where are my files stored? On virtual disks that Proxmox provides. Those virtual disks are allocated on a ZFS pool that Proxmox maintains. That way the NAS VM doesn’t need to use ZFS directly, instead it just puts an ext4 filesystem on the virtual disks and it’s simple.

It’s a little weird that ZFS doesn’t see the individual files and the NAS is unaware of the ZFS pool status. But it turns out almost all the advantages of ZFS operate at the block level. Proxmox creates the virtual disk as a ZFS dataset, a block device, and ZFS will happily snapshot and compress and checksum and self-heal those blocks for me. There are a few filesystem-level ZFS features (like ACLs) but I don’t really care about them, so not having them in the NAS is no big loss.

The other option would be to pass through raw disk devices to the VM and let it run a ZFS pool itself. That’s more like what a TrueNAS setup does.

Still on the fence about what physical media I’ll use. So far I’m doing it on a single SSD in the Proxmox server. I think a pair of spinning disks with mirroring makes sense for the media archive but might be a little slow for the NFS clients accessing the source code.

I’m using Restic (with Backrest) for two sets of backups. One of everything to local disk, an extra disk in my Proxmox server. (See this pattern). One leaving out the media archives to Backblaze.

Product design

So far I’m at the “hack config files” stage of building this out. And that’s fine. Backrest and Syncthing already offer Web GUIs. I’d like to streamline the NFS and Samba configs so I don’t have to keep re-referencing the same disk volume over and over in several config files but eh, whatever. I could imagine making this into an actual product other people could use but it’d take a lot of configuration and testing. The VM might also make a good NixOS project.

Why not Cockpit for a GUI? It might actually be a good idea, I’ve never tried it.

Why not TrueNAS or OMV? Partly because I don’t like either very much. Also because neither supports Restic or Syncthing. You can install them in the VM but it’s kind of a kludge and really, I’m OK just managing NFS and Samba myself. TrueNAS can also manage a ZFS array which would be a different storage solution than what I’m doing.

As always with hacker projects there’s a lot niceties of monitoring and alerting I should be doing but haven’t set up yet. At least Proxmox ZFS will tell me if something catastrophic happens to the data.

Wrinkles

One problem with any NAS is managing file ownership. The POSIX file model is very clear for ownership and permissions. But SMB kind of screws that up. And so does Syncthing: it really wants to run as a single non-root user. (Although it has some support for multiple owners). For now I’m going to stick with POSIX files but they’re all just owned by me. That’s a good match for my data. OTOH I am allowing NFS clients to be root on the server, so maybe I can do more when Syncthing isn’t involved.

NFS still sucks for clients when the server fails. I’m mounting with noauto,x-systemd.automount,x-systemd.idle-timeout=60,soft,timeo=15,retrans=2 which first of all, lol, why aren’t the defaults adequate? And x-systemd really? But it seems to work; if the server is down the client will throw an error in under a minute.

Setup details.

The Vault VM is set up to use two vCPUs and 2GB of RAM. There’s a 4GB disk for the system volume (including 0.5GB of swap) and then as many user data volumes as I want.

I’m running Debian 12 on the VM with just the basics installed: nfs-kernel-server, samba, syncthing, backrest. The system seems to be using an actual 400MB of RAM, very little. I think the most RAM-hungry thing is Restic, it likes to do a lot of housekeeping when de-duping backups etc. It might make sense to give more RAM to the NAS just for it to cache files its serving.

Right now I’m mounting the datasets on virtual disks into /mnt/UUID, big ugly names. Then I have a second fstab entry to bind mount from /mnt/UUID to /export/friendlyname. Trying to reference the datasets everywhere else with the /export name.

Some specifics:
- /etc/interfaces: modified to add 192.168.240.* subnet for sharing within Proxmox
- /etc/fstab: edited to mount disks into /mnt and bind mount into /export
- /etc/exports: individually names each /export filesystem so NFS will send them
- /etc/default/nfs-kernel-server was modified to be NFSv4 only. Debian has some docs on this but they seem to be badly out of date.
- Backrest config is mostly in ~root/.config/backrest, also the systemd unit file. The GUI manages that.
- Syncthing runs as user nelson via systemd user (with loginctl linger enabled so it runs without me being logged in). The GUI manages configs, they are stored in /home/nelson/.local/state/syncthing
2024/09/16
Doze Stopper, a fix for Android notifications?

I’m testing a new fix for delayed notifications in Android and so far (24 hours) it’s working well. See also this Reddit post I made.

The fix is a free Google Play store app called Doze Stopper. There’s little explanation what the app does but judging by the logs what it’s doing is setting an alarm for itself every 5 minutes, then doing something to wake up from Doze mode. The timestamps on the logs don’t quite make sense though.

I’ve deliberately left my phone idle as much as I can for about a day now, then tested it every few hours with a notification from Gmail or Gotify. The notification always appeared within 30 seconds or so. Also the Messages webapp connects immediately now. In the past often those tests would fail: notification wouldn’t appear, messages wouldn’t connect. But this is hard to test reliably!

Didn’t notice any significant downside in battery life. Left the phone off the charger overnight for 10 hours and it went from 100% to 74%, about typical from before running this app.

Update Sep 15: left it running for 24 hours with no charging. Phone went from 100% to 44% which is roughly the same as it would without Doze Buster. Can’t measure precisely but it’s definitely not killing the battery. Also notifications keep working, Gotify notification shows up immediately after an hour+ idle. Some Gmail notifications may not be coming through immediately, not sure what’s up with that.

Update Sep 17: still seems to be helping a lot. It does seem to disrupt Do Not Disturb mode though. I have DND scheduled every night but in the morning I find it’s not enabled on my phone when it should be. I did a test now and DND was disabled sometime between 8 and 16 minutes after a schedule turned it on. I wasn’t watching closely, but there were no log events in Doze Stopper the first 10 minutes: my guess is the alarm or disabling doze is also turning off DND. There is a DND setting for “Alarm can override end time” but I want that enabled for my own alarms!

Note, the developer has a second app called Doze Buster that also does the same thing but apparently for older versions of Android. It’s a little confusing, none of this is well documented.

2024/09/14