AI Shit, go away; iocaine to the rescue

As a lot of people do, I have some content that is reachable using webbrowsers. There is the password manager Vaultwarden, an instance of Immich, ForgeJo for some personal git repos, my blog and some other random pages here and there.

All of this never had been a problem, running a webserver is a relatively simple task, no matter if you use apache2 , nginx or any of the other possibilities. And the things mentioned above bring their own daemon to serve the users.

AI crap

And then some idiot somewhere had the idea to ignore every law, every copyright and every normal behaviour and run some shit AI bot. And more idiots followed. And now we have more AI bots than humans generating traffic.

And those AI shit crawlers do not respect any limits. robots.txt, slow servers, anything to keep your meager little site up and alive? Them idiots throw more resources onto them to steal content. No sense at all.

iocaine to the rescue

So them AI bros want to ignore everything and just fetch the whole internet? Without any consideration if thats even wanted? Or legal? There are people who dislike this. I am one of them, but there are some who got annoyed enough to develop tools to fight the AI craziness. One of those tools is iocaine - it says about itself that it is The deadliest poison known to AI.

Feed AI bots sh*t

So you want content? You do not accept any Go away? Then here is content. It is crap, but appearently you don’t care. So have fun.

What iocaine does is (cite from their webpage) “not made for making the Crawlers go away. It is an aggressive defense mechanism that tries its best to take the blunt of the assault, serve them garbage, and keep them off of upstream resources”.

That is, instead of the expensive webapp using a lot of resources that are basically wasted for nothing, iocaine generates a small static page (with some links back to itself, so the crawler shit stays happy). Which takes a hell of a lot less resource than any fullblown app.

iocaine setup

The website has a https://iocaine.madhouse-project.org/documentation/, it is not hard to setup. Still, I had to adjust some things for my setup, as I use [Caddy Docker Proxy}([https://github.com/lucaslorentz/caddy-docker-proxy) nowadays and wanted to keep the config within the docker setup, that is, within the labels.

Caddy container

So my container setup for the caddy itself contains the following extra lines:

    labels:
      caddy_0.email: email@example.com
      caddy_1: (iocaine)
      caddy_1.0_@read: method GET HEAD
      caddy_1.1_reverse_proxy: "@read iocaine:42069"
      "caddy_1.1_reverse_proxy.@fallback": "status 421"
      caddy_1.1_reverse_proxy.handle_response: "@fallback"

This will be translated to the following Caddy config snippet:

(iocaine) {
        @read method GET HEAD
        reverse_proxy @read iocaine:42069 {
                @fallback status 421
                handle_response @fallback
        }
}

Any container that should be protected by iocaine

All the containers that are “behind” the Caddy reverse proxy can now get protected by iocaine with just one more line in their docker-compose.yaml. So now we have

   labels:
      caddy: service.example.com
      caddy.reverse_proxy: "{{upstreams 3000}}"
      caddy.import: iocaine

which translates to

service.example.com {
        import iocaine
        reverse_proxy 172.18.0.6:3000
}

So with one simple extra label for the docker container I have iocaine activated.

Result? ByeBye (most) AI Bots

Looking at the services that got hammered most from those crap bots - deploying this iocaine container and telling Caddy about it solved the problem for me. 98% of the requests from the bots now go to iocaine and no longer hog resources in the actual services.

I wish it wouldn’t be neccessary to run such tools. But as long as we have shitheads doing the AI hype there is no hope. I wish they all would end up in Jail for all their various stealing they do. And someone with a little more brain left would set things up sensibly, then the AI thing could maybe turn out something good and useful.

But currently it is all crap.

Electric Car, Vacation trip

Electric Car

A while ago I got my hands on an electric car - after not having owned a car for most of my life (there really is not much need here). It wasn’t planned nor a goal of mine, but it kind of “came out of talks with my boss”, so now it’s there.

Due to some special rules in the german tax system it turns out really cheap for me - it comes from the company, which allows personal use. So I have to pay taxes on the value of it plus whichever amount of kilometers I have to drive to work. And the latter is what is good for me - I have homeoffice in my contract, so no drive to work, except maybe once a year for something. So no regular trip to calculate, only if I ever really have to do a trip to the office.

And it being an electric car, running costs are also cheap. Way below the outdated tech that needs gas to run, is annoyingly loud and stinks.

The car

In the past I used either car sharing or renting a car when I needed one, depending on what I actually needed. Last times renting I already tried electric variants, so I could compare this with.

What I have now is a “Citroen e-Berlingo XL” from 2023 (so not the latest change from 2024), which is on the huge size for space, but small for battery. It has 7 seats (though I have the last 2 currently taken out, no daily need) and more storage space than I need even on a vacation trip.

The engine (or well, it’s battery) is on the small side - a capacity of 50 kWh means it only has 278km reach according to WLTP. That actually is a good bit less - as usual, those numbers are lying for the producer. Turns out that on highways in a more realistic mode than WLTP (read: real driving) it’s somewhat around 120km before one wants a charger again. But, looking around, at least in Germany that is not a big problem, there are more than enough chargers available.

Driving

Actually driving experience is good. It sure is a huge car (4.7m long, 1.85m high/wide) and feels more like driving a (small) bus, but it is easy to handle. Maximum speed is limited (or the small battery would suck even more) to 135km/h, but that is more than enough. Even on german highways. Did my vacation trip with cruise control set to 115km/h and very few times only went above that manually. Real relaxed driving that was.

Vacation trip

So we had a vacation just recently, and instead of renting a car for the trip we, of course, wanted to take the e-Berlingo. Distance was about double what the car can (realistically!) do, so one charging stop in the middle somewhere was a must. Not having had to charge on highways yet - and entirely new with this car - that made for a bit of nervousness, but it all turned out really good. There are really nice tools like ABRP to plan your trip including charging, which can take live data of availability of charging points into the planning.

And it turned out nice - we reached our planned charging point and found a long queue of cars waiting. But turns out it was all those poor folks that need actual gasoline for their outdated combustion engines. The charging points for EV cars still had enough free space, so we could bypass the queue and directly start charging. We also did not need to repark the car after just a few minutes, we could directly start our break.

With a charging time of approx. 30 minutes using the fast charger, such a break is long enough to get enough energy for the next part of the trip and short enough to not be annoying.

Charging prices

At home it depends. If one has some photovoltaic system to get power, charging is basically free. If not it depends on whichever contract one has, costs will be somewhat between 0 and ~30cent per kWh. Not much, and way below gasoline costs.

Outside, using a fast charger, prices vary depending on where you charge - and with what charging card. Prices between 40 and 70cent / kWh, and the same charging point can vary, just from the card one uses. That is a thing that the EU could actually go and better regulate, similar to the phone regulations it took. Still, the costs are still way below gasoline.

Charging cards

There is a huge amount of different providers available, and all do their own things in pricing and how one can use them. They do have standards (say, the plugs are standardized, by now the way to start charging also), and that enables roaming (use a charging card of one provider at a charging point of another), but other than that, it seems to be random.

That is - if you use card A on a charging point of Provider B you may pay 0.49cents, if you use card C on the same point, it may charge you 0.79cents. And card D isn’t taken at all. Some (the newer ones) you can pay directly by credit card, many you can’t. Some may allow paypal or Google/Apple Pay. So in the end you need more than just one charging card - I collected 8 free ones by now - just to be sure you can find a combination that isn’t hugely overpriced.

From QNAP QTS to TrueNAS Scale

History, Setup

So for quite some time I have a QNAP TS-873x here, equipped with 8 Western Digital Red 10 TB disks, plus 2 WD Blue 500G M2 SSDs. The QNAP itself has an “AMD Embedded R-Series RX-421MD” with 4 cores and was equipped with 48G RAM.

Initially I had been quite happy, the system is nice. It was fast, it was easy to get to run and the setup of things I wanted was simple enough. All in a web interface that tries to imitate a kind of workstation feeling and also tries to hide that it is actually a webinterface.

Natually with that amount of disks I had a RAID6 for the disks, plus RAID1 for the SSDs. And then configured as a big storage pool with the RAID1 as cache. Below the hood QNAP uses MDADM Raid and LVM (if you want, with thin provisioning), in some form of emdedded linux. The interface allows for regular snapshots of your storage with flexible enough schedules to create them, so it all appears pretty good.

QNAP slow

Fast forward some time and it gets annoying. First off you really should have regular raid resyncs scheduled, and while you can set priorities on them and have them low priority, they make the whole system feel very sluggish, quite annoying. And sure, power failure (rare, but can happen) means another full resync run. Also, it appears all of the snapshots are always mounted to some /mnt/snapshot/something place (df on the system gets quite unusable).

Second, the reboot times. QNAP seems to be affected by the “more features, fuck performance” virus, and bloat their OS with more and more features while completly ignoring the performance. Everytime they do an “upgrade” it feels worse. Lately reboot times went up to 10 to 15 minutes - and then it still hadn’t started the virtual machines / docker containers one might run on. Another 5 to 10 minutes for those. Opening the file explorer - ages on calculating what to show. Trying to get the storage setup shown? Go get a coffee, but please fetch the beans directly from the plantation, or you are too fast.

Annoying it was. And no, no broken disks or fan or anything, it all checks out fine.

Replace QNAPs QTS system

So I started looking around what to do. More RAM may help a little bit, but I already had 48G, the system itself appears to only do 64G maximum, so not much chance of it helping enough. Hardware is all fine and working, so software needs to be changed. Sounds hard, but turns out, it is not.

TrueNAS

And I found that multiple people replaced the QNAPs own system with a TrueNAS installation and generally had been happy. Looking further I found that TrueNAS has a variant called Scale - which is based on Debian. Doubly good, that, so I went off checking what I may need for it.

Requirements

Heck, that was a step back. To install TrueNAS you need an HDMI out and a disk to put it on. The one that QTS uses is too small, so no option.

QNAPs internal USB disk — QNAPs original internal USB drive, DOM

So either use one of the SSDs that played cache (and should do so again in TrueNAS, or get the QNAP original replaced.

HDMI out is simple, get a cheap card and put it into one of the two PCIe-4x slots, done. The disk thing looked more complicated, as QNAP uses some “internal usb stick thing”. Turns out it is “just” a USB stick that has an 8+1pin connector. Couldn’t find anything nice as replacement, but hey, there are 9-pin to USB-A adapters.

With that adapter, one can take some random M2 SSD and an M2-to-USB case, plus some cabling, and voila, we have a nice system disk.

USB 9pin to USB-A cable connected to Motherboard and some more cable — 9pin adapter to USB-A connected with some more cable

Obviously there isn’t a good place to put this SSD case and cable, but the QNAP case is large enough to find space and use some cable ties to store it safely. Space enough to get the cable from the side, where the mainboard is to the place I mounted it, so all fine.

Mounted SSD in external case, also shows the video card — Mounted SSD in its external case

The next best M2 SSD was a Western Digital Red with 500G - and while this is WAY too much for TrueNAS, it works. And hey, only using a tiny fraction? Oh so much more cells available internally to use when others break. Or something…

Together with the Asus card mounted I was able to install TrueNAS. Which is simple, their installer is easy enough to follow, just make sure to select the right disk to put it on.

Preserving data during the move

Switching from QNAP QTS to TrueNAS Scale means changing from MDADM Raid with LVM and ext4 on top to ZFS and as such all data on it gets erased. So a backup first is helpful, and I got myself two external Seagate USB Disks of 6TB each - enough for the data I wanted to keep.

Copying things all over took ages, especially as the QNAP backup thingie sucks, it was breaking quite often. Also, for some reason I did not investigate, the performance of it was real bad. It started at a maximum of 50MB/s, but the last terabyte of data was copied at MUCH less than that, and so it took much longer than I anticipated.

Copying back was slow too, but much less so. Of course reading things usually is faster than writing, with it going around 100MB/s most of the time, which is quite a bit more - still not what USB3 can actually do, but I guess the AMD chip doesn’t want to go that fast.

TrueNAS experience

The installation went mostly smooth, the only real trouble had been on my side. Turns out that a bad network cable does NOT help the network setup, who would have thought. Other than that it is the usual set of questions you would expect, a reboot, and then some webinterface.

And here the differences start. The whole system boots up much faster. Not even a third of the time compared to QTS.

One important thing: As TrueNAS scale is Debian based, and hence a linux kernel, it automatically detects and assembles the old RAID arrays that QTS put on. Which TrueNAS can do nothing with, so it helps to manually stop them and wipe the disks.

Afterwards I put ZFS on the disks, with a similar setup to what I had before. The spinning rust are the data disks in a RAIDZ2 setup, the two SSDs are added as cache devices. Unlike MDADM, ZFS does not have a long sync process. Also unlike the MDADM/LVM/EXT4 setup from before, ZFS works different. It manages the raid thing but it also does the volume and filesystem parts. Quite different handling, and I’m still getting used to it, so no, I won’t write some ZFS introduction now.

Features

The two systems can not be compared completly, they are having a pretty different target audience. QNAP is more for the user that wants some network storage that offers a ton of extra features easily available via a clickable interface. While TrueNAS appears more oriented to people that want a fast but reliable storage system. TrueNAS does not offer all the extra bloat the QNAP delivers. Still, you have the ability to run virtual machines and it seems it comes with Rancher, so some kubernetes/container ability is there. It lacks essential features like assigning PCI devices to virtual machines, so is not useful right now, but I assume that will come in a future version.

I am still exploring it all, but I like what I have right now. Still rebuilding my setup to have all shares exported and used again, but the most important are working already.

Rust? Munin? munin-plugin…

My first Rust crate: munin-plugin

Sooo, some time ago I had to rewrite a munin plugin from Shell to Rust, due to the shell version going crazy after some runtime and using up a CPU all for its own. Sure, it only did that on Systems with Oracle Database installed, so that monster seems to be bad (who would have guessed?), but somehow I had to fixup this plugin and wasn’t allowed to drop that wannabe-database.

A while later I wrote a plugin to graph Fibre Channel Host data, and then Network interface statistics, all with a one-second resolution for the graphs, to allow one to zoom in and see every spike. Not have RRD round of the interesting parts.

As one can imagine, that turns out to be a lot of very similar code - after all, most of the difference is in the graph config statements and actual data gathering, but the rest of code is just the same.

As I already know there are more plugins (hello rsyslog statistics) I have to (sometimes re-)write in Rust, I took some time and wrote me a Rust library to make writing munin-plugins in Rust easier. Yay, my first crate on crates.io (and wrote lots of docs for it).

By now I made my 1 second resolution CPU load plugin and the 1 second resolution Network interface plugin use this lib already. To test less complicated plugins with the lib, I took the munin default plugin “load” (Linux variant) and made a Rust version from it, but mostly to see that something as simple as that is also easy to implement: Munin load

I got some idea on how to provide a useful default implementation of the fetch function, so one can write even less code, when using this library.

It is my first library in Rust, so if you see something bad or missing in there, feel free to open issues or pull requests.

Now, having done this, one thing missing: Someone to (re)write munin itself in something that is actually fast… Not munin-node, but munin. Or maybe the RRD usage, but with a few hundred nodes in it, with loads of graphs, we had to adjust munin code and change some timeout or it would commit suicide regularly. And some other code change for not always checking for a rename, or something like it. And only run parts of the default cronjob once an hour, not on every update run. And switch to fetching data over ssh (and munin-async on the nodes). And rrdcached with loads of caching for the trillions of files (currently amounts to ~800G of data).. And it still needs way more CPU than it should. Soo, lots of possible optimizations hidden in there. Though I bet a non-scripting language rewrite might gain the most. (Except, of course, someone needs to do it… :) )

Another shell script moved to rust

Shell? Rust!

Not the first shell script I took and made a rust version of, but probably my largest yet. This time I took my little tm (tmux helper) tool which is (well, was) a bit more than 600 lines of shell, and converted it to Rust.

I got most of the functionality done now, only one major part is missing.

What’s tm?

tm started as a tiny shell script to make handling tmux easier. The first commit in git was in July 2013, but I started writing and using it in 2011. It started out as a kind-of wrapper around ssh, opening tmux windows with an ssh session on some other hosts. It quickly gained support to open multiple ssh sessions in one window, telling tmux to synchronize input (send input to all targets at once), which is great when you have a set of machines that ought to get the same commands.

tm vs clusterssh / mussh

In spirit it is similar to clusterssh or mussh, allowing to run the same command on many hosts at the same time. clusterssh sets out to open new terminals (xterm) per host and gives you an input line, that it sends everywhere. mussh appears to take your command and then send it to all the hosts. Both have disadvantages in my opinion: clusterssh opens lots of xterm windows, and you can not easily switch between multiple sessions, mussh just seems to send things over ssh and be done.

tm instead “just” creates a tmux session, telling it to ssh to the targets, possibly setting the tmux option to send input to all panes. And leaves all the rest of the handling to tmux. So you can

detach a session and reattach later easily,
use tmux great builtin support for copy/paste,
see all output, modify things even for one machine only,
“zoom” in to one machine that needs just ONE bit different (cssh can do this too),
let colleagues also connect to your tmux session, when needed,
easily add more machines to the mix, if needed,
and all the other extra features tmux brings.

More tm

tm also supports just attaching to existing sessions as well as killing sessions, mostly for lazyness (less to type than using tmux directly).

At some point tm gained support for setting up sessions according to some “session file”. It knows two formats now, one is simple and mostly a list of hostnames to open synchronized sessions for. This may contain LIST commands, which let tm execute that command, expected output is list of hostnames (or more LIST commands) for the session. That, combined with the replacement part, lets us have one config file that opens a set of VMs based on tags our Ganeti runs, based on tags. It is simply a LIST command asking for VMs tagged with the replacement arg and up. Very handy. Or also “all VMs on host X”.

The second format is basically “free form tmux commands”. Mostly “commandline tmux call, just drop the tmux in front” collection.

Both of them supporting a crude variable replacement.

Conversion to Rust

Some while ago I started playing with Rust and it somehow ‘clicked’, I do like it. My local git tells me, that I tried starting off with go in 2017, but that appearently did not work out. Fun, everywhere I can read says that Rust ought to be harder to learn.

So by now I have most of the functionality implemented in the Rust version, even if I am sure that the code isn’t a good Rust example. I’m learning, after all, and already have adjusted big parts of it, multiple times, whenever I learn (and understand) something more - and am also sure that this will happen again…

Compatibility with old tm

It turns out that my goal of staying compatible with the behaviour of the old shell script does make some things rather complicated. For example, the LIST commands in session config files - in shell I just execute them commands, and shell deals with variable/parameter expansion, I just set IFS to newline only and read in what I get back. Simple. Because shell is doing a lot of things for me.

Now, in Rust, it is a different thing at all:

Properly splitting the line into shell words, taking care of quoting (can’t simply take whitespace) (there is shlex)
Expanding specials like ~ and $HOME (there is home_dir).
Supporting environment variables in general, tm has some that adjust behaviour of it. Which shell can use globally. Used lazy_static for a similar effect - they aren’t going to change at runtime ever, anyways.

Properly supporting the commandline arguments also turned out to be a bit more work. Rust appearently has multiple crates supporting this, I settled on clap, but as tm supports “getopts”-style as well as free-form arguments (subcommands in clap), it takes a bit to get that interpreted right.

Speed

Most of the time entirely unimportant in the tool that tm is (open a tmux with one to some ssh connections to some places is not exactly hard or time consuming), there are situations, where one can notice that it’s calling out to tmux over and over again, for every single bit to do, and that just takes time: Configurations that open sessions to 20 and more hosts at the same time especially lag in setup time. (My largest setup goes to 443 panes in one window). The compiled Rust version is so much faster there, it’s just great. Nice side effect, that is. And yes, in the end it is also “only” driving tmux, still, it takes less than half the time to do so.

Code, Fun parts

As this is still me learning to write Rust, I am sure the code has lots to improve. Some of which I will sure find on my own, but if you have time, I love PRs (or just mails with hints).

Github

Also the first time I used Github Actions to see how it goes. Letting it build, test, run clippy and also run a code coverage tool (Yay, more than 50% covered…) on it. Unsure my tests are good, I am not used to writing tests for code, but hey, coverage!

Up next

I do have to implement the last missing feature, which is reading the other config file format. A little scared, as that means somehow translating those lines into correct calls within the tmux_interface I am using, not sure that is easy. I could be bad and just shell out to tmux on it all the time, but somehow I don’t like the thought of doing that. Maybe (ab)using the control mode, but then, why would I use tmux_interface, so trying to handle it with that first.

Afterwards I want to gain a new command, to save existing sessions and be able to recreate them easily. Shouldn’t be too hard, tmux has a way to get at that info, somewhere.

Scan for SSH private keys without passphrase

SSH private key scanner (keys without passphrase)

So for policy reasons, customer wanted to ensure that every SSH private key in use by a human on their systems has a passphrase set. And asked us to make sure this is the case.

There is no way in SSH to check this during connection, so client side needs to be looked at. Which means looking at actual files on the system.

Turns out there are multiple formats for the private keys - and I really do not want to implement something able to deal with that on my own.

OpenSSH to the rescue, it ships a little tool ssh-keygen, most commonly known for its ability to generate SSH keys. But it can do much more with keys. One action is interesting here for our case: The ability to print out the public key to a given private key. For a key that is unprotected, this will just work. A key with a passphrase instead leads to it asking you for one.

So we have our way to check if a key is protected by a passphrase. Now we only need to find all possible keys (note, the requirement is not “keys in .ssh/”, but all possible, so we need to scan for them.

But we do not want to run ssh-keygen on just any file, we would like to do it when we are halfway sure, that it is actually a key. Well, turns out, even though SSH has multiple formats, they all appear to have the string PRIVATE KEY somewhere very early (usually first line). And they are tiny - even a 16384bit RSA key is just above 12000 bytes long.

Lets find every file thats less then 13000 bytes and has the magic string in it, and throw it at ssh-keygen - if we get a public key back, flag it. Also, we supply a random (ohwell, hardcoded) passphrase, to avoid it prompting for any.

Scanning the whole system, one will find quite a surprising number of “unprotected” SSH keys. Well, better description possibly “Unprotected RSA private keys”, so the output does need to be checked by a human.

This, of course, can be done in shell, quite simple. So i wrote some Rust code instead, as I am still on my task to try and learn more of it. If you are interested, you can find sshprivscanner and play with it, patches/fixes/whatever welcome.

Funny CPU usage - rewrite it in rust

Munin plugin and it’s CPU usage (and a rewrite in rust)

With my last blog on the Munin plugins CPU usage I complained about Oracle Linux doing something really weird, driving up CPU usage when running a fairly simple Shell script with a loop in.

Turns out, I was wrong. It is not OL7 that makes this problem show up. It appears to be something from the Oracle “Enterprise” Database installed on the system, that makes it go this crazy. I’ve now had this show up on RedHat7 systems too, and the only thing that singles them out is that overpriced index card system on it.

I still don’t know what the actual reason for this is, and honestly, don’t have enough time to dig deep into it. It is not something that a bit of debugging/tracing finds - especially as it does start out all nice, and accumulates more CPU usage over time. Which would suggest some kind of leak leading to more processing needed, or so - but then it is only CPU affected, not memory, and ONLY on systems with that database on. Meh.

Well, I recently (December vacation) got me to look deeper into learning Rust. My first project with that was a multi-threaded milter to do some TLS checks on outgoing mails (kind of fun customer requirements there), and heck, Rust did make that a surprisingly easy task in the end. (Comparing the old, single-threaded C code with my multi-threaded Rust version, a third of the code length doing more, and being way easier to extend with wanted new features is nice).

So my second project was “Replace this shell script with a Rust binary doing the same”. Hell yeah. Didn’t take that long and looks good (well, the result. Not sure about the code. People knowing rust may possibly scratch out eyes when looking at it). Not yet running for that long, but even compared to the shell on systems that did not show the above mentioned bugs (read: Debian, without Oracle foo), uses WAY less CPU (again, mentioned by highly accurate outputs of the top command). So longer term I hope this version won’t run into the same problems as the shell one. Time will tell.

If you are interested in the code, go find it here, and if you happen to know rust and not run away screaming, I’m happy for tips and code fixes, I’m sure this can be improved lots. (At least cargo clippy is happy, so basics are done…)

Update: According to munin, the rust version creates 14 forks/second less than the shell one. And the fork rate change is same on machines with/without the database. That 14 is more than I would have guessed. CPU usage as expected: only on the problem hosts with Oracle Database installed you can see a huge difference, otherwise it is not an easily noticable difference. That is, on an otherwise idle host (munin graph shows average use of low one-digit numbers), one can see a drop of around 1% in the CPU usage graph from munin. Ohwell, poor Shell.

Funny CPU usage

Munin plugin and it’s CPU usage (shell fixup)

So at work we do have a munin server running, and one of the graphs we do for every system is a network statistics one with a resolution of 1 second. That’s a simple enough script to have, and it is working nicely - on 98% of our machines. You just don’t notice the data gatherer at all, so that we also have some other graphs done with a 1 second resolution. For some, this really helps.

Basics

The basic code for this is simple. There is a bunch of stuff to start the background gathering, some to print out the config, and some to hand out the data when munin wants it. Plenty standard.

The interesting bit that goes wrong and uses too much CPU on one Linux Distribution is this:

run_acquire() {
   echo "$$" > ${pidfile}

   while :; do
     TSTAMP=$(date +%s)
     echo ${IFACE}_tx.value ${TSTAMP}:$(cat /sys/class/net/${IFACE}/statistics/tx_bytes ) >> ${cache}
     echo ${IFACE}_rx.value ${TSTAMP}:$(cat /sys/class/net/${IFACE}/statistics/rx_bytes ) >> ${cache}
     # Sleep for the rest of the second
     sleep 0.$(printf '%04d' $((10000 - 10#$(date +%4N))))
   done
}

That code works, and none of Debian wheezy, stretch and buster as well as RedHat 6 or 7 shows anything, it just works, no noticable load generated.

Now, Oracle Linux 7 thinks differently. The above code run there generates between 8 and 15% CPU usage (on fairly recent Intel CPUs, but that shouldn’t matter). (CPU usage measured with the highly accurate use of top and looking what it tells…)

Whyever.

Fixing

Ok, well, the code above isn’t all the nicest shell, actually. There is room for improvement. But beware, the older the bash, the less one can fix it.

So, first of, there are two useless uses of cat. Bash can do that for us, just use the $(< /PATH/TO/FILE ) way.
Oh, Bash5 knows the epoch directly, we can replace the date call for the timestamp and use ${EPOCHSECONDS}
Too bad Bash4 can’t do that. But hey, it’s builtin printf can help out, a nice TSTAMP=$(printf ‘%(%s)T\n’ -1) works.
Unfortunately, Bash4.2 and later, not 4.1, and meh, we have a 4.1 system, so that has to stay with the date call there.

Taking that, we end up with 3 different possible versions, depending on the Bash on the system.

obtain5() {
  ## Purest bash version, Bash can tell us epochs directly
  echo ${IFACE}_tx.value ${EPOCHSECONDS}:$(</sys/class/net/${IFACE}/statistics/tx_bytes) >> ${cache}
  echo ${IFACE}_rx.value ${EPOCHSECONDS}:$(</sys/class/net/${IFACE}/statistics/rx_bytes) >> ${cache}
  # Sleep for the rest of the second
  sleep 0.$(printf '%04d' $((10000 - 10#$(date +%4N))))
}

obtain42() {
  ## Bash cant tell us epochs directly, but the builtin printf can
  TSTAMP=$(printf '%(%s)T\n' -1)
  echo ${IFACE}_tx.value ${TSTAMP}:$(</sys/class/net/${IFACE}/statistics/tx_bytes) >> ${cache}
  echo ${IFACE}_rx.value ${TSTAMP}:$(</sys/class/net/${IFACE}/statistics/rx_bytes) >> ${cache}
  # Sleep for the rest of the second
  sleep 0.$(printf '%04d' $((10000 - 10#$(date +%4N))))
}

obtain41() {
  ## Bash needs help from a tool to get epoch, means one exec() all the time
  TSTAMP=$(date +%s)
  echo ${IFACE}_tx.value ${TSTAMP}:$(</sys/class/net/${IFACE}/statistics/tx_bytes) >> ${cache}
  echo ${IFACE}_rx.value ${TSTAMP}:$(</sys/class/net/${IFACE}/statistics/rx_bytes) >> ${cache}
  # Sleep for the rest of the second
  sleep 0.$(printf '%04d' $((10000 - 10#$(date +%4N))))
}

run_acquire() {
   echo "$$" > ${pidfile}

   case ${BASH_VERSINFO[0]} in
     5) while :; do
          obtain5
        done
        ;;
     4) if [[ ${BASHVERSION[1]} -ge 2 ]]; then
          while :; do
            obtain42
          done
        else
          while :; do
            obtain41
          done
        fi
        ;;
   esac
}

Does it help?

Oh yes, it does. Oracle Linux 7 appears to use Bash 4.2, so uses obtain42 and hey, removing one date and two cat calls, and it has a sane CPU usage of 0 (again, highly accurate number generated from top…). Appears OL7 is doing heck-what-do-i-know extra, when calling other tools, for whatever gains, removing that does help (who would have thought).

(None of RedHat or Oracle Linux has SELinux turned on, so that one shouldn’t bite. But it is clear OL7 doing something extra for everything that bash spawns.)

Debian NEW Queue, Rust packaging

Debian NEW Queue

So for some reason I got myself motivated again to deal with some packages in Debians NEW Queue. We had 420 source packages waiting for some kind of processing when I started, now we are down to something around 10. (Silly, people keep uploading stuff…)

That’s not entirely my own work, others from the team have been active too, but for those few days I went through a lot of stuff waiting. And must say it still feels mostly like it did when I somehow stopped doing much in NEW.

Except - well, I feel that maintainers are much better in preparing their packages, especially that dreaded task of getting the copyright file written seems to be one that is handled much better. Now, thats not supported by any real numbers, just a feeling, but a good one, I think.

Rust

Dealing with NEW meant I got in contact with one part that currently generates some friction between the FTP Team and one group of package maintainers - the Rust team.

Note: this is, of course, entirely written from my point of view. Though with the intention of presenting it as objective as possible. Also, I know what rust is, and have tried a “Hello world” in it, but that’s about my deep knowledge of it…

The problem

Libraries in rust are bundled/shipped/whatever in something called crates, and you manage what your stuff needs and provides with a tool called cargo.

A library (one per crate) can provide multiple features, say a TLS lib can link against gnutls or openssl or some other random implementation. Such features may even be combinable in various different ways, so one can have a high number of possible feature combinations for one crate.

There is a tool called debcargo which helps creating a Debian package out of a crate. And that tool generates so-called feature-packages, one per feature / combination thereof.

Those feature packages are empty packages, only containing a symlink for their /usr/share/doc/… directory, so their size is smaller than the metadata they will produce. Inside the archive and the files generated by it, stuff that every user everywhere has to download and their apt has to process. Additionally, any change of those feature sets means one round through NEW, which is also not ideal.

So, naturally, the FTP Team dislikes those empty feature packages. Really, a lot.

There appears to be a different way. Not having the feature packages, but putting all the combinations into a Provides header. That sometimes works, but has two problems:

It can generate really long Provides: lines. I mean, REALLY REALLY REALLY long. Somewhat around 250kb is the current record. Thats long enough that a tool (not dak itself) broke on it. Sure, that tool needs to be fixed, but still, that’s not nice. Currently preferred from us, though.
Some of the features may need different dependencies (say, gnutls vs openssl), should those conflict with each other, you can not combine them into one package.

Solutions

Currently we do not have a good one. The rust maintainers and the ftp team are talking, exploring various ideas, we will see what will come out.

Devel archive / Component

One of the possible solutions for the feature package problem would be something that another set of packages could also make good use of, I think. The introduction of a new archive or component, meant only for packages that are needed to build something, but where users are discouraged from ever using them.

What?

Well, take golang as an example. While we have a load of golang-something packages in Debian, and they are used for building applications written in go - none of those golang-something are meant to be installed by users. If you use the language and develop in it, the go get way is the one you are expected to use.

So having an archive (or maybe component like main or contrib) that, by default, won’t be activated for users, but only for things like buildds or archive rebuilds, will make one problem (hated metadata bloat) be evaluated wildly different.

It may also allow a more relaxed processing of binary-NEW (easier additions of new feature packages).

But but but

Yes, it is not the most perfect solution. Without taking much energy to think about, it requires

an adjustment in how main is handled. Right now we have the golden rule that main is self contained, that is, things in it may not need anything outside it for building or running. That would need to be adjusted for building. (Go as well as currently rust are always building static binaries, so no library dependencies there).
It would need handling for the release, that is, the release team would need to deal with that archive/component too. We haven’t, yet, talked to them (still, slowly, discussing inside FTP Team). So, no idea how many rusty knives they want to sink into our nice bodies for that idea…

Final

Well, it is still very much open. Had an IRC meeting with the rust people, will have another end of November, it will slowly go forward. And maybe someone comes up with an entire new idea that we all love. Don’t know, time will tell.

SSH known_hosts merge by key

So I just ran again into sshs annoying behaviour of storing the same host key a trillionth time in my .ssh/known_hosts file. And then later on, when it changes (for whatever reason), complaining over and over until one manually fixed all those tons of lines.

Annoying.

So I came up with a little hacky python script that takes one or more files in the known_hosts format and merges them by key. So you end up with one line per key, and as many hostnames, IP addresses and whatnot in front.

Note: Does not work with that annoying HashKnownHosts format, and I have no idea what ssh will say if you use one of those @ tags in there. The first one is think is crap, so I don’t use it anywhere, the second I never had to use, so no idea if it breaks or not.

#!/usr/bin/python

# Copyright (C) 2019 Joerg Jaspert <joerg@debian.org>
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# .
# 1. Redistributions of source code must retain the above copyright
#    notice, this list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright
#    notice, this list of conditions and the following disclaimer in the
#    documentation and/or other materials provided with the distribution.
# .
# THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
# OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
# IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

import argparse

parser = argparse.ArgumentParser(
    description='Merge ssh known host entries by key',
    epilog="""
Merges entries in given ssh known_hosts file based on the key. One can also merge from multiple files.
The file should NOT use the HashKnownHosts feature.
""")

parser.add_argument('files', type=str, nargs='+', help='files that should be merged')
parser.add_argument('-o', '--output', type=str, nargs='?', help='output file (defaults is STDOUT). Only opened after merge is complete, so can be used for inplace merge.')
args = parser.parse_args()

if args.output:
  import StringIO
  output = StringIO.StringIO()
else:
  import sys
  output = sys.stdout

hostkeys = {}
for kfile in args.files:
  with open(kfile) as kf:
    for line in kf:
      line_splitted = line.rstrip().split(' ')
      hosts = line_splitted.pop(0).split(',')
      key_type = line_splitted.pop(0)
      key = line_splitted[0]
      if not key in hostkeys:
        hostkeys[key]={}
        hostkeys[key]["hosts"] = {}
      hostkeys[key]["key_type"]=key_type
      # Store the host entries, uniquify them
      for entry in hosts:
        hostkeys[key]["hosts"][entry]=1

# And now output it all
for key in hostkeys:
  output.write('%s %s %s\n' %
  (','.join(hostkeys[key]["hosts"]), hostkeys[key]["key_type"], key))

if args.output:
  with open(args.output,'w') as f:
    f.write(output.getvalue())

← Newer Page 1 of 26 Older →