Shredding Fail

A few years ago I blogged about the importance of using a good paper shredder. A crappy shredder is just (bad) security through obscurity, and easily leaks all kinds of info. So I was delighted and horrified to find this package show up in the mail, using shreds as packing protection for the item I had ordered.

 

30 seconds of searching turned up a number of interesting bits (plus a few more with full names which I have kindly omitted):

Let’s count a few of the obvious mistakes here:

  1. Using a cheap, damaged, or simply overloaded shredder that doesn’t even fully cut the paper. I got an inch-wide swath of two pages for free.
  2. Not using a cross-cut shredder. (In part, there is a mix of both types.)
  3. Shredding in the direction of the printed text.
  4. Sending the poorly-shredded output to a random person who bought something from your business.

And of course it would have been better security to use a microcut shredder.

Now, to be fair, even this poor shredding has technically done its job. Other than a few alluring snippets, it’s not worth my time to assemble the rest to see the full details of these banking, business, and health care records. But then I’m a nice guy who isn’t interested in committing fraud or identity theft, which is an unreasonable risk to assume of every customer.

Satellite Radio

In my last blog post, I wrote about how I helped fund the launch of a satellite via KickStarter, and that after launch I would have more to say about receiving radio signals from it. This is that update.

The good news is that SpaceX’s Falcon 9 finally launched last Friday. It’s had two delays since the originally scheduled date of March 30th. No big deal, it happens. Here’s the launch video, showing liftoff, onboard rocketcam, staging, and separation of the Dragon cargo capsule: (YouTube)

This is the CRS-3 flight — the 3rd  Commercial Resupply Service flight hauling cargo to the International Space Station. With the retirement of the Space Shuttle, CRS flights are the only US vehicles going to ISS, although other countries have their own capabilities. This is also the first flight where SpaceX has successfully flown the first stage back to Earth for a landing (albeit in the ocean, for safety, as it’s still experimental). Usually rockets like this are expendable – they just reenter the atmosphere and burn up. But SpaceX plans to fly them back to reuse the hardware and enable lower costs. It’s a pretty stunning concept. There’s no video of CRS-3’s Falcon 9 returning (yet?), just two tweets confirming it was successful. But it just so happens they did a test flight last week that demonstrates the idea: (YouTube)

So back to to satellite I funded: KickSat. The “carrier” was successfully deployed, and the first telemetry packets have been received and decoded. In a couple of weeks (May 4th) it will deploy its 104 “Sprite” nanosatellites, which will each start broadcasting their own unique signals. Things are off to a great start!

The slightly bad news is that I haven’t been completely successful in capturing Kicksat broadcasts with my own ground equipment. Yet.

It’s challenging for a few reasons…

The biggest factor is that the signals are inherently just not very strong, due to power limitations. The KickSat carrier is a 3U cubesat, with limited area for solar-cells. It periodically broadcasts a brief telemetry packet with a 1-watt radio, which is what I’m trying to capture.

And of course the Sprites it carries are even smaller, and will only be transmitting with a mere 10mW of power. The silver area in the Sprite I’m holding below is where the solar cell goes. (It turns out that satellite-grade solar cells are export-restricted under US law, so the sample units shipped without ’em!)

Such faint signals need some modestly specialized (but still off-the-shelf) equipment to enable reception. I’ve got a yagi antenna and low-noise amplifier, as recommended on the KickSat wiki. Successfully making use of it is a little tricky, though. You need to use orbit tracking tools (e.g. gpredict or via Heavens Above) to know when a satellite pass will occur, and where in the local sky its path will be. Yagis are directional, and thus need to be pointed (at least roughly) to the right place. Not every pass is directly overhead, and if it’s too low on the horizon it may be too hard to receive (due to strength or interference).

Additionally, KickSat only broadcasts a short packet once every 30 or 250 seconds (depending on what mode it’s in), and during a pass it’s only above the horizon for a few minutes. That makes for a rather ephemeral indication that it’s even there. Blink and it’s gone! My location has about 4 pass opportunities a day, but not all are useful.

Oh, and did I mention I’m doing this all with a cheap $20 RTL2832U software defined radio?! Heck, the coax cables connecting my dongle to the antenna cost more than the radio itself!

I decided to start off by first trying to catch signals from some other satellites. I went through AMSAT’s status page and DK3WN’s fantastic satellite blog and gathered a list of a couple dozen satellites known to be active on the same 70cm (~435Mhz) band KickSat is using.

My first success was HO-68 (aka Hope Oscar 68 aka XW-1). This is a Chinese satellite launched in 2009, broadcasting a fairly constant 200mW telemetry “beacon” in morse code. Picking out the dotted line in the GQRX waterfall display was pretty easy, even though it’s not exactly on the expected 435.790Mhz frequency due to inaccuracies in my radio and doppler shift(!).

This is what it sounds like: WAV | OGG. Why is the tone shifting? The gradual lowering is doppler shift. The abrupt jumps upward are just from me adjusting the radio tuning to keep it audible. My morse code skills are terrible, but I replayed it enough times to convince myself that it starts out with the expected “BJ1SA XW XW” (the radio sign and initials of its name), per the page describing the signal format. For the lazy, that’s – . . . / . – – – / . – – – – / . . . / . – / – . . – / . – – / – . . – / . – – in morse code.

Next up was FO-29.

Here’s a screengrab of gpredict, showing the predicted path low in the sky, from the southeast to north.

It’s got a stronger 1-watt beacon, which wasn’t too hard to spot. Here I’ve switched from GQRX on OS X to SDRSharp on Windows. It has more features, plugins, and makes it easy to zoom in on a signal. The signal with doppler-shift is readily apparent as a diagonal line.

Audio of FO-29’s beacon: WAV | OGG. Despite the stronger transmitter, the received signal is weaker (probably due to being low on the horizon), and the the morse code is sufficiently fast that I’m not able to decode it by ear. (There’s an app for automatic decoding, but I haven’t tried it yet.)

And lastly… *drumroll*… After a number of unsuccessful attempts to receive KickSat’s signal, I finally caught it today! There was a nearly-overhead pass in the afternoon, so the as-received signal strength was optimal.

I pointed my antenna upwards, tilted it to my clear northeast view, tuned SDRSharp to the 437.505 MHz KickSat frequency, and waited. This is one of the usecases where software radio is really useful… While waiting, I was was recording the entire raw ~2Mhz slice of bandwidth around that frequency. That’s 4.5GB for the 7 minutes I was recording. I actually missed the transmission the first time, as it’s indicated about 23kHz lower than expected (again, due to hardware inaccuracies and doppler shift). But no big deal, I just replayed the recorded data and fine tuned things to the right spot.

Here’s what it sounds like: WAV | OGG. Unlike the steady stream of analog dit-dahs from HO-68 and FO-29, this is just 2 seconds of digital “noise”, like you would hear from a dialup modem. In fact, it’s exactly what you’d hear from an old modem. KickSat is using the 1200bps AFSK-modulated format, which is apparently still widely used in amateur packet radio. There are decoders available to extract the digital data (GQRX has one built in, and SDRSharp output can be piped to the qtmm AFSK1200 decoder).

If you’ve got SDRSharp, here’s the raw IQ data for the packet (ZIP, 60.3MB). I cropped the data to just the relevant portion. Alas, I can’t seem to get the decoder to recognize the packet. 😦 I’m not quite sure what the problem is yet… I’ve successfully decoded other AFSK data with this setup, so I suspect my signal was just too weak/noisy. Could be poor antenna pointing, but this was an easy pass. Some folks have had success with improving SNR with shielding, but I haven’t been able to replicate their results. There are a number of knobs and dials in SDRSharp to adjust manual/automatic gain control, so I might need to tweak that. (Unfortunately difficult, as there are only a few brief chances a day to catch KickSat.) It’s also possible that this is just slightly beyond the capabilities of a $20 RTL2832U dongle. Other options exist. I’d love to get a HackRF SDR board, but they’re not available yet. The FUNcube Dongle Pro+ can be had for about $200, but from comparisons I’m not sure exactly how much better it is in this band, or if it’s worth it. I’d love to hear opinions from hams who know this stuff better, or have tried both!

Amusing aside: while poking around in the 70cm band for interesting things, I stumbled across Santa Clara County’s ARES/RACES packet radio BBS. (Apparently, Ted, that is indeed still a thing!) In fact, FO-29 is actually an orbiting BBS! It’s quaintly amusing in the Internet Age, but when it launched in 1990 it must have been amazing. I had just upgraded to a 2400bps modem and discovered FidoNet and UUCP to reach beyond the local area code BBS systems.

That’s it for now. Over the next few weeks I’ll be refining my equipment and skills, and hope to capture some solid transmissions by the time my Sprite deploys. That will be even tougher to catch, but it’s a fun challenge!

Hacks all the way down

So, I did this thing… It’s a little complicated to explain, but bear with me: I used a Software Defined Radio to capture the radio transmissions from a Bravo Ph esophageal monitor, wrote a browser-based decoder using the AudioData API, and reverse-engineered the broadcast data packets. As a bonus, I hope to do something similar to catch signals from a tiny satellite I helped fund on Kickstarter, which launches this weekend on SpaceX’s CRS-3 mission to the International Space Station.

Phew. Ok, now let’s break that down. 🙂

Software Defined Radio

Software Defined Radio (SDR) is central to all of this. It’s a pretty complex field — and I am by no means an expert — but the simplified basics are not hard to understand. Traditionally, a radio is built for a specific purpose with specific hardware. It’s effectively a black box customized to convert audio/video/data to or from a particular pattern of electromagnetic radiation in some particular part of the spectrum. Each box is different; you need one for satellite TV, one for FM radio, one for WiFi, one for GPS, and so on. There are myriad variations, and devices that might seem similar can actually be completely different. You’ve probably seen news reports about police and fire departments who are responding to the same disaster, but are literally unable to talk to each other because their radio systems are different.

SDR is a radical departure from all that. You still need a piece of hardware that can tune to a relevant slice of the radio spectrum, but it becomes a general-purpose device that relies on software to do all the application-specific bits. For example, you might ask such a device to tune to 66Mhz, and capture 3Mhz of bandwidth on either side. You then feed the result to a software NTSC decoder, et voilà, you’re watching TV (analog channel 4). The hardware doesn’t know anything about the contents of what it’s receiving, since it’s the software that deals with it. If your device can capture more bandwidth, you could even watch multiple channels at the same time. And since it’s just generic data being processed by software, it doesn’t need to happen in real-time. You can record a stream of RF data, and process it in different ways after the fact.

DVB-T dongle (digital TV) based on the Realtek RTL2832U and Elonics E4000 tuner.

Until recently, SDR was only possible with fairly expensive equipment, which made it a niche hobby. But in the 2010-2012 timeframe, some folks discovered that a cheap USB device intended for digital TV reception (“watch TV on your laptop!”) contained surprisingly capable hardware that could be repurposed as a general-purpose SDR. Specifically, the RTL2832U chipset and a variety of tuner chips. For $10 to $20 you could get one of these mass-produced dongles that, with the right software, let you receive and decode all kinds of interesting transmissions from roughly 50Mhz to 1800Mhz.

Here are just a few of the things possible:

There’s a whole world of analog and digital RF data being broadcast around us, which cheap Software Defined Radio hardware makes readily accessible.

The Bravo Ph system

Around the time I was starting to play with SDR, I had gone to my doctor because I was experiencing some of the symptoms of acid-reflux. Or, as it’s more formally known, gastroesophageal reflux disease: GERD. [Spoiler: no big deal, weightloss + antacid and I’m all good.] One of the steps in the diagnosis is monitoring the acidity level in your esophagus over time. This used to involve inserting a  tube into your nose and throat, leaving it there for a few days of measurement, and was generally quite unpleasant. Now they can just attach a tiny wireless sensor in you; it sticks there for about a week, and then gets eliminated naturally. It’s only a couple centimeters in size, and you don’t even notice it:

Bravo pH device compared with the tip of a pencil

During the monitoring period, you carry around a receiver (which basically looks like a giant 1990-era pager). It’s supposed to be kept within 3 feet of you at all times, or else it makes an annoying beep when it loses the sensor’s signal. It records pH measurements every few seconds, and conveniently displays the last reading.

Bravo pH receiver

When the study is over, you return the receiver, your doctor downloads the data, and you get a nice little report with graphs and numbers to help your doctor make a diagnosis.

Capturing Data

So that’s SDR and Bravo Ph. Now, if you’re connecting the obvious dots like I was, you’re wondering if it might be possible to snoop on the sensor’s broadcasts to see what they contain. Indeed it is!

But the first step is finding the signal.

I wasn’t really sure where to start, but some Googling turned up a User’s Guide (doctor’s guide, really) for the system, and buried in an appendix was the info I needed:

Output & Transmission
EIRP: 17.6 μW (-47.53 dBm) at a 3-meter distance
Format: Amplitude-shift keying
Frequency: 433.92 MHz
Rate: 60 ms every 12 seconds

Bingo! All I needed to do was tune my dongle to around 433.92MHz, and look for a bursty signal repeating about every 12 seconds. It was literally as simple as that — here’s a waterfall display from the GQRX app I was using, showing two transmissions (time is the vertical axis, frequency is the horizontal):

Screenshot from GQRX showing signals

And here’s what it sounds like as AM-demodulated audio: MP3 | WAV (Sorry, WordPress doesn’t seem to support inline HTML5 audio.)

Decoding Data

Ok, so now we get signal. But what’s in it? The brief bursts are obviously too fast for our meaty ears to discern meaning, so looking at the waveform in an audio editor was the easiest way to take a first look. I used Audacity on OS X:

Ah, there it is. Digital data. There are clear hi/lo levels, but what’s actually important is the length of the pulses. After examining a few more transmissions, the basic format of the data packet is apparent:

  1. A preamble consisting of 10 500μs hi pulses (each separated by 500μs lo). This likely serves as a clear “beginning of message” indicator, and to establish clock speed.
  2. 48 bits (6 bytes) of data. Each bit is 1 millisecond; with a “0” indicated by 250μs hi followed by 750μs lo, and a “1” indicated by 666μs hi followed by 333μs lo.
  3. A single 500μs hi stop bit.

(Note that I’m slightly rounding the timings to what would seem to be likely values. The actual data is imprecise due to noise and rising/falling edges, on the order of tens-of-microseconds.)

I decoded a few packets by hand, both for fun and to validate the format. But it quickly became tiring to scribble down data like “pppppppppp111110111111101000000100110100011101000001100101s” and then convert it to hex (fbfa04d1d065), so I decided to write a tool to do it. In the browser, of course!

At the time, Firefox supported a simple low-level Audio API that was exactly what I was looking for. In a nutshell, you add a MozAudioAvailable event listener to an <audio> element, and the listener periodically gets an array of sample data as the media plays. I implemented a simple state machine to decode the data, based on a manually set threshold between the hi/lo states (and some fudge-factors to deal with the imprecision/noise previously noted). I’m sure there are more elegant and automatic ways to do this, but simple brute force was enough for my limited needs. The one annoying downside is that this API can only process in real-time(!); there’s no way to ignore playback speed and just get all the data as fast as possible. If you’d like to play around with it, here’s a live demo and the source on Github. (*cough* I’ve been so slow in finishing this post, that the Audio API I’m using has been removed from Firefox 28, in favor of the newer Web Audio API. So you’ll need an older Firefox, or just gaze upon the following screenshot.)

Ok, so now I’ve got a bunch of decoded data values to examine, such as:

fbfa04b4b79b
fbfa04b4a9a9
fbfa049b80eb
fbfa04857a07
fbfa04858af7
fbfa048483ff

What do they mean?

The first 3 bytes (0xFBFA04) are always the same, so that’s presumably a serial number or unique ID (and the manual confirms that during setup, there’s a step to ensure that the receiver is getting the expected sensor ID).

The next two bytes must be the actual pH measurements. They are usually similar to each other, and by graphing the values I can see they follow the trend of the pH values reported on the receiver (which I was writing down when capturing the transmissions). Why two values? The manual says that a measurement is made every 6 seconds but a transmission only every 12, which I assume is done to save power. The pH is roughly obtained by dividing the byte’s value by 25 — but it looks like it’s somewhat non-linear or uncalibrated, as data for the lowest pH values needs to instead be divided by 30 to match what the receiver reports.

The last byte took a bit more effort to figure out. At first glance it appeared fairly random, so I assumed it was some kind of checksum. Validating medical data seems important, after all. Probably something simple to compute for an 8-bit microcontroller, so no fancy FEC or CRC magic… I fiddled around with a few guesses, but graphing the data led me to the answer:

Graph compating pH1/pH2 with checksum

The checksum value (green) looks like a stretched, inverted, and offset version of the average pH (yellow). How about (pH1 + pH2) ^ 0xFF + 7? (All modulo-255, since this is likely an 8-bit microcontroller.) That’s it! It correctly generates the observed checksum for each of the 144 packets I captured.

So with that, I’m able to decode, interpret, and validate the data packets. Neat. It’s not directly useful for anything, but made for a fun experiment.

Afterwards, I got to wondering if there might be some further technical details buried somewhere online to help explain or confirm what I found. I’ve seen patents and FCC filings used to glean data in other cases, so I went to look…

Patent US6689056 has a number of interesting tidbits. It indicates that the microcontroller in the sensor is probably a MicroChip 12C672 (a member of their PIC family, which is similar to the Atmel AVR family familiar to Arduino folks). There’s a basic description of the packet format, but the only detail I hadn’t caught was that the 3-byte header is actually a 2-byte ID and a 1-byte Message Type (I only ever saw one type). It does confirm that the last byte is a checksum, but doesn’t go into how it’s computed.

On the FCC’s website, I found a 4/25/2001 application from Meditronics for the PHZ-BRAVO100, which has a number of close-up photos and an extremely detailed Test Report. It basically confirms what I had found, with additional info on the bit timing and packet format, and also reveals that there is a “transmitter status” message type that’s sent once an hour.

Kicksat

Now let’s shift from inner space to outer space — or at least low Earth orbit. Back in November 2011, a fascinating project appeared on Kickstarter: “KickSat – Your personal spacecraft in space.” Usually satellites are large vehicles that cost millions, but recently this has been made more affordable by using  the small, standardized Cubesat format (a 10cm cube, weighing 1.3kg). KickSat takes this a step further, by packing a Cubesat carrier with tiny “nanosatellites” (3cm square, weighing a few grams). That brings the cost down to just $300 to sponsor a KickSat in orbit, broadcasting a custom callsign and other simple data. They don’t do much, but it helps demonstrate the concept of using a fleet of cheap, simple sensors instead of a single expensive “Cadillac” spacecraft. For example, instead of predicting space weather using reports from a handful of satellites, you might use a huge number of cheap nanosatellites monitoring a wide area.

KickSat Sprite (solar cells and antenna not installed)

As a bonafide space nerd, I jumped at the opportunity. And now, after a long wait, KickSat is poised to launch in just a few days (March 30th), onboard the SpaceX CRS-3 resupply flight to the International Space Station. Assuming all goes well, the KickSat CubeSat carrier will be deployed immediately after 2nd stage cutoff. It orbits by itself for 16 days, to ensure wide clearance from the ISS, and then deploys its 104 KickSats. Including mine, which will be broadcasting “MOZFF“. They’ll orbit for a few weeks, and then burn-up as they reenter the atmosphere. (No space junk!)

Rendering of the KickSat Sprites being deployed

The project is publishing info about the satellite’s transmissions, as well as info on how to set an inexpensive ground station using… That’s right, software defined radio (GNURadio + RTL2832U dongle). I’ve got my equipment ready, and will attempt to capture signals from KickSat while it’s in orbit. More on that after launch!

Attachment history

Earlier today, Dietrich tweeted a great pic as he was getting ready for the 2013 Summit in Brussels, showing his collection of badges from previous summits.

As it happens, I also dug out my badge collection home just yesterday. (Didn’t want to forget it at the office!).

That got me thinking, so I pulled out my drawer full of other Mozilla schwag and took some pictures. It’s an interesting historical look at some of the artifacts Mozilla has made over the years. Hopefully we’ll get some more great stuff at this year’s summit. Because, really, that’s the whole point of it. 😉

Mozilla office history

If you don’t live in the San Francisco Bay area, you may not be aware that the culmination of a major infrastructure project is underway this holiday weekend. The Bay Bridge, one of 5 major Bay-area bridges, is in the middle of a 5 1/2 day closure as it’s transitioned over to its replacement. (The other 4 major bridges, in case you’re wondering, are the
Richmond–San Rafael Bridge, the San Mateo Bridge, the Dumbarton Bridge, aaaand… hmmm… oh, right, the Golden Gate bridge)

The Bay Bridge was originally built in the 1930s, and after the Loma Prieta earthquake in 1989 it became clear it needed to be replaced. One of the flashbulb memories many people have of the quake — in addition to it interrupting the ’89 World Series and the collapse of the double-decker Cypress Street Viaduct — is the failure of a span on the Bay Bridge, with cars driving over it. Since then, the western half of the Bay Bridge has been retrofitted to be earthquake-safe, but the eastern span of the bridge has taken longer to completely replace. This weekend’s work is to transition the connection points, so that tomorrow people will be driving over a completely new bridge that’s been 11 years and $6.4 billion in the making!

(I’m getting to the part where Mozilla ties into this.)

Last Friday @BurritoJustice tweeted a link to a slideshow that dove into the engineering history of the Bay Bridge, complete with photos taken during the construction.

It’s some fantastic engineering porn, and I spent my lunch reading through all of it. I happened to notice that the building in the background of one of the photos looked familiar…

Mozilla’s San Francisco office, in the historic Hills Brothers Coffee building at 2 Harrison Street, is literally next-door to where the western span of the Bay Bridge lands in S.F. It makes for some really great views of the bridge from our top-floor patio:

As well as a first-row seat beneath the giant “HILLS BROS COFFEE” sign atop the building.

It’s this sign that made it easy to spot our building in the engineering history slideshow. The building was constructed in 1926, and the Bay Bridge wasn’t built until 1933-1936, so I was curious to see if the sign was visible in other contemporaneous photos. I started digging though some online resources, and got lucky right away by finding a highres version of that picture at the Library Of Congress:

I skimmed through the other 415 photos in this collection and another 1160 from UC Berkeley (So You Don’t Have To™), and found some other nice shots with the Hills Coffee sign peeking out from the background:






So there you have it. Pics of the Mozilla San Francisco office from both ends of an 80 year span of history.

A change in password manager

In my last post I talked about how password manager currently works. Starting in Firefox 26 (assuming all goes well), it will work a bit differently.

Currently, the password manager must look at each page that loads, and process it to see if it might be a login page (and if so try to fill in a saved username/password). This is kind of dumb and has always annoyed me. For one, it’s not really great for performance. We spend a small amount of time to do this on every page, when only a tiny fraction of pages will actually be login pages. Nor is it suited to the modern web — many pages do interesting things after they’ve initially loaded, but the password manager only attempts to do its work at initial page load. If a page dynamically adds a username/password field to the DOM afterward (in response to, say, the user clicking a “login” button), we never notice and thus can’t offer to fill in a login.

Starting with tomorrow’s Firefox 26 nightly build (or the day after, depending on how merges and build go), this whole process will instead be triggered when an <input type=password> is appended to the DOM — be it from loading static HTML, or when the page dynamically adds one later. This means password manager can skip processing most page loads (i.e., the ones without password fields). It also means it can now fill in logins on some sites that it couldn’t before.

This new functionality comes via bug 355063, with major supporting work from bug 771331 (which added the chrome-only DOMFormHasPassword event) and bug 839961 (which refactored password manager to prepare it for using this event). It’s also interesting to mention that bug 762593 has already added code to use DOMFormHasPassword, in order to log console warnings when a page is using password fields insecurely.

There is one potential downside.

This changes the timing of when passwords are filled, so if a site is setting or clearing the contents of login fields, it’s possible it could blow away a login that previously hadn’t been filled yet. For example, I’ve seen sites clear fields as a poor security measure (I guess to deter password managers?), as well as set fields (to values like “Enter username here”). The latter is a great example of when and why the HTML5 placeholder attribute should be used.

I don’t think this change is a significant risk, and is definitely worth it for the other benefits. But should you run across it, there will be a signon.useDOMFormHasPassword pref that can be flipped to revert to the old code. (This is only a temporary pref, it will go away once we’re sure this isn’t a big problem.) And, again, if you find problems please file a bug containing debug info.

On Firefox’s Password Manager

It’s been a while since I last blogged about Firefox’s password manager. No big deal, it really hasn’t fundamentally changed since since I rewrote it 6 years ago for Firefox 3.0. Well, ok, the storage backend did switch to SQLite, but that’s mostly invisible to users. There’s a noteworthy change coming soon (see next post!), but I figured that would be easier to explain if I first talked about how things currently work. Which I’ve been meaning to do for a long time, ahem.

The first thing you should know is that there is no standard or specification for how login pages work! The browser isn’t involved with the login process, other than to do generic things like loading pages and running javascript. So we basically have to guess about what’s going on in order to fill or save a username/password, and sometimes sites can do things that break this guesswork. I’m actually surprised I don’t get questions about this more often.

So let’s talk about how two of the main functions work — filling in known logins, and saving new logins.

Filling in a known login

The overall process for filling in an existing stored login is simple and brutish.

  1. Use the chrome docloader service and nsIWebProgress to learn when we start loading a new page.
  2. Add a DOMContentLoaded event listener to learn when that page has mostly loaded.
  3. When that event fires…
    1. Check to see if there are any logins stored for this site. If not, we’re done.
    2. Loop through each form element on the page…
      1. Is there an <input type=password> in the form? If not, skip form.
      2. Check to see if any known logins match the particular conditions of this form. If not, skip form.
      3. Check to see if various other conditions prevent us from using the login in this form.
      4. Fill in the username and/or password. Great success!

Phew! But it’s the details of looking at a form where things get complex.

To start with, where do the username and password go? The password is fairly obvious, because we can look for the password-specific input type. (What if there’s more than one? Then we ignore everything after the first.) There’s no unique “username” type, instead we just look for the first regular input field before the password field. At least, that was before HTML5 added more input types. Now we also allow types that could be plausibly be usernames, like <input type=email> (but not types like <input type=color>). Note that this all relies on the order of fields in the DOM — we can’t detect cases where a username is intended to go after the password (thankfully I’ve never seen anyone do this), or cases where other text inputs are inserted between the actual username field and the password (perhaps with a table or CSS to adjust visual ordering).

And then there’s the quirks of the fields themselves. If your username is “alice”, what should happen if the username field already has “bob” filled in? (We don’t change it or fill the password.) Or, more common and depressing, what if the username field already contains “Enter your sign in name here”? In Mongolian? (We treat it like “bob”.) What if the page has <input maxlength=8> but your username is “billvonweiterheimerschmidt”? (We avoid filling anything the user couldn’t have typed.)

And then there’s the quirks of the saved logins. What if the username field already has “ALICE” instead of “alice”? (We ignore case when filling, it’s a little trickier when saving.) Is there a difference between <input name=user> and <input name=login>? (Nope, we stopped using the fieldname in Firefox 3 because it was being used inconsistently, even within a site.) What about a site has both a username+password _and_ a separate password-like PIN? (Surprisingly, we were able to make that work! Depending on the presence of a username field, we prefer one login or the other.)

And then. And then and then and then. Like I said, there’s no spec, and sometimes a site’s usage can break the guesses we make.

Saving a new/changed login

In comparison, the process for saving a login is simpler.

  1. Watch for any form submissions (via a chrome earlyformsubmit observer)
  2. Given a form submission, is there a password field in it? If not, we’re done.
  3. Determine the username and password from the form, and compare with existing logins…
    • If username is new, ask if user wants to save this new login
    • If username exists but the password is different, ask if user wants to change the saved password
    • If username and password are already saved, there’s nothing else to do.

Of course, there are still a number of complicating details!

This whole process is initiated by a form submission. If a site doesn’t actually submit a form (via form.submit() or <button type=submit>), but just runs some JavaScript to process the login, the password manager won’t see it. And thus can’t offer to save a new/changed login for the user. (Note that this is easy for a site to work around — do your work in the form’s onsubmit, but return |false| to cancel the submission).

Oh, and there’s still the same question as before — how to determine which fields are the username and password? We reuse the same algorithm as when filling a page, for consistency. But there are a few wrinkles. The form might be changing a password, so there could be up to 3 relevant password fields (for the cases of: just a new password, old and new passwords, and old + new + confirm.). And the password fields could be in any order! (Surprisingly, again, this works.) The most common problem I’ve seen is an account signup page with other form fields between the username and password, such as “choose a user name, enter your shipping address, set a password”. The password manager will guess wrong and think your zipcode is your username.

Oh, and somewhere in all this I should mention how differences in URLs can prevent filling in a login (or result in saving a seemingly-duplicate login). Clearly google.com and yahoo.com logins should be separate. But we also match on protocol, so that a https://site.com login will not be filled in on http://site.com. And what about http://www.foo.com and foo.com or accounts.foo.com? (We treat them separately.) What about mail.mozilla.com and people.mozilla.com? (Also separate.) What you might not realize is that we also use the form’s action URL (i.e., where the form is submitted to), ever since bug 360493. While this prevented sending your myspace.com password to evilhacker.com, it also breaks things when a site uses slightly different URLs in various login pages, or later changes where their login forms submit to.

Oh, bother.

All the gory details

This is one of the benefits of being Open Source. If you want to see alllll the gory details of how the Firefox password manager works, you can look at the source code. See http://mxr.mozilla.org/mozilla-central/source/toolkit/components/passwordmgr/. In particular, LoginManagerContent.jsm contains the code implementing the stuff discussed in this post, with the main entry points being onContentLoaded() and onFormSubmit().

Finally (!), I’ll mention that the Firefox password manager has some built in logging to help with debugging problems. If you find it not working in some particular case, you can enable the logging and it will often make the problem clear — or at least clearer to a developer when you file a bug!

IRC notifications

I currently use IRSSI, a traditional text-based client for IRC. I’ve used various GUI clients in the past, but found them all to have bugs and quirks that drove me crazy. IRSSI’s been very good to me, but it’s lacking one major feature that every GUI client has — notifications. And so unless I actually *look* at my IRC terminal, I can’t tell if someone’s trying to get my attention. To make things even worse, I run IRSSI under screen on a Mozilla server (so it’s always online even if my laptop isn’t). That makes it tricky to get a notification on my local system when the trigger is on a remote system.

Back in 2010 Justin Dow blogged about one way to get this working. It’s a little complicated… You first configure IRSSI to log notifications to a text file, the run one normal interactive SSH connection to screen+IRSSI, and another SSH connection to pipe the IRSSI log back to a local script that feeds it into Growl. I didn’t want to do that.

I tried using iTerm 2 for a while. Among its many bells and whistles is the ability to do actions (like notifications) upon a regex match. Neat little feature, and was easy to get working for someone saying “dolske” in whatever channel I had active. But background mentions don’t send any obvious string, so that was a huge limitation. (I tried regexing for the ANSI control codes that hilight an IRSSI window number, but that didn’t work reliably.)

I recently read that as of OS X 10.7, the default Terminal.app includes a simple feature — whenever it receives a bell character, it will bounce the Dock icon and annotate the icon with a number. Perfect! I got this all working, so here’s what you can do:

1. Configure Terminal.app. Actually, you don’t need to. This works automatically. But Preferences –> Settings –> Advanced –> Audible Bell may be something you want to disable if you dislike terminals making noise. Or enable it for debugging this.

2. Configure IRSSI to beep on private messages and hilights: /set beep_msg_level MSGS HILIGHT

3. Configure Screen to allow passing through beeps. You can put “vbell off” in ~/.screenrc and/or toggle it interactively via Control-A, Control-G. An easy test for this is to enable Audible Bell (see step 1) and use IRSSI’s /beep command. If you hear a sound, you’re good to go.

Note that IRSSI also has a “/set bell_beeps on|off” toggle. It’s not needed for the above; it seems to be for filtering out bell characters from channel messages, but the docs are somewhat vague (and a quick skim of the source wasn’t any more enlightening).

Design a site like this with WordPress.com
Get started