8 Reasons The Voight-Kampff Machine is shit (and a redesign to fix it)

Distinguishing replicants from humans is a tricky business. Since they are indistinguishable biologically, it requires an empathy test, during which the subject hears empathy-eliciting scenarios and watched carefully for telltale signs such as, “capillary dilation—the so-called blush response…fluctuation of the pupil…involuntary dilation of the iris.” To aid the blade runner in this examination, they use a portable machine called the Voight-Kampff machine, named, presumably, for its inventors.

The device is the size of a thick laptop computer, and rests flat on the table between the blade runner and subject. When the blade runner prepares the machine for the test, they turn it on, and a small adjustable armature rises from the machine, the end of which is an intricate piece of hardware, housing a powerful camera, glowing red.

The blade runner trains this camera on one of the subject’s eyes. Then, while reading from the playbook book of scenarios, they keep watch on a large monitor, which shows an magnified image of the subject’s eye. (Ostensibly, anyway. More on this below.) A small bellows on the subject’s side of the machine raises and lowers. On the blade runner’s side of the machine, a row of lights reflect the volume of the subject’s speech. Three square, white buttons sit to the right of the main monitor. In Leon’s test we see Holden press the leftmost of the three, and the iris in the monitor becomes brighter, illuminated from some unseen light source. The purpose of the other two square buttons is unknown. Two smaller monochrome monitors sit to the left of the main monitor, showing moving but otherwise inscrutable forms of information.

In theory, the system allows the blade runner to more easily watch for the minute telltale changes in the eye and blush response, while keeping a comfortable social distance from the subject. Substandard responses reveal a lack of empathy and thereby a high probability that the subject is a replicant. Simple! But on review, it’s shit. I know this is going to upset fans, so let me enumerate the reasons, and then propose a better solution.

-2. Wouldn’t a genetic test make more sense?

If the replicants are genetically engineered for short lives, wouldn’t a genetic test make more sense? Take a drop of blood and look for markers of incredibly short telomeres or something.

-1. Wouldn’t an fMRI make more sense?

An fMRI would reveal empathic responses in the inferior frontal gyrus, or cognitive responses in the ventromedial prefrontal gyrus. (The brain structures responsible for these responses.) Certinaly more expensive, but more certain.

0. Wouldn’t a metal detector make more sense?

If you are testing employees to detect which ones are the murdery ones and which ones aren’t, you might want to test whether they are bringing a tool of murder with them. Because once they’re found out, they might want to murder you. This scene should be rewritten such that Leon leaps across the desk and strangles Holden, IMHO. It would make him, and other blade runners, seem much more feral and unpredictable.

(OK, those aren’t interface issues but seriously wtf. Onward.)

1. Labels, people

Controls needs labels. Especially when the buttons have no natural affordance and the costs of experimentation to discover the function are high. Remembering the functions of unlabeled controls adds to the cognitive load for a user who should be focusing on the person across the table. At least an illuminated button helps signal the state, so that, at least, is something.

 2. It should be less intimidating

The physical design is quite intimidating: The way it puts a barrier in between the blade runner and subject. The fact that all the displays point away from the subject. The weird intricacy of the camera, its ominous HAL-like red glow. Regular readers may note that the eyepiece is red-on-black and pointy. That is to say, it is aposematic. That is to say, it looks evil. That is to say, intimidating.

I’m no emotion-scientist, but I’m pretty sure that if you’re testing for empathy, you don’t want to complicate things by introducing intimidation into the equation. Yes, yes, yes, the machine works by making the subject feel like they have to defend themselves from the accusations in the ethical dilemmas, but that stress should come from the content, not the machine.

2a. Holden should be less intimidating and not tip his hand

While we’re on this point, let me add that Holden should be less intimidating, too. When Holden tells Leon that a tortoise and a turtle are the same thing, (Narrator: They aren’t) he happens to glance down at the machine. At that moment, Leon says, “I’ve never seen a turtle,” a light shines on the pupil and the iris contracts. Holden sees this and then gets all “ok, replicant” and becomes hostile toward Leon.

In case it needs saying: If you are trying to tell whether the person across from you is a murderous replicant, and you suddenly think the answer is yes, you do not tip your hand and let them know what you know. Because they will no longer have a reason to hide their murderyness. Because they will murder you, and then escape, to murder again. That’s like, blade runner 101, HOLDEN.

3. It should display history 

The glance moment points out another flaw in the interface. Holden happens to be looking down at the machine at that moment. If he wasn’t paying attention, he would have missed the signal. The machine needs to display the interview over time, and draw his attention to troublesome moments. That way, when his attention returns to the machine, he can see that something important happened, even if it’s not happening now, and tell at a glance what the thing was.

4. It should track the subject’s eyes

Holden asks Leon to stay very still. But people are bound to involuntarily move as their attention drifts to the content of the empathy dilemmas. Are we going to add noncompliance-guilt to the list of emotional complications? Use visual recognition algorithms and high-resolution cameras to just track the subject’s eyes no matter how they shift in their seat.

5. Really? A bellows?

The bellows doesn’t make much sense either. I don’t believe it could, at the distance it sits from the subject, help detect “capillary dilation” or “ophthalmological measurements”. But it’s certainly creepy and Terry Gilliam-esque. It adds to the pointless intimidation.

6. It should show the actual subject’s eye

The eye color that appears on the monitor (hazel) matches neither Leon’s (a striking blue) or Rachel’s (a rich brown). Hat tip to Typeset in the Future for this observation. His is a great review.

7. It should visualize things in ways that make it easy to detect differences in key measurements

Even if the inky, dancing black blob is meant to convey some sort of information, the shape is too organic for anyone to make meaningful readings from it. Like seriously, what is this meant to convey?

The spectrograph to the left looks a little more convincing, but it still requires the blade runner to do all the work of recognizing when things are out of expected ranges.

8. The machine should, you know, help them

The machine asks its blade runner to do a lot of work to use it. This is visual work and memory work and even work estimating when things are out of norms. But this is all something the machine could help them with. Fortunately, this is a tractable problem, using the mighty powers of logic and design.

Pupillary diameter

People are notoriously bad at estimating the sizes of things by sight. Computers, however, are good at it. Help the blade runner by providing a measurement of the thing they are watching for: pupillary diameter. (n.b. The script speaks of both iris constriction and pupillary diameter, but these are the same thing.) Keep it convincing and looking cool by having this be an overlay on the live video of the subject’s eye.

So now there’s some precision to work with. But as noted above, we don’t want to burden the user’s memory with having to remember stuff, and we don’t want them to just be glued to the screen, hoping they don’t miss something important. People are terrible at vigilance tasks. Computers are great at them. The machine should track and display the information from the whole session.

Note that the display illustrates radius, but displays diameter. That buys some efficiencies in the final interface.

Now, with the data-over-time, the user can glance to see what’s been happening and a precise comparison of that measurement over time. But, tracking in detail, we quickly run out of screen real estate. So let’s break the display into increments with differing scales.

There may be more useful increments, but microseconds and seconds feel pretty convincing, with the leftmost column compressing gradually over time to show everything from the beginning of the interview. Now the user has a whole picture to look at. But this still burdens them into noticing when these measurements are out of normal human ranges. So, let’s plot the threshold, and note when measurements fall outside of that. In this case, it feels right that replicants display less that normal pupillary dilation, so it’s a lower-boundary threshold. The interface should highlight when the measurement dips below this.

Blush

I think that covers everything for the pupillary diameter. The other measurement mentioned in the dialogue is capillary dilation of the face, or the “so-called blush response.” As we did for pupillary diameter, let’s also show a measurement of the subject’s skin temperature over time as a line chart. (You might think skin color is a more natural measurement, but for replicants with a darker skin tone than our two pasty examples Leon and Rachel, temperature via infrared is a more reliable metric.) For visual interest, let’s show thumbnails from the video. We can augment the image with degree-of-blush. Reduce the image to high contrast grayscale, use visual recognition to isolate the face, and then provide an overlay to the face that illustrates the degree of blush.

But again, we’re not just looking for blush changes. No, we’re looking for blush compared to human norms for the test. It would look different if we were looking for more blushing in our subject than humans, but since the replicants are less empathetic than humans, we would want to compare and highlight measurements below a threshold. In the thumbnails, the background can be colored to show the median for expected norms, to make comparisons to the face easy. (Shown in the drawing to the right, below.) If the face looks too pale compared to the norm, that’s an indication that we might be looking at a replicant. Or a psychopath.

So now we have solid displays that help the blade runner detect pupillary diameter and blush over time. But it’s not that any diameter changes or blushing is bad. The idea is to detect whether the subject has less of a reaction than norms to what the blade runner is saying. The display should be annotating what the blade runner has said at each moment in time. And since human psychology is a complex thing, it should also track video of the blade runner’s expressions as well, since, as we see above, not all blade runners are able to maintain a poker face. HOLDEN.

Anyway, we can use the same thumbnail display of the face, without augmentation. Below that we can display the waveform (because they look cool), and speech-to-text the words that are being spoken. To ensure that the blade runner’s administration of the text is not unduly influencing the results, let’s add an overlay to the ideal intonation targets. Despite evidence in the film, let’s presume Holden is a trained professional, and he does not stray from those targets, so let’s skip designing the highlight and recourse-for-infraction for now.

Finally, since they’re working from a structured script, we can provide a “chapter” marker at the bottom for easy reference later.

Now we can put it all together, and it looks like this. One last thing we can do to help the blade runner is to highlight when all the signals indicate replicant-ness at once. This signal can’t be too much, or replicants being tested would know from the light on the blade runner’s face when their jig is up, and try to flee. Or murder. HOLDEN.

For this comp, I added a gray overlay to the column where pupillary and blush responses both indicated trouble. A visual designer would find some more elegant treatment.

If we were redesigning this from scratch, we could specify a wide display to accomodate this width. But if we are trying to squeeze this display into the existing prop from the movie, here’s how we could do it.

Note the added labels for the white squares. I picked some labels that would make sense in the context. “Calibrate” and “record” should be obvious. The idea behind “mark” is an easy button for the blade runner to press when they see something that looks weird, like when doctors manually annotate cardiograph output.

Lying to Leon

There’s one more thing we can add to the machine that would help out, and that’s a display for the subject. Recall the machine is meant to test for replicant-ness, which happens to equate to murdery-ness. A positive result from the machine needs to be handled carefully so what happens to Holden in the movie doesn’t happen. I mentioned making the positive-overlay subtle above, but we can also make a placebo display on the subject’s side of the interface.

The visual hierarchy of this should make the subject feel like its purpose is to help them, but the real purpose is to make them think that everything’s fine. Given the script, I’d say a teleprompt of the empathy dilemma should take up the majority of this display. Oh, they think, this is to help me understand what’s being said, like a closed caption. Below the teleprompt, at a much smaller scale, a bar at the bottom is the real point.

On the left of this bar, a live waveform of the audio in the room helps the subject know that the machine is testing things live. In the middle, we can put one of those bouncy fuiget displays that clutters so many sci-fi interfaces. It’s there to be inscrutable, but convince the subject that the machine is really sophisticated. (Hey, a diegetic fuiget!) Lastly—and this is the important part—An area shows that everything is “within range.” This tells the subject that they can be at ease. This is good for the human subject, because they know they’re innocent. And if it’s a replicant subject, this false comfort protects the blade runner from sudden murder. This test might flicker or change occasionally to something ambiguous like “at range,” to convey that it is responding to real world input, but it would never change to something incriminating.

This way, once the blade runner has the data to confirm that the subject is a replicant, they can continue to the end of the module as if everything was normal, thank the replicant for their time, and let them leave the room believing they passed the test. Then the results can be sent to the precinct and authorizations returned so retirement can be planned with the added benefit of the element of surprise.

OK

Look, I’m sad about this, too. The Voight-Kampff machine is cool. It fits very well within the art direction of the Blade Runner universe. This coolness burned the machine into my memory when I saw this film the first dozen times, but despite that, it just doesn’t stand up to inspection. It’s not hopeless, but does need a lot of thinkwork and design to make it really fit to task, and convincing to us in the audience.

Remote wingman via EYE-LINK

EYE-LINK is an interface used between a person at a desktop who uses support tools to help another person who is live “in the field” using Zed-Eyes. The working relationship between the two is very like Vika and Jack in Oblivion, or like the A.I. in Sight.

In this scene, we see EYE-LINK used by a pick-up artist, Matt, who acts as a remote “wingman” for pick-up student Harry. Matt has a group video chat interface open with paying customers eager to lurk, comment, and learn from the master.

Harry’s interface

Harry wears a hidden camera and microphone. This is the only tech he seems to have on him, only hearing his wingman’s voice, and only able to communicate back to his wingman by talking generally, talking about something he’s looking at, or using pre-arranged signals.

image1.gif
Tap your beer twice if this is more than a little creepy.

Matt’s interface

Matt has a three-screen setup:

  1. A big screen (similar to the Samsung Series 9 displays) which shows a live video image of Harry’s view.
  2. A smaller transparent information panel for automated analysis, research, and advice.
  3. An extra, laptop-like screen where Matt leads a group video chat with a paying audience, who are watching and snarkily commenting on the wingman scenario. It seems likely that this is not an official part of the EYE-LINK software.
image55.png
image47.png
image28.png
Please make a note of the hilarious and condemning screen names of the peanut gallery: Pie Ape, Popkorn, El Nino, Nixon, Fappucino [sic], Stingray, I_AM_WALDO, and Wigwam.

Harry communicates to Matt by speaking or enacting a crude sign language for the video camera. Matt communicates back to Harry using an audiolink through a headset. Setting up the connection is similar to Skype/Hangouts (even featuring an icon of an archaic laptop.) Every first-person EYE-LINK view is characterized by a pixelated gradient at the sides of the screen.

Matt’s wingman support tools

We see that Matt has a number of tools to help him act as a remote wingman for Harry, evident through six main navigation items on his side screen…A home icon, Web, News, Image, Video, and Social Media. The home icon is always bright white, but the section he’s currently viewing is a bolded gray.  

In the Image mode, it runs a face recognition on a still image from Matt’s video feed, and provides its best match for further research.

image20.png

Somehow he can also get information on the event that Harry is attending. In this view, there’s a floor plan of the venue, which Matt can use to instruct Harry.

image11.png

OK. This is of course a creepy use of this interface, but it’s easy to imagine scenarios where something like the EYE-LINK is used virtuously:

  • A nurse practitioner needing to call on the expertise of a remote, more senior caregiver.
  • An airplane maintenance worker needing to speak to the aircraft engineers about a problem she’s encountering.
  • Paintball players coordinating their game through a centralized team captain.

So with that in mind, let’s review this with the caveat that of course the specific wingman scenario is super creepy.

Analysis: Harry’s feedback

The communication channel back from Harry to Matt doesn’t need to be too rich for these purposes, but there are ways that it could be richer. Of course Harry could pick up his phone and simply type something that Matt could see. But if the communication needed to be undetectable to a casual observer, there are other options. Subvocalization is nascent, but a possibility and mostly-natural for the speaker.

78105main_ACD04-0024-001.jpg
Image courtesy of the NASA Ames Research demo of subvocalization.

If the remote user has time for training, subgestural detection might be another option. This is like subvocal detection, but instead of detecting throat movements used in speech, it would be an armband (like the Myo) that could detect gentle finger presses allowing the user chorded keyboard input which he could use while, say, gripping the beer bottle.

tw_hand.png

Either way, richer “undetectable” communication mechanisms exist, and could be incorporated.

Analysis: Graphics

One of the refreshing things about the interfaces in Black Mirror generally—and these screens in particular—is how understated they are, especially compared to the Roccoco interfaces that populate much of sci-fi. (Compare the two below.)

The color palette is spartan grayscale. The typeface is Helvetica (or adjacent). Nothing 3D, nothing swoopy, no complexity for complexity’s sake.

Analysis: Navigation and layout

The navigation for the information panel is a little confusing. Sure, it looks like lots of websites. But this chunking of information into separate screens requires that Matt hunt for information that’s of interest. Better would be to have a single, dynamic screen, and have the system do real-time parsing, providing suggestions and notifications in the context of the event. If he needed to dive down into some full-screen mode, let it fill the screen with some easy way to return to context.

Also, how did he get to the event view? Is that just a web view? What bar puts its floorplan on its site? There is no primary navigation element that would on first glance explain how he got there, or once there, how he might get back to other screens. The home icon is obscured. (Maybe this is designed by Apple, though, and has some entirely hidden swipe gesture or long press to request the event screen or force a return to home?) It’s really hard to say, and so fails affordance.

Analysis: Group chat

A quick look at any modern group video chat software shows that this is too pared down, with lots of controls for audio and video controls missing, as well as controls for the “meeting.” It’s possible that these appear only if Matt interacts with the cursor on that laptop, but again, affordances.

Analysis: More wingman tools?

There are more tools that would be useful to a wingman’s job, which could be built even now—without the strong AI that this diegesis has. They could be more virtuous, like…

  • Ways to keep Harry calm, focused, and feeling confident.
  • Reminders of general best practices for making a good impression.
  • Automatic privacy blackout when Harry approaches people for conversation.
thegame

Or they could be…uh…more questionable. (Here I’ll confess to referencing The Game: Penetrating the Secret Society of Pickup Artists by Neil Strauss, for how a real PUA might handle it.)

  • A transcript of the conversation with key phrases highlit, indicating the “target’s” attitudes and levels of interest.
  • Personality analysis on social media, listing derived topics that these particular “targets” would find engaging.
  • A list of Harry’s practiced “routines” for Matt to quickly review, and suggest. The AI could even highlight its best-guess suggestion.
  • Counts of “indicators of interest.”
  • An overview of Matt’s favored stages of pickup, with an indicator of where Harry is and how well he performed on the prior stages.

Either way, the support that these tools are offering are pretty minimal compared to what could be done, but then again, that kind of fits the story. Yes, the creepiness of the remote wingman support tools is part of the point. But the whole reason the peanut gallery pays for the honor of watching Matt coach Harry is (yes, voyeurism, but also) to witness a master wingman at his work. If the system was too much of a support, the peanut gallery would be less incentivized to pay to see him in action.

Green Laser Scan

In a very brief scene, Theo walks through a security arch on his way into the Ministry of Energy. After waiting in queue, he walks towards a rectangular archway. At his approach, two horizontal green laser lines scan him from head to toe. Theo passes through the arch with no trouble.

childrenofmen-002

Though the archway is quite similar to metal detection technology used in airports today, the addition of the lasers hints at additional data being gathered, such as surface mapping for a face-matching algorithm.

We know that security mostly cares about what’s hidden under clothes or within bodies and bags, rather than confirming the surface that security guards can see, so it’s not likely to be an actual technological requirement of the scan. Rather it is a visual reminder to participants and onlookers that the scan is in progress, and moreover that this the Ministry is a secured space.

Though we could argue that the signal could be made more visible, laser light is very eye catching and human eyes are most sensitive at 555nm, and this bright green is the closest to the 808 diode laser at 532nm. So for being an economic, but eye catching signal, this green laser is a perfect choice.

TETVision

image05

The TETVision display is the only display Vika is shown interacting with directly—using gestures and controls—whereas the other screens on the desktop seem to be informational only. This screen is broken up into three main sections:

  1. The left side panel
  2. The main map area
  3. The right side panel

The left side panel

The communications status is at the top of the left side panel and shows Vika the status of whether the desktop is online or offline with the TET as it orbits the Earth. Directly underneath this is the video communications feed for Sally.

Beneath Sally’s video feed is the map legend section, which serves the dual purposes of providing data transfer to the TET and to the Bubbleship as well as a simple legend for the icons used on the map.

The communications controls, which are at the bottom of the left side panel, allow Vika to toggle the audio communications with Jack and with Sally.

The main map area

The largest section is the viewport where the various live feeds are displayed. The main map, which serves as a radar, as well as the remote video feeds she uses to monitor Jack are both in this section of the display.

The right side panel

The panel on the right side of the map contains the video feed controls, which allow Vika to toggle between live footage from the Bubbleship, the TET, and of course, the main map view.

Although never shown in use in the film, the bottom right of the screen houses the tower rotation controls. This unused control is the only indication the capability even exists, so it is unknown whether the tower rotates 360 degrees or whether it’s limited to set points. (More on this below.)

It has robust capabilities

image02

At one point in the movie, Vika is able to use the drones to search for bio trail signatures when Jack is abducted by the scavs.

image06

Vika is also able to detect and decode various types of signals such as the morse code message sent by Jack or the rogue signal sent out by the scavs.

image08

And, probably unbeknownst to Jack and Vika, the TETVision can be controlled remotely from the TET to allow Sally access to the data stored on the desktop—as shown at one point in the movie, when Sally pulls up a past bio trail signature to send drones after Jack and the scavs.

It’s missing a critical layer of data

image03

At the beginning of the film, as Jack heads toward the downed drone 166, he suddenly encounters a dangerous lightning storm and nearly plunges to his death when the Bubbleship loses power. His signature disappears from the TETVision map, but from Vika’s perspective there is no indication as to what could have happened — or that there was any danger to begin with.

image01

Since the weather is unstable and constantly changing, it would have been better to include a weather overlay so that Vika could have notified Jack of the storm—allowing him to fly around it instead of straight into it.

It’s got some useless bits

image09

The tower rotation controls are never shown in use in the film, so it’s not clear what benefit rotating the tower would serve. The main purpose of their mission is to ensure the hydro-rigs are secure and functioning properly, not getting an optimal view.

image04

The tower is almost completely surrounded by windows as it is. And since the tower windows already face the hydro-rigs, what would be the benefit of changing vantage points?

It seems that the space could be used for something more beneficial to Vika such as bike, hydro-rig and drone cam feeds. This would provide Vika with more eyes on the ground, allowing her the additional support to keep Jack safe and monitor scav activity.

From an clustering standpoint, it would also fall in line logically with the other feed controls on the right side panel.

And some unnecessary visual feedback

image07

Towards the end of the movie, Sally is trying to find Jack and the scavs. She accesses Vika’s desktop remotely in order to pull up the bio trail records. Although no one is around to see the information, the TETVision displays the process as it happens. Of course, this is necessary for the narrative to progress, but in a real-life situation Sally would only need to see the data on her side—not from the desktop in Tower 49. If they’ve managed interstellar travel, cloning, terraforming, and cognitive reprogramming of alien species, they’re not likely still using VNC. This type of interaction should simply run in the background and not be visible on screen.

Better: Provide useful visuals

When a drone picks up a bio trail signal, a visual of a DNA sequence is displayed. Since the analysis is being conducted by Sally on the TET, it seems that this information isn’t really useful to Vika at all.

image00

From Vika’s point of view it seems like the actual trail would be more important, so why not show a drone cam feed complete with the HUD overlay? She could instantly gain more information by seeing that there are two bio trails—proving that Jack has been captured by the scavs and taken to another location.

Homing Beacon

image04

After following a beacon signal, Jack makes his way through an abandoned building, tracking the source. At one point he stops by a box on the wall, as he sees a couple of cables coming out from the inside of it, and cautiously opens it.

The repeater

I can’t talk much about interactions on this one given that he does not do much with it. But I guess readers might be interested to know about the actual prop used in the movie, so after zooming in on a screen capture and a bit of help from Google I found the actual radio.

image05
When Jack opens the box he finds the repeater device inside. He realizes that it’s connected to the building structure, using it as an antenna, and over their audio connection asks Vika to decrypt the signal.

The desktop interface

Although this sequence centers around the transmission from the repeater, most of the interactions take place on Vika’s desktop interface. A modal window on the display shows her two slightly different waveforms that overlap one another. But it’s not clear at all why the display shows two signals instead of just one, let aside what the second signal means.

After Jack identifies it as a repeater and asks her to decrypt the signal, Vika touches a DECODE button on her screen. With a flourish of orange and white, the display changes to reveal a new panel of information, providing a LATITUDE INPUT and LONGITUDE INPUT, which eventually resolve to 41.146576 -73.975739. (Which, for the curious, resolves to Stelfer Trading Company in Fairfield, Connecticut here on Earth. Hi, M. Stelfer!) Vika says, “It’s a set of coordinates. Grid 17. It’s a goddamn homing beacon.”

DECODE_15FPS
At the control tower Vika was already tracking the signal through her desktop interface. As she hears Jack’s request, she presses the decrypt button at the top of the signal window to start the process.

When you look at the display, the decrypt button is already there for her to press. So either the computer already knows there is an encryption going on, or the user can press the decrypt button at any time, regardless of whether the signal is encrypted or not. In both cases, it’s bad interaction design.

An issue of agentive tech

If the computer already knows that the signal is encrypted, why doesn’t it tell her that? It should automatically handle the decryption, alert her that it was decrypted, and show the lat/long results on the screen. If it’s wrong, she can dismiss it. But let’s not rely on her consultation of a stoic guru just to find out. (It doesn’t even make sense from the TET’s perspective.) In this way you simplify the interface—as you no longer need a “decrypt” button—and help Vika and Jack with their goals more effectively.

Needs more states

From the sequence you can tell that the decrypt button has only two states , OFF and ON. To improve the interface, we’d want to have a few more states, indicating CONFIDENCE, PROCESSING, and of course if it’s wrong, the opportunity to DISMISS. Each of these would need specific designing for microinteractions, but these two states aren’t enough.

What if those weren’t coordinates?

When Vika presses the decrypt button we can see it expands the bottom part of the window, adding some encryption-related info. And way at the very bottom the interface there are a couple of labels that read LONGITUDE INPUT and LATITUDE INPUT. Not the best name though since it’s easy to mistake these for the coordinates of the signal source rather than the message itself. The numbers there start to change as the computer seems to be decoding the signal from the repeater, and making the correction on the data on real time.

But the strange bit are those same coordinate inputs. It seems as if the computer already knows—before it finishes decrypting—that the signal is transmitting a set of longitude and latitude coordinates. I mean, what if the encrypted data wasn’t coordinates at all…say, an entry code to some scav station? It’s possible that there is some metadata in the signal that conveys this information, but if that was immediately available, again, the system should have told them.

Finally, there is no feedback whatsoever about the time needed to complete the decryption. It doesn’t do much harm here as it’s pretty fast, but I’m guessing that more complex transmissions might pass the threshold of attention it would become an issue.

What is out there?

This is the first thing Jacks asks once he knows about the encrypted coordinates. And the interface designers thought about that one too, and place a small button next to the coordinate labels. That button leads to another window with the map display but not only that, if you look closely you can see that the button label also changes. While at first it reads MAP, after a few seconds the labels changes to GRID followed later by the number 17. And it keeps looping between those last two.

image03
image07
image01

The changing labels are a way to add more info on the same screen real estate. If Vika happens to know the surroundings of sector 17 she could have told Jack there was nothing there without even looking at the map. In the next sequence we see Vika scrolling around the map view—hopefully it opened right at those coordinates, but even if she’s scrolling around to see if there’s anything of interest there, I’ll note that the location does not have a drop pin to let her re-orient.

Losing the signal

Just as Jack is cutting one of the wires from the repeater to shut down the transmission we get a view of the desktop interface again. The modal window that Vika was using to track and decode the signal suddenly closes. This is a nice use of affordances, as the animation itself shows Vika that the signal was interrupted from the source. A more common trope is a big “no signal” label, so this is nice to see.

image06
After Vika finishes the decryption of the coordinates from the signal, Jacks takes his pliers to cut the wires going from the repeater to the building structure to shut down the transmission.
image02
Jacks decides to shut down the transmission from the repeater. As he does so, the desktop closes the window that Vika was using to track the signal, emphasizing the action with a short sound warning.

The only issue I can see is that in some cases Vika would end up opening the modal window again immediately if she was in the middle of work. The computer should stores the signal in memory and switch automatically from LIVE FEED to CACHE so she could continue.

Mostly useable

So the desktop interface definitely has its issues, but at the same time some few well considered details. The main challenge is its withholding the encryption from Vika. It shouldn’t. On the other hand, the interfaces have some clever information design, such as the space-saving labels and the animation which embodies the facts about the signal.

DuoMento, improved

Forgive me, as I am but a humble interaction designer (i.e., neither a professional visual designer nor video editor) but here’s my shot at a redesigned DuoMento, taking into account everything I’d noted in the review.

  • There’s only one click for Carl to initiate this test.
  • To decrease the risk of a false positive, this interface draws from a large category of concrete, visual and visceral concepts to be sent telepathically, and displays them visually.
  • It contrasts Carl’s brainwave frequencies (smooth and controlled) with Johnny’s (spiky and chaotic).
  • It reads both the brain of the sender and the receiver for some crude images from their visual cortex. (It would be better at this stage to have the actors wear some glowing attachment near a crown to show how this information was being read.)

DuoMento_improved

These changes are the sort that even in passing would help tell a more convincing narrative by being more believable, and even illustrating how not-psychic Johnny really is.

DuoMento

Carl, a young psychic, has an application at home to practice and hone his mental powers. It’s not named in the film, so I’m going to call it DuoMento. We see DuoMento in use when Carl uses it to try and help Johnny find if he has any latent psyhic talent. (Spoiler alert: It doesn’t work.)

StarshipT_035

Setup

DuoMento challenges its users with blind matching tests. For it, the “thought projector” (Carl) sits in a chair at a desk with a keyboard and a desktop monitor before him. The “thought receiver” (Johnny) sits in a chair facing the thought projector, unable to see either the desktop monitor or the large, wall-mounted screen behind him, which duplicates the image from the desktop monitor. To the receiver’s right hand is a small elevated panel of around 20 white push buttons.

StarshipT_036StarshipT_037

Blind matching

For the test, two Hoyle playing cards appear on the screen side-by-side, face down. Carl presses a key on his keyboard, and one card flips over to reveal its face. Carl concentrates on the face-up card, attempting to project the identity of the card to Johnny. Johnny tries his best to receive the thought. It’s intense.

intense_520

When Johnny feels he has an answer, he says, “I see…Ace of Spades,” and reaches forward and presses a button on the elevated panel. In response, the hidden card flips over as the ace of spades. An overlay appears on top of the two cards indicating if it was a match. Lacking any psychic abilities, Johnny gets a big label reading “NO MATCH,” accompanied by a buzzer sound. Carl resets it to a new card with three clicks on his keyboard.

StarshipT_033

Not very efficient

Why does it take Carl three clicks to reset the cards? You’d think on such a routine task it would be as simple as pressing [space bar]. Maybe you want to prevent accidental activation, but still that’s a key with a modifer, like shift+[space bar]. Best would be if Carl was also a telekinetic. Then he could just mentally push a switch and get some of that practice in. If that switch offered variable resistance it could increase with each…but I digress since he’s just a telepath.

A semi-questionable display

I get why there’s a side-by-side pair of cards. People are much better at these sorts of comparison tasks when objects are side-by-side. But ultimately, it conveys the wrong thing. Having a face down card that flips over implies that that face-down card is the one that Johnny’s trying to guess. But it’s not. The one that’s already turned over is the one he’s trying to guess. Better would be a graphic that implies he’s filling in the blank.

better_duomento_520

Better still are two separate screens: One for the projector with a single card displayed, and a second for the receiver with this same graphic prompting him to guess. This would require a little different setup when shooting the scene, with over-the-shoulder shots for each showing the different screen. But audiences are sophisticated enough to get that now. Different screens can show different things.

Mismatched inputs?

At first it seems like Johnny’s input panel is insufficient for the task. After all, there are 52 cards in a standard deck of cards and only 20 buttons. But having a set of 13 keys for the card ranks and 4 for the suit is easy enough, reduces the number of keys, and might even let him answer only the part he’s confident in if the image hasn’t quite come through.

StarshipT_039

Does it help test for “sensitivity”?

Psychic powers are real in the world of Starship Troopers, so we’re going not going to question that. Instead the question at hand will be: Is this the best test for psychic sensitivity?

Visual cheating

I do wonder that having a lit screen gives the receiver a reflection in the projector’s eyes to detect, even if unconsciously. An eagle-eyed receiver might be able to spot a color, or the difference between a face card and a number card. Better would be some way for the projector to cover his eyes while reading the subject, and dim that screen afterward.

The risk of false positives

More importantly, such a test would want to eliminate the chance that the receiver guessed correctly by chance. The more constrained and familiar the range of options, the more likely they are to get a false positive, which wouldn’t help anything except confidence, and even that would be false. I get that when designing skills-building interfaces, you want to start easy and get progressively more challenging. But it makes more sense to constrain the concepts being projected to things that are more concrete and progress to greater abstraction or more nuance. Start with “fire,” perhaps, and advance to “flicker” or “warmth.” For such thoughts, a video cue of a word randomly selected from that pool of concepts would make the most sense. And for cinematic directness (Starship Troopers was nothing if not direct) you should overlay the word onto the video cue as well.

fireloop1

Better input

The next design challenge then becomes how does the receiver provide to the system what, if anything, they’re receiving. Since the concepts would be open-ended, you need a language-input mechanism: ANSI keyboard for typing, or voice recognition.

Additionally, I’d add a brain-reading interface that was able to read his brain as he was attempting to receive. Then it could detect for the right state of mind, e.g. an alpha state, as well as areas of the brain that are being activated. Cinematically you could show a brain map, indicating the brain state in a range, the areas of the brain being activated. Having the map on hand for Johnny would let him know to relax and get into a receptive state. If Carl had the same map he could help prompt him.

In a movie you’d probably also want a crude image feed being “read” from Johnny’s thoughts. It might charmingly be some dumb, non-fire things, like scenes from his last jump ball game, Carmen’s face and cleavage, and to Carl’s shame, a recollection of the public humilation suffered recently at his hand.

But if this interface (and telepathy) was real, you wouldn’t want to show that to Johnny, as it might cause distracting feedback loops, and you wouldn’t want to show it to Carl less he betray when Johnny is getting close, and encourage Johnny’s zeroing in on the concept through subtle social cues instead of the desired psychic ones. Since it’s not real, let’s comp it up next more cinematically.

FedPaint

Fedpaint_big

Students in Starship Troopers academy have access to desktop computing environments during class, including a drawing and animation program called “Fedpaint,” that had a number of very forward-looking features.

The screen is housed in a metal bezel that is attached to the desk, and can be left flat or angled slightly per the user’s preference. A few hardware buttons sit in a row at the bottom of the bezel. (Quick industrial design aside: Those buttons belong at the top of the bezel.) The input device is a stylus. (Styli had been in use in personal digital assistants for over a decade when the film came out, I don’t think they had been sold as the primary input for a PC.) When we first see Johnny using the computer, he is ignoring his citizenship lesson and using Fedpaint instead.

StarshipT_013

The main part of the interface is a canvas. Running along the left and bottom edges are a complex tool palette and color picker that is vaguely reminiscent of Windows 3.0 WIMP applications. It’s easy to tell which category and tool is selected. (What color is selected is unclear.) I’d even say that most of the icons, while a little ham-handed and completely lacking labels, convey what they would do pretty clearly. The tools also seem to be clustered logically with categories across the top left, tools in the middle left, a color palette in the lower left corner, and file operations across the bottom. That’s some reasonable and reasonably convincing layout design for a movie interface. Nowadays a designer might argue to hide the menus when not in use to maximize the canvas real estate, but the most common OS paradigm at the time was Windows 97, and the most advanced paint program, i.e. Photoshop, looked like this. (Major thanks to Hongkiat for keeping their museum of Photoshop interfaces.)

Using the stylus, Johnny sketches a flirty animation for Carmen. He draws each of their profiles in white lines. He then adds some flat color and animates the profiles (not shown onscreen) such that the faces get closer, their eyes close, and their mouths open in readiness of a kiss. He then sends it to her.

On her desk she receives a notification. (We don’t get to see it. Was she already in the program? Did the notification jump her there?) Carmen grabs her stylus and responds by adding to the animation. She sends the file back to him. He opens it and it plays automatically. In her version of the animation, the profiles approach as before, but as they near for a kiss, the female profile blows a bubble gum bubble that gets so large it pops and covers the face of the male.

StarshipT_019

What’s nice about this interface is that the narrative seems to have driven some innovation in its design. It’s half gee-whiz-circa-1997 of course but half character development as it tells us that Johnny likes Carmen, and Carmen is a bit playfully stand-offish in response. To make this work well narratively, communication of the animation back and forth had to be seamless, and that seems to be the reason we see the communication tools built right into the interface. If ever there was a case for why scenario-driven design for personas works, this is it.

What’s frustrating is that they skipped over the hard part. How does Johnny apply the color? A paint bucket tool is a reasonable guess, but it’s also error prone. How did he specify the number of frames and their speed? How did he ensure that the motion felt relatively smooth and communicative? Anyone who’s worked with an animation program knows that these aren’t trivial matters, and Starship Troopers took the narrative route. Probably best for the story, but less for my analysis purposes.

Still, the stylus-driven direct manipulation, the unique layout, and easy, social sharing were big innovations for the time. I don’t know that there’s much to learn from this today, since our OS metaphors have advanced enough to make this seem quaint at best, and social integration is now the norm. But credit where it’s due, this interface was ahead of its time.

Her: interface components (2/8)

Depending on how you slice things, the OS1 interface consists of five components and three (and a half) capabilities.

Her-earpiece

1. An Earpiece

The earpiece is small and wireless, just large enough to fit snugly in the ear and provide an easy handle for pulling out again. It has two modes. When the earpiece is in Theodore’s ear, it’s in private mode, hearable only by him. When the earpiece is out, the speaker is as loud as a human speaking at room volume. It can produce both voice and other sounds, offering a few beeps and boops to signal needing attention and changes in the mode.

Her-cameo

2. Cameo phone

I think I have to make up a name for this device, and “cameo phone” seems to fit. This small, hand-sized, bi-fold device has one camera on the outside an one on the inside of the recto, and a display screen on the inside of the verso. It folds along its long edge, unlike the old clamshell phones. The has smartphone capabilities. It wirelessly communicates with the internet. Theodore occasionally slides his finger left to right across the wood, so it has some touch-gesture sensitivity. A stripe around the outside-edge of the cameo can glow red to act as a visual signal to get its user’s attention. This is quite useful when the cameo is folded up and sitting on a nightstand, for instance.

Theodore uses Samantha almost exclusively through the earpiece and cameo phone, and it is this that makes OS1 a wearable system.

3. A beauty-mark camera

Only present for the surrogate sex scene, this small wireless (are we at the point when we can stop specifying that?) camera affixes to the skin and has the appearance of a beauty mark.

4. (Unseen) microphones

Whether in the cameo phone, the desktop screen, or ubiquitously throughout the environment, OS1 can hear Theodore speak wherever he is over the course of the film.

5. Desktop screen

Theodore only uses a large monitor for OS1 on his desktop a few times. It is simply another access point as far as OS1 is concerned. Really, there’s nothing remarkable about this screen. It is notable that there’s no keyboard. All input is provided by either voice, camera, or a touch gesture on the cameo.

Her-install01

If those are components to the interface, they provide the medium for her 3.5 capabilities.

Her capabilities

1. Voice interface

Users can speak to OS1 in fully-natural language, as if speaking to another person. OS1 speaks back with fully-human spoken articulation. Theodore’s older OS had a voice interface, but because of its lack of artificial intelligence driving it, the interactions were limited to constrained commands like, “Read email.”

2. Computer vision

Samantha can process what she sees through the camera lens of the cameo perfectly. She recognizes distinct objects, people, and gestures at the physical and pragmatic level. I don’t think we ever see things from Samatha’s perspective, but we do have a few quick close ups of the camera lens.

3. Artificial Intelligence

The most salient aspect of the interface is that OS1 is a fully realized “Strong” artificial intelligence.

It would like me to try and get to some painfully-crafted definition of what counts as either an artificial intelligence or sentience, but in this case we don’t really need a tight definition to help suss out whether or not Samantha is one. That’s the central conceit of the film, and the evidence is just overwhelming.

  • She has a human command of language.
  • She’s fully versed in the nuances of human emotion (and Theodore has a glut of them to engage).
  • She has emotions and can fairly be described as emotional. She has a sexual drive.
  • She has existential crises and a rich theory of mind. At one point she dreamily asks Theodore “What’s it like to be alive in that room right now?” as if she was a philosophical teen idly chatting with her boyfriend over the phone.
  • She commits lies of omission in hiding uncomfortable truths.
  • She changes over time. She solves problems. She learns. She creates.
  • She has a sense of humor. When Theodore tells her early on to “read email” in the weird toComputerese (my name for that 1970s dialect of English spoken only between humans and machines) grammar he had been using with his old operating system, Samantha jokingly adopts a robotic voice and replies, “OK. I will read the email for Theodore Twombly” and gets a good laugh out of him before he apologizes.

Pedants will have some fun discussing whether this is apt but I’m moving forward with it as a given. She’s sentient.

3.5 An “operating system”

This item only counts as half a thing because Theodore uses it as an operating system maaaybe twice in the film. Really, this categorization is a MacGuffin to explain why he gets it in the first place, but it has little to no other bearing on the film.

scarlettjoclippy

What’s missing?

Notably missing in OS1 is a face or any other visual anthropomorphic aspect. There’s no Samantha-faced Clippy. Notice that she’s very carefully disembodied. Jonze does not spend screen time close up on her camera lens, like Kubrick did with HAL’s unblinking eye. Had he done so, it would have given us the impression that she’s somewhere behind that eye. But she’s not. Even in the prop design, he makes sure the camera lens itself looks unremarkable, neutral, and unexpressive, and never gets a lingering focus.

Her “organs,” like the cameo and earpiece, don’t even connect together physically at all. Speaking as she does through the earpiece means she doesn’t exist as a voice from some speaker mounted to the wall. She exists across various displays and devices, in some psychological ether between them. For us, she’s a voiceover existing everywhere at once. For Theodore, she’s just a delightful voice in his head. An angel—or possibly a ghost—borne unto him.

This disembodiment (both the design and the cinematic treatment) frees Theodore and the audience from the negative associations of many other sci-fi intelligences, robots, and unfortunate experiments in commercial artificial intelligence that got trapped in the muck of the uncanny valley. One of the main reasons designers have to be careful about invoking the anthropomorphic sense in users is because it will raise expectations of human capabilities that modern technology just can’t match. But OS1 can match and exceed those expectations, since it’s an AI in a work of fiction, so Jonze is free of that constraint.

And having no visual to accompany a human-like voice allows users to imagine our own “perfect” embodiment to the voice. Relying on the imagination to provide the visuals makes the emotional engagement greater, as it does with our crushes on radio personalities, or the unseen monster in a horror movie. Movies can never create as fulfilling an image for an individual audience member as their imagination can. Theodore could picture whatever he wanted to–even if he wanted to–to accompany Samantha’s computer-generated voice. Unfortunately for the audience, Jonze cast Scarlett Johansen, a popular actress whose image we are instantly able to recall upon hearing her husky, sultry voice, so the imagined-perfection is more difficult for us.

This is just the components and capabilities. Tomorrow we’ll look at some of the key interactions with OS1.