Deckard’s Elevator

This is one of those interactions that happens over a few seconds in the movie, but turns out to be quite deep—and broken—on inspection.

When Deckard enters his building’s dark, padded elevator, a flat voice announces, “Voice print identification. Your floor number, please.” He presses a dark panel, which lights up in response. He presses the 9 and 7 keys on a keypad there as he says, “Deckard. 97.” The voice immediately responds, “97. Thank you.” As the elevator moves, the interface confirms the direction of travel with gentle rising tones that correspond to the floor numbers (mod 10), which are shown rising up a 7-segment LED display. We see a green projection of the floor numbers cross Deckard’s face for a bit until, exhausted, he leans against the wall and out of the projection. When he gets to his floor, the door opens and the panel goes dark.

A need for speed

An aside: To make 97 floors in 20 seconds you have to be traveling at an average of around 47 miles per hour. That’s not unheard of today. Mashable says in a 2014 article about the world’s fastest elevators that the Hitachi elevators in Guangzhou CTF Finance Building reach up to 45 miles per hour. But including acceleration and deceleration adds to the total time, so it takes the Hitachi elevators around 43 seconds to go from the ground floor to their 95th floor. If 97 is Deckard’s floor, it’s got to be accelerating and decelerating incredibly quickly. His body doesn’t appear to be suffering those kinds of Gs, so unless they have managed to upend Newton’s basic laws of motion, something in this scene is not right. As usual, I digress.

The input control is OK

The panel design is nice and was surprising in 1982, because few people had ridden in elevators serving nearly a hundred floors. And while most in-elevator panels have a single button per floor, it would have been an overwhelming UI to present the rider of this Blade Runner complex with 100 floor buttons plus the usual open door, close door, emergency alert buttons, etc. A panel that allows combinatorial inputs reduces the number of elements that must be displayed and processed by the user, even if it slows things down, introduces cognitive overhead, and adds the need for error-handling. Such systems need a “commit” control that allows them to review, edit, and confirm the sequence, to distinguish, say, “97” from “9” and “7.” Not such an issue from the 1st floor, but a frustration from 10–96. It’s not clear those controls are part of this input.

Deckard enters 8675309, just to see what will happen.

I’m a fan of destination dispatch elevator systems that increase efficiency (with caveats) by asking riders to indicate their floor outside the elevator and letting the algorithm organize passengers into efficient groups, but that only works for banks of elevators. I get the sense Deckard’s building is a little too low-rent for such luxuries. There is just one in his building, and in-elevator controls work fine for those situations, even if they slow things down a bit.

The feedback is OK

The feedback of the floors is kind of nice in that the 7-segment numbers rise up helping to convey the direction of movement. There is also a subtle, repeating, rising series of tones that accompany the display. Most modern elevators rely on the numeracy of its passengers and their sense of equilibrium to convey this information, but sure, this is another way to do it. Also, it would be nice if the voice system would, for the visually impaired, say the floor number when the door opens.

Though the projection is dumb

I’m not sure why the little green projection of the floor numbers runs across Deckard’s face. Is it just a filmmaker’s conceit, like the genetic code that gets projected across the velociraptors head in Jurassic Park?

Pictured: Sleepy Deckard. Dumb projection.

Or is it meant to be read as diegetic, that is, that there is a projector in the elevator, spraying the floor numbers across the faces of its riders? True to the New Criticism stance of this blog, I try very hard to presume that everything is diegetic, but I just can’t make that make sense. There would be much better ways to increase the visibility of the floor numbers, and I can’t come up with any other convincing reason why this would exist.

If this was diegetic, the scene would have ended with a shredded projector.

But really, it falls apart on the interaction details

Lastly, this interaction. First, let’s give it credit where credit is due. The elevator speaks clearly and understands Deckard perfectly. No surprise, since it only needs to understand a very limited number of utterances. It’s also nice that it’s polite without being too cheery about it. People in LA circa 2019 may have had a bad day and not have time for that shit.

Where’s the wake word?

But where’s the wake word? This is a phrase like “OK elevator” or “Hey lift” that signals to the natural language system that the user is talking to the elevator and not themselves, or another person in the elevator, or even on the phone. General AI exists in the Blade Runner world, and that might allow an elevator to use contextual cues to suss this out, but there are zero clues in the film that this elevator is sentient.

There are of course other possible, implicit “wake words.” A motion detector, proximity sensor, or even weight sensor could infer that a human is present, and start the elevator listening. But with any of these implicit “wake words,” you’d still need feedback for the user to know when it was listening. And some way to help them regain attention if they got the first interaction wrong, and there would be zero affordances for this. So really, making an explicit wake word is the right way to go.

It might be that touching the number panel is the attention signal. Touch it, and the elevator listens for a few seconds. That fits in with the events in the scene, anyway. The problem with that is the redundancy. (See below.) So if the solution was pressing a button, it should just be a “talk” button rather than a numeric keypad.

It may be that the elevator is always listening, which is a little dark and would stifle any conversation in the elevator less everyone end up stuck in the basement, but this seems very error prone and unlikely.

Deckard: *Yawns* Elevator: Confirmed. Silent alarm triggered.

This issue is similar to the one discussed in Make It So Chapter 5, “Gestural Interfaces” where I discussed how a user tells a computer they are communicating to it with gestures, and when they aren’t. 

Where are the paralinguistics?

Humans provide lots of signals to one another, outside of the meaning of what is actually being said. These communication signals are called paralinguistics, and one of those that commonly appears in modern voice assistants is feedback that the system is listening. In the Google Assistant, for example, the dots let you know when it’s listening to silence and when it’s hearing your voice, providing implicit confirmation to the user that the system can hear them. (Parsing the words, understanding the meaning, and understanding the intent are separate, subsequent issues.)

Fixing this in Blade Runner could be as simple as turning on a red LED when the elevator is listening, and varying the brightness with Deckard’s volume. Maybe add chimes to indicate the starting-to-listen and no-longer-listening moments. This elevator doesn’t have anything like that, and it ought to.

Why the redundancy?

Next, why would Deckard need to push buttons to indicate “97” even while he’s saying the same number as part of the voice print? Sure, it could be that the voice print system was added later and Deckard pushes the numbers out of habit. But that bit of backworlding doesn’t buy us much.

It might be a need for redundant, confirming input. This is useful when the feedback is obscure or the stakes are high, but this is a low-stakes situation. If he enters the wrong floor, he just has to enter the correct floor. It would also be easy to imagine the elevator would understand a correction mid-ride like “Oh wait. Elevator, I need some ice. Let’s go to 93 instead.” So this is not an interaction that needs redundancy.

It’s very nice to have the discrete input as accessibility for people who cannot speak, or who have an accent that is unrecognizable to the system, or as a graceful degradation in case the speech recognition fails, but Deckard doesn’t fit any of this. He would just enter and speak his floor.

Why the personally identifiable information?

If we were designing a system and we needed, for security, a voice print, we should protect the privacy of the rider by not requiring personally identifiable information. It’s easy to imagine the spoken name being abused by stalkers and identity thieves riding the elevator with him. (And let’s not forget there is a stalker on the elevator with him in this very scene.)

This young woman, for example, would abuse the shit out of such information.

Better would be some generic phrase that stresses the parts of speech that a voiceprint system would find most effective in distinguishing people.

Tucker Saxon has written an article for VoiceIt called “Voiceprint Phrases.” In it he notes that a good voiceprint phrase needs some minimum number of non-repeating phonemes. In their case, it’s ten. A surname and a number is rarely going to provide that. “Deckard. 97,” happens to have exactly 10, but if he lived on the 2nd floor, it wouldn’t. Plus, it has that personally identifiable information, so is a non-starter.

What would be a better voiceprint phrase for this scene? Some of Saxon’s examples in the article include, “Never forget tomorrow is a new day” and “Today is a nice day to go for a walk.” While the system doesn’t care about the meaning of the phrase, the humans using it would be primed by the content, and so it would just add to the dystopia of the scene if Deckard had to utter one of these sunshine-and-rainbows phrases in an elevator that was probably an uncleaned murder scene. but I think we can do it one better.

(Hey Tucker, I would love use VoiceIt’s tools to craft a confirmed voiceprint phrase, but the signup requires that I permit your company to market to me via phone and email even though I’m just a hobbyist user, so…hard no.)

Deckard: Hi, I’m Deckard. My bank card PIN code is 3297. The combination lock to my car spells “myothercarisaspinner” and my computer password is “unicorn.” 97 please.

Here is an alternate interaction that would have solved a lot of these problems.

  • ELEVATOR
  • Voice print identification, please.
  • DECKARD
  • SIGHS
  • DECKARD
  • Have you considered life in the offworld colonies?
  • ELEVATOR
  • Confirmed. Floor?
  • DECKARD
  • 97

Which is just a punch to the gut considering Deckard is stuck here and he knows he’s stuck, and it’s salt on the wound to have to repeat fucking advertising just to get home for a drink.

So…not great

In total, this scene zooms by and the audience knows how to read it, and for that, it’s fine. (And really, it’s just a setup for the moment that happens right after the elevator door opens. No spoilers.) But on close inspection, from the perspective of modern interaction design, it needs a lot of work.

IQ Testing

When Joe is processed after his arrest, he is taken to a general IQ testing facility. He sits in a chair wearing headphones. A recorded voice asks, “If you have one bucket that holds two gallons, and another bucket that holds five gallons, how many buckets do you have?” Into a microphone he says, incredulous that this is a question, “Two?” The recorded voice says, “Thank you!”

IDIOCRACY-IQ11

Joe looks to his left to see another subject is trying to put a square blue peg into the middle round hole of a panel and of course failing. Joe looks to his right, to see another subject with a triangular green peg in hand that he’s trying to put into the round middle hole in his interface. Small colored bulbs above each hole are unlit, but they match the colors of the matching blocks, so let’s presume they illuminate when the correct peg is inserted. When you look closely, it’s also apparent that the blocks are tethered to the panel so they’re not lost, and each peg is tethered directly below its matching hole. So there are lots and lots of cues that would let a subject figure it out. And yet, they are not. The subject to Joe’s right even eyes Joe suspiciously and turns his body to cover his test so Joe won’t try and crib…uh…“answers.”

Idiocracy_iq03

Comedy

The comedy in the scene comes from how rudimentary these challenges are. Most toddlers could complete the shape test. Even if you couldn’t figure out the shapes, you could match the colors, i.e. the blue object goes in the hole under the blue bulb. Most preschoolers could answer the spoken challenge. It underscores the stupidity of this world that generalized IQ tests for adults test below grade school levels.

IQ Testing

Since Binet invented the first one in 1904, IQ testing has a long, and problematic past (racism and using it to justify eugenic arguments, just for instance) but it can have a rational goal: How do we measure the intelligence of a set of people (students in a classroom, or applicants to intelligence jobs) for strategic decisions about aptitude, assistance, and improvement? But intelligence is a very slippery concept, and complicated to study much less test. The good news in this case is that the citizens of Idiocracy don’t have very sophisticated intellects, so very basic tests of intelligence should suffice.

Some nice things

So, that said, the shape test has some nice aspects. The panel is angled so the holes are visible and targetable, without being so vertical it’s easy to drop the pegs while manipulating them. The panel is plenty thick for durability and cleaning. The speech-to-text tech seems to work perfectly, unlike the errors and bad design that riddle most technologies in Idiocracy.

Idiocracy_iq02

A garden path match

There’s an interesting question of affordances in the device. You can see in the image above that the yellow round block fits just fine in the square hole. Ordinarily, a designer would want to prevent errors like this by, say, increasing the diameter of the round peg (and its hole) so that it couldn’t be inserted into the square hole. That version of the test would just test the time it took by even trial-and-error to match pegs to their matching holes, then you could rank subjects by time-to-completion. But by allowing the round peg to fit in the square hole, you complicate the test with a “garden path” branch where some subjects can get lost in what he thinks is a successful subtask. This makes it harder to compare subjects fairly, because another subject might not have wandered down this path and paid an unfair price in their time-to-complete.

Another complication is that this test has so many different clues. Do they notice the tethers? Do subjects notice the colored bulbs? (What about color blind subjects?) Having it test cognitive skills as well as fine-motor manipulation skills as well as perception skills seems quite complicated and less likely to enable fair comparisons. 

We must always scrutinize IQ tests because people put so much stock in them and it can be very much to an individual’s detriment. Designers of these tests ought to instrument them carefully for passive and active feedback about when the test itself is proving to be problematic.

Challenging the “superintelligent?”

A larger failing of the test is that it doesn’t challenge Joe at all. All his results would tell him is that he’s much much more intelligent than these tests are built for. Fair enough, there’s nothing in the world of Idiocracy which would indicate a need to test for superintelligence among the population, but this test had to be built by someone(s), generations ago. Could they not even have the test work on someone as smart as themselves? That’s all it would need to test Joe. But we live in a world that should be quite cautious about the emergence of a superintelligence. It would be comforting to imagine that we could test for that. Maybe we should include the Millennium Problems at the end of every test. Just in case.

GOPad.png

Another Idiot Test

As “luck” would have it, Trump tweeted an IQ test just this morning. (I don’t want to link to it to directly add any fuel to his fire, but you can Google it easily.) It’s an outrageous political video ad. As you watch it:

  • Do you believe that a single anecdote about a troubled, psychotic individual is generalizable to everyone with brown skin? Or even to everyone with brown skin who is not American and seeking legal asylum in the U.S.?
  • Do you ignore the evidence of the past decades (and the last week) that show it’s conservative white males who are much more of a problem? (Noting that vox is a liberal-leaning publication, but look at the article’s citations.)
  • Can you tell that the war drums under the ad are there only to make you feel scared, appealing to your emotions with cinematic tricks?
  • Do you uncritically fall for implicature and the slippery slope fallacy?

If the answers to all these are yes, well, sorry. You’ve failed an IQ test put to you by one of the most blatantly racist political ads since WIllie Horton. (Not many ads warrant a deathbed statement of regret, but that one did.) Maybe it’s best you take the rest of the week off treating yourself. Leave town. Take a road trip somewhere. Eat some ice cream.

For the rest of you, congratulations on passing the test. We have 5 days until the election. Kick the racist bastards and the bastards enabling the racist bastards out.

Garden Center

In the center of the kitchen, mounted to the ceiling, is a “Garden Center.” Out of use, it retracts out of reach, but anyone in the family can say “Fruit, please” and the Garden Center drops down to allow fresh grapes to be plucked right off the vine. When done, Marty Jr. tells it to “retract” with a thump on it, and it retracts back up to its resting place near the ceiling.

BttF_119

This is wonderful. Responds to many types of inputs and keeps healthy, fresh fruit available to the family at any time.

6-Screen TV

BttF_109

When Marty Jr. gets home, he approaches the large video display in the living room, which is displaying a cropped image of “The Gold of Their Bodies (Et l’’or de Leur Corps)” by Paul Gauguin. He speaks to the screen, saying “Art off.” After a bit of static, the screen goes black. He then says, “OK, I want channels 18, 24, 63, 109, 87, and the Weather Channel.” As he says each, a sixth of the screen displays the live feed. The number for the channel appears in the upper left corner for a short while before fading. Marty Jr. then sits down to watch the six channels simultaneously.

Voice control. Perfect recognition. No modality. Spot on. It might dynamically update the screen in case he only wanted to watch 2 or 3 channels, but perhaps it is a cheaper system apropos to the McFly household.

Café 80s

BttF_058

Following Dr. Brown’s instructions, Marty heads to Café 80s where the waitstaff consists of television screens mounted on articulated arms which are suspended from the ceiling, allowing them to reach anyplace in the café. Each screen has a shelf on which small items can be delivered to a patron. Each screen features a different celebrity from the 1980s, rendered as a computer talking head and done in a jittery Max Headroom style.

Patrons speak directly to the figure on screen as if it was a human server. With perfect speech recognition, the figures engage in dialogue with the customer to answer questions and take orders. When Marty orders a Pepsi, the waiterbot turns away to attend to other customers, and a small cylinder rises from the Pepsi-branded table in front of him containing a “Pepsi Perfect.” When Marty removes the soda, the delivery cylinder descends quickly back into the table with a whoosh.

BttF_061

Sure. This is functional as a robotic cafe. The limitations of the cafe are apparent when a violent gang intrudes, and the cafe does nothing to help protect its customers or itself, not even call human officers to intervene.

Fueling stations

BttF_041

Fueling stations are up on a raised platform. Cars can ride or land there and approach a central column. A rotating overhead arm maneuvers a liquid fuel dispensing robot into place near the car while a synthesized voice crudely welcomes the driver, delivers a marketing slogan, and announces its actions, i.e. checking oil, and checking landing gear.”

This seems like a pretty good robot solution. It’s efficient, and keeps the pilot informed of status. I presume payment happens as automatically, but we don’t see it.

The biggest improvement I’d make is to the horribly synthesized voice. Sure it conveys that this is a robot, but where movies optimize for the first time user, that crap would get tiring on a frequent use. Pilots could also save time out of their day and do a bit of environmental good if refueling could happen at home using an technology readily available as an off-the-shelf appliance. But where would one find such a thing?

Iron Man HUD: A Breakdown

So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.

Avengers-Iron-Man-Videoconferencing01

So on the surface of this scene, it’s a communications interface.

But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)

So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components.

Exosuit

The Iron Man is the name of the series of superpowered exosuits designed by Tony Stark. They range from the Mark I, a comparatively crude suit of armor to escape imprisonment by terrorists, through the Mark XLVI, the armor seen in The Avengers: Age of Ultron. The suit acts as defense against nearly every type of weapon known. It has repulsor beams built into the palms and in later models the arc reactor mounted in the chest that can be used to deliver concussive force. It allows the wearer to fly. Offensive weaponry varies between models, but has included a high powered laser system, and auto-targeting minigun pod and missiles. The suit can act semi-autonomously or via remote control. One of the models in The Avengers has parts that are seen to self-propel to Tony, targeting a beacon bracelet he wears, and self-assemble around him very quickly.

Marks1and43

Immersive display

Though Tony’s head is completely covered, he has a virtual reality display within his helmet. It is a full-field-of-vision, very high-resolution, full-color display that provides stereoscopic imaging. It allows Tony to see the world around him as if he were not wearing the helmet, augment the view with goal-, person-, location-, and object-sensitive awareness.

The display varies a great deal, changing to the needs of the situation. But five icons persistently in the lower part of the display seem to be: suit status, targeting and optics, radar, artificial horizon, and map.

An interpretive view of Tony’s experience, from Iron Man (2008).
An interpretive view of Tony’s experience, from Iron Man (2008).
An first-person view from within the HUD, Iron Man (2008).
An first-person view from within the HUD, Iron Man (2008).

There is much to critique about the readability of the complex layering and translucency, the limits of human perception, and the necessarily- (and strictly-) interpretive nature of what we as audience see, but let me save those three points for a later post. For now it’s enough to log the features as aspects of the system.

Head NUI

Though Tony could use his hands to interact with an interface projected into the augmented reality view around him, his hands are often occupied in controlling flight or in combat. For this reason the means of input are head gesture, eye gesture, and voice input. A bit more on each follows.

Elements within the HUD such as reticles around his eyes follow and track his head gestures. Other elements stay locked in place. The HUD can track his gaze perfectly, allowing him to designate targets for his weapons with a fixation. Using this perfect eye tracking, Tony can also speak about something he is looking at, either in the real world or in the interface, and the system understands exactly what he’s talking about.

In fact, Tony is able to speak fully natural language commands, and indeed, carry out full-Turing conversations with the suit because of the presence of…

Strong artificial intelligence: JARVIS

An on-board artificial intelligence known as JARVIS handles any information task Tony asks of it, and monitors the surroundings and anticipates informational needs. There is strong evidence that most of the functions of the suit are handled by JARVIS behind the scenes. The crucialness of the artificial intelligence to the function of the suit cannot be overstated. It’s difficult to imagine how most of the suit could function as it does without an artificial intelligence behind the scenes facilitating results and even guiding Tony. With this in mind it is instructive to reframe the AI as the thing being named the Iron Man, with Tony Stark being an onboard manager, or, more charitably, a command-and-control center. Who quips.

Next up in the Iron HUD series: Lets review the functions of the suit.

Avengers-Iron-Man-Videoconferencing02



The Drone

A spherical robot with the number 166 in a dark, smoky environment, hovering above burning debris.

Each drone is a semi-autonomous flying robot armed with large cannons, heavy armor, and a wide array of sensor systems. When in flight mode, the weapon arms retract. The arms extend when the drone senses a threat.

A figure stands amidst debris, lifting a large spherical object that emits bright beams of light in a dark environment.

Each drone is identical in make and temperament, distinguishable only by large white numbers on its “face”. The armored shell is about a meter in diameter (just smaller than Jack). Internal power is supplied by a small battery-like device that contains enough energy to start a nuclear explosion inside of a sky-scraper-sized hydrogen distiller. It is not obvious whether the weapons are energy or projectile-based.

The HUD

The Drone Interface is a HUD that shows the drone’s vision and secondary information about its decision making process. The HUD appears on all video from the Drone’s primary camera. Labels appear in legible human English.

Video feeds from the drone can be in one of several modes that vary according to what kind of searching the drone is doing. We never see the drone use more than one mode at once. These modes include visual spectrum, thermal imaging, and a special ‘tracking’ mode used to follow Jack’s bio signature.

Occasionally, we also see the Drone’s primary objective on the HUD. These include an overlay on the main view that says “TERMINATE” or “CLEAR”.

A digital overlay displaying targeting data and identifiers, with a focus on the word 'TERMINATE', set against an orange background.

In English, the HUD displays what look to be GPS (or similar) coordinates at the top, the Drone’s number (i.e. 185), and the letters A1-XX. The second ‘X’ is greyed out, and this area remains constant between Drones regardless of what mode they are in or what their current mission is.

Additional information covers the left and right sides of the Drone’s vision. All information on the HUD changes in real time, and most appears to be status information about the drone itself or its connection to the Home Station and the Tet.

Physical Feedback

For nearby techs (or enemies), the Drones have a simple voice (tonal) language to describe queries, anger, and acknowledgement of commands. This is similar to R2-D2 from Star Wars, or to pets, like dogs and cats.

A futuristic robotic sphere with the number 166 displayed on its front, equipped with mechanical arms and a single red eye, set against a dimly lit background.

If people or Maintenance Techs are close enough to see details on the drone, the drones’ iris dilates when the drone enters an aggressive mode, then contracts when the drone determines that there is no further threat.

Post-Mission Review

As an overlay on the video feed, this looks like an attempt to more fully immerse the maintenance team in the (artificial) story that the Tet is trying to perpetuate. We never see Vika watch directly through a drone’s eye, but she accesses similar information very easily from the Tet and the Bubbleship.

The most useful situation for this kind of HUD overlay is a post-mission review of a Drone’s activity. Post-mission, the HUD would allow the team to understand how the Drone was making decisions. Given that the Drones appear to be low-level Artificial Intelligence, this would be useful for getting into the Drone’s mind. Jack knows that the drones are temperamental from his encounter at the downed NASA ship, and he would want to make sure that he understands them.

Given how quickly the drone makes decisions, there would not be enough time for Vika to notice that a Drone had made a decision (based on its HUD), then countermand that order. The drone appears to have just enough reaction time for Jack to announce himself before being eliminated.

Futuristic user interface displaying data analysis and terrain information, with orange tones and digital readouts.

If the numbers at the top do conform to the Drone’s current position on the ground, it is surprising that it doesn’t also show the altitude of the drone. The Drone’s position in 3d space would be far more useful to a team trying to understand what the Drone was up to after a mission. It is likely that this is an attempt to keep information from the maintenance team to correspond better to Vika’s 2d command console, and the Tet likely knows exactly where each drone is.

If the maintenance team is infrequently accessing the Drone HUD, more labeling of information on the active status of the Drones would make the data more useful on quick viewing. Right now, the maintenance team needs to constantly remember what each area means, and what each icon represents. The different data formats are good clues, but more labeling would make everything instantly clear and allow the team to focus on the situation instead of deciphering the interface.

At the same time, the wealth of information related to the Drone’s operational status means that a review session using freeze-frames could allow a Team to deduce any functional reasons for an unexpected or catastrophic action on the Drone’s part. Thus the suggestion is reinforced that this HUD is meant for post-operation analysis and not in-the-moment error correction.

There is a potential clue (or Tet hand-tip) for the Team here: Even a catastrophic failure that resulted in the termination of Jack is acceptable enough for Tet not to emphasize in-the-moment error correction as an option for the Team. Tet knows it has plenty of Maintenance Team members in queue. The Maintenance Team does not.

Deceptive, Effectively

The Drone HUD provides useful information to the Maintenance Team for post-mission review. This HUD also works well as a way to make the maintenance team think it has control and understanding over the drone. This deception effectively keeps critical information firmly in the hands of the Tet.

For the Maintenance Team, this deception doesn’t affect their job. What does affect their job is the lack of labels on the data. Better labeling and a more efficient use of space around the edges would make the maintenance team’s life much easier without releasing any extra information from the Tet’s hands.

Perhaps the abundance of information on the display is meant to suggest to the Maintenance Team that other humans will deal with or are dealing with that overabundance in some other setting. If so, these would be impressive lengths for Tet to go to in its serial deception of each instance of the team.

It is worth noting that Oblivion marks one of relatively few cases where an internally-facing HUD with human-readable data can be rationalized as part of the story, rather than simply material for the viewing audience.

Lessons:

  1. Clearly label Information
  2. Speak in a language your users understand
  3. Don’t use up space with unnecessary information

Her: interactions (3/8)

If interface is the collection of inputs and outputs, interaction is how a user uses these along with the system’s programming over time to achieve goals. The voice interaction described above, in fact, covers most of the interaction he has with her. But there are a few other back-and-forths worth noting.

socialoranti

The setup

When Theodore starts up OS1, after an installation period, a male voice asks him four questions meant to help customize the interface. It’s a funny sequence. The emotionless male voice even interrupts him as he’s trying to thoughtfully answer the personal questions asked of him. As far as an interaction, it’s pretty bad. Theodore is taken aback by its rudeness. It’s there in the film to help underscore how warm and human Samantha is by comparison, but let’s be clear: We would never want real world software to ask open-ended and personal questions of a user, and then subsequently shut them down when they began to try and answer. Bad pattern! Bad!

Of course you don’t want Theodore bonding with this introductory AI, so it shouldn’t be too charming. But let’s ask some telling closed-ended questions instead so his answers will be short, still telling, and you know, let him actually finish answering. In fact there is some brilliant analysis out there about what those close ended questions should be.

Seamless transition across devices

Samantha talks to Theodore through the earpiece frequently. When she needs to show him something, she can draw his attention to the cameo phone or a desktop screen. Access to these visual displays help her overcome one of the most basic challenges to an all-voice interface, i.e. people have significant challenges processing aurally-presented options. If you’ve ever had to memorize a list of seven items while working your way through an interactive voice response system, you’ll know how painful this can be. Some other user of OS1 who had no visual display might find their OSAI much less useful.

Her-lunchdate

Signaling attention

Theodore isn’t engaging Samantha constantly. Because of this, he needs ways to disengage from interaction. He has lots of them.

  1. Closing the cameo (a partial signal)
  2. Pulling the earpiece out (an unmistakable signal)
  3. Telling her with language that he needs to focus on something else.

He also needs a way to engage, and the reverse of these actions work for that: putting the earpiece in and speaking, or opening the cameo.

In addition to all this, Samantha also needs a way to signal when she needs his attention. She has the illuminated band around the outside of the cameo as well as the audible beeps from the earpiece. Both work well.

Though all these ways, OS1 has signaling attention covered, and it’s not an easy interaction to get right. So the daily interactions with OS1 are pretty good. But we can also evaluate it for its wearableness, which comes up next. (Hint: it’s kind of a mixed bag.)

Her: interface components (2/8)

Depending on how you slice things, the OS1 interface consists of five components and three (and a half) capabilities.

Her-earpiece

1. An Earpiece

The earpiece is small and wireless, just large enough to fit snugly in the ear and provide an easy handle for pulling out again. It has two modes. When the earpiece is in Theodore’s ear, it’s in private mode, hearable only by him. When the earpiece is out, the speaker is as loud as a human speaking at room volume. It can produce both voice and other sounds, offering a few beeps and boops to signal needing attention and changes in the mode.

Her-cameo

2. Cameo phone

I think I have to make up a name for this device, and “cameo phone” seems to fit. This small, hand-sized, bi-fold device has one camera on the outside an one on the inside of the recto, and a display screen on the inside of the verso. It folds along its long edge, unlike the old clamshell phones. The has smartphone capabilities. It wirelessly communicates with the internet. Theodore occasionally slides his finger left to right across the wood, so it has some touch-gesture sensitivity. A stripe around the outside-edge of the cameo can glow red to act as a visual signal to get its user’s attention. This is quite useful when the cameo is folded up and sitting on a nightstand, for instance.

Theodore uses Samantha almost exclusively through the earpiece and cameo phone, and it is this that makes OS1 a wearable system.

3. A beauty-mark camera

Only present for the surrogate sex scene, this small wireless (are we at the point when we can stop specifying that?) camera affixes to the skin and has the appearance of a beauty mark.

4. (Unseen) microphones

Whether in the cameo phone, the desktop screen, or ubiquitously throughout the environment, OS1 can hear Theodore speak wherever he is over the course of the film.

5. Desktop screen

Theodore only uses a large monitor for OS1 on his desktop a few times. It is simply another access point as far as OS1 is concerned. Really, there’s nothing remarkable about this screen. It is notable that there’s no keyboard. All input is provided by either voice, camera, or a touch gesture on the cameo.

Her-install01

If those are components to the interface, they provide the medium for her 3.5 capabilities.

Her capabilities

1. Voice interface

Users can speak to OS1 in fully-natural language, as if speaking to another person. OS1 speaks back with fully-human spoken articulation. Theodore’s older OS had a voice interface, but because of its lack of artificial intelligence driving it, the interactions were limited to constrained commands like, “Read email.”

2. Computer vision

Samantha can process what she sees through the camera lens of the cameo perfectly. She recognizes distinct objects, people, and gestures at the physical and pragmatic level. I don’t think we ever see things from Samatha’s perspective, but we do have a few quick close ups of the camera lens.

3. Artificial Intelligence

The most salient aspect of the interface is that OS1 is a fully realized “Strong” artificial intelligence.

It would like me to try and get to some painfully-crafted definition of what counts as either an artificial intelligence or sentience, but in this case we don’t really need a tight definition to help suss out whether or not Samantha is one. That’s the central conceit of the film, and the evidence is just overwhelming.

  • She has a human command of language.
  • She’s fully versed in the nuances of human emotion (and Theodore has a glut of them to engage).
  • She has emotions and can fairly be described as emotional. She has a sexual drive.
  • She has existential crises and a rich theory of mind. At one point she dreamily asks Theodore “What’s it like to be alive in that room right now?” as if she was a philosophical teen idly chatting with her boyfriend over the phone.
  • She commits lies of omission in hiding uncomfortable truths.
  • She changes over time. She solves problems. She learns. She creates.
  • She has a sense of humor. When Theodore tells her early on to “read email” in the weird toComputerese (my name for that 1970s dialect of English spoken only between humans and machines) grammar he had been using with his old operating system, Samantha jokingly adopts a robotic voice and replies, “OK. I will read the email for Theodore Twombly” and gets a good laugh out of him before he apologizes.

Pedants will have some fun discussing whether this is apt but I’m moving forward with it as a given. She’s sentient.

3.5 An “operating system”

This item only counts as half a thing because Theodore uses it as an operating system maaaybe twice in the film. Really, this categorization is a MacGuffin to explain why he gets it in the first place, but it has little to no other bearing on the film.

scarlettjoclippy

What’s missing?

Notably missing in OS1 is a face or any other visual anthropomorphic aspect. There’s no Samantha-faced Clippy. Notice that she’s very carefully disembodied. Jonze does not spend screen time close up on her camera lens, like Kubrick did with HAL’s unblinking eye. Had he done so, it would have given us the impression that she’s somewhere behind that eye. But she’s not. Even in the prop design, he makes sure the camera lens itself looks unremarkable, neutral, and unexpressive, and never gets a lingering focus.

Her “organs,” like the cameo and earpiece, don’t even connect together physically at all. Speaking as she does through the earpiece means she doesn’t exist as a voice from some speaker mounted to the wall. She exists across various displays and devices, in some psychological ether between them. For us, she’s a voiceover existing everywhere at once. For Theodore, she’s just a delightful voice in his head. An angel—or possibly a ghost—borne unto him.

This disembodiment (both the design and the cinematic treatment) frees Theodore and the audience from the negative associations of many other sci-fi intelligences, robots, and unfortunate experiments in commercial artificial intelligence that got trapped in the muck of the uncanny valley. One of the main reasons designers have to be careful about invoking the anthropomorphic sense in users is because it will raise expectations of human capabilities that modern technology just can’t match. But OS1 can match and exceed those expectations, since it’s an AI in a work of fiction, so Jonze is free of that constraint.

And having no visual to accompany a human-like voice allows users to imagine our own “perfect” embodiment to the voice. Relying on the imagination to provide the visuals makes the emotional engagement greater, as it does with our crushes on radio personalities, or the unseen monster in a horror movie. Movies can never create as fulfilling an image for an individual audience member as their imagination can. Theodore could picture whatever he wanted to–even if he wanted to–to accompany Samantha’s computer-generated voice. Unfortunately for the audience, Jonze cast Scarlett Johansen, a popular actress whose image we are instantly able to recall upon hearing her husky, sultry voice, so the imagined-perfection is more difficult for us.

This is just the components and capabilities. Tomorrow we’ll look at some of the key interactions with OS1.