These examples, although fictional, demonstrate that “3D” can be used in different ways.
In Jurassic Park and Hackers, 3D graphics are used to create a richer display with more information density, though it is not photorealistic. The Jurassic Park file browser is primarily a symbolic 2D representation of the file system hierarchy, projected onto a perspective ground plane to make more elements visible at once. The third dimension is used to indicate the number of sub elements or their size. In Hackers, the City of Text towers most likely represent the actual contents of each physical disk drive in the corresponding real world location, and the pulses and colors indicate levels of activity or threat.
The Corridor in Disclosure, and its VirtuGood 6500 close copy in Community, instead create a more photorealistic virtual world. The file system becomes a building or landscape, and the users are embodied within the virtual world as an avatar. Like the pre-computer memory palace, this should take advantage of the human ability to remember and navigate our way around. But The Corridor blows it by putting all the files within one room, and representing them as sheets of paper within identical filing cabinets. Walking through the 3D architecture becomes a pretty but time wasting diversion.
I’m personally disappointed not to find any true computer memory palaces, whether fictional or real. As mentioned in the introduction, an essential characteristic of the memory palace is that each item be stored in a unique location, visually distinct from any other. None of the 3D file systems I’ve been able to find do this, instead using generic icons throughout. Computers are actually quite good at creating almost infinite variations in appearance, e.g. fractals in 2D and various CGI landscapes and underwater environments in 3D. A computer memory palace would at least be more interesting to look at.
Where are they today?
Since the 1990s the 3D file browser has seemingly faded away, both in reality and in film/TV. Let’s (briefly) think about why.
The SGI 3D file browser shown in Jurassic Park was not the only one to be released as a real piece of software. Although personal computers could easily run such a 3D file browser by the year 2000, and mobile phones a few years later, the systems we actually use have remained two dimensional. The only widespread use of 3D spatial organisation that I’m aware of is the Apple Time Machine backup software, which uses distance from the viewer to represent increasing age. It’s a linear sequence of 2D desktops rather than allowing true three dimensional movement in any direction. Even native 3D systems like the Oculus Quest present the user a 2D GUI wrapped around the user in a cylinder.
We don’t have our files arranged into 3D buildings or worlds, but there have been other developments since the first 2D file browsers. Keyword search is now built into most GUI desktops. Photo collections can be viewed by timeline, or by geographical location; and music collections arranged by genre, artist, or album. So one likely reason why we don’t have real world 3D file browsers is that in themselves they don’t provide enough of an advantage over the existing 2D GUIs to make changing worthwhile.
User interfaces in film and TV are not constrained by reality or practicality so their absence must be due to other reasons. Sometimes real world interface trends affect what we see on the screen, for instance the replacement of command line interfaces by graphical, but for file browsing we’re still using the 2D GUI browsers from the 1990s. And it’s not because of technical difficulty or expense, because we’ve seen that 1990 feature-film 3D effects can now be created in the budget of a sitcom episode. An example is the 2008 film Iron Man, already mentioned for using a 3D trashcan within Tony Stark’s CAD software system. Later in the film, Pepper needs to copy some files from the corporate PC of evil executive Obadiah Stane. As in the earlier films covered in this review, Stark Industries is portrayed as an advanced technology company so this PC also has a custom GUI created for the film. Here though there is only a very slight use of 3D to arrange flat file icons in order, otherwise it closely resembles existing 2D desktops. The filmmakers could have inserted a 3D file browser with perhaps volumetric projection to match Tony’s 3D CAD system but chose not to.
Pepper selects a folder in the text list at left and it is also highlighted in the graphical list of overlaid translucent icons at right. Iron Man (2008)
Copying computer files (or more dramatically “the data”) still happens in science fiction or near future film settings, but also has become more common in everyday life with the spread of personal computers and now smartphones worldwide. In my opinion, this is the most likely reason why we don’t see 3D and VR file browsers any more: we the audience know how to copy files and search for them, and won’t be impressed by attempts to make it “high tech” with fanciful user interfaces. File systems and browsers have become, well, boring. So we can look back on these cinematic dalliances with 3D file management fondly, but recognize it as a thing we tried for a while, and learned from, but eventually put down.
All of these build on the given that vibranium is a very powerful substance and that Wakanda’s scientists have managed to gain a very, very sophisticated control over it.
In the Talon
This table is about a meter square, and raised off the floor around knee-height. As Okoye and T’Challa approach the traffickers in the Sambisa Forest, T’Challa approaches the table and it springs to life, showing him real-time model of the traffickers’ vehicle train. T’Challa picks up the model of the small transport truck and with a finger, wipes off its roof, revealing that there are over a dozen people huddled within. One of the figures glows amber. (It’s Nakia.) He places the truck back into the display, and the display collapses back to inert sand.
A quick critique of this interaction. The sand highlights Nakia for T’Challa, but why did it wait for him to find her truck and wipe off the top of it to look inside? It knew his goals (find Nakia), can clearly conduct a scan into the vehicle, and understood the context (she’s in one of those trucks), it should not wait for him to pick up each car and scrape off its roof to check and see which one she was in. The interface should have drawn his attention to the truck it knew she was in. This is a “stoic guru” mistake that I’ve critiqued before. You know, the computer knows all, but only tells you when you ask it. It is much more sensible for the transport truck to be glowing from the moment the table goes live, as in the comp below.
Designers: Don’t wait for users to ask just the the right thing at the right time.
Otherwise, this is a good high-tech use of the sand table for the more common meaning of “sand table,” which is a 3-dimensional surface for understanding a theatre of conflict. It doesn’t really help him run through scenarios, testing various tactics, but T’Challa is a warrior king, he can do all that in his head.
The interaction also nicely blurs the line between display and gestural interactive tool, in the same way that the Prometheus astrometrics display did. Like that other example, it would be useful for the display to distinguish when it is representing reality, and when the display is being interrupted or modified. Also, T’Challa is nice enough to put the truck back where it “belongs,” but a design would also need to handle how to respond when T’Challa put the truck back in the wrong place, or, say, crushed the truck model with his hand in fury.
In Prometheus it was an Earth, not a truck, but still focused on Africa.
Shuri’s lab
The largest table we see in the movie is in Shuri’s lab. After Black Panther challenges Killmonger and engages in battle outside the capital city, Shuri, Nakia, and Agent Ross rush down to the lab. As they approach an edge-lit hexagonal table, the vibranium sand lowers to reveal 3D-printed armor and weaponry for Shuri and Nakia to join the fight. (Though it’s not like modern 3D printing, these are powered weapons and kimoyo beads, items with very sophisticated functionality.)
Shuri outfits Ross with kimoyo beads from the print and takes off to join the fight. In the lab, the table creates a seat for Ross to remote-pilot the Royal Talon. Up on the flight deck, Shuri throws a control bead onto the Talon, and an AI in the lab named Griot announces to Agent Ross, “Remote piloting system activated.” (Hey, Trevor Noah, we hear you there!)
Around the seat, a volumetric projection of the Talon appears around him, including a 360° display just beyond the windshield that gives him a very immersive remote flying experience. We hear Shuri’s voice explain to Ross “I made it American Style for you. Get in!”
Ross sits down, grabs joystick controls, and begins remote-chasing down the cargo ships that are carrying munitions to Killmonger’s War Dogs around the world. (The piloting controls and HUD for Ross are a separate issue, and will be handled in their own post.)
The moment that Ross pilots the Talon through the last cargo ship, the volumetric projection disappears and the piloting seat returns to sand, ungraciously plopping Ross down the floor level of the lab.
It is in this shot that we realize that the dark tiles of the lab’s floor are all recessed vibranium sand tables. I can count seven in the shot. So the lab is full of them.
Display material
Let’s talk for a bit about the display choices. Vibranium can change to display any color and a shape down to a fine level of detail. See the screen cap below for an example of perfectly lifelike (if scaled) representation.
This is a vibranium-powered volumetric display. It raises the gaze matching issues we’ve seen before.
So why would it be designed so that in most cases, the display is sparkly and black like black tourmaline? Wouldn’t the truck that T’Challa picks up be most useful if it was photographically rendered? Wouldn’t the remote piloting chair be more comfortable if it had pleather- and silicone-like surfaces?
Extradiegetically, I understand the reason is because art direction. We want Wakandan tech to be visibly different than other tech in the MCU, and having it look like vibranium dust ties it back to that key plot element.
But, per the stance of this blog, I try to look for a diegetic reason. It might be a deliberate reminder of the resource on which their technological fortunes are built. And as the Okoye VP above shows, they aren’t purists about it. When detail is needed, it’s included. So perhaps this is it. That implies a great deal of sophistication on the part of the displays to know when photorealism is needed and when it is not, but the presence of Griot there tells us that they have something approaching general AI.
Missing interactions
So, just like I had to do for the Royal Talon, I have to throw my hands up about reviewing the interactions with the sand tables, because we don’t see the interactions that would give these results.
How were the mission goals communicated to the Royal Talon table? Is it programmed to activate when someone approaches it, or did T’Challa issue a mental command? How did Shuri specify those weapons and that armor? What did she do to make the ship “American style” for Ross? Is that a template? Was it Griot’s interpretation of her intention? Why did the remote piloting seat vanish the moment the mission was complete? Was this something Shuri set up in advance, or Griot’s way of telling Agent Ross to GTFO for his own safety? How does someone in the lab instruct a floor tile to leap up and become a table and do stuff? It’s almost certainly via mental commands through the kimoyo beads, but that’s conjecture. The film really provides little evidence.
On the one hand, this is appropriate for us mere non-Wakandans observing the most technologically advanced society on earth. Much of it would feel like inexplicable magic to us.
On the other, sci-fi routinely introduces us to advanced technologies, and doesn’t always eschew the explanatory interactions, so the absence is notable here. It’s magic.
Black Lives Matter
Each post in the Black Panther review is followed by actions that you can take to support black lives.
In the last post we grieved Chadwick Boseman’s passing. This week we’re grieving the loss of Ruth Bader Ginsburg. May her memory be a blessing. With her loss, the GOP is ratcheting up its outrageous hypocrisy by reversing a precedent that they themselves established when Obama was president. The “Moscow Mitch Rule” (oh, oops, sorry) “McConnell Rule” was that new Justices should not be appointed within a year of a general election, so the people’s voice can be taken into account. Of course, the bastards are just ignoring that now and trying to ram through one of their own before election day. This Justice will certainly be a conservative, and we know with this administration that means reactionary, loyal to tiny-hand Twittler, and racist as a Jim Crow law.
There are a few arrows in citizen’s quivers to stop this. One is to convince at least 4 Republican Senators to reject this outright hypocrisy, put country over party, and adhere to the McConnell rule.
To help put pressure where it might work, you can leave voicemails with Republican Senators who may be mulling whether to put country over party. Those 6 Senators’ names and numbers are below. Here’s a script for your message:
Hello, my name is ______. In 2016, Mitch McConnell created the principle of not confirming a Supreme Court Justice in an election year until after the next inauguration. For the legitimacy of the Court in the eyes of the people, I’m asking Senator ________ to uphold that principle by refusing to confirm a new Justice until after a new President is installed. Thank you.
—You, hopefully
Lisa Murkowski, Alaska; (202) 224-6665
Mitt Romney, Utah: (202) 224-5251
Susan Collins, Maine: (202) 224-2523
Martha McSally, Arizona: (202) 224-2235
Cory Gardner, Colorado: (202) 224-5941
Chuck Grassley, Iowa: (202) 224-3744
I’ve made my calls and left my messages. Can you do the same to stop the hypocritical Trumpian power grab that would tip the Supreme Court for generations?
We’re actually done with all of the artifacts from Doctor Strange. But there’s one last kind-of interface that’s worth talking about, and that’s when Strange assists with surgery on his own body.
After being shot with a soul-arrow by the zealot, Strange is in bad shape. He needs medical attention. He recovers his sling ring and creates a portal to the emergency room where he once worked. Stumbling with the pain, he manages to find Dr. Palmer and tell her he has a cardiac tamponade. They head to the operating theater and get Strange on the table.
When Strange passes out, his “spirit” is ejected from his body as an astral projection. Once he realizes what’s happened, he gathers his wits and turns to observe the procedure.
When Dr. Palmer approaches his body with a pericardiocentesis needle, Strange manifests so she can sense him and recommends that she aim “just a little higher.” At first she is understandably scared, but once he explains what’s happening, she gets back to business, and he acts as a virtual coach.
In this role he points at the place she should insert the needle, and illuminates the chest cavity from within so she can kind of see the organ she’s targeting and the surrounding tissue. She asks him, “What were you stabbed with?” and he must confess, “I don’t know.”
Things go off the rails when the zealot who stabbed him shows up also as an astral projection and begins to fight Strange, but that’s where we can leave off the narrative and focus on everything up to this point as an interface.
Imagine with me, if you will, that this is not magic, but a kind of augmented reality available to the doctor. Strange is an unusual character in that he is both one of the world’s great surgeons and the patient in the scene, so let’s tease apart each.
An augmented reality coach
Realize that Dr. Palmer is getting assistance from one of the world’s greatest surgeons, rendered as a volumetric projection (“hologram” in vernacular). She can talk to him as if he was there to get his advice, and, I presume, even dismiss him if she believes he was wrong. Wouldn’t doctors working in new domains relish the opportunity to get advice from experts until they they have built their own mastery?
Two notes to extend this idea.
In the spirit of evidence-based medicine and big idea, we must admit that it would be better to have diagnoses and advice based on the entirety of the medical record and current, ethical best practices, not just one individual expert. But if an individual doctor prefers to have that information delivered through an avatar of a favored mentor, why not?
The second note to anyone thinking of this as a real world model for an AR assistant: I would expect a fully realized solution to include augmentations other than just a human, of course, such as ideal angles for incisions, depth meters, and life signs.
A (crude) body visualization
One of the challenges surgeons have when working with internal damage is that the body is largely opaque. They have to use visualization tools like radiographs and (very) educated guesses to diagnose and treat what’s going on inside these fleshy boxes of ours. How awesome that the AR coach can help illuminate (in both senses) the body to help Dr. Palmer perform the procedure correctly?
Admittedly, what we see in Doctor Strange is a crude version. This same x-ray vision appeared with more clarity and higher resolution in two other films, as cited in the Medical chapter of Make It So. In Lost in Space, the medical table projects a real-time volumetric scan of the organs inside Judy’s body into the eyes of the observers.
In Chrysalis, Dr. Bruger sees a volumetric display of the patient on which she is teleoperating.
But despite its low-resolution, I wanted to draw it out as another awesome and somewhat subtle part of the way this AR assistant helps the doctor.
A queryable patient avatar
Lastly, consider that Dr. Palmer is able to ask her patient what happened to him. Of course in the real world passed out patients aren’t able to answer questions, but of course understanding the events that led to a crisis are important. I can imagine several sci-fi ways that this information might be retrievable from the world.
Trace evidence on the patient’s body: High-resolution sensors throughout the operating theater could have automatically run forensic analysis on the patient the moment they entered the room to determine type of wound and likely causes, such as microscopic detection of soot in entrance wounds.
Environmental sensors: If the accident happened in a place with sensors that are queryable, then the assistant could look at video footage, or listen in to microphones in the environment to help piece together what happened. Of course the notion of a queryable technological panoptican has massive privacy issues which cannot be overlooked, but if the information is available to medical professionals, it would be tragic to ignore it in genuine crises.
Human witnesses can provide informative narratives. Witness and first responders may be on record already. But in looking at the environmental sensors, the assistant might be able to instantly reach out to those who have not. Imagine one of these witness, shaken by the event he saw, on a commute home. His phone buzzes and it is the assistant saying, “Hello, Mr. Mackinnon. Records indicate that you were witness to a violent crime today, and your account of the event is needed for the victim, who is currently in surgery. Can you take a moment to answer some questions?”
Patient preferences should be automatically exposed and incorporated via the assistant as well. If the patient was a Jehovah’s Witnesses, for instance, then their desire not to have a blood transfusion should be raised in whatever form the assistant takes.
An surgical assistant could automatically query all of these sources to make a hypothesis of what happened and advise the procedure. This could be available doctor for the asking, volunteered by the assistant at a lull in more critical action, or offered by the assistant as a preventative. I suspect it’s more likely the doctor would ask the assistant than the patient, e.g. “OK, ERbot, what happened to this guy?” but if the doctor prefers, she should be able ask in the second person, as Dr. Palmer does in the scene, and the system should reply appropriately.
Sure, in this context, it’s magic, but since we can imagine how it could be done with technology, this scene gives us a very dense set of inspirational ideas for the future of surgical assistants.
We see a completely new mode for the Eye in the Dark Dimension. With a flourish of his right hand over his left forearm, a band of green lines begin orbiting his forearm just below his wrist. (Another orbits just below his elbow, just off-camera in the animated gif.) The band signals that Strange has set this point in time as a “save point,” like in a video game. From that point forward, when he dies, time resets and he is returned here, alive and well, though he and anyone else in the loop is aware that it happened.
In the scene he’s confronting a hostile god-like creature on its own mystical turf, so he dies a lot.
An interesting moment happens when Strange is hopping from the blue-ringed planetoid to the one close to the giant Dormammu face. He glances down at his wrist, making sure that his savepoint was set. It’s a nice tell, letting us know that Strange is a nervous about facing the giant, Galactus-sized primordial evil that is Dormammu. This nervousness ties right into the analysis of this display. If we changed the design, we could put him more at ease when using this life-critical interface.
Initiating gesture
The initiating gesture doesn’t read as “set a savepoint.” This doesn’t show itself as a problem in this scene, but if the gesture did have some sort of semantic meaning, it would make it easier for Strange to recall and perform correctly. Maybe if his wrist twist transitioned from moving splayed fingers to his pointing with his index finger to his wrist…ok, that’s a little too on the nose, so maybe…toward the ground, it would help symbolize the here & now that is the savepoint. It would be easier for Strange to recall and feel assured that he’d done the right thing.
I have questions about the extents of the time loop effect. Is it the whole Dark Dimension? Is it also Earth? Is it the Universe? Is it just a sphere, like the other modes of the Eye? How does he set these? There’s not enough information in the movie to backworld this, but unless the answer is “it affects everything” there seems to be some variables missing in the initiating gesture.
Setpoint-active signal
But where the initiating gesture doesn’t appear to be a problem in the scene, the wrist-glance indicates that the display is. Note that, other than being on the left forearm instead of the right, the bands look identical to the ones in the Tibet and Hong Kong modes. (Compare the Tibet screenshot below.) If Strange is relying on the display to ensure that his savepoint was set, having it look identical is not as helpful as it would be if the visual was unique. “Wait,” he might think, “Am I in the right mode, here?”
In a redesign, I would select an animated display that was not a loop, but an indication that time was passing. It can’t be as literal as a clock of course. But something that used animation to suggest time was progressing linearly from a point. Maybe something like the binary clock from Mission to Mars (see below), rendered in the graphic language of the Eye. Maybe make it base-3 to seem not so technological.
Seeing a display that is still, on invocation—that becomes animated upon initialization—would mean that all he has to do is glance to confirm the unique display is in motion. “Yes, it’s working. I’m in the Groundhog Day mode, and the savepoint is set.”
In the priorthreeposts, I’ve discussed the goods-and-bads of the Eye of Agamotto in the Tibet mode. (I thought I could squeeze the Hong Kong and the Dark Dimension modes into one post, but turns out this one was just too long. keep reading. You’ll see.) In this post we examine a mode that looks like the Tibet mode, but is actually quite different.
Hong Kong mode
Near the film’s climax, Strange uses the Eye to reverse Kaecilius’ destruction of the Hong Kong Sanctum Sanctorum (and much of the surrounding cityscape). In this scene, Kaecilius leaps at Strange, and Strange “freezes” Kaecilius in midair with the saucer. It’s done more quickly, but similarly to how he “freezes” the apple into a controlled-time mode in Tibet.
But then we see something different, and it complicates everything. As Strange twists the saucer counterclockwise, the cityscape around him—not just Kaecilius—begins to reverse slowly. (And unlike in Tibet, the saucer keeps spinning clockwise underneath his hand.) Then the rate of reversal accelerates, and even continues in its reversal after Strange drops his gesture and engages in a fight with Kaecilius, who somehow escapes the reversing time stream to join Strange and Mordo in the “observer” time stream.
So in this mode, the saucer is working much more like a shuttle wheel with no snap-back feature.
A shuttle wheel, as you’ll recall from the first post, doesn’t specify an absolute value along a range like a jog dial does. A shuttle wheel indicates a direction and rate of change. A little to the left is slow reverse. Far to the left is fast reverse. Nearly all of the shuttle wheels we use in the real world have snap-back features, because if you were just going to leave it reversing and pay attention to something else, you might as well use another control to get to the absolute beginning, like a jog dial. But, since Strange is scrubbing an endless “video stream,” (that is, time), and he can pull people and things out of the manipulated-stream and into the observer-stream and do stuff, not having a snap-back makes sense.
For the Tibet mode I argued for a chapter ring to provide some context and information about the range of values he’s scrubbing. So for shuttling along the past in the Hong Kong mode, I don’t think a chapter ring or content overview makes sense, but it would help to know the following.
The rate of change
Direction of change
Shifted datetime
Timedate difference from when he started
In the scene that information is kind of obvious from the environment, so I can see the argument for not having it. But if he was in some largely-unchanging environment, like a panic room or an underground cave or a Sanctum Sanctorum, knowing that information would save him from letting the shuttle go too far and finding himself in the Ordovician. A “home” button might also help to quickly recover from mistakes. Adding these signals would also help distinguish the two modes. They work differently, so they should look different. As it stands, they look identical.
He still (probably) needs future branches
Can Strange scrub the future this way? We don’t see it in the movie. But if so, we have many of the same questions as the Tibet mode future scrubber: Which timeline are we viewing & how probable is it? What other probabilities exist and how does he compare them? This argues for the addition of the future branches from that design.
Selecting the mode
So how does Strange specify the jog dial or shuttle wheel mode?
One cop-out answer is a mental command from Strange. It’s a cop-out because if the Eye responds to mental commands, this whole design exercise is moot, and we’re here to critique, practice, and learn. Not only that, but physical interfaces are more cinegenic, so better to make a concrete interaction for the film.
You might think we could modify the opening finger-tut (see the animated gif, below). But it turns out we need that for another reason: specifying the center and radius-of-effect.
Center and radius-of-effect
In Tibet, the Eye appears to affect just an apple and a tome. But since we see it affecting a whole area in Hong Kong, let’s presume the Eye affects time in a sphere. For the apple and tome, it was affecting a small sphere that included the table, too, it’s just that table didn’t change in the spans of time we see. So if it works in spheres, how is the center and the radius of the sphere set?
Center
Let’s say the Eye does some simple gaze monitoring to find the salient object at his locus of attention. Then it can center the effect on the thing and automatically set the radius of effect to the thing’s size across likely-to-be scrubbed extents. In Tibet, it’s easy. Apple? Check. Tome? Check. In Hong Kong, he’s focusing on the Sanctum, and its image recognition is smart enough to understand the concept of “this building.”
Radius
But the Hong Kong radius stretches out beyond his line of sight, affecting something with a very vague visual and even conceptual definition, that is, “the wrecked neighborhood.” So auto-setting these variables wouldn’t work without reconceiving the Eye as a general artificial intelligence. That would have some massive repercussions throughout the diegesis, so let’s avoid that.
If it’s a manual control, how does he do it? Watch the animated gif above carefully and see he’s got two steps to the “turn Eye on” tut: opening the eye by making an eye shape, and after the aperture opens, spreading his hands apart, or kind of expanding the Eye. In Tibet that spreading motion is slow and close. In Hong it’s faster and farther. That’s enough evidence to say the spread*speed determines the radius. We run into the scales problem of apple-versus-neighborhood that we had in determining the time extents, but make it logarithmic and add some visual feedback and he should be able to pick arbitrary sizes with precision.
So…back to mode selection
So if we’re committing the “turn on” gesture to specifying the center-and-radius, the only other gesture left is the saucer creation. For a quick reminder, here’s how it works in Tibet.
Since the circle works pretty well for a jog dial, let’s leave this for Tibet mode. A contrasting but related gesture would be to have Strange hold his right hand flat, in a sagittal plane, with the palm facing to his left. (See an illustration, below.) Then he can tilt his hand inside the saucer to reverse or fast forward time, and withdraw his hand from the saucer graphic to leave time moving at the adjusted rate. Let the speed of the saucer indicate speed of change. To map to a clock, tilting to the left would reverse time, and tilting to the right would advance it.
How the datetime could be shown is an exercise for the reader.
The yank out
There’s one more function we see twice in the Hong Kong scene. Strange is able to pull Mordo and Wong from the reversing time stream by thrusting the saucer toward them. This is a goofy choice of a gesture that makes no semantic sense. It would make much more sense for Strange to keep his saucer hand extended, and use his left hand to pull them from the reversing stream.
Whew.
So one of the nice things about this movie interface, is that while it doesn’t hold up under the close scrutiny of this blog, the interface to the Eye of Agamotto works while watching the film. Audience sees the apple happen, and gets that gestures + glowing green circle = adjusting time. For that, it works.
That said, we can see improvements that would not affect the script, would not require much more of the actors, and not add too much to post. It could be more consistent and believable.
But we’re not done yet. There’s one other function shown by the Eye of Agamotto when Strange takes it into the Dark Dimension, which is the final mode of the Eye, up next.
This is one of those sci-fi interactions that seems simple when you view it, but then on analysis it turns out to be anything but. So set aside some time, this analysis will be one of the longer ones even broken into four parts.
The Eye of Agamotto is a medallion that (spoiler) contains the emerald Time Infinity Stone, held on by a braided leather strap. It is made of brass, about a hand’s breadth across, in the shape of a stylized eye that is covered by the same mystical sigils seen on the rose window of the New York Sanctum, and the portal door from Kamar-Taj to the same.
World builders may rightly ask why this universe-altering artifact bears a sigil belonging to just one of the Sanctums.
We see the Eye used in three different places in the film, and in each place it works a little differently.
The Tibet Mode
The Hong Kong Modes
The Dark Dimension Mode
The Tibet Mode
When the film begins, the Eye is under the protection of the Masters of the Mystic Arts in Kamar-Taj, where there’s even a user manual. Unfortunately it’s in mysticalese (or is it Tibetan? See comments) so we can’t read it to understand what it says. But we do get a couple of full-screen shots. Are there any cryptanalysists in the readership who can decipher the text?
They really should put the warnings before the spells.
The power button
Strange opens the old tome and reads “First, open the eye of Agamotto.” The instructions show him how to finger-tut a diamond shape with both hands and spread them apart. In response the lid of the eye opens, revealing a bright green glow within. At the same time the components of the sigil rotate around the eye until they become an upper and lower lid. The green glow of this “on state” persists as long as Strange is in time manipulation mode.
Once it’s turned on, he puts the heels of his palms together, fingers splayed out, and turns them clockwise to create a mystical green circle in the air before him. At the same time two other, softer green bands spin around his forearm and elbow. Thrusting his right hand toward the circle while withdrawing his left hand behind the other, he transfers control of the circle to just his right hand, where it follows the position of his palm and the rotation of his wrist as if it was a saucer mystically glued there.
Then he can twist his wrist clockwise while letting his fingers close to a fist, and the object on which he focuses ages. When he does this to an apple, we see it with progressively more chomps out of it until it is a core that dries and shrivels. Twisting his wrist counter clockwise, the focused object reverses aging, becoming younger in staggered increments. With his middle finger upright, the object reverts to its “natural” age.
Pausing and playing
At one point he wants to stop practicing with the apple and try it on the tome whose pages were ripped out. He relaxes his right hand and the green saucer disappears, allowing him to manipulate it and a tome without changing their ages. To reinstate the saucer, he extends his fingers out and gives his hand a shake, and it fades back into place.
Tibet Mode Analysis: The best control type
The Eye has a lot of goodness to it. Time has long been mapped to circles in sun dials and clock faces, so the circle controls fit thematically quite well. The gestural components make similar sense. The direction of wrist twist coincides with the movement of clock hands, so it feels familiar. Also we naturally look at and point at objects of focus, so using the extended arm gesture combined with gaze monitoring fits the sense of control. Lastly, those bands and saucers look really cool, both mystical in pattern and vaguely technological with the screen-green glow.
Readers of the blog know that it rarely just ends after compliments. To discuss the more challenging aspects of this interaction with the Eye, it’s useful to think of it as a gestural video scrubber for security footage, with the hand twist working like a jog wheel. Not familiar with that type of control? It’s a specialized dial, often used by video editors to scroll back and forth over video footage, to find particular sequences or frames. Here’s a quick show-and-tell by YouTube user BrainEatingZombie.
Is this the right kind of control?
There are other options to consider for the dial types of the Eye. What we see in the movie is a jog dial with hard stops, like you might use for an analogue volume control. The absolute position of the control maps to a point in a range of values. The wheel stops at the extents of the values: for volume controls, complete silence on one end and max volume at the other.
But another type is a shuttle wheel. This kind of dial has a resting position. You can turn it clockwise or counterclockwise, and when you let go, it will spring back to the resting position. While it is being turned, it enacts a change. The greater the turn, the faster the change. Like a variable fast-forward/reverse control. If we used this for a volume control: a small turn to the left means, “Keep lowering the volume a little bit as long as I hold the dial here.” A larger turn to the left means, “Get quieter faster.” In the case of the Eye, Strange could turn his hand a little to go back in time slowly, and fully to reverse quickly. This solves some mapping problems (discussed below) but raises new issues when the object just doesn’t change that much across time, like the tome. Rewinding the tome, Strange would start slow, see no change, then gradually increase speed (with no feedback from the tome to know how fast he was going) and suddenly he’d fly way past a point of interest. If he was looking for just the state change, then we’ve wasted his time by requiring him to scroll to find it. If he’s looking for details in the moment of change, the shuttle won’t help him zoom in on that detail, either.
There are also free-spin jog wheels, which can specify absolute or relative values, but since Strange’s wrist is not free-spinning, this is a nonstarter to consider. So I’ll make the call and say what we see in the film, the jog dial, is the right kind of control.
So if a jog dial is the right type of dial, and you start thinking of the Eye in terms of it being a video scrubber, it’s tackling a common enough problem: Scouring a variable range of data for things of interest. In fact, you can imagine that something like this is possible with sophisticated object recognition analyzing security footage.
The investigator scrubs the video back in time to when the Mona Lisa, which since has gone missing, reappears on the wall.
INVESTIGATOR
Show me what happened—across all cameras in Paris—to that priceless object…
She points at the painting in the video.
…there.
So, sure, we’re not going to be manipulating time any…uh…time soon, but this pattern can extend beyond magic items a movie.
The scrubber metaphor brings us nearly all the issues we have to consider.
What are the extents of the time frame?
How are they mapped to gestures?
What is the right display?
What about the probabilistic nature of the future?
What are the extents of the time frame?
Think about the mapping issues here. Time goes forever in each direction. But the human wrist can only twist about 270 degrees: 90° pronation (thumb down) and 180° supination (thumb away from the body, or palm up). So how do you map the limited degrees of twist to unlimited time, especially considering that the “upright” hand is anchored to now?
The conceptually simplest mapping would be something like minutes-to-degree, where full pronation of the right hand would go back 90 minutes and full supination 2 hours into the future. (Noting the weirdness that the left hand would be more past-oriented and the right hand more future-oriented.) Let’s call this controlled extents to distinguish it from auto-extents, discussed later.
What if -90/+180 minutes is not enough time to entail the object at hand? Or what if that’s way too much time? The scale of those extents could be modified by a second gesture, such as the distance of the left hand from the right. So when the left hand was very far back, the extents might be -90/+180 years. When the left hand was touching the right, the extents might be -90/+180 milliseconds to find detail in very fast moving events. This kind-of backworlds the gestures seen in the film.
That’s simple and quite powerful, but doesn’t wholly fit the content for a couple of reasons. The first is that the time scales can vary so much between objects. Even -90/+180 years might be insufficient. What if Strange was scrubbing the timeline of a Yareta plant (which can live to be 3,000 years old) or a meteorite? Things exist in greatly differing time scales. To solve that you might just say OK, let’s set the scale to accommodate geologic or astronomic time spans. But now to select meaningfully between the apple and the tome his hand must move mere nanometers and hard for Strange to get right. A logarithmic time scale to that slider control might help, but still only provides precision at the now end of the spectrum.
If you design a thing with arbitrary time mapping you also have to decide what to do when the object no longer exists prior to the time request. If Strange tried to turn the apple back 50 years, what would be shown? How would you help him elegantly focus on the beginning point of the apple and at the same time understand that the apple didn’t exist 50 years ago?
So letting Strange control the extents arbitrarily is either very constrained or quite a bit more complicated than the movie shows.
Could the extents be automatically set per the focus?
Could the extents be set automatically at the beginning and end of the object in question? Those can be fuzzy concepts, but for the apple there are certainly points in time at which we say “definitely a bud and not a fruit” and “definitely inedible decayed biomass.” So those could be its extents.
The extents for the tome are fuzzier. Its beginning might be when its blank vellum pages were bound and its cover decorated. But the future doesn’t have as clean an endpoint. Pages can be torn out. The cover and binding could be removed for a while and the pages scattered, but then mostly brought together with other pages added and rebound. When does it stop being itself? What’s its endpoint? Suddenly the Eye has to have a powerful and philosophically advanced AI just to reconcile Theseus’ paradox for any object it was pointed at, to the satisfaction of the sorcerer using it and in the context in which it was being examined. Not simple and not in evidence.
Auto-extents could also get into very weird mapping. If an object were created last week, each single degree of right-hand-pronation would reverse time by about 2 hours; but if was fated to last a millennium, each single degree of right-hand-supination would advance time by about 5 years. And for the overwhelming bulk of that display, the book wouldn’t change much at all, so the differences in the time mapping between the two would not be apparent to the user and could cause great confusion.
So setting extents automatically is not a simple answer either. But between the two, starting with the extents automatically saves him the work of finding the interesting bits. (Presuming we can solve that tricky end-point problem. Ideas?) Which takes us to the question of the best display, which I’ll cover in the next post.
Mordo wears the Vaulting Boots of Valtor throughout the movie and first demonstrates their use to Dr. Strange when they are sparring. The Boots allow the user to walk, run, or jump on air as if it were solid ground.
When activated, the sole of each boot creates a circular field of force in anticipation of a footfall in midair, as if creating free-floating stepping stones.
How might this work as tech?
The main interaction design challenge is how the wearer indicates where he wants a stepping-stone to appear. The best solution is to let Mordo’s footfall location and motion inform the boots when and where he expects there to be a solid surface. (Anyone who has stumbled while misjudging the height or location of a step on a stairway knows how differently you treat a step where you expect there to be solid footing.)
If this were a technological device, sensors within the boots would retain a detailed history of the wearer’s stride for all possible speeds and distances of movement. The boots would detect muscle tension and flexion combined with the owner’s direction and velocity to accurately predict the placement of each step and then insert an appropriately elevated and angled stepping stone. The boots would know the difference between each of these styles of movement, walking, running, and sprinting and behave accordingly.
As a result, Mordo could always remain upright and stable regardless of his intended direction or how high he had climbed. And while Mordo may be a sorcerer with exceptional physical training, he isn’t superhuman. With the power of the boots he is only able to run and step as high as he could normally if for example he was taking a set of stairs two or three at a time.
As a magical device, the intelligence imbued in the boots is limited to the awareness of the intent of the sorcerer and knows where to place each force-field stepping-stone.
The glowing bits
As each step lands, the placement of the boot results in a brief energy discharge in the shape of a brilliant glowing gold circle. Is this a bug in combat, or a feature? The blog has before called out how glowing bits on a warrior make them an easier target, but it’s worth noting that Mordo’s feet are actually on individual stepping stones for less than a half a second. He leaves them behind as he goes. If someone targeted the circles themselves, they’d mostly be targeting where he was rather than where he is, so I’d count it as a distracting feature. As long as he wasn’t being targeted with a long-distance area-of-effect weapon.
Activation?
When describing them to Strange, Mordo demonstrates the effect with a subtle kick. It’s not clear if he’s activating the boots or just demonstrating that they have inherent magical powers.
These boots are awesome. They would require a lot of practice to get used to, but after some tumbles a user could always acquire the high ground on an opponent and they would never need a ladder to change a light bulb. What’s not known is what would happen if the user tried to do parkour style moves where a step would be perpendicular to the ground. Could Mordo walk on walls or the ceiling of a room?
More!
It would be cool to know more about these boots. Could Mordo climb to a given height and then just stand there or is each step is a limited duration effect? Could the boots be used offensively as a kind of boot sized force field? In a fight, Mordor could lash out with a sidekick/step that stops an onrushing attacker not unlike hitting a brick wall.
Since he’s heavily set up to the Big Bad in the sequel, we’ll likely see more of these relics, and get some more of the questions answered.
Hover technology is a thing in 2015(1985) and it appears many places.
Hoverboards
When Marty has troubles with Griff Tannan he borrows a young girl’s hover scooter and breaks off its handlebar. Hes able to put his skateboarding skills to use on the resulting hover board.
Griff and his gang chases Marty on their own hover boards. Griffs has a top of the line hover board labeled a Pit Bull. Though Marty clearly has to manually supply forward momentum to his, Griffs has miniature swivel-mount jet engines that (seem to) respond to the way he shifts his weight on the board.
Hovertraction
George requires traction for a back problem, but this doesn’t ground him. A hover device clamps his ankles in place and responds to foot motions to move him around.
Hover tech is ideal for leaning control, like what controls a Segway. That’s just what seems to be working in the hoverboard and hovertraction devices. Lean in the direction you wish to travel, just like walking. No modality, just new skills to learn.
So this is going to take a few posts. You see, the next interface that appears in The Avengers is a video conference between Tony Stark in his Iron Man supersuit and his partner in romance and business, Pepper Potts, about switching Stark Tower from the electrical grid to their independent power source. Here’s what a still from the scene looks like.
So on the surface of this scene, it’s a communications interface.
But that chat exists inside of an interface with a conceptual and interaction framework that has been laid down since the original Iron Man movie in 2008, and built upon with each sequel, one in 2010 and one in 2013. (With rumors aplenty for a fourth one…sometime.)
So to review the video chat, I first have to talk about the whole interface, and that has about 6 hours of prologue occurring across 4 years of cinema informing it. So let’s start, as I do with almost every interface, simply by describing it and its components.
Exosuit
The Iron Man is the name of the series of superpowered exosuits designed by Tony Stark. They range from the Mark I, a comparatively crude suit of armor to escape imprisonment by terrorists, through the Mark XLVI, the armor seen in The Avengers: Age of Ultron. The suit acts as defense against nearly every type of weapon known. It has repulsor beams built into the palms and in later models the arc reactor mounted in the chest that can be used to deliver concussive force. It allows the wearer to fly. Offensive weaponry varies between models, but has included a high powered laser system, and auto-targeting minigun pod and missiles. The suit can act semi-autonomously or via remote control. One of the models in The Avengers has parts that are seen to self-propel to Tony, targeting a beacon bracelet he wears, and self-assemble around him very quickly.
Immersive display
Though Tony’s head is completely covered, he has a virtual reality display within his helmet. It is a full-field-of-vision, very high-resolution, full-color display that provides stereoscopic imaging. It allows Tony to see the world around him as if he were not wearing the helmet, augment the view with goal-, person-, location-, and object-sensitive awareness.
The display varies a great deal, changing to the needs of the situation. But five icons persistently in the lower part of the display seem to be: suit status, targeting and optics, radar, artificial horizon, and map.
An interpretive view of Tony’s experience, from Iron Man (2008).
An first-person view from within the HUD, Iron Man (2008).
There is much to critique about the readability of the complex layering and translucency, the limits of human perception, and the necessarily- (and strictly-) interpretive nature of what we as audience see, but let me save those three points for a later post. For now it’s enough to log the features as aspects of the system.
Though Tony could use his hands to interact with an interface projected into the augmented reality view around him, his hands are often occupied in controlling flight or in combat. For this reason the means of input are head gesture, eye gesture, and voice input. A bit more on each follows.
Elements within the HUD such as reticles around his eyes follow and track his head gestures. Other elements stay locked in place. The HUD can track his gaze perfectly, allowing him to designate targets for his weapons with a fixation. Using this perfect eye tracking, Tony can also speak about something he is looking at, either in the real world or in the interface, and the system understands exactly what he’s talking about.
In fact, Tony is able to speak fully natural language commands, and indeed, carry out full-Turing conversations with the suit because of the presence of…
Strong artificial intelligence: JARVIS
An on-board artificial intelligence known as JARVIS handles any information task Tony asks of it, and monitors the surroundings and anticipates informational needs. There is strong evidence that most of the functions of the suit are handled by JARVIS behind the scenes. The crucialness of the artificial intelligence to the function of the suit cannot be overstated. It’s difficult to imagine how most of the suit could function as it does without an artificial intelligence behind the scenes facilitating results and even guiding Tony. With this in mind it is instructive to reframe the AI as the thing being named the Iron Man, with Tony Stark being an onboard manager, or, more charitably, a command-and-control center. Who quips.
Several times throughout the movie, Loki uses places the point of the glaive on a victim’s chest near their heart, and a blue fog passes from the stone to infect them: an electric blackness creeps upward along their skin from their chest until it reaches their eyes, which turn fully black for a moment before becoming the same ice blue of the glaive’s stone, and we see that the victim is now enthralled into Loki’s servitude.
You have heart.
The glaive is very, very terribly designed for this purpose.
It freaks the victim out (or should, anyway)
Look at that damned thing. It looks like an elven shiv. A can opener for human flesh. When a victim sees it coming, he will reasonably presume it’s going to split them like a fresh-caught fish, and do whatever he or she can to flail away from it. See how Loki has to grab Hawkeye by the wrist? That’s because short of some sort of hypnosis, Hawkeye would not just stand there like that with Orcrist slicing towards his sternum. We have to backworld some sort of pre-enthrallment mind effect to explain why he’s not jerking in the other direction. As all great propaganda and persuasion masters know, you can’t approach as a threat, or the victim’s fight-or-flight might kick in and slam that window shut for winning their hearts and minds.
It might, in fact, slice the target open
Even if there’s some mystical roofie thing going on to calm the victim, if Loki had too much force behind his approach, or someone bumped either of them, the glaive could go into the victim, causing a shock of pain that might wake them up before the enthrallment could take place. Or worse, it could actually damage the heart and kill the victim, which is counter to Loki’s goal.
It requires precision, control, and time
To avoid the disheartening of an intended victim, then, Loki has to grab them, momentarily hypnotize them into calmness, and carefully ease the thing up to the target, and hold it and them in place for a few. Imagine a button on a keyboard that had to be touched with feather pressure, or it would brick the machine. This would not be a great keyboard. All these are expensive dependencies, and the time it takes is time for onlookers to intervene (or to somehow incapacitate the victim to save them.)
It tips its hand
OK, fine, the glowing-blue eyes might be an unavoidable side effect of the “tech”—and yes, I understand it’s very valuable narrative purpose to signal enthrallment—but if you were designing an enthrallment tech, you’d want to avoid such an obvious “tell,” especially right there in the main location people target when looking at other people.
A redesign
So there are a lot of ways this is less than ideal. Fortunately we don’t have to call iGlaive and tell them to shutter operations. I think we can fix this in one of a few ways.
Soften the industrial design? No.
The glaive needs to stay looking evil, and being sharp and pointy helps with that.
1. Have the glaive pull them in
A cinematic hack might be to visually imply that the glaive helps with these problems. Imagine Loki approaching Hawkeye with the glaive outstretched, and the blue fog appears and pulls Hawkeye towards its point. The point of contact can glow slightly, implying some protection, and the crystal can glow to do its enthralling. Now it’s a feature, not a bug.
2. Go broadside
If for some plot or cinematic reason that wouldn’t work, you could have Loki use the broad side of the glaive against the chest of the person. Slapping it like an oar onto someone would be a fast gesture that wouldn’t need a lot of precision to get the crystal near the heart. It could even enable sneakier attacks from the side. It might prove cinematically problematic when enthralling a female character, but since that doesn’t happen on screen in The Avengers, it’s moot.
3. A new gesture
If Loki isn’t the broadside sort, you could keep the staff the same and redesign the gesture. The mind is the thing enthralled, so it’s tempting to have it located on a forehead or neck, but we can’t have Loki gesturing to the victim’s head, because then we lose the awesome moment near the climax when Loki tries and fails to enthrall Stark on his chest reactor. So let’s keep it cardiac. Maybe we can change the relationship of the glaive to the victim.
Imagine if he lays the glaive across his left forearm, (or better: cuts into his own skin, which would explain why he just doesn’t keep enthralling everyone in sight) which begins to glow with the blue fog, and he uses a pointing index finger to tap the victim’s heart. A finger-to-sternum interaction would telegraph a lot less danger, risk fewer victims’ lives, and enable speed with less apparent precision required. As above, it might be problematic to enthrall a woman without the audience going OMG BOOBS, but again, we’re saved from that problem by the script.
In many ways this is my favorite of the redesigns. It’s a Natural User Interface. With blue fog.
Any of those tweaks might help us believe in the interaction and useful for us to keep in mind: requiring great precision of our users only slows them down and keeps them focused on the interface rather than their goals.