In this blog post, I will be comparing and analyzing two VUIs, one fictional (Star Trek android Data) and one real (Amazon Alexa).
1. Explain the main voice interfaces for the Star Trek android and Amazon Alexa. How are they similar? How are they different?
Both interfaces respond to voice input and have a robotic audio output. Both are also assigned a gender based on their voice and, in Data’s case, their body shape. However, Data has a humanoid form and is able to convey more information with gestures and (somewhat limited) facial expressions. Additionally, Data has the capacity to form friendships, a desire for autonomy, and the ability to create and pursue his own goals. When Data is commanded to transfer and submit for experimentation, he refuses. If it was necessary to give Alexa a similar command, I don’t think it would have the capacity or even desire to refuse.
2. Compare and contrast the expectations the users/audiences had of each of these voice interfaces. What were the expected behaviors of those voice interfaces? What happened when those expected behaviors were not met? What were the reactions of those users/audiences?
As it was mentioned in Design for Voice Interfaces, the more human-like an interface is, the higher the expectations of the users/audience. Since Data has a humanoid form and acts like a human in many ways, he is expected to have near-human functionality — he must be able to hold sustained, intelligible conversation, understand most social nuances, etc. However, since Alexa’s responses are relatively limited, users understand relatively quickly that Alexa cannot be held to very high standards and they lower them accordingly.
3. The uncanny valley is a theory that describes a “level of realism in artificial life forms in which the human observer has a negative reaction.” At which point for each of these interfaces (Star Trek android and the Echo) did the user/audience encounter the uncanny valley?
I personally react negatively when Alexa sings. I can’t fathom who decided to program her to do that. It’s not robotic enough to be incorporated into a song in an interesting way, and it’s not realistic enough to cross the threshold of being pleasant to listen to. I don’t really know if that qualifies as the uncanny valley. For Data, it’s a little strange for me to really think of him being intimate with a human. I can buy him having friends and close relationships, but the idea of him engaging in any kind of intimacy is bizarre to me.
4. Did you find these reactions surprising? If so, why? If not, why not?
I don’t think it’s necessarily surprising. Alexa has distinct characteristics that make it clear to the user that she’s not human. She’s not attempting to convince me that she is a human, so I can’t be too creeped out by any of her behavior. On the other hand, while Data’s skin, eyes, and speech are distinctly inhuman, he also has a lot of human characteristics, and we are clearly meant to think of him as an autonomous being. Still, something in me rebels against the idea of Data actually being involved in a romantic way — maybe because he sounds like a robot, maybe because he’s sort of childlike in a way.
5. These sources of popular culture are artifacts of different eras. How do you think the public’s expectations of voice interfaces have evolved since then? (Star Trek original series was 1966; Amazon Echo was 2014.)
I think that people used to have higher expectations for voice interfaces because at that point the technology for such a thing wasn’t even really possible, and it was easy to imagine something as functional as Data or even the computer in Star Trek. Now that people have a better idea of the kinds of technology required to make a voice interface, we can all understand the limitations to it.
6. Each of these voice interfaces have a “life” of their own…or do they? What is your perspective on the identities and personalities of Data and Alexa? If you had to rank them on a 1-10 scale of mechanistic (1) to humanistic (10), where would each of them stand, and why?
I think Data is definitely ten or near ten on that scale, because he is an autonomous being with interests, goals, and meaningful relationships that closely resemble what a human’s might be. Alexa is probably around three or four on the scale. She has a bit of a personality when she interacts with users, but on the whole lacks many characteristics that would make users really perceive Alexa as a fully-realized human-like being.
7. In what circumstances would it be better to design a voice interface that was emotionally appealing? When would it be better to design something more utilitarian?
It would probably be best to design a voice interface that was emotionally appealing in a situation where the user needs assurance and comfort, like maybe for a patient in a hospital or a customer in a store. Something more utilitarian would be useful in serious contexts, such as a surgeon in an operating room.
8. Now that I’ve poured all of this content into your brain, what’s one major takeaway you’ve learned about designing voice interfaces?
I think it’s important to consider what voice interfaces are useful for and tailor its design to that rather than trying to use it because it’s trendy or seems “cool.” I think there are definitely times when a voice interface is extremely useful, and I’m definitely finding myself using my Echo Dot more than I foresaw, but as pointed out in Design for Voice Interfaces, it’s not always appropriate to use a voice interface, and sometimes it needs to be paired with another output in order to be most effective. However, I think it can definitely be a powerful tool for accessibility, and I’m excited to think of the tools a voice interface can help create.