First Impressions of a Voice User Interface

I bought my dad an Amazon Echo Dot a few years ago for Fathers Day, but I’ve never extensively interacted with it before. When I visited my parents, I tried asking for facts about dogs, but Alexa never seemed to know what I was talking about. Now having used one in class and interacting with my own at home, I’ve learned a lot about how they work.

Initial setup was pretty rocky, but I think a large part of that was the environment of the classroom. We had three pairs of people trying to connect to three different devices all at the same time in a relatively small space. The thing that we got stuck on was selecting the correct WiFi network for the correct device. Since there were three devices in the room, we had to make a bit of a guess at which of the three WiFi networks was the correct one. We made the wrong guess and ended up having to move to a different spot. However, when I got my own at home, I was an expert at setting it up.

Once the device was successfully set up it was easily able to do simple tasks: it accurately reported on the weather for the correct location, set up an alarm, and even thoughtfully asked whether the user would like to have the alarm set for all weekdays. It was able to play a song, but then the volume was too loud for the device to hear its name and the user had to press the button on top to force it to listen to a request. We were able to try several of Alexa’s skills, including playing games like Jeopardy, Question of the Day, and Rock, Paper, Scissor.

Rock, Paper, Scissors was a bit odd to play. The device doesn’t have hands or a visual display, so there was no way to know which item Alexa chose except that she told us. Out of seven rounds, we won six and tied with Alexa on one. I started to suspect that Alexa was letting me win since I just had to trust what she was saying and couldn’t see anything.

Alexa was also able to multi-task, playing Rock, Paper, Scissors in the middle of making a phone call. This can be seen in the video below.

There were a few issues that stemmed from Alexa mishearing and misunderstanding questions or commands. Sometimes she would cut the user off mid-question or she would answer the wrong question. Another issue was that sometimes another group would speak up too loudly and the device we were using would respond to their questions or commands. This was also seen in the video, when another person started playing Rock, Paper, Scissor with their device, but our device also heard the command and started playing as well.

I think the use of the light to signal to the users what the device is doing (listening, processing a question/command, making a phone call, etc) is very effective. When making a phone call, the light changing to green was the only sign that it was doing anything. When the device heard the other group’s command, the light changed to blue, letting us know that it was listening, so we understood why it suddenly started playing Rock, Paper, Scissor. However, I wonder how the user experience would be for a non-sighted person.

The main usability issue is that certain commands have to be worded more specifically in order for the device to understand. For example, when we were playing Jeopardy, we kept trying to skip a clue by saying variations of “skip this question.” The device kept demanding that we could only answer in the form of a question and did not register that we weren’t trying to give an answer. After a few times the device, apparently frustrated with us, reminded us of the correct prompt for skipping a clue.

I do think the Alexa app does a good job of giving the users ideas of how to word questions and commands by giving a lot of suggestions about what to try. I learned a lot about how to communicate with Alexa by looking through the app. Another thing the device itself does is try to anticipate what you might want and inform you ahead of time of what commands you might want to try. For example, when I earned a free game through Question of the Day, the device told me what to say in order to access it. However, it would be easy to forget commands since there’s no easy reference to see. I generally rely on Alexa to remind me of my choices, but maybe there’s a way to ask for them.

Another thing that I think could be improved was that when the device was making the phone call, we weren’t certain that it was actually doing it because the phone didn’t go to the typical “call” screen. The device light did change, so we understood that it was doing something, but we were a little confused because we didn’t hear anything. When I tried this again on my personal one at home, I realized that the device played the dialing noise, which made it clear that a phone call was being made, so maybe we just couldn’t hear it in the classroom because the volume of the device was too low.

All-in-all, I think I’ve gotten a lot better at using the device and understanding how they work. They have a lot more utility that I thought they did, and I’d like to try more of the health and fitness skills to see if they’re really practical to use.

Rachel Dormido

First Impressions of a Voice User Interface

Leave a ReplyCancel reply