What is a Voice User Interface?
Voice User Interface or VUI (pronounced “voo-e”) is simply an application that the user interacts with by communicating vocally. Most of us are familiar with voice interfaces by interacting with automated phone systems. Sadly, a large amount of phone systems have a very badly designed interface. That’s because the developers who design these systems don’t understand how to design for voice experience.
Designing the Blueprint
While it’s very tempting to design your voice experience using a flow-chart, it’s not the best way to achieve a great experience. Although it’s important to have a flow-chart explaining the workflow or process, the conversational design prompts shouldn’t be focused on that. Remember that speech happens in context and is not based on logic. If you design your voice interface just on call-flows or flow-charts, then you are in a very poor position.
For example, in a conversation, we always go forward and never backward. How many automated systems have you heard with prompts such as, “If you want to return to the main menu, press 1”. The ingredient to make a design naturally conversational is understanding linguistics. In a conversational design interface, the focus is on building an interaction based on the way people speak, not how they write. A lot of voice interfaces are written the way we write rather than speak.
Designing a VUI dialog
So how do we get started? In voice interfaces, we don’t have wireframes: we have sample dialogs. But before we do that, let’s start with user stories. Start by describing scenarios in which your users will find useful. Identify major scenarios to describe the purpose of your interface and ways your users can interact with them.
Next, start crafting a sample dialog for your scenarios. A sample dialog is a script that shows a conversational flow between the system and the user. You can think of this as the VUI version of a wireframing process. A sample dialog can be something you show your client to get feedback before you start prototyping.
Here’s a sample dialog:
System: Thanks for calling BankOMatic. I looked up your number and found an account. I see you have a payment that’s due today. Are you calling about that?
System: OK. For security, tell me the last digits of your social security.
System: Thanks. Let me look it up…
Found it. You have a payment of $40.98, that’s due today. Would you like to make a payment now? Just say yes or no.
System: All right. I see you have a VISA card ending in 789 on file. Would you like me to use that?
Caller: Yes, please.
System: Just a second…. All done! I’ve put that through. Just so you know – it might take up to 48 hours to process that. Now – if that’s all you needed, feel free to hang up and thanks
The dialog above follows conversational design approach. In a conversation, we tend to use contractions (i.e., ‘You’re, ‘I’m). While we do this unconsciously, it’s important to realize this – the way we speak is different from the way we write. So please do keep that in mind when designing your VUIs.
Another idea to keep in mind is what’s called a discourse marker. A discourse marker is a word or phrase that you can use to connect and arrange what you say or write (i.e., anyway, now, etc). In the dialog above, I have used the word ‘Now’ to make an obvious transition from one idea to another.
Whether you’re building a speech application or Chatbot, it’s important to understand the persona you’re trying to create. Your voice system’s persona or personality is an extension of your brand’s image and plays an important role.
A great persona in voice is not about just having a pretty voice. It’s also about connecting with the user on the other end. When we hear a voice we unconsciously make a lot of assumptions about that person. These assumptions include how intelligent that person might be or which region or country they’re from. When designing a persona, here are some things you should think about:
- Role: What’s the role of the application? Is it an assistant helping get things done? Is it a stock advisor? Bank clerk?
- Company Brand: The persona you select should represent your brand’s image.
- Target Audience: A good persona should be familiar to your users. For a compelling persona, we need to consider demographics, attitudes, lifestyle of the user, etc.
Documenting and Prototyping a VUI
Prototyping for voice can be tricky, since not a lot of voice prototyping tools are readily available. While I use internal tools to prototype, some available tools are Sayspring and API.AI. I can’t comment on any of these tools since I use internal tools and not external ones.
When it comes to documenting your designs, I can tell what you should think about. Think of each interaction as a series of states. Each state should have:
- State Name – The unique state name you can reference back to.
- State Type – Type of the state – i.e., transfer state, data look up state, etc.
- Initial Prompt – Your main initial prompt that your user will hear.
- Error Prompt(s) – What are the prompts you’re going to be playing if the user says something out of context?
- State conditions – Conditions can be used to point to the next successful state (i.e., the user selects order status and the condition state points to the next state – ‘OrderStatusPrompt’ upon success).
State Name: WelcomeAlexa
State Type: Presentation
Initial Prompt: Welcome to Alexa Skills. What can I help you with?
Error: Sorry – I didn’t understand. You can tell me things like….
Order Status -> OrderStatusPrompt
When it comes to designing error strategies, one powerful way is by starting with general to specific. Let’s have a look at this:
System: What’s your date of birth?
System: Just tell me your date of birth using 2 digits for the month, 2 digits of the day, and four for the year.
As you can see, the error strategy started escalating from general to specific rather than just giving all the information right away. In other words, your errors should be progressive and helpful. Another approach is known as rapid re-prompt. In this approach, we don’t give away all the details but instead, the response is given with something simple, such as, “I’m sorry?”
Here’s an example:
Bot: How can I help you?
Bot: I’m sorry?
The disadvantage as you know is that the user’s not getting all the information in details at first. The best place to use this is during open-ended prompts, such as ‘How can I help you?’ and the user often doesn’t consider this as an error.
There are many ways you can design a great experience for voice. Remember that the end focus is on users like you and me. So, it’s important to understand not only the target users but also the context in which the dialogs will appear.
In other words, study the way we speak and write. Study user-centered design processes and learn how to approach the challenge in a human way. It’s never the limitation of speech technology that’s responsible for horrible voice experience. It’s usually the designers not knowing how to apply these processes that result in a less-than-desirable voice interface. Hopefully, this article will help you design for mere mortals.
Originally published in : https://www.justinmind.com/blog/voice-user-experience-design-and-prototyping-for-mere-mortals/