Jane Muder's profile

Grubhub Alexa skill VUI design process

Prototyping, designing, testing, & launching 
‘Reorder with Grubhub,’ an Amazon Alexa VUI skill

Project roles: Lead voice user interface (VUI) designer, head of content

View the complementary product marketing & FAQ page I created at https://www.grubhub.com/alexa
On March 15, 2017, Grubhub celebrated the launch of our first voice-controlled ordering system, ‘Reorder with Grubhub’ for Amazon Alexa. Designed for frequent Grubhub users, this skill evolves the ordering experience by enabling diners to reorder their favorite dishes without lifting a finger. In this post, I’ll walk you through basic skill functionality, my prototyping, design, and testing processes, and the run-up to launch.

(Note: ‘Reorder with Seamless’ for Amazon Alexa launched next, in May 2017. Much of what's covered in this paper applies equally to that skill. However, its user accounts and order data-sets are unique to the Seamless platform. In other words, they do not cross over to the Grubhub skill or VUI—and the reverse is also true.)


An overview of the Grubhub skill

If you’re a current Grubhub user and you’ve enabled our skill in the Amazon Alexa store, you’ll be able to use the Alexa voice user interface (VUI) to reorder from your Grubhub Order History. You can also manage and update your default payment method and delivery address. This diagram, which will seem more intuitive as you read on, gives an overview of how the Alexa skill works:
Phrases, slot values, and intents: VUI building blocks

Let’s look more closely at what’s happening between a typical Grubhub Alexa skill user and the voice user interface (VUI). Designing a voice-driven skill like this requires that we anticipate and build a comprehensive phrase library, encompassing all possible prompts, questions, and commands a user may utter when interacting with the VUI.

Each phrase I've included in the phrase library contains a few different components: utterances, slot values, and intents. Certain phrases also include one or two more components critical to successful exchanges of information between the user and the Alexa VUI. These components are a) the “wake word,” which signals the Amazon Echo or other device with Amazon Alexa to pay attention to the user's speech — and b) the “skill name,” which invokes a specific skill.

To understand how components are combined into phrases, and how phrases come together to build the library, let’s look at a sample exchange between a Grubhub user and his or her Amazon Echo or other Alexa-enabled device. When the user wants to talk to Alexa, s/he says the “wake word” associated with its device (e.g., “Alexa”).

Here’s an example of what the user might say to initiate the exchange:
If the user’s a Grubhub customer who has ordered from Blue Ribbon Sushi recently, and Grubhub indicates that the order’s available, Alexa will respond with the next step in the reordering process.

Phrases are also grouped together based on the specific sort of action or response they trigger, and sometimes, other details such as the slot value they contain — for example, RestaurantName or OrderDate. These groups of phrases are called “intents,” and when they're all put together, they function as a language rubric and a phrase library for the Grubhub skill. Currently, the library contains hundreds of phrases and dozens of intents.

When formatted as an intent, a sample phrase will look like this:
Note that in intent format, we omit the “wake word” (i.e., Alexa) and the skill name (i.e., Grubhub), since these aspects of the exchange are handled elsewhere in the code. Likewise, the restaurant uttered above (i.e., Blue Ribbon Sushi) has been replaced with the generic RestaurantName slot value string so the user can request any restaurant from his or her Grubhub order history.

When a phrase is formatted as an intent, we include a prefix — in this example, “ReorderByRestaurant.” “ReorderByRestaurant” is the actual “intent.” In fact, every single phrase in the Grubhub skill library that a) kicks off the reordering process and b) requests a specific reorder via the RestaurantName slot value must be mapped to the “ReorderByRestaurant” intent before it can be placed in the codebase. Why? It's because the intent performs the critical heavy lifting—and this is true for all Amazon Alexa skills. Rather than requiring Alexa to parse each word the user utters and match it to one of hundreds or even thousands of unique phrases in the codebase, the intent works as a shortcut, enabling Alexa to instantly recognize the user’s request and provide a response without delay.

Here, each time the user utters one of the phrases from the “ReorderByRestaurant” intent group, Alexa passes the intent and slot and Grubhub returns the response. If the restaurant mentioned matches an available order in the user’s history, Alexa responds by reading out the order, including items, quantities, and total price, and asks the user to confirm the reorder. The example phrase we’ve been using — “Alexa, ask Grubhub to reorder from Blue Ribbon Sushi” — is just one way the user can trigger this specific response. S/he could also say “Alexa, ask Grubhub to read my last order from Burger King,” or “Alexa, ask Grubhub to gimme the order from Chopt.”

In the Grubhub skill codebase, an intent group might look something like this:
The brief list displayed above is only a sample — the actual “ReorderByRestaurant” intent group contains between 1–2 dozen phrases. Intent groups must be robust to give every user maximum flexibility in interacting with the Grubhub skill. To make this flexibility possible, I developed each intent group to match natural user speech patterns and ordering vocabularies.
Prototyping the voice user interface (VUI)

Reorder with Grubhub is modeled on natural human language patterns and conventions, which required me to make some basic assumptions about how Alexa would prompt and respond to users. Alexa’s responses needed to mimic the phrasing and flow of a human conversation partner, ensuring an engaging and realistic ordering experience for every customer.

For example, if a user says a phrase that suggests a desire to order food, but does not indicate a preference for a specific order or restaurant, I wanted Alexa to respond in a naturalistic way, prompting the user to provide the missing information and continue along in his/her request. So if a user says “Alexa, tell Grubhub I’m hungry,” Alexa would say, “Great, here are your available re-orders,” followed by a readout of that user's three most recent Grubhub orders, in response.

This excerpt from the skill VUI flow illustrates how this works. Alexa’s responses are denoted in the yellow boxes. Note the conversational style and tone of the content. This is intentional — the “voice” of Grubhub needs to show warmth and friendliness.
Of course, building a VUI for this Alexa skill by attempting to guess what real-life users will say and do is never as accurate as testing our design assumptions with actual Grubhub users. Thus, we tested this skill by observing actual Grubhub customers as they interacted with the VUI in multiple real-time situations.
Testing the prototype with real users

Once my team developed a working prototype of our skill, we recruited loyal Grubhub users of different backgrounds and lifestyles as our test subjects. Each user sat with one of our product managers and walked through the different tasks within our skill. I observed users as they initiated voice-driven ordering with Grubhub, decided what to reorder, and managed their account settings, which gave me immediate feedback about what was and wasn’t working. By encouraging users to tell us exactly what they thought of their experience, I was able to collect subjective feedback that also proved valuable to the design process.

Testing uncovered these key takeaways:

• The initial phrase library was far too limited. Before releasing the skill, I’d have to build a library that accounted for a much greater diversity of user commands and requests than we offered in the prototype.

• The prototype included too much “filler” language, specifically when Alexa responded to users. It takes just a minute to order food online, so if VUI-based ordering took any longer, it’d be too inconvenient for daily use.

• Some of the exchanges featured more back-and-forths than necessary. Simplifying the flow of information helped. So did paring the process of ordering down to the minimum steps required to ensure an accessible, pleasing, and convenient VUI.

• Sometimes, the Grubhub skill prototype took too long to respond to users. Other times, phrasing was rushed, or too difficult for users to understand. Inserting natural pauses into our code so the conversation felt more human was key in this case.

• Our team tested more complex interactions between the Alexa skill and our users, including requests for orders far in the past and modifications of existing orders. Ultimately, these experiments made the case for keeping exchanges simple, and limited to the three most recent orders in a user’s Grubhub order history.

To market, to market

To prepare for the launch of this Alexa skill, my team kicked off a multi-phase, iterative cycle of VUI design and QA. We continued to refine and re-test our skill. As I did so, I focused on perfecting the VUI experience and improving the robustness of the intent library. Many improvements were based on findings from user testing sessions, while quite a few others were inspired by members of our team over the project lifecycle.

As the launch date approached, we sent the skill to Amazon’s Alexa team for further testing and approval. The team was very supportive in preparing us for launch and someone from Amazon was always able to answer any questions. (If anyone on the Amazon Alexa team is reading this entry, I’d like to say “Thank you for all your help!”) Once the skill was certified and published to the Alexa skills store, we announced the launch to the community. I'm personally so excited to add voice-driven re-ordering to the Grubhub ordering ecosystem—and I'm proud to welcome you to this new experience.

For more information or FAQs about Reorder with Alexa, visit the product & FAQ page on Grubhub's website. To try the skill yourself, visit Grubhub's detail page in the Alexa skill store.
Grubhub Alexa skill VUI design process
Published:

Grubhub Alexa skill VUI design process

I led the design and marketing of Grubhub's first-ever voice-driven ordering skill for Amazon Alexa. Discover my process here.

Published: