Inside the Alexa Prize
Amazon is in a pitched battle with its rivals to bring truly conversational AI into our homes. So the company is staging a contest—a mad dash toward an outlandish goal: Cook up a bot capable of small talk.
By JAMES VLAHOS
Feb 27 2018
The first interactor—a muscular man in his fifties with a shaved head and a black V-neck sweater—walks into a conference room and sits in a low-slung blue armchair before a phalanx of video cameras and studio lights. He’s brightly lit. The rest of the room is totally dark. He gazes at a black, hockey-puck-shaped object—an Amazon Echo—on a small table in front of him. “Alexa,” he says, “let’s chat.”
“Good morning, my friend,” a female voice replies with synthetic agreeability, a purplish ring of light pulsing atop the Echo. “Or is it afternoon? I’m not quite sure.”
“It’s morning,” the man replies.
“Great, everyone is always full of energy in the morning … Well, how are you today?”
“I’m great, how are you?”
“Well, I’m always busy, as people keep chatting with me … But listen, this is just hilarious, it’s a recent post on Reddit: ‘Toothpaste is just mouth soap.’ ”
Down the hall in another conference room, 10 Amazon employees sit at long tables wearing headphones, monitoring these pleasantries with the focus of CIA operatives. In yet another room, three men sit in booths cordoned off by black curtains. They, too, wear headphones and have cameras trained on them. Finally, in a control center, members of a video crew monitor all the feeds on a large, tiled screen. Everything must be recorded, because Amazon wants to understand absolutely everything about what’s transpiring today.
This extravagantly staged operation, which took place last November, is the final judging session in a months-long competition. Amazon has challenged 15 teams of some of the world’s best computer science graduate students to build “a socialbot that can converse coherently and engagingly with humans on popular topics for 20 minutes.” If any team succeeds, its members will snare academic glory and the promise of brilliant future careers. (Consider that some of the most impressive alums of the Darpa Grand Challenges, an early set of autonomous vehicle competitions, went on to run the self-driving car divisions of Google, Ford, Uber, and General Motors.) They will also walk away with a $1 million purse—which Amazon has called the Alexa Prize.
Amazon, in case you haven’t noticed, has spent the past few years pursuing voice AI with a voraciousness rivaling that of its conquest of retail. The company has more than 5,000 people working on the Alexa platform. And since just 2015, it has reportedly sold more than 20 million Echoes. One day, Amazon believes, AIs will do much more than merely control lights and playlists. They will drive cars, diagnose diseases, and permeate every niche of our lives. Voice will be the predominant interface, and conversation itself—helpful, informative, companionable, entertaining—will be the ultimate product.
But all this early success and ambition has plunged Amazon off a cliff, and into a wide and treacherous valley. Today Alexa, like all voice assistants, often fails to comprehend the blindingly obvious. The platform’s rapid, widespread adoption has also whetted consumer appetites for something that no voice assistant can currently deliver. Alexa does well enough setting alarms and fulfilling one-off commands, but speech is an inherently social mode of interaction. “People are expecting Alexa to talk to them just like a friend,” says Ashwin Ram, who leads Alexa’s AI research team. Taking part in human conversation—with all its infinite variability, abrupt changes in context, and flashes of connection—is widely recognized as one of the hardest problems in AI, and Amazon has charged into it headlong.
The Alexa Prize is hardly the first contest that has tried to squeeze more humanlike rapport out of the world’s chatbots. Every year for the better part of three decades, a smattering of computer scientists and hobbyists has gathered to compete for something called the Loebner Prize, in which contestants try to trick judges into believing a chatbot is human. That prize has inspired its share of controversy over the years—some AI researchers call it a publicity stunt—along with plenty of wistful, poetic ruminations on what divides humans from machines. But the Alexa Prize is different in a couple of ways. First, the point isn’t to fool anyone that Alexa is a person. Second, the scale of the competition—the sheer human, financial, and computational firepower behind it—is massive. For several months of 2017, during an early phase of the contest, anyone in the US who said “Alexa, let’s chat” to their Amazon voice device was allowed to converse with a randomly selected contest bot; they were then invited to rate the conversation they’d had from one to five stars. The bots had millions of rated interactions, making the Alexa Prize competition, by orders of magnitude, the largest chatbot showdown the world has ever seen.
That showdown culminated last November in a room with a blue armchair and a bunch of lights.
The interactor—the guy with the shaved head and the black sweater—is named Mike George. Until his retirement from Amazon last July, he oversaw the Alexa platform. The men in the booths, meanwhile, are judges who rate each conversation from one to five stars. If a judge thinks that a conversation has gone off the rails, he can press a button on a handheld wand; if a second judge does so, the conversation and the session timer are halted. Nobody knows which bot is which. Not the interactors, not the judges.