Amazon Is Testing an AI Chatbot That Generates Original Dialogue in Real Time

The system may be a first for the customer service market

an illustration of a phone with message bubbles next to the Amazon logo
Amazon is testing two new AI chatbots.
Amazon, Getty Images

Amazon is testing what may be the first commercial customer service chatbot to produce wholly original dialogue in real time using recent breakthroughs in AI language generation.

The retail giant said today that it will deploy the generative chatbot as an aid to human agents for the time being but plans to eventually have it deal with customers directly. The company is also rolling out a separate consumer-facing chatbot that uses a neural network to better match human-authored response templates to customer queries.

The project marks one of the first commercial tests of a state-of-the-art new natural language processing technology that researchers think has the potential to supercharge progress in the field. The model, which has also powered cutting-edge systems like OpenAI’s GPT-2, draws on massive training datasets and predictive text to generate realistic-sounding copy or dialogue.

“It is difficult to determine what types of conversational models other customer service systems are running, but we are unaware of any announced deployments of end-to-end, neural-network-based dialogue models like ours,” wrote Jared Kramer, an applied-science manager on Amazon’s Customer Service Tech team, in a blog post.

Despite these advances in machine learning, most chatbots on the market today still run on automation rather than true AI. They use a flow chart of rules to match customer queries with cookie-cutter responses based on keywords.

While programs like Microsoft’s DialoGPT have tapped new language-generation models for chatbots before, they’ve yet to see much commercial application, perhaps because of how unpredictable and randomized they can be.

But Amazon is hoping to eventually safeguard its generative chatbot with the response-ranking AI, according to a research paper detailing the methods. In that scenario, a generative AI would produce a list of possible responses to a given customer question, and a separate neural network would choose the most relevant.

The randomized testing currently encompasses two types of interactions: return refund status requests and order cancellations. Kramer said the new AI chatbots have already significantly outperformed the previous automated systems by an internal metric that accounts for successfully completed transactions and whether customers have to follow up within 24 hours.

Each of Amazon’s models are trained on data that spans 5 million conversation-response pairs culled from 350,000 past interactions, according to the research paper.

Recommended articles