Amazon has filed a patent application with the US Patent and Trademark Office describing a technology that would allow Echo and other Alexa-enabled devices to capture what you say before a wakeword, like “Alexa” is uttered. Currently, Alexa devices only record and send audio to Amazon servers if a wakeword is detected. Should Amazon decide to develop or implement the technology, an Alexa-enabled device would constantly record and delete what you say using the device’s local memory storage.
The patent application, which was made public today, offers insight into Amazon’s ambitions to expand the capabilities of its voice recognition technology. Alexa devices currently can’t understand commands where the wakeword comes after, or is in the middle of a sentence. But images in the patent application offer “Play some music, Alexa” and “Play some music, Alexa. The Beatles, please” as examples.
“While such phrasings may be natural for a user, current speech processing systems are not configured to handle commands that are not preceded by a wakeword,” wrote the patent application’s authors Kurt Wesley Piersol and Gabriel Beddingfield. “Offered is a system to correct this problem.”
In a statement, an Amazon spokesperson told BuzzFeed News that, “The technology in this patent is not in use, and referring to the potential use of patents is highly speculative.” The spokesperson added that Amazon files many patent applications that are not ultimately implemented into consumer-facing products, and that patents do not necessarily reflect “current or near-future states of products and services.”
According to the patent application, after a wakeword is detected, Alexa may “look backwards” to determine if the command came before the wakeword, and use pauses in speech to identify the beginning of the command. At that point, the audio would be sent from the device to Amazon servers for full processing. The idea is similar to Apple’s Live Photos technology, which captures 1.5 seconds before and after the camera shutter is pressed.
This process, the patent application states, is designed so that all captured speech doesn’t need to be sent to Amazon, “thus addressing privacy concerns associated with an ‘always-on’ speech processing system.” Additionally, the device may be configured to store only 10 to 30 seconds of audio at a time “to avoid capturing too much speech and causing privacy concerns.”
Still, the potential technology means that Alexa devices will be recording more conversations and more audio, including speech it thinks are commands. While Amazon’s speech recognition software is constantly being improved, Echo and third-party Alexa devices still accidentally record plenty of audio snippets triggered by false wake words. This could expose more customer recordings to Amazon employees and contracts tasked with reviewing and annotating audio files to improve Alexa’s technology, as well as others. In May 2018, an Amazon Echo inadvertently shared portions of a woman’s private conversation as a voice message to one of her husband’s employees. It’s also unclear whether customers would be able to turn off pre-wakeword recording.
Privacy concerns aside, the patent application is just a part of Amazon’s attempt to make voice computing more natural, and more human. But in order for Amazon to do so, it appears that we’ll have to let Alexa listen in on more of our conversations.