IoT Interface Design: The Voices of…

Smart Communication

IoT Interface Design: The Voices of…

Voice is the next big platform and it may very well deliver the smart home hub everyone is waiting for. How does this new interface model fit in with the rest of the digital world? And what does the voice market look like today?.

by Marcel Weiss

The Amazon Fire Phone was a spectacular flop and, in hind sight, one of the best things that could have happened to – Amazon. Instead of perhaps gaining a comfortable third place in mobile, Amazon was delegated to remain as “just another app” on someone else’s platform. While Apple and Google were hard at work dividing up the gargantuan mobile market between them and crushing former operating systems giant Microsoft, Amazon could look elsewhere. Unencumbered by mobile, Amazon built and released the first version of a new kind of standalone gadget in late 2014. Thus Echo, the leading contender as winner of the “next big thing” was born and it took Google until 2016 and Apple until 2017 to bring their own smart speakers to market. Though Apple and Google continue to dominate mobile, Amazon seems poised to rule in the home. This is a big deal for the Internet of Things. IoT is often seen as the next big thing after mobile and, in the end consumer market, the biggest missing piece of this puzzle has always been the actual glue that holds it all together: the essential platform and, with it, the front-end interface.

IoT Interface: Design Spansion-FM4-Voice-Control-System-Diagram

Not that easy: Multipoint Control Units (MCUs) play a crucial role within complex voice server systems

Let’s take a step back. Is voice the next big thing? Will voice finally be the harbinger of a truly smart home? Will voice be the natural interface for smart homes? Or, more precisely, will smart speakers turn into the smart home hubs holding it all together? Voice is a more natural way of interacting, or “communicating,” with devices than using a graphical user interface (GUI).

On the other hand, voice is also very much constrained. Using voice is not the best way to do something that consists of several steps, or is overly complex, because it requires a lot of data points or options. This lays bare the fact that pure voice as an interface, meaning no screens, is more akin to a terminal than it is to a GUI within the larger interface family.

We are entering the age of additive interfaces

Which brings us to another trend in interfaces. Think about their future, which is more than “just another aspect” of the future of technology. We have to do away with a rather dominant assumption: one type of interface replaces another type of interface. This is not true. After the big, and ongoing, switch from mouse plus keyboard to multi-touch (plus keyboard), we are entering the age of additive interfaces. Smartphones and multi-touch are not going anywhere, they’re here to stay. But now they’ve got company. Voice is clearly an additive interface. Voice makes some interactions easier but is not better suited than a GUI for others. And a new additive interface is rising – augmented reality.

What uses are more suited to voice? Broadly speaking, you can split user interactions with smart devices smart into two segments: the initial setup phase, and day-to-day usage. Setup, at the very least, requires the authentication of yourself, or the services and the networks which connect to the device. Even if everything is kept in the default state, this often proves more complex for users than the second phase, the actual use. A GUI makes more sense for this first phase and that’s why every voice system is accompanied by an app for setting things up. The second phase involves repeated actions and this creates space for voice. Think about it in the context of the smart home: turn on your lights by voice, trigger predefined “scenes” by voice (turn lights off and turn TV on, for example), control connected kitchen appliances by voice while cooking, and so on.

Voice is only a subset of conversational interfaces

A new catchphrase, conversational commerce, emerged in 2016. Broadly speaking, this is regular e-commerce performed in a “chatty” way using a conversational interface. The voice interface may be incorporated in a mobile chat app using text conversion or voice through a smart speaker. The interesting aspect from a developer’s perspective is that once you have built your bot/AI, it doesn’t matter much whether your customers engage through text or audio. Considering voice as a subset of conversational interfaces immediately increases its usefulness. Conversational commerce is where Amazon is making money with Alexa. Voice is perfect when reordering consumables, for example – simple, repeated actions. Amazon has now built a more convenient way to buy consumables and other products by connecting Alexa with Prime – and it is the biggest player in standalone voice gadgets through the company’s growing Echo speaker family. Voice is still an open first, especially for non-English-speaking markets. While Apple’s Siri can support 21 languages (localized for 36 countries), only eight are supported by Microsoft’s Cortana, four delivered in Google Assistant, and currently Amazon’s Alexa has only two (English and German). Amazon Echo has three years of experience; Google Home, the first to follow, was released two years later in 2016; last summer, Alibaba released its Echo-like speaker Tmall Genie; and, barring any further delays, Apple’s Homepod should be hitting the first homes around the time you’re reading this.

IoT Interface Design: Speech Systems Logos

Mixed results: Samsung’s Bixby is struggling to catch up with Amazon’s Alexa

While there are, by far, more phones with Siri or Assistant onboard, making the potential customer base for those smart assistants bigger, it is the far smaller, smart-speaker market that matters. This is especially true for the Internet of Things. A dedicated communal voice-based device can become a smart home hub but a smartphone, which is personal, cannot.

Amazon is building its own operating system

Alexa is the best example to study because, first, voice is not bound to screenless speakers and, secondly, Amazon is a fierce platform company determined not to lose out to any rival consumer-centric platform plays. Amazon has introduced Echo Look, an Echo with a cloud and AI-connected Style Check camera, and Echo Show, the first Echo with a screen. More importantly, Alexa has been the “secret” star at the influential CES show for two years in a row.

IoT Interface Design: CES 2017 show floor

Showstopper: Alexa has been the secret star of CES for two years in a row

Through 2016 and 2017, numerous companies announced Alexa-enabled gadgets. Amazon is effectively making its own operating system – in the cloud, voice-based. And many of the other big companies are joining in – no one wants to miss out. Facebook may use Messenger and WhatsApp to build its own voice platform. Samsung is now trying to push Bixby (formerly Viv) from the people behind Siri, but with mixed results – the dedicated Bixby buttons on the Galaxy S8 and Note 8 are universally hated by users. In our previous issue, we showed how even a former giant like Nokia is getting in on the action: Corporate Comeback with IoT: Reinventing Nokia… Again. The application level is even more interesting than the OS/platform level. What do the emerging voice markets mean for current and new brands and manufacturers?

Premium speaker brand Sonos is the first example pointing to where we are heading in the voice space. Sonos One supports Alexa right out of the gate and will add Google Assistant in 2018. Sonos’ implicit modus operandi is: we’ll build the best speakers and will support all major voice platforms that allow integration.

IoT Interface Design: IKEA Smart Home

Northern lights: Ikea’s Tradfri home lighting system can be voice controlled via Alexa or Apple HomeKit

Homeware company Ikea is doing something similar with its first smart home product family. Its Tradfri smart lighting system supports Alexa and Apple HomeKit, and can connect with Philips Hue gateway. What Sonos and Ikea are doing is referred to by economists as “multi-homing” because they support several platforms simultaneously and thereby slightly decrease the market power of those platforms. In the sci-fi TV series Extant, Halle Berry plays a near-future astronaut. Anyone interested in interfaces, the Internet of Things and where things are heading in the consumer space should take a look, at least at the pilot episode. It provides a well thoughtthrough picture of where we are heading: a world where voice tech is everywhere and any surface can be an interface, invisibly connected and equally invisibly user-authenticated – FaceID, anyone? That world is approaching fast.

Leave a Reply

Your email address will not be published. Required fields are marked *

*