The Hobgoblin cooktop experience
We’re always thinking about what’s coming next, and investing in building our capabilities to get ahead of the next major technology trends. One of the areas we’ve been most excited about is the evolution of human-digital interfaces. Voice user interfaces (VUIs) have been the biggest development in that space recently, breaking the screen barrier and making interactions between humans and technology more natural and intuitive. We think the next advance will be to give machines more contextual understanding, so they know not just what you’re saying but where you’re pointing, where you’re looking, what’s around you, and maybe even your emotional state. Our Hobgoblin Cooktop demonstration introduces a user interface that enables command of a kitchen cooktop using a combination of voice recognition, gaze detection, and analog burner knobs that can also be controlled digitally.
We chose to demonstrate this novel user interface in a cooktop because it’s a setting that many of us know intimately and where the bar to beat is high. Many of us use our cooktops frequently and they help us complete complex tasks reliably. The kitchen is also such a central part of our lives. It’s where we start the day with a cup of coffee, where we prepare meals for family, and where guests often gather. Kitchens are the center of day-to-day living—so the space and the appliances within should be inviting, aesthetically pleasing, and functional, and any technology involved should be helpful yet unintrusive.
This forms a perfect backdrop for testing our hypothesis that a natural, contextually-aware UI can give users more control and more enjoyment of a product without getting in the way.
Use-specific voice control
The first task was to build an appropriate voice user interface (VUI). To reduce architecture cost and complexity, our first version of the cooktop VUI used off-the-shelf microphone and speaker hardware.
As part of an ongoing effort to evaluate the merits of various off-the-shelf VUI software toolkits for our customers we decided to use an open-source speech recognition toolkit. Many customer applications require the power of a cloud-based speech and natural language understanding platform (like Oracle Digital Assistant, Microsoft LUIS, or Amazon Alexa), but we are also exploring software developer kits (SDKs) that can be completely embedded in a device. Embedded SDKs work well in cases where privacy and/or bandwidth are precious and there is sufficient on-board computing power to enable a simpler set of voice interactions. Hobgoblin’s VUI is based on a simple, constrained set of easy-to-use command phrases that can control each burner, e.g. “Turn the front left burner to medium high.”
Audio front-end with noise cancellation and speaker isolation were also integrated for use in noisy environments.
Moving beyond the wake word
While a traditional “wake word” can be leveraged to get the cooktop’s attention, saying a wake word can increase the cognitive load of conversing with smart devices. We explored more natural ways to get the cooktop’s attention and we’re excited about the possibilities. One insight we had was that due to the limited number of potential commands, it’s possible for the device to pay attention only when the conversation is relevant. For example, the cooktop could wake up if it hears words like “burner” or “heat” or “medium,” and then determine if the surrounding words constitute a command.
Taking that concept further, we can leverage the gaze detection abilities of the cooktop to determine whether the user is looking at the device and use that to more accurately infer intent.
The result would bring the experience closer in some ways to what you would expect when talking to another human. If you’re talking about something relevant to the cooktop, it could pay attention. If you’re looking at it, it could pay attention.
The Power of Gaze + Voice
State-of-the-art VUI platforms and SDKs are just starting to be able to rely on context for interpreting users’ intent in their speech. To date, the main context supported by commercial platforms is temporal—remembering what was said earlier in a conversation (a “multi-turn interaction”) to disambiguate non-specific terms. We have been exploring the use of computer-vision based sensing as context, and for this cooktop demonstration we augmented the VUI using gaze tracking to make what feels like a magical interaction. The cooktop infers which burner is being addressed in a voice command by using its camera tracking to know which burner you’re looking at. This way, when the system detects a person standing in front of it looking at a burner, commands can omit the burner designation, e.g. “turn that burner on,” or simply saying a level like “medium high.”
Our custom gaze detection algorithm works using a three-dimensional camera and a cascade of three AI computer vision systems. In a first pass, the edge-optimized neural network finds and isolates individual humans and their heads in the camera view. The second layer zooms in to find individual eyes, and the third layer infers a 3D gaze direction from each eye. The gaze information is fused with the 3D head position and known positions of the burners to establish which burner the subject is looking at.
Reactive physical control interfaces (RPCIs)
Touch is a fundamental aspect of human nature, which is why physical user interfaces are a very intuitive and powerful interface. We have observed that too often when a traditionally stand-alone product gets redesigned to be connected and IoT-enabled, or voice-enabled, its physical user interface suffers. This is because physical controls, such as the knobs that control a cooktop burner, not only provide control but also reflect the state of the system via their position. But when a secondary means of control is added (like VUI or a phone app), the system state can change out from under the physical UI. Designers have typically worked around this by switching to touchscreens, or physically stateless controls like momentary buttons.
But what if that tradeoff weren’t necessary? In the Hobgoblin Cooktop we have introduced what we call Reactive Physical Control Interfaces (RPCIs). RPCIs are physical interfaces like switches, knobs, sliders, and buttons, that maintain all of the convenience of their analog counterparts and react—that is, move themselves—when their function is controlled by digital means.
The physical knobs on the cooktop demonstrate this concept. The burners can be turned up or down with physical knobs, but if controlled with voice, the knobs will self-actuate to reflect the new burner state.
To achieve the feel of a premium cooktop knob while enabling self-actuation, we leveraged high resolution, low-detent torque stepper motors paired with quadrature encoders. Digital control of the knobs meant that rather than the detents being created by the mechanical design, we could actually generate them through programming. This allowed us to program detents in different rotational positions, adjust the feel (smooth or abrupt), and change the strength, all by simply changing the software. Future applications could take this one step further by changing the detents dynamically to provide extra tactile feedback, or allowing the user to set their own preferred detents.
Another consideration at the intersection of hardware and software is the timing accuracy of the motor controller, which ensures the knobs’ self-actuation is smooth and quiet. To guarantee the timing accuracy, we chose to offload the real-time operations for the knobs to a separate microcontroller. This timing consideration and a fast processor is especially important when spinning the motor(s) quickly, with high microstep and/or step resolution, and especially since the need may arise to move all four knobs at once (“Alexa, turn all burners off”), for example.
These are just some examples of the challenges and opportunities that come with RPCIs and we’re excited to explore the concept further.
A variant of the demonstration integrates with Alexa Voice Service (AVS)—turning the Hobgoblin Cooktop into an Alexa-capable device. In this variant we replaced the voice control interface with Amazon’s AVS SDK, but all hardware architecture is the same. By tapping into the ecosystem of an existing voice service like Alexa, the user has instant access to all of the evolving functionality that comes with it (timers, weather, music, smart home control, skills, etc.), and from a development perspective that functionality comes with minimal additional effort.
AVS is pioneering the trend of intuitive “human” interfaces, not just in the connected home, but now in industrial, automotive, and many other markets. As a member of Amazon’s AVS Consulting and Professional Services (CPS) partnership program, Synapse has a unique ability to develop Alexa-integrated devices.
We see the huge potential that improved human-digital interfaces have to give us all more intuitive control over the complex technologies around us. The Hobgoblin Cooktop explores use cases in the home but we see equally exciting opportunities in industrial settings where quick intuitive command of functionality can save time, reduce costs, and improve safety.
We’re encouraged by what we’ve learned through this development and plan to continue making the human-digital conversation more natural and intuitive through future projects.