Enabling the In-Car Cocktail Party

Innovation Blog | May 9, 2018

It’s not what you think – we’re not talking about drinking and driving! Creating an automotive speech recognition system that works well under all conditions has always been a very challenging proposition. The car’s acoustic environment has lots of loud, non-predictable sounds that compete with the driver’s voice – like wind noise, road noise, and traffic. You may be able to guess another big noise source from inside the car, especially if you’re a parent – the other occupants. Whether it’s children, companions, or colleagues, it’s not always easy to stop all chatter just so the car has a chance of recognizing what you say to it. Recognizing one speaker in a crowd is the so-called “cocktail party problem,” and it’s been a very difficult one for computers to solve, especially within the car.

This is why we’ve developed technology that can distinguish and separate multiple simultaneous voices, making the vehicle’s speech recognition more robust, useful and accurate. Allowing the car to recognize hands-free commands without requiring other conversations to stop provides more natural speech interaction and a better user experience for everyone in the car. And since this solution only needs a single microphone, it doesn’t introduce additional hardware expense to the car. Great – but according to the US Census Bureau’s latest data, over 76 percent of commuters drive alone and therefore don’t need to worry about competing voices.

Yes, but that’s today. In a few short years, perhaps even within one or two traditional automotive design cycles, mobility will be drastically transformed. Mobility as a Service (MaaS) is taking off and promises the ability for people to dynamically choose their mode of transport, with on-demand vehicle subscriptions and multiple car- or ride-sharing models. Far from removing the need to talk to your car, we believe self-driving technology will result in even more opportunity to talk to, direct, and control your car. The future is multi-passenger – and conversational.

Digital assistants like Amazon Alexa and Google Assistant are the hottest thing since sliced bread. With greater exposure, reliability, and comfort in a digital assistant, people will be increasingly relying on them to perform tasks in the car – tasks that don’t rely on their eyes or hands. That’s why we don’t just need digital assistants in our cars, we need ones that can listen to each one of us, even when we’re all talking.

Jacek Spiewla Sr Manager, User Experience

Jacek holds a Master’s in Human-Computer Interaction from the University of Michigan, and has a deep background in speech/audio processing technology, as well as voice user interface design. He is responsible for strategic planning activities and coordinating UX projects.