Many people are amazing at focusing their attention on one person or one voice in a multi speaker scenario, and ‘muting’ other people and background noise. This is known as the cocktail party effect. For other people it is a challenge to separate audio sources.
In this presentation I will focus on solving this problem with deep neural networks and TensorFlow. I will share technical and implementation details with the audience, and talk about gains, pains points, and merits of the solutions as it relates to:
* Preparing, transforming and augmenting relevant data for speech separation and noise removal.
* Creating, training and optimizing various neural network architectures.
* Hardware options for running networks on tiny devices.
* And the end goal : Real-time speech separation on a small embedded platform.
I will present a vision of future smart air pods, smart headsets and smart hearing aids that will be running deep neural networks .
Participants will get an insight into some of the latest advances and limitations in speech separation with deep neural networks on embedded devices in regards to:
* Data transformation and augmentation.
* Deep neural network models for speech separation and for removing noise.
* Training smaller and faster neural networks.
* Creating a real-time speech separation pipeline.
Christian Grant is a deep learning engineer, data scientist, big data architect, and self driving car engineer. His focus is on combining AI, deep learning, autonomous car technologies, IoT, Android, big data and real time streaming. He has extensive experience from working on projects for Fortune 500 companies, including: Demant, Philips Electronics, Deloitte, IBM, Hitachi, Allianz, Bombardier Aerospace, Boeing, and Maersk. Currently he is working on speech separation and audio processing solutions with TensorFlow, Keras and TensorFlow Lite on embedded devices.