Speech recognition from brain scans

This is a hybrid meeting. Please find the Teams link in the abstract.

Decoding inner monologues from brain signals could be hugely impactful. It has the potential to significantly improve the lives of those who currently can’t communicate. In the past five years, there have been several notable breakthroughs in this area, using deep learning to decode brain signals directly into meaningful text through both surgically invasive and non-invasive methods. In this talk, I will discuss these breakthroughs and the current state-of-the-art in brain-speech decoding methods from the perspective of a deep learning researcher. I will also discuss the primary challenges we face in this field and our plans to overcome them. While the implications of successful decoding are vast, this field also navigates complex ethical considerations and the future of this work not only has the potential to improve brain-computer interfaces (BCIs) for communication, but also extends into many broader applications.

Dulhan is a DPhil student in the Autonomous Intelligent Machines and Systems (AIMS) CDT. At PNPL, his work focuses on leveraging deep learning to find efficient representations of brain signals for downstream tasks (e.g. phoneme recognition from heard speech brain data). Prior to joining PNPL, he worked on multi-agent RL and reasoning with graph neural networks at the University of Cambridge. Before this, he completed his BSc in Computer Science at the University of Southampton, where he researched computer vision systems for visual navigation. He has worked on large language models at Speechmatics and developing assembly-level machine learning kernels for new hardware at Arm.

Teams link: teams.microsoft.com/l/meetup-join/19%3ameeting_NWM3YWRlM2EtNDBmZC00YzJkLThjYmEtNTNkM2FlYmY4N2M2%40thread.v2/0?context=%7b%22Tid%22%3a%22cc95de1b-97f5-4f93-b4ba-fe68b852cf91%22%2c%22Oid%22%3a%222d6d82c4-6b2c-4f77-b979-7c49923c3b36%22%7d