On 28th November OxTalks will move to the new Halo platform and will become 'Oxford Events' (full details are available on the Staff Gateway).
There will be an OxTalks freeze beginning on Friday 14th November. This means you will need to publish any of your known events to OxTalks by then as there will be no facility to publish or edit events in that fortnight. During the freeze, all events will be migrated to the new Oxford Events site. It will still be possible to view events on OxTalks during this time.
If you have any questions, please contact halo@digital.ox.ac.uk
In complex room settings, machine listening systems may experience a degradation in performance due to factors like room reverberations, background noise, and unwanted sounds. Concurrently, machine vision systems can suffer from issues like visual occlusions, insufficient lighting, and background clutter. Combining audio and visual data has the potential to overcome these limitations and enhance machine perception in complex audio-visual environments. In this talk, we will first discuss the machine cocktail party problem, and the development of speech source separation algorithms for extracting individual speech sources from sound mixtures. We will then discuss selected works related to audio-visual speech separation. This encompasses the fusion of audio-visual data for speech source separation, employing techniques such as Gaussian mixture models, dictionary learning, and deep learning.