no code implementations • 12 Aug 2014 • Antoine Deleforge, Radu Horaud, Yoav Schechner, Laurent Girin
Indeed, we demonstrate that the method can be used for audio-visual fusion, namely to map speech signals onto images and hence to spatially align the audio and visual modalities, thus enabling to discriminate between speaking and non-speaking faces.