Search Results for author: Davide Berghi

Found 5 papers, 2 papers with code

Leveraging Visual Supervision for Array-based Active Speaker Detection and Localization

1 code implementation • 21 Dec 2023 • Davide Berghi, Philip J. B. Jackson

The multichannel audio ``student'' network is trained to generate the same results.

Paper
Code

Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection

1 code implementation • 14 Dec 2023 • Davide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang, Philip J. B. Jackson

Sound event localization and detection (SELD) combines two subtasks: sound event detection (SED) and direction of arrival (DOA) estimation.

Data Augmentation Event Detection +2

Paper
Code

Audio Inputs for Active Speaker Detection and Localization via Microphone Array

no code implementations • 27 Jul 2023 • Davide Berghi, Philip J. B. Jackson

This study considers the problem of detecting and locating an active talker's horizontal position from multichannel audio captured by a microphone array.

Paper
Add Code

Tragic Talkers: A Shakespearean Sound- and Light-Field Dataset for Audio-Visual Machine Learning Research

no code implementations • 4 Dec 2022 • Davide Berghi, Marco Volino, Philip J. B. Jackson

This is partly due to the lack of available datasets enabling audio-visual research in this direction.

Paper
Add Code

Visually Supervised Speaker Detection and Localization via Microphone Array

no code implementations • 7 Mar 2022 • Davide Berghi, Adrian Hilton, Philip J. B. Jackson

We propose to generate weak labels using a pre-trained active speaker detector on pre-extracted face tracks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.