no code implementations • 4 Dec 2023 • Martin Strauss, Nicola Pia, Nagashree K. S. Rao, Bernd Edler
This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE).
no code implementations • 30 May 2023 • Luca Resti, Martin Strauss, Matteo Torcoli, Emanuël Habets, Bernd Edler
When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems.
no code implementations • 21 Oct 2022 • Martin Strauss, Matteo Torcoli, Bernd Edler
Deep generative models for Speech Enhancement (SE) received increasing attention in recent years.
no code implementations • 28 Jan 2022 • Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs
Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding.
no code implementations • 16 Jun 2021 • Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler
The music separation models are selected as they share the number of channels (2) and sampling rate (44. 1 kHz or higher) with the considered broadcast content, and vocals separation in music is considered as a parallel for dialog separation in the target application domain.
no code implementations • 16 Jun 2021 • Martin Strauss, Bernd Edler
Speech enhancement involves the distinction of a target speech signal from an intrusive background.
1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2018 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël Habets
Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance.
1 code implementation • 12 Dec 2017 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël. A. P. Habets
The task of estimating the maximum number of concurrent speakers from single channel mixtures is important for various audio-based applications, such as blind source separation, speaker diarisation, audio surveillance or auditory scene classification.
Audio and Speech Processing Sound