Search Results for author: Bernd Edler

Found 8 papers, 2 papers with code

SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement

no code implementations4 Dec 2023 Martin Strauss, Nicola Pia, Nagashree K. S. Rao, Bernd Edler

This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE).

Audio Generation Speech Enhancement

Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio

no code implementations30 May 2023 Luca Resti, Martin Strauss, Matteo Torcoli, Emanuël Habets, Bernd Edler

When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems.

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain

no code implementations28 Jan 2022 Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs

Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding.

Decoder

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

no code implementations16 Jun 2021 Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler

The music separation models are selected as they share the number of channels (2) and sampling rate (44. 1 kHz or higher) with the considered broadcast content, and vocals separation in music is considered as a parallel for dialog separation in the target application domain.

Music Source Separation Transfer Learning

A Flow-Based Neural Network for Time Domain Speech Enhancement

no code implementations16 Jun 2021 Martin Strauss, Bernd Edler

Speech enhancement involves the distinction of a target speech signal from an intrusive background.

Density Estimation Speech Enhancement +1

CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning Speaker Count Estimation

1 code implementation IEEE/ACM Transactions on Audio, Speech, and Language Processing 2018 Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël Habets

Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance.

blind source separation speaker-diarization +1

Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation

1 code implementation12 Dec 2017 Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël. A. P. Habets

The task of estimating the maximum number of concurrent speakers from single channel mixtures is important for various audio-based applications, such as blind source separation, speaker diarisation, audio surveillance or auditory scene classification.

Audio and Speech Processing Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.