Search Results for author: Bernd Edler

Found 8 papers, 2 papers with code

SEFGAN: Harvesting the Power of Normalizing Flows and GANs for Efficient High-Quality Speech Enhancement

no code implementations • 4 Dec 2023 • Martin Strauss, Nicola Pia, Nagashree K. S. Rao, Bernd Edler

This paper proposes SEFGAN, a Deep Neural Network (DNN) combining maximum likelihood training and Generative Adversarial Networks (GANs) for efficient speech enhancement (SE).

Audio Generation Speech Enhancement

Paper
Add Code

Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio

no code implementations • 30 May 2023 • Luca Resti, Martin Strauss, Matteo Torcoli, Emanuël Habets, Bernd Edler

When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems.

Paper
Add Code

Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation

no code implementations • 21 Oct 2022 • Martin Strauss, Matteo Torcoli, Bernd Edler

Deep generative models for Speech Enhancement (SE) received increasing attention in recent years.

Speech Enhancement

Paper
Add Code

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain

no code implementations • 28 Jan 2022 • Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs

Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding.

Decoder

Paper
Add Code

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

no code implementations • 16 Jun 2021 • Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler

The music separation models are selected as they share the number of channels (2) and sampling rate (44. 1 kHz or higher) with the considered broadcast content, and vocals separation in music is considered as a parallel for dialog separation in the target application domain.

Music Source Separation Transfer Learning

Paper
Add Code

A Flow-Based Neural Network for Time Domain Speech Enhancement

no code implementations • 16 Jun 2021 • Martin Strauss, Bernd Edler

Speech enhancement involves the distinction of a target speech signal from an intrusive background.

Density Estimation Speech Enhancement +1

Paper
Add Code

CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning Speaker Count Estimation

1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2018 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël Habets

Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance.

blind source separation speaker-diarization +1

142

Paper
Code

Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation

1 code implementation • 12 Dec 2017 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël. A. P. Habets

The task of estimating the maximum number of concurrent speakers from single channel mixtures is important for various audio-based applications, such as blind source separation, speaker diarisation, audio surveillance or auditory scene classification.

Audio and Speech Processing Sound

142

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.