no code implementations • 13 Dec 2023 • Shrishti Saha Shetu, Soumitro Chakrabarty, Oliver Thiergart, Edwin Mabande
This paper introduces an innovative method for reducing the computational complexity of deep neural networks in real-time speech enhancement on resource-constrained devices.
no code implementations • 1 Feb 2022 • Mohamed Elminshawi, Wolfgang Mack, Srikanth Raj Chetupalli, Soumitro Chakrabarty, Emanuël A. P. Habets
However, such studies have been conducted on a few datasets and have not considered recent deep neural network architectures for SS that have shown impressive separation performance.
no code implementations • 9 Nov 2020 • Shrishti Saha Shetu, Soumitro Chakrabarty, Emanuël A. P. Habets
Audio-visual speech enhancement (AVSE) methods use both audio and visual features for the task of speech enhancement and the use of visual features has been shown to be particularly effective in multi-speaker scenarios.
no code implementations • 20 Nov 2018 • Soumitro Chakrabarty, Emanuël. A. P. Habets
In this work, we propose to use systematic dilations of the convolution filters in each of the convolution layers of the previously proposed CNN for expansion of the receptive field of the filters to reduce the computational cost of the method.
1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2018 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël Habets
Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance.
no code implementations • 31 Jul 2018 • Soumitro Chakrabarty, Emanuël. A. P. Habets
Supervised learning based methods for source localization, being data driven, can be adapted to different acoustic conditions via training and have been shown to be robust to adverse acoustic environments.
no code implementations • 12 Dec 2017 • Soumitro Chakrabarty, Emanuël. A. P. Habets
The problem of multi-speaker localization is formulated as a multi-class multi-label classification problem, which is solved using a convolutional neural network (CNN) based source localization method.
1 code implementation • 12 Dec 2017 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël. A. P. Habets
The task of estimating the maximum number of concurrent speakers from single channel mixtures is important for various audio-based applications, such as blind source separation, speaker diarisation, audio surveillance or auditory scene classification.
Audio and Speech Processing Sound
1 code implementation • 2 May 2017 • Soumitro Chakrabarty, Emanuël. A. P. Habets
Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals.