Search Results for author: Soumitro Chakrabarty

Found 9 papers, 3 papers with code

Ultra Low Complexity Deep Learning Based Noise Suppression

no code implementations • 13 Dec 2023 • Shrishti Saha Shetu, Soumitro Chakrabarty, Oliver Thiergart, Edwin Mabande

This paper introduces an innovative method for reducing the computational complexity of deep neural networks in real-time speech enhancement on resource-constrained devices.

Speech Enhancement

Paper
Add Code

New Insights on Target Speaker Extraction

no code implementations • 1 Feb 2022 • Mohamed Elminshawi, Wolfgang Mack, Srikanth Raj Chetupalli, Soumitro Chakrabarty, Emanuël A. P. Habets

However, such studies have been conducted on a few datasets and have not considered recent deep neural network architectures for SS that have shown impressive separation performance.

Speaker Separation Target Speaker Extraction

Paper
Add Code

An Empirical Study of Visual Features for DNN based Audio-Visual Speech Enhancement in Multi-talker Environments

no code implementations • 9 Nov 2020 • Shrishti Saha Shetu, Soumitro Chakrabarty, Emanuël A. P. Habets

Audio-visual speech enhancement (AVSE) methods use both audio and visual features for the task of speech enhancement and the use of visual features has been shown to be particularly effective in multi-speaker scenarios.

Optical Flow Estimation Speech Enhancement

Paper
Add Code

Multi-scale aggregation of phase information for reducing computational cost of CNN based DOA estimation

no code implementations • 20 Nov 2018 • Soumitro Chakrabarty, Emanuël. A. P. Habets

In this work, we propose to use systematic dilations of the convolution filters in each of the convolution layers of the previously proposed CNN for expansion of the receptive field of the filters to reduce the computational cost of the method.

Paper
Add Code

CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning Speaker Count Estimation

1 code implementation • IEEE/ACM Transactions on Audio, Speech, and Language Processing 2018 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël Habets

Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance.

blind source separation speaker-diarization +1

143

Paper
Code

Multi-Speaker DOA Estimation Using Deep Convolutional Networks Trained with Noise Signals

no code implementations • 31 Jul 2018 • Soumitro Chakrabarty, Emanuël. A. P. Habets

Supervised learning based methods for source localization, being data driven, can be adapted to different acoustic conditions via training and have been shown to be robust to adverse acoustic environments.

Binary Classification General Classification +1

Paper
Add Code

Multi-Speaker Localization Using Convolutional Neural Network Trained with Noise

no code implementations • 12 Dec 2017 • Soumitro Chakrabarty, Emanuël. A. P. Habets

The problem of multi-speaker localization is formulated as a multi-class multi-label classification problem, which is solved using a convolutional neural network (CNN) based source localization method.

General Classification Multi-Label Classification

Paper
Add Code

Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation

1 code implementation • 12 Dec 2017 • Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël. A. P. Habets

The task of estimating the maximum number of concurrent speakers from single channel mixtures is important for various audio-based applications, such as blind source separation, speaker diarisation, audio surveillance or auditory scene classification.

Audio and Speech Processing Sound

143

Paper
Code

Broadband DOA estimation using Convolutional neural networks trained with noise signals

1 code implementation • 2 May 2017 • Soumitro Chakrabarty, Emanuël. A. P. Habets

Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals.

General Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.