Search Results for author: Midia Yousefi

Found 6 papers, 0 papers with code

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

no code implementations • 28 May 2024 • Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng

There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation.

Paper
Add Code

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

no code implementations • 10 Apr 2024 • Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng

In this paper, we introduce CoVoMix: Conversational Voice Mixture Generation, a novel model for zero-shot, human-like, multi-speaker, multi-round dialogue speech generation.

Dialogue Generation

Paper
Add Code

Single-channel speech separation using Soft-minimum Permutation Invariant Training

no code implementations • 16 Nov 2021 • Midia Yousefi, John H. L. Hansen

A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal, referred to as label permutation ambiguity.

Speech Separation

Paper
Add Code

Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural Network

no code implementations • 30 Oct 2021 • Midia Yousefi, John H. L. Hansen

Most current speech technology systems are designed to operate well even in the presence of multiple active speakers.

Paper
Add Code

Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition

no code implementations • 30 Oct 2021 • Midia Yousefi, John H. L. Hanse

The speaker conditioning process allows the acoustic model to perform computation in the context of target-speaker auxiliary information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Probabilistic Permutation Invariant Training for Speech Separation

no code implementations • 4 Aug 2019 • Midia Yousefi, Soheil Khorram, John H. L. Hansen

Recently proposed Permutation Invariant Training (PIT) addresses this problem by determining the output-label assignment which minimizes the separation error.

Speech Separation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.