Search Results for author: Midia Yousefi

Found 6 papers, 0 papers with code

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

no code implementations28 May 2024 Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng

There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation.

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

no code implementations10 Apr 2024 Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng

In this paper, we introduce CoVoMix: Conversational Voice Mixture Generation, a novel model for zero-shot, human-like, multi-speaker, multi-round dialogue speech generation.

Dialogue Generation

Single-channel speech separation using Soft-minimum Permutation Invariant Training

no code implementations16 Nov 2021 Midia Yousefi, John H. L. Hansen

A long-lasting problem in supervised speech separation is finding the correct label for each separated speech signal, referred to as label permutation ambiguity.

Speech Separation

Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural Network

no code implementations30 Oct 2021 Midia Yousefi, John H. L. Hansen

Most current speech technology systems are designed to operate well even in the presence of multiple active speakers.

Probabilistic Permutation Invariant Training for Speech Separation

no code implementations4 Aug 2019 Midia Yousefi, Soheil Khorram, John H. L. Hansen

Recently proposed Permutation Invariant Training (PIT) addresses this problem by determining the output-label assignment which minimizes the separation error.

Speech Separation

Cannot find the paper you are looking for? You can Submit a new open access paper.