Search Results for author: Chang Zeng

Found 10 papers, 6 papers with code

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms

1 code implementation • Interspeech 2023 • Chang Zeng, Xin Wang, Xiaoxiao Miao, Erica Cooper, Junichi Yamagishi

The ability of countermeasure models to generalize from seen speech synthesis methods to unseen ones has been investigated in the ASVspoof challenge.

Speech Synthesis

Paper
Code

Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition

no code implementations • 23 Mar 2023 • Haoyu Tang, Zhaoyi Liu, Chang Zeng, Xinfeng Li

To overcome the drawback of universal Transformer models for the application of ASR on edge devices, we propose a solution that can reuse the block in Transformer models for the occasion of the small footprint ASR system, which meets the objective of accommodating resource limitations without compromising recognition accuracy.

Ranked #11 on Speech Recognition on AISHELL-1

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification

1 code implementation • 22 Feb 2023 • Meng Liu, Kong Aik Lee, Longbiao Wang, Hanyi Zhang, Chang Zeng, Jianwu Dang

Visual speech (i. e., lip motion) is highly related to auditory speech due to the co-occurrence and synchronization in speech production.

Text-Independent Speaker Verification

Paper
Code

Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network

1 code implementation • Interspeech 2023 • Chunhui Wang, Chang Zeng, Xing He

XiaoiceSing is a singing voice synthesis (SVS) system that aims at generating 48kHz singing voices.

Generative Adversarial Network Singing Voice Synthesis

Paper
Code

HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation

1 code implementation • 23 Oct 2022 • Chunhui Wang, Chang Zeng, Jun Chen, Xing He

Entertainment-oriented singing voice synthesis (SVS) requires a vocoder to generate high-fidelity (e. g. 48kHz) audio.

Generative Adversarial Network Singing Voice Synthesis

Paper
Code

Deep Spectro-temporal Artifacts for Detecting Synthesized Speech

no code implementations • 11 Oct 2022 • Xiaohui Liu, Meng Liu, Lin Zhang, Linjuan Zhang, Chang Zeng, Kai Li, Nan Li, Kong Aik Lee, Longbiao Wang, Jianwu Dang

The Audio Deep Synthesis Detection (ADD) Challenge has been held to detect generated human-like speech.

Data Augmentation Domain Adaptation +1

Paper
Add Code

Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances

no code implementations • 1 Sep 2022 • Chang Zeng, Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi

Conventional automatic speaker verification systems can usually be decomposed into a front-end model such as time delay neural network (TDNN) for extracting speaker embeddings and a back-end model such as statistics-based probabilistic linear discriminant analysis (PLDA) or neural network-based neural PLDA (NPLDA) for similarity scoring.

Data Augmentation Speaker Verification

Paper
Add Code

Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022

no code implementations • 1 Sep 2022 • Chang Zeng, Lin Zhang, Meng Liu, Junichi Yamagishi

Current state-of-the-art automatic speaker verification (ASV) systems are vulnerable to presentation attacks, and several countermeasures (CMs), which distinguish bona fide trials from spoofing ones, have been explored to protect ASV.

Speaker Verification

Paper
Add Code

Exploring Deep Learning for Joint Audio-Visual Lip Biometrics

1 code implementation • 17 Apr 2021 • Meng Liu, Longbiao Wang, Kong Aik Lee, Hanyi Zhang, Chang Zeng, Jianwu Dang

Audio-visual (AV) lip biometrics is a promising authentication technique that leverages the benefits of both the audio and visual modalities in speech communication.

Speaker Recognition

Paper
Code

Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances

1 code implementation • 4 Apr 2021 • Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi

Probabilistic linear discriminant analysis (PLDA) or cosine similarity have been widely used in traditional speaker verification systems as back-end techniques to measure pairwise similarities.

Ranked #1 on Speaker Verification on CN-CELEB

Speaker Verification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.