no code implementations • 30 Jun 2023 • Raghuveer Peri, Seyed Omid Sadjadi, Daniel Garcia-Romero
Prior studies on this problem are sparse, and lack a common benchmark for systematic evaluations.
no code implementations • 21 Apr 2022 • Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason, Douglas Reynolds
Evaluation results indicate: audio-visual fusion produce substantial gains in performance over audio-only or visual-only systems; top performing speaker and face recognition systems exhibited comparable performance under the matched domain conditions present in this evaluation; and, the use of complex neural network architectures (e. g., ResNet) along with angular losses with margin, data augmentation, as well as long duration fine-tuning contributed to notable performance improvements for the audio-only speaker recognition task.
no code implementations • 21 Apr 2022 • Seyed Omid Sadjadi, Craig Greenberg, Elliot Singer, Lisa Mason, Douglas Reynolds
The US National Institute of Standards and Technology (NIST) has been conducting a second iteration of the CTS challenge since August 2020.
no code implementations • 16 Feb 2022 • Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram
Experimental results indicate that both audio and text-based models improve the emotion recognition performance and that the proposed multimodal solution achieves state-of-the-art results on the IEMOCAP benchmark.
no code implementations • 16 Aug 2021 • Seyed Omid Sadjadi
This document provides a brief description of the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) conversational telephone speech (CTS) Superset.
no code implementations • 5 Aug 2021 • Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram
Automatic speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
no code implementations • 5 May 2016 • Seyed Omid Sadjadi, Jason Pelecanos, Sriram Ganapathy
We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 23 Feb 2016 • Seyed Omid Sadjadi, Sriram Ganapathy, Jason W. Pelecanos
In this paper we describe the recent advancements made in the IBM i-vector speaker recognition system for conversational speech.