Search Results for author: Prashanth Gurunath Shivakumar

Found 15 papers, 3 papers with code

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

no code implementations • 5 Jan 2024 • Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text.

In-Context Learning intent-classification +6

Paper
Add Code

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

no code implementations • 23 Dec 2023 • Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko

Specifically, our framework serializes tasks in the order of current paralinguistic attribute prediction, response paralinguistic attribute prediction, and response text generation with autoregressive conditioning.

Attribute Language Modelling +4

Paper
Add Code

Discriminative Speech Recognition Rescoring with Pre-trained Language Models

no code implementations • 10 Oct 2023 • Prashanth Gurunath Shivakumar, Jari Kolehmainen, Yile Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

In this study, we propose and explore several discriminative fine-tuning schemes for pre-trained LMs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Personalization for BERT-based Discriminative Speech Recognition Rescoring

no code implementations • 13 Jul 2023 • Jari Kolehmainen, Yile Gu, Aditya Gourav, Prashanth Gurunath Shivakumar, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline.

Decoder speech-recognition +1

Paper
Add Code

Scaling Laws for Discriminative Speech Recognition Rescoring Models

no code implementations • 27 Jun 2023 • Yile Gu, Prashanth Gurunath Shivakumar, Jari Kolehmainen, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

We study whether this scaling property is also applicable to second-pass rescoring, which is an important component of speech recognition systems.

speech-recognition Speech Recognition

Paper
Add Code

Distillation Strategies for Discriminative Speech Recognition Rescoring

no code implementations • 15 Jun 2023 • Prashanth Gurunath Shivakumar, Jari Kolehmainen, Yile Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

We also show that the proposed distillation can reduce the WER gap between the student and the teacher by 62% upto 100%.

Language Modelling Re-Ranking +2

Paper
Add Code

Phone Duration Modeling for Speaker Age Estimation in Children

no code implementations • 3 Sep 2021 • Prashanth Gurunath Shivakumar, Somer Bishop, Catherine Lord, Shrikanth Narayanan

In this paper, we propose features specific to children and focus on speaker's phone duration as an important biomarker of children's age.

Age Estimation regression

Paper
Add Code

End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

no code implementations • 19 Feb 2021 • Prashanth Gurunath Shivakumar, Shrikanth Narayanan

A key desiderata for inclusive and accessible speech recognition technology is ensuring its robust performance to children's speech.

speech-recognition Speech Recognition

Paper
Add Code

Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

1 code implementation • 3 Feb 2021 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

RNN based Incremental Online Spoken Language Understanding

no code implementations • 23 Oct 2019 • Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan

We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

Paper
Add Code

Behavior Gated Language Models

no code implementations • 31 Aug 2019 • Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling.

Language Modelling

Paper
Add Code

Spoken Language Intent Detection using Confusion2Vec

1 code implementation • 7 Apr 2019 • Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities

no code implementations • 8 Nov 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou

In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

no code implementations • 8 May 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou

Evaluations are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

1 code implementation • 7 Feb 2018 • Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou

In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.