Search Results for author: Peter Bell

Found 49 papers, 11 papers with code

Improving Code-switched ASR with Linguistic Information

no code implementations • COLING 2022 • Jie Chi, Peter Bell

This paper seeks to improve the performance of automatic speech recognition (ASR) systems operating on code-switched speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Explainable Attribute-Based Speaker Verification

no code implementations • 30 May 2024 • Xiaoliang Wu, Chau Luu, Peter Bell, Ajitha Rajan

This paper proposes a fully explainable approach to speaker verification (SV), a task that fundamentally relies on individual speaker characteristics.

Paper
Add Code

Crossmodal ASR Error Correction with Discrete Speech Units

no code implementations • 26 May 2024 • Yuanchao Li, Pinzhen Chen, Peter Bell, Catherine Lai

ASR remains unsatisfactory in scenarios where the speaking style diverges from that used to train ASR systems, resulting in erroneous transcripts.

Paper
Add Code

LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots

no code implementations • 22 Apr 2024 • Dongge Han, Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Peter Bell, Amos Storkey

We introduce LLM-Personalize, a novel framework with an optimization pipeline designed to personalize LLM planners for household robotics.

Imitation Learning Text Generation

Paper
Add Code

Quantifying the perceptual value of lexical and non-lexical channels in speech

no code implementations • 7 Jul 2023 • Sarenne Wallbridge, Peter Bell, Catherine Lai

Speech is a fundamental means of communication that can be seen to provide two channels for transmitting information: the lexical channel of which words are said, and the non-lexical channel of how they are spoken.

Paper
Add Code

Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition

no code implementations • 29 May 2023 • Xiaoliang Wu, Peter Bell, Ajitha Rajan

Explainable AI (XAI) techniques have been widely used to help explain and understand the output of deep learning models in fields such as image classification and Natural Language Processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Transfer Learning for Personality Perception via Speech Emotion Recognition

no code implementations • 25 May 2023 • Yuanchao Li, Peter Bell, Catherine Lai

In this work, we investigate the relationship between two affective attributes: personality and emotion, from a transfer learning perspective.

Speech Emotion Recognition Transfer Learning

Paper
Add Code

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

no code implementations • 25 May 2023 • Yuanchao Li, Zeyu Zhao, Ondrej Klejch, Peter Bell, Catherine Lai

To overcome this challenge, we investigate how Automatic Speech Recognition (ASR) performs on emotional speech by analyzing the ASR performance on emotion corpora and examining the distribution of word errors and confidence scores in ASR transcripts to gain insight into how emotion affects ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition

no code implementations • 23 May 2023 • Yaoting Wang, Yuanchao Li, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai

Fusing multiple modalities has proven effective for multimodal information processing.

Emotion Recognition Multimodal Sentiment Analysis

Paper
Add Code

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR

no code implementations • 31 Mar 2023 • Ramon Sanabria, Nikolay Bogoychev, Nina Markl, Andrea Carmantini, Ondrej Klejch, Peter Bell

Although the great many advances in English automatic speech recognition (ASR) over the past decades, results are usually reported based on test datasets which fail to represent the diversity of English as spoken today around the globe.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Explanations for Automatic Speech Recognition

no code implementations • 27 Feb 2023 • Xiaoliang Wu, Peter Bell, Ajitha Rajan

We address quality assessment for neural network based ASR by providing explanations that help increase our understanding of the system and ultimately help build trust in the system.

Automatic Speech Recognition Explainable Artificial Intelligence (XAI) +4

Paper
Add Code

ODOR: The ICPR2022 ODeuropa Challenge on Olfactory Object Recognition

no code implementations • 24 Jan 2023 • Mathias Zinnen, Prathmesh Madhu, Ronak Kosti, Peter Bell, Andreas Maier, Vincent Christlein

The Odeuropa Challenge on Olfactory Object Recognition aims to foster the development of object detection in the visual arts and to promote an olfactory perspective on digital heritage.

Domain Adaptation Few-Shot Learning +4

Paper
Add Code

Transfer Learning for Olfactory Object Detection

no code implementations • 24 Jan 2023 • Mathias Zinnen, Prathmesh Madhu, Peter Bell, Andreas Maier, Vincent Christlein

We investigate the effect of style and category similarity in multiple datasets used for object detection pretraining.

Object object-detection +2

Paper
Add Code

Evaluating and reducing the distance between synthetic and real speech distributions

no code implementations • 29 Nov 2022 • Christoph Minixhofer, Ondřej Klejch, Peter Bell

While modern Text-to-Speech (TTS) systems can produce natural-sounding speech, they remain unable to reproduce the full diversity found in natural speech data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora

no code implementations • 5 Oct 2022 • Yuanchao Li, Yumnah Mohamied, Peter Bell, Catherine Lai

Self-supervised speech models have grown fast during the past few years and have proven feasible for use in various downstream tasks.

Emotion Recognition

Paper
Add Code

ICC++: Explainable Image Retrieval for Art Historical Corpora using Image Composition Canvas

no code implementations • 22 Jun 2022 • Prathmesh Madhu, Tilman Marquart, Ronak Kosti, Dirk Suckow, Peter Bell, Andreas Maier, Vincent Christlein

In this work, we present a novel approach called Image Composition Canvas (ICC++) to compare and retrieve images having similar compositional elements.

Image Retrieval Retrieval

Paper
Add Code

Mask-combine Decoding and Classification Approach for Punctuation Prediction with real-time Inference Constraints

no code implementations • 15 Dec 2021 • Christoph Minixhofer, Ondřej Klejch, Peter Bell

In this work, we unify several existing decoding strategies for punctuation prediction in one framework and introduce a novel strategy which utilises multiple predictions at each word across different windows.

Classification

Paper
Add Code

Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR

no code implementations • 12 Nov 2021 • Ondrej Klejch, Electra Wallington, Peter Bell

We present a method for cross-lingual training an ASR system using absolutely no transcribed training data from the target language, and with no phonetic knowledge of the language in question.

Cross-Lingual ASR Cross-Lingual Transfer +1

Paper
Add Code

Fusing ASR Outputs in Joint Training for Speech Emotion Recognition

no code implementations • 29 Oct 2021 • Yuanchao Li, Peter Bell, Catherine Lai

However, due to the scarcity of emotion labelled data and the difficulty of recognizing emotional speech, it is hard to obtain reliable linguistic features and models in this research area.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

It's not what you said, it's how you said it: discriminative perception of speech as a multichannel communication system

no code implementations • 1 May 2021 • Sarenne Wallbridge, Peter Bell, Catherine Lai

People convey information extremely effectively through spoken interaction using multiple channels of information transmission: the lexical channel of what is said, and the non-lexical channel of how it is said.

Paper
Add Code

Segmenting Subtitles for Correcting ASR Segmentation Errors

no code implementations • EACL 2021 • David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation.

Information Retrieval Machine Translation +4

Paper
Add Code

Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers

no code implementations • 9 Feb 2021 • Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals

Although the lower layers of a deep neural network learn features which are transferable across datasets, these layers are not transferable within the same dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Enhancing Human Pose Estimation in Ancient Vase Paintings via Perceptually-grounded Style Transfer Learning

1 code implementation • 10 Dec 2020 • Prathmesh Madhu, Angel Villar-Corrales, Ronak Kosti, Torsten Bendschus, Corinna Reinhardt, Peter Bell, Andreas Maier, Vincent Christlein

(2) To improve the already strong results further, we created a small dataset (ClassArch) consisting of ancient Greek vase paintings from the 6-5th century BCE with person and pose annotations.

Image Retrieval Pose Estimation +3

Paper
Code

Stochastic Attention Head Removal: A simple and effective method for improving Transformer Based ASR Models

1 code implementation • 8 Nov 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals

To the best of our knowledge, we have achieved state-of-the-art end-to-end Transformer based model performance on Switchboard and AMI.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

On the Usefulness of Self-Attention for Automatic Speech Recognition with Transformers

no code implementations • 8 Nov 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals

Self-attention models such as Transformers, which can capture temporal relationships without being limited by the distance between events, have given competitive speech recognition results.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Leveraging speaker attribute information using multi task learning for speaker verification and diarization

1 code implementation • 27 Oct 2020 • Chau Luu, Peter Bell, Steve Renals

On a test set of US Supreme Court recordings, we show that by leveraging two additional forms of speaker attribute information derived respectively from the matched training data, and VoxCeleb corpus, we improve the performance of our deep speaker embeddings for both verification and diarization tasks, achieving a relative improvement of 26. 2% in DER and 6. 7% in EER compared to baselines using speaker labels only.

Attribute Multi-Task Learning +2

Paper
Code

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines

no code implementations • 19 Oct 2020 • David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown

In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation.

Cross-Lingual Information Retrieval POS +6

Paper
Add Code

Understanding Compositional Structures in Art Historical Images using Pose and Gaze Priors

1 code implementation • 8 Sep 2020 • Prathmesh Madhu, Tilman Marquart, Ronak Kosti, Peter Bell, Andreas Maier, Vincent Christlein

These compositions are useful in analyzing the interactions in an image to study artists and their artworks.

Paper
Code

Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

1 code implementation • 14 Aug 2020 • Peter Bell, Joachim Fainberg, Ondrej Klejch, Jinyu Li, Steve Renals, Pawel Swietojanski

We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / neural network systems and end-to-end neural network systems, with a focus on speaker adaptation, domain adaptation, and accent adaptation.

Data Augmentation Domain Adaptation +2

Paper
Code

When Can Self-Attention Be Replaced by Feed Forward Layers?

no code implementations • 28 May 2020 • Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals

Recently, self-attention models such as Transformers have given competitive results compared to recurrent neural network systems in speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Subtitles to Segmentation: Improving Low-Resource Speech-to-TextTranslation Pipelines

no code implementations • LREC 2020 • David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathy Mckeown

In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation.

Cross-Lingual Information Retrieval POS +6

Paper
Add Code

Recognizing Characters in Art History Using Deep Learning

1 code implementation • 31 Mar 2020 • Prathmesh Madhu, Ronak Kosti, Lara Mührenberg, Peter Bell, Andreas Maier, Vincent Christlein

We present experiments and analysis on three different models and show that the model trained on domain related data gives the best performance for recognizing character.

Paper
Code

DropClass and DropAdapt: Dropping classes for deep speaker representation learning

1 code implementation • 2 Feb 2020 • Chau Luu, Peter Bell, Steve Renals

The first proposed method, DropClass, works via periodically dropping a random subset of classes from the training data and the output layer throughout training, resulting in a feature extractor trained on many different classification tasks.

General Classification Representation Learning +1

Paper
Code

Multi-scale Octave Convolutions for Robust Speech Recognition

no code implementations • 31 Oct 2019 • Joanna Rownicka, Peter Bell, Steve Renals

We propose a multi-scale octave convolution layer to learn robust speech representations efficiently.

Computational Efficiency Robust Speech Recognition +1

Paper
Add Code

Channel adversarial training for speaker verification and diarization

no code implementations • 25 Oct 2019 • Chau Luu, Peter Bell, Steve Renals

Previous work has encouraged domain-invariance in deep speaker embedding by adversarially classifying the dataset or labelled environment to which the generated features belong.

Speaker Verification

Paper
Add Code

Speaker Adaptive Training using Model Agnostic Meta-Learning

1 code implementation • 23 Oct 2019 • Ondřej Klejch, Joachim Fainberg, Peter Bell, Steve Renals

Speaker adaptive training (SAT) of neural network acoustic models learns models in a way that makes them more suitable for adaptation to test conditions.

Meta-Learning

Paper
Code

Embeddings for DNN speaker adaptive training

no code implementations • 30 Sep 2019 • Joanna Rownicka, Peter Bell, Steve Renals

In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs (DNN-SAT) focusing on a small amount of adaptation data per speaker.

Speaker Recognition

Paper
Add Code

Acoustic Model Adaptation from Raw Waveforms with SincNet

1 code implementation • 30 Sep 2019 • Joachim Fainberg, Ondřej Klejch, Erfan Loweimi, Peter Bell, Steve Renals

Raw waveform acoustic modelling has recently gained interest due to neural networks' ability to learn feature extraction, and the potential for finding better representations for a given scenario than hand-crafted features.

Acoustic Modelling

Paper
Code

Top-down training for neural networks

no code implementations • 25 Sep 2019 • Shucong Zhang, Cong-Thanh Do, Rama Doddipatla, Erfan Loweimi, Peter Bell, Steve Renals

Interpreting the top layers as a classifier and the lower layers a feature extractor, one can hypothesize that unwanted network convergence may occur when the classifier has overfit with respect to the feature extractor.

speech-recognition Speech Recognition

Paper
Add Code

Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

no code implementations • 27 Jun 2019 • Ondrej Klejch, Joachim Fainberg, Peter Bell, Steve Renals

Acoustic model adaptation to unseen test recordings aims to reduce the mismatch between training and testing conditions.

Test-time Adaptation

Paper
Add Code

Lattice-based lightly-supervised acoustic model training

no code implementations • 30 May 2019 • Joachim Fainberg, Ondřej Klejch, Steve Renals, Peter Bell

This text data can be used for lightly supervised training, in which text matching the audio is selected using an existing speech recognition model.

Language Modelling speech-recognition +2

Paper
Add Code

Analyzing deep CNN-based utterance embeddings for acoustic model adaptation

no code implementations • 12 Nov 2018 • Joanna Rownicka, Peter Bell, Steve Renals

We analyze the representations learned by deep CNNs and compare them with deep neural network (DNN) representations and i-vectors, in the context of acoustic model adaptation.

speech-recognition Speech Recognition

Paper
Add Code

Few-shot learning with attention-based sequence-to-sequence models

no code implementations • 8 Nov 2018 • Bertrand Higy, Peter Bell

End-to-end approaches have recently become popular as a means of simplifying the training and deployment of speech recognition systems.

Decoder Few-Shot Learning +2

Paper
Add Code

Learning to adapt: a meta-learning approach for speaker adaptation

1 code implementation • 30 Aug 2018 • Ondřej Klejch, Joachim Fainberg, Peter Bell

The performance of automatic speech recognition systems can be improved by adapting an acoustic model to compensate for the mismatch between training and testing conditions, for example by adapting to unseen speakers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2