speech-recognition

999 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in speech-recognition

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find speech-recognition models and implementations

espnet/espnet

16 papers

7,875

msalhab96/SpeeQ

11 papers

pytorch/fairseq

10 papers

29,251

PaddlePaddle/PaddleSpeech

10 papers

10,142

See all 23 libraries.

Datasets

Most implemented papers

Most implemented Social Latest No code

Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges

autoliuweijie/FastBERT • • 8 Mar 2021

Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others.

Paper
Code

ISyNet: Convolutional Neural Networks design for AI accelerator

mindspore-ai/models • • 4 Sep 2021

To address this problem we propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method; and ISyNet - a set of architectures designed to be fast on the specialized neural processing unit (NPU) hardware and accurate at the same time.

Paper
Code

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper • • Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Paper
Code

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

facebookresearch/salina • • 3 Apr 2015

Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients.

Paper
Code

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

TensorSpeech/TensorFlowASR • • 7 May 2020

We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2. 1%/4. 6% without external language model (LM), 1. 9%/4. 1% with LM and 2. 9%/7. 0% with only 10M parameters on the clean/noisy LibriSpeech test sets.

Paper
Code

Unsupervised Cross-lingual Representation Learning for Speech Recognition

huggingface/transformers • • 24 Jun 2020

This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.

Paper
Code

An Overview of Multi-Task Learning in Deep Neural Networks

shenweichen/DeepCTR • • 15 Jun 2017

Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery.

Paper
Code

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

upskyy/Transformer-Transducer • • 7 Feb 2020

We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy.

Paper
Code

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

PaddlePaddle/PaddleSpeech • • 10 Dec 2020

In this paper, we present a novel two-pass approach to unify streaming and non-streaming end-to-end (E2E) speech recognition in a single model.

Paper
Code

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

microsoft/unilm • • 26 Oct 2021

Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks.

Paper
Code

speech-recognition

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result