no code implementations • 18 Oct 2023 • Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg
We present the NVIDIA NeMo team's multi-channel speech recognition system for the 7th CHiME Challenge Distant Automatic Speech Recognition (DASR) Task, focusing on the development of a multi-channel, multi-speaker speech recognition system tailored to transcribe speech from distributed microphones and microphone arrays.
no code implementations • 4 Oct 2023 • Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg
Traditional automatic speech recognition (ASR) models output lower-cased words without punctuation marks, which reduces readability and necessitates a subsequent text processing model to convert ASR transcripts into a proper format.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 23 Dec 2022 • Vladimir Kondratenko, Artem Sokolov, Nikolay Karpov, Oleg Kutuzov, Nikita Savushkin, Fyodor Minkin
We present a new data set for speech emotion recognition (SER) tasks called Dusha.
Ranked #1 on Speech Emotion Recognition on Dusha Podcast
1 code implementation • 18 Jun 2021 • Nikolay Karpov, Alexander Denisenko, Fedor Minkin
This paper introduces a novel Russian speech dataset called Golos, a large corpus suitable for speech research.
no code implementations • SEMEVAL 2017 • Nikolay Karpov
In many areas, such as social science, politics or market research, people need to deal with dataset shifting over time.