no code implementations • LREC 2022 • Denis Ivanko, Alexandr Axyonov, Dmitry Ryumin, Alexey Kashevnik, Alexey Karpov
We present a new audio-visual speech corpus (RUSAVIC) recorded in a car environment and designed for noise-robust speech recognition.
1 code implementation • Expert Systems with Applications 2024 • Dmitry Ryumin, Alexandr Axyonov, Elena Ryumina, Denis Ivanko, Alexey Kashevnik, Alexey Karpov
The article introduces a novel audio-visual speech command recognition transformer (AVCRFormer) specifically designed for robust AVSR.
Ranked #1 on Audio-Visual Speech Recognition on LRW
no code implementations • 19 Mar 2024 • Elena Ryumina, Maxim Markitantov, Dmitry Ryumin, Heysem Kaya, Alexey Karpov
Our findings from the challenge demonstrate that the proposed method can potentially form a basis for developing intelligent tools for annotating audio-visual data in the context of human's basic and compound emotions.
no code implementations • 19 Mar 2024 • Denis Dresvyanskiy, Maxim Markitantov, Jiawei Yu, Peitong Li, Heysem Kaya, Alexey Karpov
As emotions play a central role in human communication, automatic emotion recognition has attracted increasing attention in the last two decades.
1 code implementation • Expert Systems with Applications 2023 • Elena Ryumina, Maxim Markitantov, Dmitry Ryumin, Alexey Karpov
Psychological and neurological studies earlier suggested that a personality type can be determined by the whole face as well as by its sides.
Personality Trait Recognition Personality Trait Recognition by Face
1 code implementation • Neurocomputing 2022 • Elena Ryumina, Denis Dresvyanskiy, Alexey Karpov
Many researchers have been seeking robust emotion recognition system for already last two decades.
Ranked #1 on Facial Expression Recognition (FER) on Aff-Wild2
no code implementations • 30th European Signal Processing Conference (EUSIPCO) 2022 • Denis Ivanko, Dmitry Ryumin, Alexey Kashevnik, Alexandr Axyonov, Alexey Karpov
After a comprehensive evaluation, we adapt the developed method and test it on the collected RUSAVIC corpus we recorded in-the-wild for vehicle driver.
Ranked #4 on Lipreading on Lip Reading in the Wild
no code implementations • 7 Oct 2020 • Denis Dresvyanskiy, Elena Ryumina, Heysem Kaya, Maxim Markitantov, Alexey Karpov, Wolfgang Minker
In this paper, we present our contribution to ABAW facial expression challenge.
1 code implementation • 7 Sep 2020 • Gizem Soğancıoğlu, Oxana Verkholyak, Heysem Kaya, Dmitrii Fedotov, Tobias Cadèe, Albert Ali Salah, Alexey Karpov
Acoustic and linguistic analysis for elderly emotion recognition is an under-studied and challenging research direction, but essential for the creation of digital assistants for the elderly, as well as unobtrusive telemonitoring of elderly in their residences for mental healthcare purposes.
no code implementations • LREC 2020 • Irina Kipyatkova, Alexey Karpov
We achieved WER of 14. 94 {\%} at our own speech corpus of continuous Russian speech that is 15 {\%} relative reduction with respect to the baseline 3-gram model.
no code implementations • LREC 2020 • Ildar Kagirov, Denis Ivanko, Dmitry Ryumin, Alex Axyonov, er, Alexey Karpov
The database includes lexical units (single words and phrases) from Russian sign language within one subject area, namely, {``}food products at the supermarket{''}, and was collected using MS Kinect 2. 0 device including both FullHD video and the depth map modes, which provides new opportunities for the lexicographical description of the Russian sign language vocabulary and enhances research in the field of automatic gesture recognition.
no code implementations • WS 2019 • Oleg Akhtiamov, Ingo Siegert, Alexey Karpov, Wolfgang Minker
Mixup is shown to be beneficial for merging acoustic data (extracted features but not raw waveforms) from different domains that allows us to reach a higher classification performance on human-machine AD and also for training a multipurpose neural network that is capable of solving both human-machine and adult-child AD problems.