1 code implementation • ECCV 2020 • Zhe Niu, Brian Mak
In this paper, we propose novel stochastic modeling of various components of a continuous sign language recognition (CSLR) system that is based on the transformer encoder and connectionist temporal classification (CTC).
Ranked #12 on Sign Language Recognition on RWTH-PHOENIX-Weather 2014 T
no code implementations • 2 May 2024 • Zhe Niu, Ronglai Zuo, Brian Mak, Fangyun Wei
The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT).
1 code implementation • 10 Jan 2024 • Ronglai Zuo, Fangyun Wei, Brian Mak
Our approach comprises three phases: 1) developing a sign language dictionary encompassing all glosses present in a target sign language dataset; 2) training an isolated sign language recognition model on augmented signs using both conventional classification loss and our novel saliency loss; 3) employing a sliding window approach on the input sign sequence and feeding each sign clip to the well-optimized model for online recognition.
1 code implementation • 9 Jan 2024 • Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong
The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation.
1 code implementation • CVPR 2023 • Ronglai Zuo, Fangyun Wei, Brian Mak
Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth.
Ranked #1 on Sign Language Recognition on WLASL-2000
no code implementations • ICCV 2023 • Zhe Niu, Brian Mak
Most lip-to-speech (LTS) synthesis models are trained and evaluated under the assumption that the audio-video pairs in the dataset are perfectly synchronized.
1 code implementation • 26 Dec 2022 • Ronglai Zuo, Brian Mak
The first task enhances the visual module, which is sensitive to the insufficient training problem, from the perspective of consistency.
Ranked #10 on Sign Language Recognition on CSL-Daily
1 code implementation • 2 Nov 2022 • Yutong Chen, Ronglai Zuo, Fangyun Wei, Yu Wu, Shujie Liu, Brian Mak
RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understanding.
Ranked #1 on Sign Language Translation on CSL-Daily
no code implementations • CVPR 2022 • Ronglai Zuo, Brian Mak
The backbone of most deep-learning-based continuous sign language recognition (CSLR) models consists of a visual module, a sequential module, and an alignment module.
no code implementations • 19 Aug 2020 • Wei Li, Brian Mak
One of the current state-of-the-art multilingual document embedding model LASER is based on the bidirectional LSTM neural machine translation model.
no code implementations • 26 Nov 2019 • Zhaoyu Liu, Brian Mak
Speaker similarity is good for native speech from native speakers.
no code implementations • 29 Jul 2018 • Wei Li, Brian Mak
This paper further adds a distance constraint to the training objective function of NV so that the two embeddings of a parallel document are required to be as close as possible.
Cross-Lingual Document Classification Document Classification +5
no code implementations • EACL 2017 • Wei Li, Brian Mak
In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector.