Search Results for author: Brian Mak

Found 13 papers, 6 papers with code

Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition

1 code implementation • ECCV 2020 • Zhe Niu, Brian Mak

In this paper, we propose novel stochastic modeling of various components of a continuous sign language recognition (CSLR) system that is based on the transformer encoder and connectionist temporal classification (CTC).

Ranked #12 on Sign Language Recognition on RWTH-PHOENIX-Weather 2014 T

Decoder Sign Language Recognition

Paper
Code

A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News

no code implementations • 2 May 2024 • Zhe Niu, Ronglai Zuo, Brian Mak, Fangyun Wei

The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT).

Benchmarking Sign Language Recognition +1

Paper
Add Code

Towards Online Sign Language Recognition and Translation

1 code implementation • 10 Jan 2024 • Ronglai Zuo, Fangyun Wei, Brian Mak

Our approach comprises three phases: 1) developing a sign language dictionary encompassing all glosses present in a target sign language dataset; 2) training an isolated sign language recognition model on augmented signs using both conventional classification loss and our novel saliency loss; 3) employing a sliding window approach on the input sign sequence and feeding each sign clip to the well-optimized model for online recognition.

Sign Language Recognition speech-recognition +2

215

Paper
Code

A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars

1 code implementation • 9 Jan 2024 • Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong

The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation.

Sign Language Translation Translation

215

Paper
Code

Natural Language-Assisted Sign Language Recognition

1 code implementation • CVPR 2023 • Ronglai Zuo, Fangyun Wei, Brian Mak

Sign languages are visual languages which convey information by signers' handshape, facial expression, body movement, and so forth.

Ranked #1 on Sign Language Recognition on WLASL-2000

Sign Language Recognition

215

Paper
Code

On the Audio-visual Synchronization for Lip-to-Speech Synthesis

no code implementations • ICCV 2023 • Zhe Niu, Brian Mak

Most lip-to-speech (LTS) synthesis models are trained and evaluated under the assumption that the audio-video pairs in the dataset are perfectly synchronized.

Audio-Visual Synchronization Lip to Speech Synthesis +1

Paper
Add Code

Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal

1 code implementation • 26 Dec 2022 • Ronglai Zuo, Brian Mak

The first task enhances the visual module, which is sensitive to the insufficient training problem, from the perspective of consistency.

Ranked #10 on Sign Language Recognition on CSL-Daily

Disentanglement Sentence +3

Paper
Code

Two-Stream Network for Sign Language Recognition and Translation

1 code implementation • 2 Nov 2022 • Yutong Chen, Ronglai Zuo, Fangyun Wei, Yu Wu, Shujie Liu, Brian Mak

RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understanding.

Ranked #1 on Sign Language Translation on CSL-Daily

Sign Language Recognition Sign Language Translation +2

215

Paper
Code

C2SLR: Consistency-Enhanced Continuous Sign Language Recognition

no code implementations • CVPR 2022 • Ronglai Zuo, Brian Mak

The backbone of most deep-learning-based continuous sign language recognition (CSLR) models consists of a visual module, a sequential module, and an alignment module.

Ranked #4 on Sign Language Recognition on RWTH-PHOENIX-Weather 2014 T

Sentence Sentence Embedding +2

Paper
Add Code

Transformer based Multilingual document Embedding model

no code implementations • 19 Aug 2020 • Wei Li, Brian Mak

One of the current state-of-the-art multilingual document embedding model LASER is based on the bidirectional LSTM neural machine translation model.

Document Embedding Machine Translation +3

Paper
Add Code

Cross-lingual Multi-speaker Text-to-speech Synthesis for Voice Cloning without Using Parallel Corpus for Unseen Speakers

no code implementations • 26 Nov 2019 • Zhaoyu Liu, Brian Mak

Speaker similarity is good for native speech from native speakers.

Speech Synthesis Text-To-Speech Synthesis +1

Paper
Add Code

NMT-based Cross-lingual Document Embeddings

no code implementations • 29 Jul 2018 • Wei Li, Brian Mak

This paper further adds a distance constraint to the training objective function of NV so that the two embeddings of a parallel document are required to be as close as possible.

Cross-Lingual Document Classification Document Classification +5

Paper
Add Code

Derivation of Document Vectors from Adaptation of LSTM Language Model

no code implementations • EACL 2017 • Wei Li, Brian Mak

In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector.

General Classification Genre classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.