Search Results for author: Jiaming Zhou

Found 10 papers, 3 papers with code

Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores

no code implementations • 6 Jun 2024 • Jiaming Zhou, Shiwan Zhao, Hui Wang, Tian-Hao Zhang, Haoqin Sun, Xuechen Wang, Yong Qin

To address this, we propose a novel kNN-CTC-based code-switching ASR (CS-ASR) framework that employs dual monolingual datastores and a gated datastore selection mechanism to reduce noise interference.

Paper
Add Code

CKGConv: General Graph Convolution with Continuous Kernels

1 code implementation • 21 Apr 2024 • Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates

In this work, we propose a novel and general graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding.

Graph Classification Graph Learning +1

Paper
Code

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

1 code implementation • 3 Mar 2024 • Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yu-Ming Tang, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.

Open Vocabulary Action Recognition

Paper
Code

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

no code implementations • 22 Jan 2024 • Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng

With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.

Action Recognition Video Description +1

Paper
Add Code

GeoDeformer: Geometric Deformable Transformer for Action Recognition

no code implementations • 29 Nov 2023 • Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang

Specifically, at the core of GeoDeformer is the Geometric Deformation Predictor, a module designed to identify and quantify potential spatial and temporal geometric deformations within the given video.

Action Recognition

Paper
Add Code

Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition

no code implementations • 28 Nov 2023 • Jiaming Zhou, Hanjun Li, Kun-Yu Lin, Junwei Liang

To this end, this work aims to build a weakly supervised end-to-end framework for training recognition models on long videos, with only video-level action category labels.

Ranked #1 on Long-video Activity Recognition on Breakfast

Action Classification Action Recognition +6

Paper
Add Code

PostRainBench: A comprehensive benchmark and a new model for precipitation forecasting

1 code implementation • 4 Oct 2023 • Yujin Tang, Jiaming Zhou, Xiang Pan, Zeying Gong, Junwei Liang

To address these limitations, we introduce the PostRainBench, a comprehensive multi-variable NWP post-processing benchmark consisting of three datasets for NWP post-processing-based precipitation forecasting.

NWP Post-processing Precipitation Forecasting

Paper
Code

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition

no code implementations • 26 Jul 2023 • Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li

RNN-T models are widely used in ASR, which rely on the RNN-T loss to achieve length alignment between input audio and target sequence.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

MADI: Inter-domain Matching and Intra-domain Discrimination for Cross-domain Speech Recognition

no code implementations • 22 Feb 2023 • Jiaming Zhou, Shiwan Zhao, Ning Jiang, Guoqing Zhao, Yong Qin

Unsupervised domain adaptation (UDA) aims to improve the performance on the unlabeled target domain by transferring knowledge from the source to the target domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

no code implementations • CVPR 2021 • Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.

Ranked #5 on Video Classification on Breakfast

Action Recognition Long-video Activity Recognition +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.