no code implementations • PANDL (COLING) 2022 • Remo Nitschke, Yuwei Wang, Chen Chen, Adarsh Pyarelal, Rebecca Sharp
Natural language (as opposed to structured communication modes such as Morse code) is by far the most common mode of communication between humans, and can thus provide significant insight into both individual mental states and interpersonal dynamics.
no code implementations • 16 May 2024 • Yuchen Hu, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng, Ruizhe Li
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 12 May 2024 • Weiwei Weng, Mahardhika Pratama, Jie Zhang, Chen Chen, Edward Yapp Kien Yee, Ramasamy Savitha
To this end, this article proposes a cross-domain CL approach making possible to deploy a single model in such environments without additional labelling costs.
no code implementations • 9 May 2024 • Chen Chen, Kai Qiao, Jie Yang, Jian Chen, Bin Yan
In this model, the teacher-guided MIM pretraining model is introduced into PCB CT image element segmentation for the first time, and a multi-scale local visual field extraction (MVE) module is proposed to reduce redundancy by focusing on local visual fields.
no code implementations • 9 May 2024 • Zonglin Lyu, Ming Li, Jianbo Jiao, Chen Chen
To address this problem, we propose our unique solution: Frame Interpolation with Consecutive Brownian Bridge Diffusion.
1 code implementation • 2 May 2024 • Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen
We present a methodology for constructing an SATO that satisfies the stability of attention and prediction.
no code implementations • 2 May 2024 • Matias Mendieta, Guangyu Sun, Chen Chen
Federated learning (FL) enables multiple clients to train models collectively while preserving data privacy.
no code implementations • 24 Apr 2024 • Jincheng Dai, Zhuowei Huang, Haiyun Jiang, Chen Chen, Deng Cai, Wei Bi, Shuming Shi
Large Language Models (LLMs), despite their impressive performance on a wide range of tasks, require significant GPU memory and consume substantial computational resources.
no code implementations • 21 Apr 2024 • Anthony Bilic, Chen Chen
Binary breast cancer tumor segmentation with Magnetic Resonance Imaging (MRI) data is typically trained and evaluated on private medical data, which makes comparing deep learning approaches difficult.
1 code implementation • 21 Apr 2024 • Sheng Yan, Mengyuan Liu, Yong Wang, Yang Liu, Chen Chen, Hong Liu
In this paper, we address the unexplored question of temporal sentence localization in human motions (TSLM), aiming to locate a target moment from a 3D human motion that semantically corresponds to a text query.
no code implementations • 18 Apr 2024 • Guangyu Sun, Matias Mendieta, Aritra Dutta, Xin Li, Chen Chen
Multi-modal transformers mark significant progress in different domains, but siloed high-quality data hinders their further improvement.
no code implementations • 17 Apr 2024 • Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu
To this end, we proposed the layer pruning and normalized distillation for compressing diffusion models (LAPTOP-Diff).
no code implementations • 11 Apr 2024 • Haotian Zhang, Haoxuan You, Philipp Dufter, BoWen Zhang, Chen Chen, Hong-You Chen, Tsu-Jui Fu, William Yang Wang, Shih-Fu Chang, Zhe Gan, Yinfei Yang
While Ferret seamlessly integrates regional understanding into the Large Language Model (LLM) to facilitate its referring and grounding capability, it poses certain limitations: constrained by the pre-trained fixed visual encoder and failed to perform well on broader tasks.
Ranked #61 on Visual Question Answering on MM-Vet
1 code implementation • 11 Apr 2024 • Ming Li, Taojiannan Yang, Huafeng Kuang, Jie Wu, Zhaoning Wang, Xuefeng Xiao, Chen Chen
To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls.
no code implementations • 10 Apr 2024 • Aakash Kumar, Chen Chen, Ajmal Mian, Neils Lobo, Mubarak Shah
Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor.
no code implementations • 3 Apr 2024 • Jinbin Huang, Chen Chen, Aditi Mishra, Bum Chul Kwon, Zhicheng Liu, Chris Bryan
Generative image models have emerged as a promising technology to produce realistic images.
no code implementations • 1 Apr 2024 • Chen Chen, Daochang Liu, Chang Xu
Pretrained diffusion models and their outputs are widely accessible due to their exceptional capacity for synthesizing high-quality images and their open-source nature.
1 code implementation • 28 Mar 2024 • Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen
To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework.
no code implementations • 28 Mar 2024 • Yuhang Li, Xin Dong, Chen Chen, Jingtao Li, Yuxin Wen, Michael Spranger, Lingjuan Lyu
Synthetic image data generation represents a promising avenue for training deep learning models, particularly in the realm of transfer learning, where obtaining real images within a specific domain can be prohibitively expensive due to privacy and intellectual property considerations.
1 code implementation • 26 Mar 2024 • Yingfa Chen, Zhengyan Zhang, Xu Han, Chaojun Xiao, Zhiyuan Liu, Chen Chen, Kuai Li, Tao Yang, Maosong Sun
Large language models (LLMs) can make predictions using parametric knowledge--knowledge encoded in the model weights--or contextual knowledge--knowledge presented in the context.
no code implementations • 21 Mar 2024 • Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu
To address these challenges, we propose FedMef, a novel and memory-efficient federated dynamic pruning framework.
no code implementations • 20 Mar 2024 • Mohammod N. I. Suvon, Prasun C. Tripathi, Wenrui Fan, Shuo Zhou, Xianyuan Liu, Samer Alabed, Venet Osmani, Andrew J. Swift, Chen Chen, Haiping Lu
In response to these limitations, we propose a novel multimodal variational autoencoder ($\text{CardioVAE}_\text{X, G}$) to integrate low-cost chest X-ray (CXR) and electrocardiogram (ECG) modalities with pre-training on a large unlabeled dataset.
no code implementations • 20 Mar 2024 • Ziyao Liu, Huanyi Ye, Chen Chen, Kwok-Yan Lam
Machine Unlearning (MU) has gained considerable attention recently for its potential to achieve Safe AI by removing the influence of specific data from trained machine learning models.
no code implementations • 18 Mar 2024 • Yifan Wang, Yafei Liu, Chufan Shi, Haoling Li, Chen Chen, Haonan Lu, Yujiu Yang
Instruction tuning effectively optimizes Large Language Models (LLMs) for downstream tasks.
1 code implementation • 17 Mar 2024 • Qucheng Peng, Ce Zheng, Chen Chen
Furthermore, the pose estimator's optimization is not exposed to domain shifts, limiting its overall generalization ability.
Domain Generalization Weakly-supervised 3D Human Pose Estimation
no code implementations • 15 Mar 2024 • Chen Chen, Lei LI, Marcel Beetz, Abhirup Banerjee, Ramneek Gupta, Vicente Grau
We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups.
no code implementations • 15 Mar 2024 • Wenrui Fan, Mohammod Naimul Islam Suvon, Shuo Zhou, Xianyuan Liu, Samer Alabed, Venet Osmani, Andrew Swift, Chen Chen, Haiping Lu
Moreover, a novel vision-language Prototypical Contr-astive Learning (ProtoCL) method is adopted in MeDSLIP to enhance the alignment within the anatomical and pathological streams.
no code implementations • 14 Mar 2024 • Lei Wang, Jieming Bian, Letian Zhang, Chen Chen, Jie Xu
Federated learning (FL) allows collaborative machine learning training without sharing private data.
no code implementations • 3 Mar 2024 • Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-Jun Zha, Haonan Lu
In contrast to vanilla consistency distillation (CD) which distills the ordinary differential equation solvers-based sampling process of a pretrained teacher model into a student, SCott explores the possibility and validates the efficacy of integrating stochastic differential equation (SDE) solvers into CD to fully unleash the potential of the teacher.
no code implementations • 2 Mar 2024 • Jian Zhong, Chen Chen, Yuqi Qian, Yiheng Bian, Yuxiong Huang, Zhaohong Bie
This approach is designed to improve the performance of PDS communication networks, adapting to ongoing PDS development and the evolution of PDS services.
no code implementations • 2 Mar 2024 • Jian Zhong, Chen Chen, Qiming Yang, Dafu Liu, Wentao Shen, Chenlin Ji, Zhaohong Bie
This scheme first improves the distribution automation function, EV deployment capability, and traffic operation efficiency by prioritizing the recovery of communication network (CN) and urban traffic network (UTN) loads.
no code implementations • 2 Mar 2024 • Jian Zhong, Chen Chen, Zhaohong Bie, Mohammad Shahidehpour
Indeed, communication network restoration is critical for speedy load recovery through DS automation based microgrid formation.
no code implementations • 2 Mar 2024 • Jian Zhong, Chen Chen, Young-Jin Kim, Yuxiong Huang, Mengjie Teng, Yiheng Bian, Zhaohong Bie
Distribution system (DS) communication failures following extreme events often degrade monitoring and control functions, thus preventing the acquisition of complete global DS component state information, on which existing post-disaster DS restoration methods are based.
1 code implementation • 21 Feb 2024 • Chenyang Song, Xu Han, Zhengyan Zhang, Shengding Hu, Xiyu Shi, Kuai Li, Chen Chen, Zhiyuan Liu, Guangli Li, Tao Yang, Maosong Sun
Some recent efforts have explored introducing ReLU or its variants as the substitutive activation function to help LLMs achieve activation sparsity and inference acceleration, but few can simultaneously obtain high sparsity and comparable model performance.
1 code implementation • 16 Feb 2024 • Ishan Rajendrakumar Dave, Tristan de Blegiers, Chen Chen, Mubarak Shah
Annotating images from LCM significantly increases the burden on medical experts compared to annotating images from high-cost microscopes (HCM).
no code implementations • 15 Feb 2024 • Yu Liu, Zibo Wang, Yifei Zhu, Chen Chen
We also theoretically prove the existence of a fairness-efficiency tradeoff in privacy budgeting.
1 code implementation • 10 Feb 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng
Leveraging the rich linguistic knowledge and strong reasoning abilities of LLMs, our new paradigm can integrate the rich information in N-best candidates to generate a higher-quality translation result.
no code implementations • 8 Feb 2024 • Kelly Payette, Céline Steger, Roxane Licandro, Priscille de Dumast, Hongwei Bran Li, Matthew Barkovich, Liu Li, Maik Dannecker, Chen Chen, Cheng Ouyang, Niccolò McConnell, Alina Miron, Yongmin Li, Alena Uus, Irina Grigorescu, Paula Ramirez Gilliland, Md Mahfuzur Rahman Siddiquee, Daguang Xu, Andriy Myronenko, Haoyu Wang, Ziyan Huang, Jin Ye, Mireia Alenyà, Valentin Comte, Oscar Camara, Jean-Baptiste Masson, Astrid Nilsson, Charlotte Godard, Moona Mazher, Abdul Qayyum, Yibo Gao, Hangqi Zhou, Shangqi Gao, Jia Fu, Guiming Dong, Guotai Wang, ZunHyan Rieu, HyeonSik Yang, Minwoo Lee, Szymon Płotka, Michal K. Grzeszczyk, Arkadiusz Sitek, Luisa Vargas Daza, Santiago Usma, Pablo Arbelaez, Wenying Lu, WenHao Zhang, Jing Liang, Romain Valabregue, Anand A. Joshi, Krishna N. Nayak, Richard M. Leahy, Luca Wilhelmi, Aline Dändliker, Hui Ji, Antonio G. Gennari, Anton Jakovčić, Melita Klaić, Ana Adžić, Pavel Marković, Gracia Grabarić, Gregor Kasprian, Gregor Dovjak, Milan Rados, Lana Vasung, Meritxell Bach Cuadra, Andras Jakab
The FeTA Challenge 2022 was able to successfully evaluate and advance generalizability of multi-class fetal brain tissue segmentation algorithms for MRI and it continues to benchmark new algorithms.
no code implementations • 8 Feb 2024 • Chen Chen, Ruizhe Li, Yuchen Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, EnSiong Chng, Chao-Han Huck Yang
Recent studies have successfully shown that large language models (LLMs) can be successfully used for generative error correction (GER) on top of the automatic speech recognition (ASR) output.
Audio-Visual Speech Recognition Automatic Speech Recognition +3
no code implementations • 6 Feb 2024 • Xin Chen, Sukanya Kudva, Yongzheng Dai, Anil Aswani, Chen Chen
The main challenge with the tensor completion problem is a fundamental tension between computation power and the information-theoretic sample complexity rate.
1 code implementation • 4 Feb 2024 • Li Ren, Chen Chen, Liqiang Wang, Kien Hua
As a result of the success of recent pre-trained models trained from larger-scale datasets, it is challenging to adapt the model to the DML tasks in the local data domain while retaining the previously gained knowledge.
Ranked #2 on Image Retrieval on iNaturalist
no code implementations • 4 Feb 2024 • Jincao Yao, Yunpeng Wang, Zhikai Lei, Kai Wang, Xiaoxian Li, Jianhua Zhou, Xiang Hao, Jiafei Shen, Zhenping Wang, Rongrong Ru, Yaqing Chen, Yahan Zhou, Chen Chen, YanMing Zhang, Ping Liang, Dong Xu
After training, ThyGPT could automatically evaluate thyroid nodule and engage in effective communication with physicians through human-computer interaction.
no code implementations • 4 Feb 2024 • Mengyuan Liu, Chen Chen, Songtao Wu, Fanyang Meng, Hong Liu
Recognizing interactive actions, including hand-to-hand interaction and human-to-human interaction, has attracted increasing attention for various applications in the field of video analysis and human-robot interaction.
1 code implementation • 19 Jan 2024 • Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Chao Zhang, Pin-Yu Chen, EnSiong Chng
To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 19 Jan 2024 • Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wang
To this end, we propose a transformer-based 360 image outpainting framework called Dream360, which can generate diverse, high-fidelity, and high-resolution panoramas from user-selected viewports, considering the spherical properties of 360 images.
no code implementations • 14 Jan 2024 • Mingli Zhu, Zihao Zhu, Sihong Chen, Chen Chen, Baoyuan Wu
To tackle overfitting challenge, we design a new ensemble model framework cooperated with data augmentation to boost generalization.
1 code implementation • 3 Jan 2024 • Haopeng Li, Andong Deng, Qiuhong Ke, Jun Liu, Hossein Rahmani, Yulan Guo, Bernt Schiele, Chen Chen
Reasoning over sports videos for question answering is an important task with numerous applications, such as player training and information retrieval.
no code implementations • 2 Jan 2024 • Ziheng Xu, Jianwei Niu, Qingfeng Li, Tao Ren, Chen Chen
In this paper we present NID-SLAM, which significantly improves the performance of neural SLAM in dynamic environments.
1 code implementation • 1 Jan 2024 • Li Ren, Chen Chen, Liqiang Wang, Kien Hua
Our experiments on benchmarks, including the popular CUB-200-2011, CARS196, Stanford Online Products, and In-Shop Clothes Retrieval, show that our learning algorithm significantly improves the existing proxy losses and achieves superior results compared to the existing methods.
2 code implementations • 25 Dec 2023 • Jing Wang, Jinagyun Li, Chen Chen, Yisi Zhang, Haoran Shen, Tianxiang Zhang
In this paper, we propose a novel framework based on the adapter mechanism, namely Adaptive FSS, which can efficiently adapt the existing FSS model to the novel classes.
no code implementations • 21 Dec 2023 • Nazmul Karim, Umar Khalid, Hasan Iqbal, Jing Hua, Chen Chen
To date, editing 3D scenes requires either re-training the model to adapt to various 3D edited scenes or design-specific methods for each special editing type.
1 code implementation • 19 Dec 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Mengyuan Liu
The past few years has witnessed the dominance of Graph Convolutional Networks (GCNs) over human motion prediction. Various styles of graph convolutions have been proposed, with each one meticulously designed and incorporated into a carefully-crafted network architecture.
1 code implementation • 14 Dec 2023 • Umar Khalid, Hasan Iqbal, Nazmul Karim, Jing Hua, Chen Chen
Our approach achieves faster editing speeds and superior output quality compared to existing 3D editing models, bridging the gap between textual instructions and high-quality 3D scene editing in latent space.
no code implementations • 10 Dec 2023 • Letian Zhang, Ming Li, Chen Chen, Jie Xu
This poses a paradox as the necessary camera pose must be estimated from the entire dataset, even though the data arrives sequentially and future chunks are inaccessible.
1 code implementation • 6 Dec 2023 • Xinshun Wang, Zhongbin Fang, Xia Li, Xiangtai Li, Chen Chen, Mengyuan Liu
Under this setting, the model can perceive tasks from prompts and accomplish them without any extra task-specific head predictions or model fine-tuning.
1 code implementation • 6 Dec 2023 • Ekkasit Pinyoanuntapong, Pu Wang, Minwoo Lee, Chen Chen
MMM consists of two key components: (1) a motion tokenizer that transforms 3D human motion into a sequence of discrete tokens in latent space, and (2) a conditional masked motion transformer that learns to predict randomly masked motion tokens, conditioned on the pre-computed text tokens.
Ranked #5 on Motion Synthesis on KIT Motion-Language
1 code implementation • 30 Nov 2023 • Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen
Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.
Ranked #2 on Zero-Shot Action Recognition on Kinetics
no code implementations • 30 Nov 2023 • Zhaoning Wang, Ming Li, Chen Chen
Nonetheless, achieving precise control over 3D generation continues to be an arduous task, as using text to control often leads to missing objects and imprecise locations.
1 code implementation • 28 Nov 2023 • Jian Ma, Chen Chen, Qingsong Xie, Haonan Lu
In this paper, we are inspired to propose a simple plug-and-play language transfer method based on knowledge distillation.
Cross-lingual Text-to-Image Generation Knowledge Distillation +1
no code implementations • 24 Nov 2023 • Cuifeng Shen, Yulu Gan, Chen Chen, Xiongwei Zhu, Lele Cheng, Tingting Gao, Jinzhi Wang
The goal of conditional image-to-video (cI2V) generation is to create a believable new video by beginning with the condition, i. e., one image and text. The previous cI2V generation methods conventionally perform in RGB pixel space, with limitations in modeling motion consistency and visual continuity.
no code implementations • 21 Nov 2023 • Yang Li, Chunhe Xia, Wei Liu, Weidong Zhou, Chen Chen, Tianbo Wang
This article proposes Blockchain-based Federated Learning (FBChain) model for federated learning parameter communication to overcome the above two problems.
no code implementations • 15 Nov 2023 • Yixiu Mao, Hongchang Zhang, Chen Chen, Yi Xu, Xiangyang Ji
Offline reinforcement learning suffers from the out-of-distribution issue and extrapolation error.
no code implementations • 1 Nov 2023 • Mengxia Wu, Min Cao, Yang Bai, Ziyin Zeng, Chen Chen, Liqiang Nie, Min Zhang
In this paper, we make the first empirical study of frame selection for TVR.
no code implementations • 30 Oct 2023 • Youbo Lei, Feifei He, Chen Chen, Yingbin Mo, Si Jia Li, Defeng Xie, Haonan Lu
Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment.
no code implementations • 28 Oct 2023 • Haoran Shen, Yifu Zhang, Wenxuan Wang, Chen Chen, Jing Liu, Shanshan Song, Jiangyun Li
As a pioneering work, a dynamic architecture network for medical volumetric segmentation (i. e. Med-DANet) has achieved a favorable accuracy and efficiency trade-off by dynamically selecting a suitable 2D candidate model from the pre-defined model bank for different slices.
no code implementations • 24 Oct 2023 • Song Wang, Yaochen Zhu, Haochen Liu, Zaiyi Zheng, Chen Chen, Jundong Li
Afterward, we provide an innovative taxonomy of KME techniques based on how the new knowledge is introduced into pre-trained LLMs, and investigate existing KME strategies while analyzing key insights, advantages, and limitations of methods from each category.
1 code implementation • 20 Oct 2023 • Binchi Zhang, Yushun Dong, Chen Chen, Yada Zhu, Minnan Luo, Jundong Li
Fairness-aware graph neural networks (GNNs) have gained a surge of attention as they can reduce the bias of predictions on any demographic group (e. g., female) in graph-based applications.
no code implementations • 17 Oct 2023 • Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Hexin Liu, Sabato Marco Siniscalchi, Eng Siong Chng
In this work, we propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 15 Oct 2023 • Chengwei Qin, Chen Chen, Shafiq Joty
Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.
no code implementations • 12 Oct 2023 • Xingyue Liu, Jiahao Qi, Chen Chen, Kangcheng Bin, Ping Zhong
Moreover, to meet cross-modality discrepancy and orientation discrepancy challenges, we present a hybrid weights decoupling network (HWDNet) to learn the shared discriminative orientation-invariant features.
no code implementations • 11 Oct 2023 • Chen Chen, Junqing Zhang, Yingying Chen
Physical layer key generation based on reciprocal and random wireless channels has been an attractive solution for securing resource-constrained low-power wide-area networks (LPWANs).
no code implementations • 27 Sep 2023 • Jiawen Wang, Quan Chen, Deze Zeng, Zhuo Song, Chen Chen, Minyi Guo
With the collaborative serving mechanism, only part of node representations are updated during the update phase, and the final representations are calculated in the inference phase.
1 code implementation • NeurIPS 2023 • Chen Chen, Yuchen Hu, Chao-Han Huck Yang, Sabato Macro Siniscalchi, Pin-Yu Chen, Eng Siong Chng
We make our results publicly accessible for reproducible pipelines with released pre-trained models, thus providing a new evaluation paradigm for ASR error correction with LLMs.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 26 Sep 2023 • Chenyang Song, Xu Han, Zheni Zeng, Kuai Li, Chen Chen, Zhiyuan Liu, Maosong Sun, Tao Yang
First, Static ConPET can adapt former continual learning methods originally designed for relatively smaller models to LLMs through PET and a dynamic replay strategy, which largely reduces the tuning costs and alleviates the over-fitting and forgetting issue.
no code implementations • 26 Sep 2023 • Zhiqian Lan, YuXuan Jiang, Yao Mu, Chen Chen, Shengbo Eben Li
Motion prediction is crucial for autonomous vehicles to operate safely in complex traffic environments.
no code implementations • 25 Sep 2023 • Tongtong Yuan, Xuange Zhang, Kun Liu, Bo Liu, Chen Chen, Jian Jin, Zhenzhen Jiao
Furthermore, we benchmark SOTA models for four multimodal tasks on this newly created dataset, which serve as new baselines for surveillance video-and-language understanding.
1 code implementation • 25 Sep 2023 • Yang Liu, Chen Chen, Can Wang, Xulin King, Mengyuan Liu
The proposed method decouples functions between the decoder and the encoder by introducing a mask regressor, which predicts the masked patch representation from the visible patch representation encoded by the encoder and the decoder reconstructs the target from the predicted masked patch representation.
Ranked #3 on Few-Shot 3D Point Cloud Classification on ModelNet40 10-way (20-shot) (using extra training data)
1 code implementation • NeurIPS 2023 • Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji
Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe.
1 code implementation • ICCV 2023 • Lijun Li, Linrui Tian, Xindi Zhang, Qi Wang, Bang Zhang, Mengyuan Liu, Chen Chen
The current interacting hand (IH) datasets are relatively simplistic in terms of background and texture, with hand joints being annotated by a machine annotator, which may result in inaccuracies, and the diversity of pose distribution is limited.
2 code implementations • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng
More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.
1 code implementation • ICCV 2023 • Hao Chen, Chenyuan Qu, Yu Zhang, Chen Chen, Jianbo Jiao
It is understandable as the model is designed to learn paired mapping (e. g. from a noisy image to its clean version).
Ranked #1 on Denoising on CBSD68 sigm75
no code implementations • 8 Sep 2023 • Sijia Li, Chen Chen, Haonan Lu
In this work, we propose a method with a mixture-of-expert (MOE) controllers to align the text-guided capacity of diffusion models with different kinds of human instructions, enabling our model to handle various open-domain image manipulation tasks with natural language instructions.
1 code implementation • 29 Aug 2023 • Umar Khalid, Hasan Iqbal, Saeed Vahidian, Jing Hua, Chen Chen
Machine learning plays a vital role in industrial HRI by enhancing the adaptability and autonomy of robots in complex environments.
no code implementations • 29 Aug 2023 • Tareen Dawood, Chen Chen, Baldeep S. Sidhua, Bram Ruijsink, Justin Goulda, Bradley Porter, Mark K. Elliott, Vishal Mehta, Christopher A. Rinaldi, Esther Puyol-Anton, Reza Razavi, Andrew P. King
The best-performing model in terms of both classification accuracy and the most common calibration measure, expected calibration error (ECE) was the Confidence Weight method, a novel approach that weights the loss of samples to explicitly penalise confident incorrect predictions.
no code implementations • 18 Aug 2023 • Xiaohan Zhang, Xingyu Li, Waqas Sultani, Chen Chen, Safwan Wshah
We attribute this deficiency to the lack of ability to extract the geometric layout of visual features and models' overfitting to low-level details.
1 code implementation • ICCV 2023 • Guangyu Sun, Matias Mendieta, Jun Luo, Shandong Wu, Chen Chen
Personalized Federated Learning (PFL) represents a promising solution for decentralized learning in heterogeneous data environments.
1 code implementation • ICCV 2023 • Jie Hu, Chen Chen, Liujuan Cao, Shengchuan Zhang, Annan Shu, Guannan Jiang, Rongrong Ji
Through extensive experiments conducted on the COCO and Cityscapes datasets, we demonstrate that PAIS is a promising framework for semi-supervised instance segmentation, particularly in cases where labeled data is severely limited.
no code implementations • 7 Aug 2023 • Pengfei Wu, Chen Chen, Dexiang Lai, Jian Zhong
Instead of integrating the constraint violation penalty with the reward function, its actor gradients are estimated by a Lagrange advantage function which is derived from two critic systems based on economic reward and violation cost.
1 code implementation • ICCV 2023 • Qucheng Peng, Ce Zheng, Chen Chen
To this end, we propose a new task, named source-free domain adaptive HPE, which aims to address the challenges of cross-domain learning of HPE without access to source data during the adaptation process.
no code implementations • 26 Jul 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu
Existing Graph Convolutional Networks to achieve human motion prediction largely adopt a one-step scheme, which output the prediction straight from history input, failing to exploit human motion patterns.
no code implementations • 25 Jul 2023 • Chen Chen, Bongshin Lee, Yunhai Wang, Yunjeong Chang, Zhicheng Liu
To facilitate the reuse of existing charts, previous research has examined how to obtain a semantic understanding of a chart by deconstructing its visual representation into reusable components, such as encodings.
1 code implementation • 21 Jul 2023 • Jian Ma, Junhao Liang, Chen Chen, Haonan Lu
In this paper, we propose Subject-Diffusion, a novel open-domain personalized image generation model that, in addition to not requiring test-time fine-tuning, also only requires a single reference image to support personalized generation of single- or multi-subject in any domain.
Diffusion Personalization Tuning Free Text-to-Image Generation
1 code implementation • 21 Jul 2023 • Shengnan Hu, Ce Zheng, Zixiang Zhou, Chen Chen, Gita Sukthankar
Human-centric visual understanding is an important desideratum for effective human-robot interaction.
1 code implementation • ICCV 2023 • Ming Li, Jie Wu, Xionghui Wang, Chen Chen, Jie Qin, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan
To this end, we propose AlignDet, a unified pre-training framework that can be adapted to various existing detectors to alleviate the discrepancies.
1 code implementation • 17 Jul 2023 • Che Liu, Sibo Cheng, Chen Chen, Mengyun Qiao, Weitong Zhang, Anand Shah, Wenjia Bai, Rossella Arcucci
The proposed method, named Medical vision-language pre-training with Frozen language models and Latent spAce Geometry optimization (M-FLAG), leverages a frozen language model for training stability and efficiency and introduces a novel orthogonality loss to harmonize the latent space geometry.
no code implementations • 17 Jul 2023 • Jing Ma, Chen Chen, Anil Vullikanti, Ritwick Mishra, Gregory Madden, Daniel Borrajo, Jundong Li
In this paper, we study the problem of causal effect estimation with treatment entangled in a graph.
1 code implementation • 16 Jul 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
Specifically, we design a noise classification (NC) model to produce acoustic embedding as a noise conditioner for guiding the reverse denoising process.
no code implementations • 11 Jul 2023 • Chen Chen, YuFei Wang, Yang Zhang, Quan Z. Sheng, Kwok-Yan Lam
Previous KGC methods typically represent knowledge graph entities and relations as trainable continuous embeddings and fuse the embeddings of the entity $h$ (or $t$) and relation $r$ into hidden representations of query $(h, r, ?
1 code implementation • 6 Jul 2023 • Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma
To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset.
2 code implementations • 4 Jul 2023 • Chen Chen, YuFei Wang, Aixin Sun, Bing Li, Kwok-Yan Lam
However, the fine-tuned PLMs often overwhelmingly focus on the textual information and overlook structural knowledge.
no code implementations • 2 Jul 2023 • Jingjie Guo, Weitong Zhang, Matthew Sinclair, Daniel Rueckert, Chen Chen
In addition, different from most existing TTA methods which restrict the adaptation to batch normalization blocks in the segmentation network only, we further exploit the use of channel and spatial attention blocks for improved adaptability at test time.
no code implementations • 27 Jun 2023 • Weiming Zhuang, Chen Chen, Lingjuan Lyu
The intersection of the Foundation Model (FM) and Federated Learning (FL) provides mutual benefits, presents a unique opportunity to unlock new possibilities in AI research, and address critical challenges in AI and real-world applications.
1 code implementation • 23 Jun 2023 • Tom Tongjia Chen, Hongshan Yu, Zhengeng Yang, Ming Li, Zechuan Li, Jingwen Wang, Wei Miao, Wei Sun, Chen Chen
Affordance-Centric Question-driven Task Completion (AQTC) has been proposed to acquire knowledge from videos to furnish users with comprehensive and systematic instructions.
1 code implementation • 19 Jun 2023 • Linus Kreitner, Johannes C. Paetzold, Nikolaus Rauch, Chen Chen, Ahmed M. Hagag, Alaa E. Fayed, Sobha Sivaprasad, Sebastian Rausch, Julian Weichsel, Bjoern H. Menze, Matthias Harders, Benjamin Knier, Daniel Rueckert, Martin J. Menten
To address this issue, recent work has employed transfer learning, where a segmentation network is trained on synthetic OCTA images and is then applied to real data.
1 code implementation • 18 Jun 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng
In this work, we investigate the noise-invariant visual modality to strengthen robustness of AVSR, which can adapt to any testing noises while without dependence on noisy training data, a. k. a., unsupervised noise adaptation.
1 code implementation • 18 Jun 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Heqing Zou, Eng Siong Chng
In this paper, we aim to learn the shared representations across modalities to bridge their gap.
1 code implementation • 17 Jun 2023 • Song Wang, Xingbo Fu, Kaize Ding, Chen Chen, Huiyuan Chen, Jundong Li
In this way, the server can exploit the computational power of all clients and train the model on a larger set of data samples among all clients.
no code implementations • 14 Jun 2023 • Shirui Pan, Linhao Luo, YuFei Wang, Chen Chen, Jiapu Wang, Xindong Wu
In this article, we present a forward-looking roadmap for the unification of LLMs and KGs.
no code implementations • 14 Jun 2023 • Xuechen Mu, Hankz Hankui Zhuo, Chen Chen, Kai Zhang, Chao Yu, Jianye Hao
Exploring sparse reward multi-agent reinforcement learning (MARL) environments with traps in a collaborative manner is a complex task.
1 code implementation • 13 Jun 2023 • Wentao Wu, Aleksei Timofeev, Chen Chen, BoWen Zhang, Kun Duan, Shuangning Liu, Yantao Zheng, Jonathon Shlens, Xianzhi Du, Zhe Gan, Yinfei Yang
Our approach involves employing a named entity recognition model to extract entities from the alt-text, and then using a CLIP model to select the correct entities as labels of the paired image.
1 code implementation • 12 Jun 2023 • Ruoyan Kong, Ruixuan Sun, Charles Chuankai Zhang, Chen Chen, Sneha Patri, Gayathri Gajjela, Joseph A. Konstan
A single digital newsletter usually contains many messages (regions).
1 code implementation • 31 May 2023 • Hasan Iqbal, Umar Khalid, Jing Hua, Chen Chen
It can be challenging to identify brain MRI anomalies using supervised deep-learning techniques due to anatomical heterogeneity and the requirement for pixel-level labeling.
1 code implementation • 30 May 2023 • Nazmul Karim, Umar Khalid, Mohsen Joneidi, Chen Chen, Nazanin Rahnavard
Text-to-Image (T2I) diffusion models have achieved remarkable success in synthesizing high-quality images conditioned on text prompts.
no code implementations • 29 May 2023 • Zhenting Wang, Chen Chen, Yi Zeng, Lingjuan Lyu, Shiqing Ma
To overcome this problem, we first develop an alteration-free and model-agnostic origin attribution method via input reverse-engineering on image generation models, i. e., inverting the input of a particular model for a specific image.
1 code implementation • 26 May 2023 • Chen Chen, Chao-Han Huck Yang, Kai Li, Yuchen Hu, Pin-Jui Ku, Eng Siong Chng
In this work, we introduce S4M, a new efficient speech separation framework based on neural state-space models (SSM).
1 code implementation • 26 May 2023 • Kai Zhang, Jun Yu, Eashan Adhikarla, Rong Zhou, Zhiling Yan, Yixin Liu, Zhengliang Liu, Lifang He, Brian Davison, Xiang Li, Hui Ren, Sunyang Fu, James Zou, Wei Liu, Jing Huang, Chen Chen, Yuyin Zhou, Tianming Liu, Xun Chen, Yong Chen, Quanzheng Li, Hongfang Liu, Lichao Sun
Conventional task- and modality-specific artificial intelligence (AI) models are inflexible in real-world deployment and maintenance for biomedicine.
Ranked #1 on Text Summarization on MeQSum
no code implementations • 25 May 2023 • Lantian Li, Xiaolou Li, Haoyu Jiang, Chen Chen, Ruihai Hou, Dong Wang
A comprehensive study was conducted to compare CN-Celeb-AV with two popular public AVPR benchmark datasets, and the results demonstrated that CN-Celeb-AV is more in line with real-world scenarios and can be regarded as a new benchmark dataset for AVPR research.
1 code implementation • 23 May 2023 • Yang Bai, Min Cao, Daming Gao, Ziqiang Cao, Chen Chen, Zhenfeng Fan, Liqiang Nie, Min Zhang
RA offsets the overfitting risk by introducing a novel positive relation detection task (i. e., learning to distinguish strong and weak positive pairs).
Ranked #2 on Text based Person Retrieval on RSTPReid
no code implementations • 23 May 2023 • Lijun Li, Li'an Zhuo, Bang Zhang, Liefeng Bo, Chen Chen
Hand mesh reconstruction from the monocular image is a challenging task due to its depth ambiguity and severe occlusion, there remains a non-unique mapping between the monocular image and hand mesh.
1 code implementation • 23 May 2023 • Ruichen Wang, Zekang Chen, Chen Chen, Jian Ma, Haonan Lu, Xiaodong Lin
Our approach produces a more semantically accurate synthesis by constraining the attention regions of each token in the prompt to the image.
no code implementations • 22 May 2023 • Yang Bai, Jingyao Wang, Min Cao, Chen Chen, Ziqiang Cao, Liqiang Nie, Min Zhang
Text-based person search (TBPS) aims to retrieve the images of the target person from a large image gallery based on a given natural language description.
no code implementations • 19 May 2023 • Wenxuan Wang, Jing Liu, Xingjian He, Yisi Zhang, Chen Chen, Jiachen Shen, Yan Zhang, Jiangyun Li
Referring image segmentation (RIS) is a fundamental vision-language task that intends to segment a desired object from an image based on a given natural language expression.
1 code implementation • 16 May 2023 • Yuchen Hu, Ruizhe Li, Chen Chen, Heqing Zou, Qiushi Zhu, Eng Siong Chng
However, most existing AVSR approaches simply fuse the audio and visual features by concatenation, without explicit interactions to capture the deep correlations between them, which results in sub-optimal multimodal representations for downstream speech recognition task.
Audio-Visual Speech Recognition Automatic Speech Recognition +3
1 code implementation • 16 May 2023 • Heqing Zou, Meng Shen, Chen Chen, Yuchen Hu, Deepu Rajan, Eng Siong Chng
Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks.
1 code implementation • 8 May 2023 • Liangliang Cao, BoWen Zhang, Chen Chen, Yinfei Yang, Xianzhi Du, Wencong Zhang, Zhiyun Lu, Yantao Zheng
In this paper, we discuss two effective approaches to improve the efficiency and robustness of CLIP training: (1) augmenting the training dataset while maintaining the same number of optimization steps, and (2) filtering out samples that contain text regions in the image.
no code implementations • 2 May 2023 • Xingbo Fu, Chen Chen, Yushun Dong, Anil Vullikanti, Eili Klein, Gregory Madden, Jundong Li
In this paper, we propose a novel problem of antibiogram pattern prediction that aims to predict which patterns will appear in the future.
1 code implementation • 1 May 2023 • Yilei Hua, Wenhan Wu, Ce Zheng, Aidong Lu, Mengyuan Liu, Chen Chen, Shiqian Wu
This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations.
no code implementations • 28 Apr 2023 • Chen Chen, Junqing Zhang, Tianyu Lu, Magnus Sandell, Liquan Chen
Different from most previous works that adopt iterative optimisation to solve the problem, the proposed DNN-based algorithm directly obtains the BS precoding and IRS phase shifts as the output of the DNN.
1 code implementation • 27 Apr 2023 • Defeng Xie, Ruichen Wang, Jian Ma, Chen Chen, Haonan Lu, Dong Yang, Fobo Shi, Xiaodong Lin
We introduce a new generative system called Edit Everything, which can take image and text inputs and produce image outputs.
1 code implementation • 21 Apr 2023 • Wenxuan Wang, Jing Wang, Chen Chen, Jianbo Jiao, Yuanxiu Cai, Shanshan Song, Jiangyun Li
The research community has witnessed the powerful potential of self-supervised Masked Image Modeling (MIM), which enables the models capable of learning visual representation from unlabeled data.
no code implementations • 21 Apr 2023 • Wenxuan Wang, Jiachen Shen, Chen Chen, Jianbo Jiao, Jing Liu, Yan Zhang, Shanshan Song, Jiangyun Li
In this paper, we present the study on parameter-efficient transfer learning for medical volumetric segmentation and propose a new framework named Med-Tuning based on intra-stage feature enhancement and inter-stage feature interaction.
no code implementations • 18 Apr 2023 • Sicen Guo, Jiahang Li, Yi Feng, Dacheng Zhou, Denghuang Zhang, Chen Chen, Shuai Su, Xingyi Zhu, Qijun Chen, Rui Fan
To foster advancements in this burgeoning field, we have launched an online open-source benchmark suite, referred to as UDTIRI.
no code implementations • 11 Apr 2023 • Yuchen Hu, Chen Chen, Qiushi Zhu, Eng Siong Chng
Second, during finetuning we propose a Transformer-based code predictor to accurately predict clean codes by modeling the global dependency of input noisy representations, which enables discovery and restoration of high-quality clean representations with reduced distortions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 7 Apr 2023 • Xinshun Wang, Qiongjie Cui, Chen Chen, Shen Zhao, Mengyuan Liu
In recent years, Graph Convolutional Networks (GCNs) have been widely used in human motion prediction, but their performance remains unsatisfactory.
Ranked #3 on Human Pose Forecasting on Human3.6M
no code implementations • 6 Apr 2023 • Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang
Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.
no code implementations • CVPR 2023 • Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.
no code implementations • 31 Mar 2023 • Tao Bai, Chen Chen, Lingjuan Lyu, Jun Zhao, Bihan Wen
Recent studies show that models trained by continual learning can achieve the comparable performances as the standard supervised learning and the learning flexibility of continual learning models enables their wide applications in the real world.
3 code implementations • 31 Mar 2023 • Jian Ma, Mingjun Zhao, Chen Chen, Ruichen Wang, Di Niu, Haonan Lu, Xiaodong Lin
Recent breakthroughs in the field of language-guided image generation have yielded impressive achievements, enabling the creation of high-quality and diverse images based on user instructions. Although the synthesis performance is fascinating, one significant limitation of current image generation models is their insufficient ability to generate text coherently within images, particularly for complex glyph structures like Chinese characters.
Optical Character Recognition (OCR) Text-to-Image Generation
2 code implementations • CVPR 2023 • Qitao Zhao, Ce Zheng, Mengyuan Liu, Pichao Wang, Chen Chen
However, in real scenarios, the performance of PoseFormer and its follow-ups is limited by two factors: (a) The length of the input joint sequence; (b) The quality of 2D joint detection.
1 code implementation • CVPR 2023 • Ishan Rajendrakumar Dave, Mamshad Nayeem Rizve, Chen Chen, Mubarak Shah
We observe that these representations complement each other depending on the nature of the action.
1 code implementation • CVPR 2023 • Ce Zheng, Xianpeng Liu, Guo-Jun Qi, Chen Chen
In this paper, we propose a pure transformer architecture named POoling aTtention TransformER (POTTER) for the HMR task from single images.
Ranked #33 on 3D Human Pose Estimation on 3DPW
1 code implementation • ICCV 2023 • Andong Deng, Taojiannan Yang, Chen Chen
The goal of building a benchmark (suite of datasets) is to provide a unified protocol for fair evaluation and thus facilitate the evolution of a specific area.
no code implementations • 23 Mar 2023 • Ce Zheng, Xianpeng Liu, Mengyuan Liu, Tianfu Wu, Guo-Jun Qi, Chen Chen
While image-based HMR methods have achieved impressive results, they often struggle to recover humans in dynamic scenarios, leading to temporal inconsistencies and non-smooth 3D motion predictions due to the absence of human motion.
Ranked #56 on 3D Human Pose Estimation on 3DPW
1 code implementation • ICCV 2023 • Jie Zhang, Chen Chen, Weiming Zhuang, LingJuan Lv
This paper focuses on an under-explored yet important problem: Federated Class-Continual Learning (FCCL), where new classes are dynamically added in federated learning.
1 code implementation • CVPR 2023 • Jianyang Gu, Kai Wang, Hao Luo, Chen Chen, Wei Jiang, Yuqiang Fang, Shanghang Zhang, Yang You, Jian Zhao
Neural Architecture Search (NAS) has been increasingly appealing to the society of object Re-Identification (ReID), for that task-specific architectures significantly improve the retrieval performance.
Ranked #8 on Vehicle Re-Identification on VehicleID Large
no code implementations • 2 Mar 2023 • Chen Chen, Jie Fu, Lingjuan Lyu
AI Generated Content (AIGC) has received tremendous attention within the past few years, with content generated in the format of image, text, audio, video, etc.
no code implementations • 23 Feb 2023 • Chen Chen, Yuchen Hu, Weiwei Weng, Eng Siong Chng
Deep neural network based speech enhancement technique focuses on learning a noisy-to-clean transformation supervised by paired training data.
no code implementations • 23 Feb 2023 • Chen Chen, Yuchen Hu, Heqing Zou, Linhui Sun, Eng Siong Chng
Deep neural network based speech enhancement approaches aim to learn a noisy-to-clean transformation using a supervised learning paradigm.
1 code implementation • 22 Feb 2023 • Yuchen Hu, Chen Chen, Heqing Zou, Xionghu Zhong, Eng Siong Chng
To alleviate this problem, we propose a novel network to unify speech enhancement and separation with gradient modulation to improve noise-robustness.
1 code implementation • 22 Feb 2023 • Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng
In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 22 Feb 2023 • Bowen Zhao, Chen Chen, Qian-Wei Wang, Anfeng He, Shu-Tao Xia
For challenge B, we point out that the gradient contribution statistics can be a reliable indicator to inspect whether the optimization is dominated by bias-aligned samples.
no code implementations • 19 Feb 2023 • Jie Zhang, Bo Li, Chen Chen, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chao Wu
In this work, we propose a novel algorithm called Decision Boundary based Federated Adversarial Training (DBFAT), which consists of two components (local re-weighting and global regularization) to improve both accuracy and robustness of FL systems.
1 code implementation • 13 Feb 2023 • Yuchen Liu, Chen Chen, Lingjuan Lyu, Fangzhao Wu, Sai Wu, Gang Chen
In order to address this issue, we propose GAS, a \shorten approach that can successfully adapt existing robust AGRs to non-IID settings.
2 code implementations • ICCV 2023 • Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen
Geospatial technologies are becoming increasingly essential in our world for a wide range of applications, including agriculture, urban planning, and disaster response.
1 code implementation • 6 Feb 2023 • Taojiannan Yang, Yi Zhu, Yusheng Xie, Aston Zhang, Chen Chen, Mu Li
Recent vision transformer based video models mostly follow the ``image pre-training then finetuning" paradigm and have achieved great success on multiple video benchmarks.
Ranked #2 on Action Recognition on Diving-48 (using extra training data)
1 code implementation • 2 Feb 2023 • Junbo Zhao, Xuefei Ning, Enshu Liu, Binxin Ru, Zixuan Zhou, Tianchen Zhao, Chen Chen, Jiajin Zhang, Qingmin Liao, Yu Wang
In the first step, we train different sub-predictors on different types of available low-fidelity information to extract beneficial knowledge as low-fidelity experts.
no code implementations • 1 Feb 2023 • Chen Chen, Dylan Walker, Venkatesh Saligrama
We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs.
1 code implementation • 31 Jan 2023 • Ekkasit Pinyoanuntapong, Ayman Ali, Kalvik Jakkala, Pu Wang, Minwoo Lee, Qucheng Peng, Chen Chen, Zhi Sun
mmWave radar-based gait recognition is a novel user identification method that captures human gait biometrics from mmWave radar return signals.
no code implementations • 30 Jan 2023 • Chen Chen, BoWen Zhang, Liangliang Cao, Jiguang Shen, Tom Gunter, Albin Madappally Jose, Alexander Toshev, Jonathon Shlens, Ruoming Pang, Yinfei Yang
We extend the CLIP model and build a sparse text and image representation (STAIR), where the image and text are mapped to a sparse token space.
no code implementations • 30 Jan 2023 • Bowen Zhao, Chen Chen, Shu-Tao Xia
However, we find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.
no code implementations • 27 Jan 2023 • Chen Chen, Wei Emma Zhang, Alireza Seyed Shakeri, Makhmoor Fiza
Despite the great development of document summarisation techniques nowadays, factual inconsistencies between the generated summaries and the original texts still occur from time to time.
1 code implementation • 20 Jan 2023 • Zifan Wu, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo
In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples.
no code implementations • 19 Jan 2023 • Chen Chen, Junqing Zhang, Tianyu Lu, Magnus Sandell, Liquan Chen
Different from most previous works that adopt the iterative optimization to solve the problem, the proposed DNN based algorithm directly obtains the BS precoding and IRS phase shifts as the output of the DNN.
1 code implementation • 6 Jan 2023 • Song Wang, Yushun Dong, Kaize Ding, Chen Chen, Jundong Li
Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i. e., meta-training classes) and then generalize to classes with limited labeled nodes (i. e., meta-test classes).
no code implementations • 6 Jan 2023 • Zirou Qiu, Chen Chen, Madhav V. Marathe, S. S. Ravi, Daniel J. Rosenkrantz, Richard E. Stearns, Anil Vullikanti
Fixed points of such dynamical systems represent configurations to which the system converges.
no code implementations • CVPR 2023 • YuAn Wang, Kun Yu, Chen Chen, Xiyuan Hu, Silong Peng
To address this issue, we propose a Spatial-Frequency Dynamic Graph method to exploit the relation-aware features in spatial and frequency domains via dynamic graph learning.
1 code implementation • ICCV 2023 • Shaoyu Zhang, Chen Chen, Silong Peng
Specifically, complementary to the object-level classification loss for model discrimination, we design a generalized average precision (GAP) loss to explicitly optimize the global-level score ranking across different objects.
1 code implementation • CVPR 2023 • Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang
Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.
no code implementations • CVPR 2023 • Chen Chen, Daochang Liu, Siqi Ma, Surya Nepal, Chang Xu
However, apart from this standard utility, we identify the "reversed utility" as another crucial aspect, which computes the accuracy on generated data of a classifier trained using real data, dubbed as real2gen accuracy (r2g%).
no code implementations • ICCV 2023 • Saeed Vahidian, Sreevatsank Kadaveru, Woonjoon Baek, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, Bill Lin
Specifically, we aim to investigate how ordered learning principles can contribute to alleviating the heterogeneity effects in FL.
1 code implementation • 16 Dec 2022 • Zeju Li, Konstantinos Kamnitsas, Cheng Ouyang, Chen Chen, Ben Glocker
The results demonstrate that CoLab can guide the segmentation model to map the logits of background samples away from the decision boundary, resulting in significantly improved segmentation accuracy.
3 code implementations • 13 Dec 2022 • Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan
The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.
no code implementations • 10 Dec 2022 • Chen Chen, Yuchen Hu, Qiang Zhang, Heqing Zou, Beier Zhu, Eng Siong Chng
Audio-visual speech recognition (AVSR) has gained remarkable success for ameliorating the noise-robustness of speech recognition.
1 code implementation • ICCV 2023 • Jun Luo, Matias Mendieta, Chen Chen, Shandong Wu
Based on our observation, in this work, we propose Personalized Global Federated Learning (PGFed), a novel personalized FL framework that enables each client to personalize its own global objective by explicitly and adaptively aggregating the empirical risks of itself and other clients.
1 code implementation • 28 Nov 2022 • Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye
In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens.
1 code implementation • 28 Nov 2022 • Wenhao Pan, Anil Aswani, Chen Chen
A recent approach, based on integer programming, resolves this tension for nonnegative tensor completion.
no code implementations • 28 Nov 2022 • Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao
The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.
no code implementations • 17 Nov 2022 • Andong Deng, Taojiannan Yang, Chen Chen, Qian Chen, Leslie Neely, Sakiko Oyama
In such cases, automatic recognition systems based on computer vision and machine learning (in particular deep learning) technology can alleviate this issue to a large extent.
1 code implementation • 16 Nov 2022 • Taojiannan Yang, Linjie Yang, Xiaojie Jin, Chen Chen
In this paper, we revisit these training-free metrics and find that: (1) the number of parameters (\#Param), which is the most straightforward training-free metric, is overlooked in previous works but is surprisingly effective, (2) recent training-free metrics largely rely on the \#Param information to rank networks.
1 code implementation • 27 Oct 2022 • Ekkasit Pinyoanuntapong, Ayman Ali, Pu Wang, Minwoo Lee, Chen Chen
Most existing gait recognition methods are appearance-based, which rely on the silhouettes extracted from the video data of human walking activities.
Ranked #6 on Multiview Gait Recognition on CASIA-B
1 code implementation • 21 Oct 2022 • Song Wang, Chen Chen, Jundong Li
Therefore, to adaptively learn node representations across meta-tasks, we propose a novel framework that learns a task-specific structure for each meta-task.
no code implementations • 20 Oct 2022 • Min Cao, Cong Ding, Chen Chen, Junchi Yan, Qi Tian
Based on a natural assumption that images belonging to the same person identity should not match with images belonging to multiple different person identities across views, called the unicity of person matching on the identity level, we propose an end-to-end person unicity matching architecture for learning and refining the person matching relations.
no code implementations • 12 Oct 2022 • Shuo Wang, Chen Qin, Chengyan Wang, Kang Wang, Haoran Wang, Chen Chen, Cheng Ouyang, Xutong Kuang, Chengliang Dai, Yuanhan Mo, Zhang Shi, Chenchen Dai, Xinrong Chen, He Wang, Wenjia Bai
The quality of cardiac magnetic resonance (CMR) imaging is susceptible to respiratory motion artifacts.
no code implementations • 8 Oct 2022 • Zeyu Gao, Yao Mu, Chen Chen, Jingliang Duan, Shengbo Eben Li, Ping Luo, YanFeng Lu
End-to-end autonomous driving provides a feasible way to automatically maximize overall driving system performance by directly mapping the raw pixels from a front-facing camera to control signals.
7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.
no code implementations • 4 Oct 2022 • Guangyu Sun, Umar Khalid, Matias Mendieta, Taojiannan Yang, Chen Chen
Recently, the use of small pre-trained models has been shown effective in federated learning optimization and improving convergence.
1 code implementation • 30 Sep 2022 • Mahdi Morafah, Saeed Vahidian, Chen Chen, Mubarak Shah, Bill Lin
Though successful, federated learning presents new challenges for machine learning, especially when the issue of data heterogeneity, also known as Non-IID data, arises.
1 code implementation • 21 Sep 2022 • Saeed Vahidian, Mahdi Morafah, Weijia Wang, Vyacheslav Kungurtsev, Chen Chen, Mubarak Shah, Bill Lin
This small set of principal vectors is provided to the server so that the server can directly identify distribution similarities among the clients to form clusters.
1 code implementation • COLING 2022 • Chen Chen, YuFei Wang, Bing Li, Kwok-Yan Lam
To remedy the KG structure information loss from the "flat" text, we further improve the input representations of entities and relations, and the inference algorithm in KG-S2S.
no code implementations • 9 Sep 2022 • Chen Chen, Yue Dai, Josiah Poon, Caren Han
Text-based games(TBG) are complex environments which allow users or computer agents to make textual interactions and achieve game goals. In TBG agent design and training process, balancing the efficiency and performance of the agent models is a major challenge.
no code implementations • 1 Sep 2022 • Wenhan Wu, Yilei Hua, Ce Zheng, Shiqian Wu, Chen Chen, Aidong Lu
Given the unmasked skeleton sequence, the encoder is fine-tuned for the action recognition task.
1 code implementation • 31 Aug 2022 • Xiaoqin Wang, Chen Chen, Rushi Lan, Licheng Liu, Zhenbing Liu, Huiyu Zhou, Xiaonan Luo
Different personalized subspaces are constructed to reflect category-specific attributes for different clusters, adaptively mapping instances within the same cluster to the same Hamming space.
no code implementations • 22 Aug 2022 • Qucheng Peng, Zhengming Ding, Lingjuan Lyu, Lichao Sun, Chen Chen
For the input-level, we design a new data augmentation technique as Phase MixUp, which highlights task-relevant objects in the interpolations, thus enhancing input-level regularization and class consistency for target models.
1 code implementation • 4 Aug 2022 • Cheng Ouyang, Shuo Wang, Chen Chen, Zeju Li, Wenjia Bai, Bernhard Kainz, Daniel Rueckert
In image segmentation, well-calibrated probabilities allow radiologists to identify regions where model-predicted segmentations are unreliable.
no code implementations • 24 Jul 2022 • Xingbo Fu, Binchi Zhang, Yushun Dong, Chen Chen, Jundong Li
Federated Graph Machine Learning (FGML) is a promising solution to tackle this challenge by training graph machine learning models in a federated manner.
1 code implementation • 21 Jul 2022 • Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui, Chia-Wen Lin
Convolutional neural network (CNN) and Transformer have achieved great success in multimedia applications.
1 code implementation • 20 Jul 2022 • Zeju Li, Konstantinos Kamnitsas, Mobarakol Islam, Chen Chen, Ben Glocker
If we could estimate the performance that a pre-trained model would achieve on data from a specific deployment setting, for example a certain clinic, we could judge whether the model could safely be deployed or if its performance degrades unacceptably on the specific data.
no code implementations • 16 Jul 2022 • Jiahao Qi, Zhiqiang Gong, Xingyue Liu, Kangcheng Bin, Chen Chen, YongQian Li, Wei Xue, Yu Zhang, Ping Zhong
Deep learning methodology contributes a lot to the development of hyperspectral image (HSI) analysis community.
1 code implementation • 14 Jul 2022 • Chen Chen, Yisen Wang, Honghua Chen, Xuefeng Yan, Dayong Ren, Yanwen Guo, Haoran Xie, Fu Lee Wang, Mingqiang Wei
Semantic segmentation of point clouds, aiming to assign each point a semantic category, is critical to 3D scene understanding. Despite of significant advances in recent years, most of existing methods still suffer from either the object-level misclassification or the boundary-level ambiguity.
1 code implementation • 6 Jul 2022 • Shruti Vyas, Chen Chen, Mubarak Shah
There are no existing datasets for this problem, therefore we propose GAMa dataset, a large-scale dataset with ground videos and corresponding aerial images.