Search Results for author: Xiaokang Yang

Found 173 papers, 68 papers with code

Layered Neighborhood Expansion for Incremental Multiple Graph Matching

1 code implementation • ECCV 2020 • Zixuan Chen, Zhihui Xie, Junchi Yan Yinqiang Zheng, Xiaokang Yang

In this paper, we treat the graphs as graphs on a super-graph, and propose a novel breadth first search based method for expanding the neighborhood on the super-graph for a new coming graph, such that the matching with the new graph can be efficiently performed within the constructed neighborhood.

Graph Matching

Paper
Code

Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research

no code implementations • 14 May 2024 • Qinglong Cao, Yuntian Chen, Lu Lu, Hao Sun, Zhenzhong Zeng, Xiaokang Yang, Dongxiao Zhang

Our framework paves the way for sustainable and inclusive VLM research, transcending the barriers between academia and industry.

Domain Adaptation

Paper
Add Code

IPAD: Industrial Process Anomaly Detection Dataset

no code implementations • 23 Apr 2024 • Jinfan Liu, Yichao Yan, Junjie Li, Weiming Zhao, Pengzhi Chu, Xingdong Sheng, Yunhui Liu, Xiaokang Yang

Video anomaly detection (VAD) is a challenging task aiming to recognize anomalies in video frames, and existing large-scale VAD researches primarily focus on road traffic and human activity scenes.

Anomaly Detection Video Anomaly Detection +1

Paper
Add Code

Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting

no code implementations • 22 Apr 2024 • Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang

Text-to-image (T2I) customization aims to create images that embody specific visual concepts delineated in textual descriptions.

Paper
Add Code

Rethinking Clothes Changing Person ReID: Conflicts, Synthesis, and Optimization

no code implementations • 19 Apr 2024 • Junjie Li, Guanshuo Wang, Fufu Yu, Yichao Yan, Qiong Jia, Shouhong Ding, Xingdong Sheng, Yunhui Liu, Xiaokang Yang

However, such improvement sacrifices the performance under the standard protocol, caused by the inner conflict between standard and CC.

Clothes Changing Person Re-Identification

Paper
Add Code

Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation

no code implementations • 18 Apr 2024 • Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen

However, a scenario usually arises where a pixel is concurrently predicted as an old class by the pre-trained segmentation model and a new class by the seed areas.

Incremental Learning Segmentation +1

Paper
Add Code

NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results

1 code implementation • 15 Apr 2024 • Zheng Chen, Zongwei Wu, Eduard Zamfir, Kai Zhang, Yulun Zhang, Radu Timofte, Xiaokang Yang, Hongyuan Yu, Cheng Wan, Yuxin Hong, Zhijuan Huang, Yajun Zou, Yuan Huang, Jiamin Lin, Bingnan Han, Xianyu Guan, Yongsheng Yu, Daoan Zhang, Xuanwu Yin, Kunlong Zuo, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Hongyu An, Xinfeng Zhang, Zhiyuan Song, Ziyue Dong, Qing Zhao, Xiaogang Xu, Pengxu Wei, Zhi-chao Dou, Gui-ling Wang, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Cansu Korkmaz, A. Murat Tekalp, Yubin Wei, Xiaole Yan, Binren Li, Haonan Chen, Siqi Zhang, Sihan Chen, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi, Anjali Sarvaiya, Pooja Choksy, Jagrit Joshi, Shubh Kawa, Kishor Upla, Sushrut Patwardhan, Raghavendra Ramachandra, Sadat Hossain, Geongi Park, S. M. Nadim Uddin, Hao Xu, Yanhui Guo, Aman Urumbekov, Xingzhuo Yan, Wei Hao, Minghan Fu, Isaac Orais, Samuel Smith, Ying Liu, Wangwang Jia, Qisheng Xu, Kele Xu, Weijun Yuan, Zhan Li, Wenqin Kuang, Ruijin Guan, Ruting Deng, Zhao Zhang, Bo wang, Suiyi Zhao, Yan Luo, Yanyan Wei, Asif Hussain Khan, Christian Micheloni, Niki Martinel

This paper reviews the NTIRE 2024 challenge on image super-resolution ($\times$4), highlighting the solutions proposed and the outcomes obtained.

Image Super-Resolution valid

Paper
Code

Monocular Identity-Conditioned Facial Reflectance Reconstruction

no code implementations • 30 Mar 2024 • Xingyu Ren, Jiankang Deng, Yuhao Cheng, Jia Guo, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang

We first learn a high-quality prior for facial reflectance.

3D Face Reconstruction

Paper
Add Code

UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment

1 code implementation • 25 Mar 2024 • Kaipeng Zeng, Bo Yang, Xin Zhao, Yu Zhang, Fan Nie, Xiaokang Yang, Yaohui Jin, Yanyan Xu

Single-step retrosynthesis prediction, a crucial step in the planning process, has witnessed a surge in interest in recent years due to advancements in AI for science.

Graph-to-Sequence molecular representation +3

Paper
Code

EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting

no code implementations • 22 Mar 2024 • Kailing Wang, Chen Yang, Yuehao Wang, Sikuang Li, Yan Wang, Qi Dou, Xiaokang Yang, Wei Shen

Precise camera tracking, high-fidelity 3D tissue reconstruction, and real-time online visualization are critical for intrabody medical imaging devices such as endoscopes and capsule robots.

Simultaneous Localization and Mapping

Paper
Add Code

ReGenNet: Towards Human Action-Reaction Synthesis

no code implementations • 18 Mar 2024 • Liang Xu, Yizhou Zhou, Yichao Yan, Xin Jin, Wenhan Zhu, Fengyun Rao, Xiaokang Yang, Wenjun Zeng

Humans constantly interact with their surrounding environments.

Decoder

Paper
Add Code

Boundary Matters: A Bi-Level Active Finetuning Framework

no code implementations • 15 Mar 2024 • Han Lu, Yichen Xie, Xiaokang Yang, Junchi Yan

In this paper, we propose a Bi-Level Active Finetuning framework to select the samples for annotation in one shot, which includes two stages: core sample selection for diversity, and boundary sample selection for uncertainty.

Active Learning Denoising

Paper
Add Code

Comparison of No-Reference Image Quality Models via MAP Estimation in Diffusion Latents

no code implementations • 11 Mar 2024 • Weixia Zhang, Dingquan Li, Guangtao Zhai, Xiaokang Yang, Kede Ma

Contemporary no-reference image quality assessment (NR-IQA) models can effectively quantify the perceived image quality, with high correlations between model predictions and human perceptual scores on fixed test sets.

Image Enhancement No-Reference Image Quality Assessment +1

Paper
Add Code

A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos

1 code implementation • 11 Mar 2024 • Weixia Zhang, Chengguang Zhu, Jingnan Gao, Yichao Yan, Guangtao Zhai, Xiaokang Yang

However, performance evaluation research lags behind the development of talking head generation techniques.

Talking Head Generation

Paper
Code

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

1 code implementation • 7 Mar 2024 • Yuanhao Cai, Yixun Liang, Jiahao Wang, Angtian Wang, Yulun Zhang, Xiaokang Yang, Zongwei Zhou, Alan Yuille

X-ray is widely applied for transmission imaging due to its stronger penetration than natural light.

Novel View Synthesis

Paper
Code

ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving

no code implementations • 5 Mar 2024 • Han Lu, Xiaosong Jia, Yichen Xie, Wenlong Liao, Xiaokang Yang, Junchi Yan

End-to-end differentiable learning for autonomous driving (AD) has recently become a prominent paradigm.

Active Learning Autonomous Driving +1

Paper
Add Code

Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach

no code implementations • 1 Mar 2024 • Han Lu, Siyu Sun, Yichen Xie, Liqing Zhang, Xiaokang Yang, Junchi Yan

In the long-tailed recognition field, the Decoupled Training paradigm has demonstrated remarkable capabilities among various methods.

Representation Learning

Paper
Add Code

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

no code implementations • 6 Feb 2024 • Junchao Gong, Lei Bai, Peng Ye, Wanghan Xu, Na Liu, Jianhua Dai, Xiaokang Yang, Wanli Ouyang

Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management.

Management

Paper
Add Code

Poisson Process for Bayesian Optimization

no code implementations • 5 Feb 2024 • Xiaoxing Wang, Jiaxing Li, Chao Xue, Wei Liu, Weifeng Liu, Xiaokang Yang, Junchi Yan, DaCheng Tao

BayesianOptimization(BO) is a sample-efficient black-box optimizer, and extensive methods have been proposed to build the absolute function response of the black-box function through a probabilistic surrogate model, including Tree-structured Parzen Estimator (TPE), random forest (SMAC), and Gaussian process (GP).

Bayesian Optimization Hyperparameter Optimization +2

Paper
Add Code

Few-Shot Class-Incremental Learning with Prior Knowledge

1 code implementation • 2 Feb 2024 • Wenhao Jiang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

To tackle the issues of catastrophic forgetting and overfitting in few-shot class-incremental learning (FSCIL), previous work has primarily concentrated on preserving the memory of old knowledge during the incremental phase.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Code

Vision-Informed Flow Image Super-Resolution with Quaternion Spatial Modeling and Dynamic Flow Convolution

no code implementations • 29 Jan 2024 • Qinglong Cao, Zhengqin Xu, Chao Ma, Xiaokang Yang, Yuntian Chen

To tackle this dilemma, we comprehensively consider the flow visual properties, including the unique flow imaging principle and morphological information, and propose the first flow visual property-informed FISR algorithm.

Image Super-Resolution

Paper
Add Code

Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning

1 code implementation • 9 Jan 2024 • Kuo Yang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang

This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes.

Pseudo Label

Paper
Code

Inter-X: Towards Versatile Human-Human Interaction Analysis

no code implementations • 26 Dec 2023 • Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang

We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.

Paper
Add Code

Efficient Deformable Tissue Reconstruction via Orthogonal Neural Plane

2 code implementations • 23 Dec 2023 • Chen Yang, Kailing Wang, Yuehao Wang, Qi Dou, Xiaokang Yang, Wei Shen

Intraoperative imaging techniques for reconstructing deformable tissues in vivo are pivotal for advanced surgical systems.

Paper
Code

Appeal: Allow Mislabeled Samples the Chance to be Rectified in Partial Label Learning

no code implementations • 18 Dec 2023 • Chongjie Si, Xuehui Wang, Yan Wang, Xiaokang Yang, Wei Shen

In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth.

Partial Label Learning

Paper
Add Code

VidToMe: Video Token Merging for Zero-Shot Video Editing

no code implementations • 17 Dec 2023 • Xirui Li, Chao Ma, Xiaokang Yang, Ming-Hsuan Yang

In this work, we propose a novel approach to enhance temporal consistency in generated videos by merging self-attention tokens across frames.

Video Editing Video Generation

Paper
Add Code

Domain Prompt Learning with Quaternion Networks

no code implementations • 12 Dec 2023 • Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

Specifically, the proposed method involves using domain-specific vision features from domain-specific foundation models to guide the transformation of generalized contextual embeddings from the language branch into a specialized space within the quaternion networks.

Contrastive Learning

Paper
Add Code

Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

no code implementations • 8 Dec 2023 • Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang

Existing scene text detection methods typically rely on extensive real data for training.

Scene Text Detection Text Detection

Paper
Add Code

Image Super-Resolution with Text Prompt Diffusion

1 code implementation • 24 Nov 2023 • Zheng Chen, Yulun Zhang, Jinjin Gu, Xin Yuan, Linghe Kong, Guihai Chen, Xiaokang Yang

Specifically, we first design a text-image generation pipeline to integrate text into the SR dataset through the text degradation representation and degradation model.

Image Generation Image Super-Resolution +1

Paper
Code

Binarized 3D Whole-body Human Mesh Recovery

1 code implementation • 24 Nov 2023 • Zhiteng Li, Yulun Zhang, Jing Lin, Haotong Qin, Jinjin Gu, Xin Yuan, Linghe Kong, Xiaokang Yang

In this work, we propose a Binarized Dual Residual Network (BiDRN), a novel quantization method to estimate the 3D human body, face, and hands parameters efficiently.

Binarization Human Mesh Recovery +1

Paper
Code

EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices

no code implementations • 16 Nov 2023 • Jingnan Gao, Zhuo Chen, Yichao Yan, Bowen Pan, Zhe Wang, Jiangjing Lyu, Xiaokang Yang

In our method, we first employ an efficient surface-based model with a multi-view supervision module to ensure accurate mesh reconstruction.

3D Reconstruction Surface Reconstruction

Paper
Add Code

Generalizable Person Search on Open-world User-Generated Video Content

no code implementations • 16 Oct 2023 • Junjie Li, Guanshuo Wang, Yichao Yan, Fufu Yu, Qiong Jia, Jie Qin, Shouhong Ding, Xiaokang Yang

Person search is a challenging task that involves detecting and retrieving individuals from a large set of un-cropped scene images.

Domain Generalization Person Search

Paper
Add Code

Domain-Controlled Prompt Learning

1 code implementation • 30 Sep 2023 • Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang

Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms, leading to suboptimal performance due to the misinterpretation of specific images in natural image patterns.

Paper
Code

Directional Texture Editing for 3D Models

no code implementations • 26 Sep 2023 • Shengqi Liu, Zhuo Chen, Jingnan Gao, Yichao Yan, Wenhan Zhu, Jiangjing Lyu, Xiaokang Yang

However, the inherent complexity of 3D models and the ambiguous text description lead to the challenge in this task.

3D Object Editing

Paper
Add Code

SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

no code implementations • 28 Aug 2023 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Xiaokang Yang, Wei Shen

Most existing fine-tuning methods attempt to bridge the gaps among different scenarios by introducing a set of new parameters to modify SAM's original parameter space.

Segmentation Semantic Segmentation

Paper
Add Code

Dual Aggregation Transformer for Image Super-Resolution

1 code implementation • ICCV 2023 • Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang, Fisher Yu

Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner.

Ranked #6 on Image Super-Resolution on Manga109 - 4x upscaling

Image Super-Resolution

325

Paper
Code

Vid2Act: Activate Offline Videos for Visual RL

no code implementations • 6 Jun 2023 • Minting Pan, Yitao Zheng, Wendong Zhang, Yunbo Wang, Xiaokang Yang

Pretraining RL models on offline video datasets is a promising way to improve their training efficiency in online tasks, but challenging due to the inherent mismatch in tasks, dynamics, and behaviors across domains.

Knowledge Distillation

Paper
Add Code

Reflection Invariance Learning for Few-shot Semantic Segmentation

no code implementations • 1 Jun 2023 • Qinglong Cao, Yuntian Chen, Chao Ma, Xiaokang Yang

Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.

Few-Shot Semantic Segmentation Segmentation +1

Paper
Add Code

Neural LerPlane Representations for Fast 4D Reconstruction of Deformable Tissues

2 code implementations • 31 May 2023 • Chen Yang, Kailing Wang, Yuehao Wang, Xiaokang Yang, Wei Shen

Reconstructing deformable tissues from endoscopic stereo videos in robotic surgery is crucial for various clinical applications.

4D reconstruction

Paper
Code

Few-Shot Rotation-Invariant Aerial Image Semantic Segmentation

1 code implementation • 29 May 2023 • Qinglong Cao, Yuntian Chen, Chao Ma, Xiaokang Yang

Few-shot aerial image segmentation is a challenging task that involves precisely parsing objects in query aerial images with limited annotated support.

Image Segmentation Segmentation +1

Paper
Code

Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning

no code implementations • 24 May 2023 • Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang

Training offline reinforcement learning (RL) models using visual inputs poses two significant challenges, i. e., the overfitting problem in representation learning and the overestimation bias for expected future rewards.

Offline RL Reinforcement Learning (RL) +2

Paper
Add Code

DynaVol: Unsupervised Learning for Dynamic Scenes through Object-Centric Voxelization

no code implementations • 30 Apr 2023 • Yanpeng Zhao, Siyu Gao, Yunbo Wang, Xiaokang Yang

The voxel features and global features are complementary and are both leveraged by a compositional NeRF decoder for volume rendering.

Decoder Neural Rendering +4

Paper
Add Code

HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks

no code implementations • 19 Apr 2023 • Zhuo Chen, Xudong Xu, Yichao Yan, Ye Pan, Wenhan Zhu, Wayne Wu, Bo Dai, Xiaokang Yang

While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance.

Attribute

Paper
Add Code

NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds

no code implementations • CVPR 2023 • Chen Yang, Peihao Li, Zanwei Zhou, Shanxin Yuan, Bingbing Liu, Xiaokang Yang, Weichao Qiu, Wei Shen

We present NeRFVS, a novel neural radiance fields (NeRF) based method to enable free navigation in a room.

Paper
Add Code

FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead

1 code implementation • 6 Apr 2023 • Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, Wanli Ouyang

We present FengWu, an advanced data-driven global medium-range weather forecast system based on Artificial Intelligence (AI).

Paper
Code

DRAC: Diabetic Retinopathy Analysis Challenge with Ultra-Wide Optical Coherence Tomography Angiography Images

no code implementations • 5 Apr 2023 • Bo Qian, Hao Chen, Xiangning Wang, Haoxuan Che, Gitaek Kwon, Jaeyoung Kim, Sungjin Choi, Seoyoung Shin, Felix Krause, Markus Unterdechler, Junlin Hou, Rui Feng, Yihao Li, Mostafa El Habib Daho, Qiang Wu, Ping Zhang, Xiaokang Yang, Yiyu Cai, Weiping Jia, Huating Li, Bin Sheng

Computer-assisted automatic analysis of diabetic retinopathy (DR) is of great importance in reducing the risks of vision loss and even blindness.

Benchmarking Data Augmentation +1

Paper
Add Code

Head3D: Complete 3D Head Generation via Tri-plane Feature Distillation

no code implementations • 28 Mar 2023 • Yuhao Cheng, Yichao Yan, Wenhan Zhu, Ye Pan, Bowen Pan, Xiaokang Yang

Head generation with diverse identities is an important task in computer vision and computer graphics, widely used in multimedia applications.

Paper
Add Code

Model-Based Reinforcement Learning with Isolated Imaginations

1 code implementation • 27 Mar 2023 • Minting Pan, Xiangming Zhu, Yitao Zheng, Yunbo Wang, Xiaokang Yang

On top of our previous work, we further consider the sparse dependencies between controllable and noncontrollable states, address the training collapse problem of state decoupling, and validate our approach in transfer learning setups.

Autonomous Driving Model-based Reinforcement Learning +3

Paper
Code

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

1 code implementation • CVPR 2023 • Weixia Zhang, Guangtao Zhai, Ying WEI, Xiaokang Yang, Kede Ma

We aim at advancing blind image quality assessment (BIQA), which predicts the human perception of image quality without any reference information.

Blind Image Quality Assessment Scene Classification

137

Paper
Code

Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm

1 code implementation • CVPR 2023 • Yichen Xie, Han Lu, Junchi Yan, Xiaokang Yang, Masayoshi Tomizuka, Wei Zhan

We propose a novel method called ActiveFT for active finetuning task to select a subset of data distributing similarly with the entire unlabeled pool and maintaining enough diversity by optimizing a parametric model in the continuous space.

Image Classification Semantic Segmentation

Paper
Code

EasyDGL: Encode, Train and Interpret for Continuous-time Dynamic Graph Learning

1 code implementation • 22 Mar 2023 • Chao Chen, Haoyu Geng, Nianzu Yang, Xiaokang Yang, Junchi Yan

Dynamic graphs arise in various real-world applications, and it is often welcomed to model the dynamics directly in continuous time domain for its flexibility.

Dynamic Link Prediction Dynamic Node Classification +4

122

Paper
Code

Improving Masked Autoencoders by Learning Where to Mask

no code implementations • 12 Mar 2023 • Haijian Chen, Wendong Zhang, Yunbo Wang, Xiaokang Yang

Masked image modeling is a promising self-supervised learning method for visual data.

Image Reconstruction Self-Supervised Learning

Paper
Add Code

Predictive Experience Replay for Continual Visual Control and Forecasting

2 code implementations • 12 Mar 2023 • Wendong Zhang, Geng Chen, Xiangming Zhu, Siyu Gao, Yunbo Wang, Xiaokang Yang

In this paper, we present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.

Continual Learning Model-based Reinforcement Learning +2

Paper
Code

Recursive Generalization Transformer for Image Super-Resolution

1 code implementation • 11 Mar 2023 • Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xiaokang Yang

In this work, we propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images.

Ranked #5 on Image Super-Resolution on Manga109 - 4x upscaling

Image Reconstruction Image Super-Resolution

Paper
Code

Xformer: Hybrid X-Shaped Transformer for Image Denoising

1 code implementation • 11 Mar 2023 • Jiale Zhang, Yulun Zhang, Jinjin Gu, Jiahua Dong, Linghe Kong, Xiaokang Yang

The channel-wise Transformer block performs direct global context interactions across tokens defined by channel dimension.

Decoder Image Denoising

Paper
Code

Graph Signal Sampling for Inductive One-Bit Matrix Completion: a Closed-form Solution

1 code implementation • 8 Feb 2023 • Chao Chen, Haoyu Geng, Gang Zeng, Zhaobing Han, Hua Chai, Xiaokang Yang, Junchi Yan

Inductive one-bit matrix completion is motivated by modern applications such as recommender systems, where new users would appear at test stage with the ratings consisting of only ones and no zeros.

Matrix Completion Recommendation Systems

Paper
Code

Deep Learning of Partial Graph Matching via Differentiable Top-K

1 code implementation • CVPR 2023 • Runzhong Wang, Ziao Guo, Shaofei Jiang, Xiaokang Yang, Junchi Yan

Graph matching (GM) aims at discovering node matching between graphs, by maximizing the node- and edge-wise affinities between the matched elements.

Ranked #1 on Graph Matching on Willow Object Class (F1 score metric)

Graph Matching Stereo Matching

797

Paper
Code

Improving Fairness in Facial Albedo Estimation via Visual-Textual Cues

no code implementations • CVPR 2023 • Xingyu Ren, Jiankang Deng, Chao Ma, Yichao Yan, Xiaokang Yang

Our key insight is that intrinsic semantic attributes such as race, skin color, and age can constrain the albedo map.

3D Face Reconstruction Fairness +1

Paper
Add Code

3D-Aware Face Swapping

no code implementations • CVPR 2023 • Yixuan Li, Chao Ma, Yichao Yan, Wenhan Zhu, Xiaokang Yang

To achieve this, we take advantage of the strong geometry and texture prior of 3D human faces, where the 2D faces are projected into the latent space of a 3D generative model.

Attribute Face Swapping

Paper
Add Code

A K-variate Time Series Is Worth K Words: Evolution of the Vanilla Transformer Architecture for Long-term Multivariate Time Series Forecasting

no code implementations • 6 Dec 2022 • Zanwei Zhou, RuiZhe Zhong, Chen Yang, Yan Wang, Xiaokang Yang, Wei Shen

In this study, we point out that the current tokenization strategy in MTSF Transformer architectures ignores the token uniformity inductive bias of Transformers.

Decoder Inductive Bias +2

Paper
Add Code

Self-supervised Character-to-Character Distillation for Text Recognition

1 code implementation • ICCV 2023 • Tongkun Guan, Wei Shen, Xue Yang, Qi Feng, Zekun Jiang, Xiaokang Yang

Therefore, exploring the robust text feature representations on unlabeled real images by self-supervised learning is a good solution.

Ranked #1 on self-supervised scene text recognition on Scene Text Recognition Benchmarks

Data Augmentation Representation Learning +7

132

Paper
Code

Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition

no code implementations • 13 Oct 2022 • Shuai Jia, Bangjie Yin, Taiping Yao, Shouhong Ding, Chunhua Shen, Xiaokang Yang, Chao Ma

For face recognition attacks, existing methods typically generate the l_p-norm perturbations on pixels, however, resulting in low attack transferability and high vulnerability to denoising defense models.

Adversarial Attack Attribute +2

Paper
Add Code

Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop

1 code implementation • 3 Oct 2022 • Weixia Zhang, Dingquan Li, Xiongkuo Min, Guangtao Zhai, Guodong Guo, Xiaokang Yang, Kede Ma

No-reference image quality assessment (NR-IQA) aims to quantify how humans perceive visual distortions of digital images without access to their undistorted references.

No-Reference Image Quality Assessment NR-IQA

Paper
Code

Perceptual Quality Assessment of Omnidirectional Images

no code implementations • 6 Jul 2022 • Huiyu Duan, Guangtao Zhai, Xiongkuo Min, Yucheng Zhu, Yi Fang, Xiaokang Yang

The original and distorted omnidirectional images, subjective quality ratings, and the head and eye movement data together constitute the OIQA database.

Image Quality Assessment

Paper
Add Code

A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian

Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.

Image Segmentation Instance Segmentation +2

Paper
Add Code

Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models

2 code implementations • 27 May 2022 • Minting Pan, Xiangming Zhu, Yunbo Wang, Xiaokang Yang

First, by optimizing the inverse dynamics, we encourage the world model to learn controllable and noncontrollable sources of spatiotemporal changes on isolated state transition branches.

Autonomous Driving Decision Making

Paper
Code

DOTIN: Dropping Task-Irrelevant Nodes for GNNs

no code implementations • 28 Apr 2022 • Shaofeng Zhang, Feng Zhu, Junchi Yan, Rui Zhao, Xiaokang Yang

Scalability is an important consideration for deep graph neural networks.

Graph Classification Graph Learning

Paper
Add Code

Continual Predictive Learning from Videos

1 code implementation • CVPR 2022 • Geng Chen, Wendong Zhang, Han Lu, Siyu Gao, Yunbo Wang, Mingsheng Long, Xiaokang Yang

Can we develop predictive learning algorithms that can deal with more realistic, non-stationary physical environments?

Continual Learning Test-time Adaptation +1

Paper
Code

Confusing Image Quality Assessment: Towards Better Augmented Reality Experience

1 code implementation • 11 Apr 2022 • Huiyu Duan, Xiongkuo Min, Yucheng Zhu, Guangtao Zhai, Xiaokang Yang, Patrick Le Callet

An objective metric termed CFIQA is also proposed to better evaluate the confusing image quality.

Image Quality Assessment

Paper
Code

Modeling Dynamic User Preference via Dictionary Learning for Sequential Recommendation

1 code implementation • IEEE Transactions on Knowledge and Data Engineering 2021 • Chao Chen, Dongsheng Li, Junchi Yan, Xiaokang Yang

Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time.

Dictionary Learning Sequential Recommendation

Paper
Code

Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation

1 code implementation • 30 Mar 2022 • Chao Chen, Haoyu Geng, Nianzu Yang, Junchi Yan, Daiyue Xue, Jianping Yu, Xiaokang Yang

User interests are usually dynamic in the real world, which poses both theoretical and practical challenges for learning accurate preferences from rich behavior data.

Dynamic Link Prediction Sequential Recommendation

Paper
Code

Exploring Frequency Adversarial Attacks for Face Forgery Detection

no code implementations • CVPR 2022 • Shuai Jia, Chao Ma, Taiping Yao, Bangjie Yin, Shouhong Ding, Xiaokang Yang

In addition, the proposed frequency attack enhances the transferability across face forgery detectors as black-box attacks.

Adversarial Attack Meta-Learning

Paper
Add Code

Analysis Method of Strapdown Inertial Navigation Error Distribution Based on Covariance Matrix Decomposition

1 code implementation • 22 Mar 2022 • Xiaokang Yang, Gongmin Yan, Fan Liu, Bofan Guan, Sihai Li

Compared with the Monte-Carlo method and other method based on covariance matrix, the proposed method uses more complete error model, considers the interaction effect of error sources and can be easily realized with less computation.

Paper
Code

EAutoDet: Efficient Architecture Search for Object Detection

no code implementations • 21 Mar 2022 • Xiaoxing Wang, Jiale Lin, Junchi Yan, Juanping Zhao, Xiaokang Yang

In contrast, this paper introduces an efficient framework, named EAutoDet, that can discover practical backbone and FPN architectures for object detection in 1. 4 GPU-days.

Ranked #30 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +1

Paper
Add Code

Facial Geometric Detail Recovery via Implicit Representation

1 code implementation • 18 Mar 2022 • Xingyu Ren, Alexandros Lattas, Baris Gecer, Jiankang Deng, Chao Ma, Xiaokang Yang, Stefanos Zafeiriou

Learning a dense 3D model with fine-scale details from a single facial image is highly challenging and ill-posed.

21,487

Paper
Code

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

no code implementations • ICCV 2023 • Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu

We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.

Paper
Add Code

DialogueNeRF: Towards Realistic Avatar Face-to-Face Conversation Video Generation

no code implementations • 15 Mar 2022 • Yichao Yan, Zanwei Zhou, Zi Wang, Jingnan Gao, Xiaokang Yang

In this paper, we propose a novel unified framework based on neural radiance field (NeRF) to address this task.

Talking Head Generation Video Generation

Paper
Add Code

Self-supervised Implicit Glyph Attention for Text Recognition

1 code implementation • CVPR 2023 • Tongkun Guan, Chaochen Gu, Jingzheng Tu, Xue Yang, Qi Feng, Yudi Zhao, Xiaokang Yang, Wei Shen

Supervised attention can alleviate the above issue, but it is character category-specific, which requires extra laborious character-level bounding box annotations and would be memory-intensive when handling languages with larger character categories.

Ranked #2 on Scene Text Recognition on ICDAR 2003

Scene Text Recognition Text Segmentation

102

Paper
Code

NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural Radiance Fields

no code implementations • 3 Mar 2022 • Shanyan Guan, Huayu Deng, Yunbo Wang, Xiaokang Yang

Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids.

Paper
Add Code

DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering

no code implementations • 3 Jan 2022 • Shunyu Yao, RuiZhe Zhong, Yichao Yan, Guangtao Zhai, Xiaokang Yang

Specifically, neural radiance field takes lip movements features and personalized attributes as two disentangled conditions, where lip movements are directly predicted from the audio inputs to achieve lip-synchronized generation.

Neural Rendering Talking Head Generation

Paper
Add Code

Align Representations With Base: A New Approach to Self-Supervised Learning

no code implementations • CVPR 2022 • Shaofeng Zhang, Lyn Qiu, Feng Zhu, Junchi Yan, Hengrui Zhang, Rui Zhao, Hongyang Li, Xiaokang Yang

Existing symmetric contrastive learning methods suffer from collapses (complete and dimensional) or quadratic complexity of objectives.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

End-to-End Reconstruction-Classification Learning for Face Forgery Detection

1 code implementation • CVPR 2022 • Junyi Cao, Chao Ma, Taiping Yao, Shen Chen, Shouhong Ding, Xiaokang Yang

Reconstruction learning over real images enhances the learned representations to be aware of forgery patterns that are even unknown, while classification learning takes the charge of mining the essential discrepancy between real and fake images, facilitating the understanding of forgeries.

Classification Decoder

Paper
Code

Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography

no code implementations • CVPR 2022 • Jun Jia, Zhongpai Gao, Dandan Zhu, Xiongkuo Min, Guangtao Zhai, Xiaokang Yang

In addition, the automatic localization of hidden codes significantly reduces the time of manually correcting geometric distortions for photos, which is a revolutionary innovation for information hiding in mobile applications.

Paper
Add Code

A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs

no code implementations • 28 Dec 2021 • Han Lu, Zenan Li, Runzhong Wang, Qibing Ren, Junchi Yan, Xiaokang Yang

Solving combinatorial optimization (CO) on graphs is among the fundamental tasks for upper-stream applications in data mining, machine learning and operations research.

Adversarial Attack Combinatorial Optimization

Paper
Add Code

Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

1 code implementation • 8 Dec 2021 • Wendong Zhang, Yunbo Wang, Bingbing Ni, Xiaokang Yang

We train the prior learner and the image generator as a unified model without any post-processing.

Image Inpainting Variational Inference

Paper
Code

TAL: Two-stream Adaptive Learning for Generalizable Person Re-identification

no code implementations • 29 Nov 2021 • Yichao Yan, Junjie Li, Shengcai Liao, Jie Qin, Bingbing Ni, Xiaokang Yang

In the meantime, we design an adaptive BN layer in the domain-invariant stream, to approximate the statistics of various unseen domains.

Domain Generalization Generalizable Person Re-identification +1

Paper
Add Code

Consensus Synergizes with Memory: A Simple Approach for Anomaly Segmentation in Urban Scenes

no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen

Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.

Ranked #10 on Anomaly Detection on Fishyscapes L&F

Anomaly Detection Autonomous Driving +1

Paper
Add Code

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

1 code implementation • 7 Nov 2021 • Shanyan Guan, Jingwei Xu, Michelle Z. He, Yunbo Wang, Bingbing Ni, Xiaokang Yang

We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos, where performance of existing SMPL-based models are significantly affected by the distribution shift represented by different camera parameters, bone lengths, backgrounds, and occlusions.

Ranked #1 on 3D Absolute Human Pose Estimation on Surreal

3D Absolute Human Pose Estimation Bilevel Optimization

221

Paper
Code

ZARTS: On Zero-order Optimization for Neural Architecture Search

no code implementations • 10 Oct 2021 • Xiaoxing Wang, Wenxuan Guo, Junchi Yan, Jianlin Su, Xiaokang Yang

Also, we search on the search space of DARTS to compare with peer methods, and our discovered architecture achieves 97. 54% accuracy on CIFAR-10 and 75. 7% top-1 accuracy on ImageNet, which are state-of-the-art performance.

Neural Architecture Search

Paper
Add Code

DAAS: Differentiable Architecture and Augmentation Policy Search

no code implementations • 30 Sep 2021 • Xiaoxing Wang, Xiangxiang Chu, Junchi Yan, Xiaokang Yang

Neural architecture search (NAS) has been an active direction of automatic machine learning (Auto-ML), aiming to explore efficient network structures.

Data Augmentation Neural Architecture Search

Paper
Add Code

Zero-CL: Instance and Feature decorrelation for negative-free symmetric contrastive learning

no code implementations • ICLR 2022 • Shaofeng Zhang, Feng Zhu, Junchi Yan, Rui Zhao, Xiaokang Yang

The proposed two methods (FCL, ICL) can be combined synthetically, called Zero-CL, where ``Zero'' means negative samples are \textbf{zero} relevant, which allows Zero-CL to completely discard negative pairs i. e., with \textbf{zero} negative samples.

Contrastive Learning

Paper
Add Code

$m$-mix: Generating hard negatives via multiple samples mixing for contrastive learning

no code implementations • 29 Sep 2021 • Shaofeng Zhang, Meng Liu, Junchi Yan, Hengrui Zhang, Lingxiao Huang, Pinyan Lu, Xiaokang Yang

Negative pairs are essential in contrastive learning, which plays the role of avoiding degenerate solutions.

Contrastive Learning Graph Classification +2

Paper
Add Code

On Learning to Solve Cardinality Constrained Combinatorial Optimization in One-Shot: A Re-parameterization Approach via Gumbel-Sinkhorn-TopK

no code implementations • 29 Sep 2021 • Runzhong Wang, Li Shen, Yiting Chen, Junchi Yan, Xiaokang Yang, DaCheng Tao

Cardinality constrained combinatorial optimization requires selecting an optimal subset of $k$ elements, and it will be appealing to design data-driven algorithms that perform TopK selection over a probability distribution predicted by a neural network.

Combinatorial Optimization One-Shot Learning +1

Paper
Add Code

Efficient Person Search: An Anchor-Free Approach

4 code implementations • 1 Sep 2021 • Yichao Yan, Jinpeng Li, Jie Qin, Shengcai Liao, Xiaokang Yang

Third, by investigating the advantages of both anchor-based and anchor-free models, we further augment AlignPS with an ROI-Align head, which significantly improves the robustness of re-id features while still keeping our model highly efficient.

Ranked #4 on Person Search on PRW

Person Search

166

Paper
Code

Learning to Track Objects from Unlabeled Videos

1 code implementation • ICCV 2021 • Jilai Zheng, Chao Ma, Houwen Peng, Xiaokang Yang

In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.

Object Discovery Optical Flow Estimation

Paper
Code

Task-Specific Normalization for Continual Learning of Blind Image Quality Models

2 code implementations • 28 Jul 2021 • Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

In this paper, we present a simple yet effective continual learning method for blind image quality assessment (BIQA) with improved quality prediction accuracy, plasticity-stability trade-off, and task-order/-length robustness.

Blind Image Quality Assessment Continual Learning

Paper
Code

Angel's Girl for Blind Painters: an Efficient Painting Navigation System Validated by Multimodal Evaluation Approach

no code implementations • 27 Jul 2021 • Hang Liu, Menghan Hu, Yuzhen Chen, Qingli Li, Guangtao Zhai, Simon X. Yang, Xiao-Ping Zhang, Xiaokang Yang

This work demonstrates that it is practicable for the blind people to feel the world through the brush in their hands.

Paper
Add Code

Local-to-Global Self-Attention in Vision Transformers

no code implementations • 10 Jul 2021 • Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao

Transformers have demonstrated great potential in computer vision tasks.

Image Classification Semantic Segmentation

Paper
Add Code

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

no code implementations • CVPR 2021 • Chunwei Wang, Chao Ma, Ming Zhu, Xiaokang Yang

On one hand, PointAugmenting decorates point clouds with corresponding point-wise CNN features extracted by pretrained 2D detection models, and then performs 3D object detection over the decorated point clouds.

3D Object Detection Autonomous Driving +4

Paper
Add Code

Exploring Visual Context for Weakly Supervised Person Search

3 code implementations • 19 Jun 2021 • Yichao Yan, Jinpeng Li, Shengcai Liao, Jie Qin, Bingbing Ni, Xiaokang Yang, Ling Shao

This paper inventively considers weakly supervised person search with only bounding box annotations.

Clustering Pedestrian Detection +2

Paper
Code

Context-Aware Image Inpainting with Learned Semantic Priors

1 code implementation • 14 Jun 2021 • Wendong Zhang, Junwei Zhu, Ying Tai, Yunbo Wang, Wenqing Chu, Bingbing Ni, Chengjie Wang, Xiaokang Yang

Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator.

Image Inpainting Knowledge Distillation

Paper
Code

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

1 code implementation • NeurIPS 2021 • Runzhong Wang, Zhigang Hua, Gan Liu, Jiayi Zhang, Junchi Yan, Feng Qi, Shuang Yang, Jun Zhou, Xiaokang Yang

Combinatorial Optimization (CO) has been a long-standing challenging research topic featured by its NP-hard nature.

Combinatorial Optimization Scheduling

Paper
Code

Making CNNs Interpretable by Building Dynamic Sequential Decision Forests with Top-down Hierarchy Learning

no code implementations • 5 Jun 2021 • Yilin Wang, Shaozuo Yu, Xiaokang Yang, Wei Shen

In this paper, we propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable, while maintaining their high classification accuracy.

Classification

Paper
Add Code

Scalable and Explainable 1-Bit Matrix Completion via Graph Signal Learning

1 code implementation • AAAI 2021 • Chao Chen, Dongsheng Li, Junchi Yan, Hanchi Huang, Xiaokang Yang

One-bit matrix completion is an important class of positiveunlabeled (PU) learning problems where the observations consist of only positive examples, eg, in top-N recommender systems.

Collaborative Ranking Matrix Completion +1

Paper
Code

Learning Multi-Attention Context Graph for Group-Based Re-Identification

1 code implementation • 29 Apr 2021 • Yichao Yan, Jie Qin, Bingbing Ni, Jiaxin Chen, Li Liu, Fan Zhu, Wei-Shi Zheng, Xiaokang Yang, Ling Shao

Extensive experiments on the novel dataset as well as three existing datasets clearly demonstrate the effectiveness of the proposed framework for both group-based re-id tasks.

Person Re-Identification

Paper
Code

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction

1 code implementation • CVPR 2021 • Shanyan Guan, Jingwei Xu, Yunbo Wang, Bingbing Ni, Xiaokang Yang

This paper considers a new problem of adapting a pre-trained model of human mesh reconstruction to out-of-domain streaming videos.

Ranked #39 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation

Paper
Code

IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking

1 code implementation • CVPR 2021 • Shuai Jia, Yibing Song, Chao Ma, Xiaokang Yang

Recently, adversarial attack has been applied to visual object tracking to evaluate the robustness of deep trackers.

Adversarial Attack Image Classification +3

Paper
Code

Learning Comprehensive Motion Representation for Action Recognition

no code implementations • 23 Mar 2021 • Mingyu Wu, Boyuan Jiang, Donghao Luo, Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang

For action recognition learning, 2D CNN-based methods are efficient but may yield redundant features due to applying the same 2D convolution kernel to each frame.

Action Recognition

Paper
Add Code

Continual Learning for Blind Image Quality Assessment

1 code implementation • 19 Feb 2021 • Weixia Zhang, Dingquan Li, Chao Ma, Guangtao Zhai, Xiaokang Yang, Kede Ma

In this paper, we formulate continual learning for BIQA, where a model learns continually from a stream of IQA datasets, building on what was learned from previously seen data.

Blind Image Quality Assessment Continual Learning

Paper
Code

Learning Interpretable Deep State Space Model for Probabilistic Time Series Forecasting

no code implementations • 31 Jan 2021 • Longyuan Li, Junchi Yan, Xiaokang Yang, Yaohui Jin

We propose a deep state space model for probabilistic time series forecasting whereby the non-linear emission model and transition model are parameterized by networks and the dependency is modeled by recurrent neural nets.

Decision Making Management +2

Paper
Add Code

Self-supervised representation learning via adaptive hard-positive mining

no code implementations • 1 Jan 2021 • Shaofeng Zhang, Junchi Yan, Xiaokang Yang

Despite their success in perception over the last decade, deep neural networks are also known ravenous to labeled data for training, which limits their applicability to real-world problems.

Contrastive Learning Representation Learning +1

Paper
Add Code

Rethinking Pseudo-labeled Sample Mining for Semi-Supervised Object Detection

no code implementations • 1 Jan 2021 • Duo Li, Sanli Tang, Zhanzhan Cheng, ShiLiang Pu, Yi Niu, Wenming Tan, Fei Wu, Xiaokang Yang

However, the impact of the pseudo-labeled samples' quality as well as the mining strategies for high quality training sample have rarely been studied in SSL.

object-detection Object Detection +1

Paper
Add Code

Graduated Assignment for Joint Multi-Graph Matching and Clustering with Application to Unsupervised Graph Matching Network Learning

1 code implementation • NeurIPS 2020 • Runzhong Wang, Junchi Yan, Xiaokang Yang

This paper considers the setting of jointly matching and clustering multiple graphs belonging to different groups, which naturally rises in many realistic problems.

Ranked #2 on Graph Matching on Willow Object Class

Clustering Graph Matching

797

Paper
Code

Combinatorial Learning of Graph Edit Distance via Dynamic Embedding

no code implementations • CVPR 2021 • Runzhong Wang, Tianqi Zhang, Tianshu Yu, Junchi Yan, Xiaokang Yang

This paper presents a hybrid approach by combing the interpretability of traditional search-based techniques for producing the edit path, as well as the efficiency and adaptivity of deep embedding models to achieve a cost-effective GED solver.

Paper
Add Code

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

no code implementations • ICCV 2023 • Xiaoxing Wang, Xiangxiang Chu, Yuda Fan, Zhexi Zhang, Bo Zhang, Xiaokang Yang, Junchi Yan

Albeit being a prevalent architecture searching approach, differentiable architecture search (DARTS) is largely hindered by its substantial memory cost since the entire supernet resides in the memory.

Disentanglement Neural Architecture Search

Paper
Add Code

Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning

no code implementations • 22 Nov 2020 • Weixia Zhang, Chao Ma, Qi Wu, Xiaokang Yang

We then propose to recursively alternate the learning schemes of imitation and exploration to narrow the discrepancy between training and inference.

Imitation Learning Navigate +1

Paper
Add Code

Hierarchical Style-based Networks for Motion Synthesis

no code implementations • ECCV 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell

Generating diverse and natural human motion is one of the long-standing goals for creating intelligent characters in the animated world.

Motion Synthesis

Paper
Add Code

Cross-Modality 3D Object Detection

no code implementations • 16 Aug 2020 • Ming Zhu, Chao Ma, Pan Ji, Xiaokang Yang

In this paper, we focus on exploring the fusion of images and point clouds for 3D object detection in view of the complementary nature of the two modalities, i. e., images possess more semantic information while point clouds specialize in distance sensing.

3D Classification 3D Object Detection +4

Paper
Add Code

Robust Tracking against Adversarial Attacks

2 code implementations • ECCV 2020 • Shuai Jia, Chao Ma, Yibing Song, Xiaokang Yang

On one hand, we add the temporal perturbations into the original video sequences as adversarial examples to greatly degrade the tracking performance.

Adversarial Attack

Paper
Code

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

1 code implementation • ECCV 2020 • Ruixue Tang, Chao Ma, Wei Emma Zhang, Qi Wu, Xiaokang Yang

However, there are few works studying the data augmentation problem for VQA and none of the existing image based augmentation schemes (such as rotation and flipping) can be directly applied to VQA due to its semantic structure -- an $\langle image, question, answer\rangle$ triplet needs to be maintained correctly.

Adversarial Attack Data Augmentation +2

Paper
Code

Collaborative Learning for Faster StyleGAN Embedding

no code implementations • 3 Jul 2020 • Shanyan Guan, Ying Tai, Bingbing Ni, Feida Zhu, Feiyue Huang, Xiaokang Yang

The latent code of the recent popular model StyleGAN has learned disentangled representations thanks to the multi-layer style-based generator.

Paper
Add Code

Video Prediction via Example Guidance

1 code implementation • ICML 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell

In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics.

Video Prediction

Paper
Code

Uncertainty-Aware Blind Image Quality Assessment in the Laboratory and Wild

1 code implementation • 28 May 2020 • Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

Nevertheless, due to the distributional shift between images simulated in the laboratory and captured in the wild, models trained on databases with synthetic distortions remain particularly weak at handling realistic distortions (and vice versa).

Blind Image Quality Assessment Learning-To-Rank

121

Paper
Code

Permutation Matters: Anisotropic Convolutional Layer for Learning on Point Clouds

1 code implementation • 27 May 2020 • Zhongpai Gao, Guangtao Zhai, Junchi Yan, Xiaokang Yang

Various point neural networks have been developed with isotropic filters or using weighting matrices to overcome the structure inconsistency on point clouds.

Representation Learning Semantic Segmentation

Paper
Code

SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing

5 code implementations • 28 Apr 2020 • Xue Yang, Junchi Yan, Wenlong Liao, Xiaokang Yang, Jin Tang, Tao He

Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects.

Ranked #33 on Object Detection In Aerial Images on DOTA (using extra training data)

Denoising object-detection +1

765

Paper
Code

Learning Local Neighboring Structure for Robust 3D Shape Representation

1 code implementation • 21 Apr 2020 • Zhongpai Gao, Junchi Yan, Guangtao Zhai, Juyong Zhang, Yiyan Yang, Xiaokang Yang

Mesh is a powerful data structure for 3D shapes.

3D Shape Reconstruction 3D Shape Representation +1

Paper
Code

Toward Better Understanding of Saliency Prediction in Augmented 360 Degree Videos

no code implementations • 12 Dec 2019 • Yucheng Zhu, Xiongkuo Min, Dandan Zhu, Ke Gu, Jiantao Zhou, Guangtao Zhai, Xiaokang Yang, Wenjun Zhang

The saliency annotations of head and eye movements for both original and augmented videos are collected and together constitute the ARVR dataset.

Object Recognition Optical Flow Estimation +1

Paper
Add Code

Robust Invisible Hyperlinks in Physical Photographs Based on 3D Rendering Attacks

no code implementations • 3 Dec 2019 • Jun Jia, Zhongpai Gao, Kang Chen, Menghan Hu, Guangtao Zhai, Guodong Guo, Xiaokang Yang

To train a robust decoder against the physical distortion from the real world, a distortion network based on 3D rendering is inserted between the encoder and the decoder to simulate the camera imaging process.

Decoder

Paper
Add Code

Neural Graph Matching Network: Learning Lawler's Quadratic Assignment Problem with Extension to Hypergraph and Multiple-graph Matching

1 code implementation • 26 Nov 2019 • Runzhong Wang, Junchi Yan, Xiaokang Yang

We also show how to extend our network to hypergraph matching, and matching of multiple graphs.

Ranked #6 on Graph Matching on SPair-71k

Combinatorial Optimization General Classification +2

797

Paper
Code

Decoding Spiking Mechanism with Dynamic Learning on Neuron Population

no code implementations • 21 Nov 2019 • Zhijie Chen, Junchi Yan, Longyuan Li, Xiaokang Yang

Our model is aimed to reconstruct neuron information while inferring representations of neuron spiking states.

Paper
Add Code

Deep Unsupervised Clustering with Clustered Generator Model

no code implementations • 19 Nov 2019 • Dandan Zhu, Tian Han, Linqi Zhou, Xiaokang Yang, Ying Nian Wu

We propose the clustered generator model for clustering which contains both continuous and discrete latent variables.

Clustering

Paper
Add Code

Composable Semi-parametric Modelling for Long-range Motion Generation

no code implementations • 25 Sep 2019 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell

Learning diverse and natural behaviors is one of the longstanding goal for creating intelligent characters in the animated world.

Paper
Add Code

Semi-supervised 3D Face Reconstruction with Nonlinear Disentangled Representations

no code implementations • 25 Sep 2019 • Zhongpai Gao, Juyong Zhang, Yudong Guo, Chao Ma, Guangtao Zhai, Xiaokang Yang

Moreover, the identity and expression representations are entangled in these models, which hurdles many facial editing applications.

3D Face Reconstruction Facial Editing

Paper
Add Code

Learning to Blindly Assess Image Quality in the Laboratory and Wild

1 code implementation • 1 Jul 2019 • Weixia Zhang, Kede Ma, Guangtao Zhai, Xiaokang Yang

Computational models for blind image quality assessment (BIQA) are typically trained in well-controlled laboratory environments with limited generalizability to realistically distorted images.

Blind Image Quality Assessment Learning-To-Rank

121

Paper
Code

Reinforcement Learning with Policy Mixture Model for Temporal Point Processes Clustering

no code implementations • 29 May 2019 • Weichang Wu, Junchi Yan, Xiaokang Yang, Hongyuan Zha

Temporal point process is an expressive tool for modeling event sequences over time.

Clustering Point Processes +2

Paper
Add Code

Gated-GAN: Adversarial Gated Networks for Multi-Collection Style Transfer

2 code implementations • 4 Apr 2019 • Xinyuan Chen, Chang Xu, Xiaokang Yang, Li Song, DaCheng Tao

We propose adversarial gated networks (Gated GAN) to transfer multiple styles in a single model.

Decoder Style Transfer

Paper
Code

Learning Context Graph for Person Search

no code implementations • CVPR 2019 • Yichao Yan, Qiang Zhang, Bingbing Ni, Wendong Zhang, Minghao Xu, Xiaokang Yang

Person re-identification has achieved great progress with deep convolutional neural networks.

Graph Learning Person Re-Identification +1

Paper
Add Code

Learning Combinatorial Embedding Networks for Deep Graph Matching

1 code implementation • ICCV 2019 • Runzhong Wang, Junchi Yan, Xiaokang Yang

In addition with its NP-completeness nature, another important challenge is effective modeling of the node-wise and structure-wise affinity across graphs and the resulting objective, to guide the matching procedure effectively finding the true matching against noises.

Graph Embedding Graph Matching

797

Paper
Code

Video Prediction via Selective Sampling

1 code implementation • NeurIPS 2018 • Jingwei Xu, Bingbing Ni, Xiaokang Yang

Most adversarial learning based video prediction methods suffer from image blur, since the commonly used adversarial and regression loss pair work rather in a competitive way than collaboration, yielding compromised blur effect.

Multiple-choice Video Prediction

Paper
Code

Correlation Propagation Networks for Scene Text Detection

no code implementations • 30 Sep 2018 • Zichuan Liu, Guosheng Lin, Wang Ling Goh, Fayao Liu, Chunhua Shen, Xiaokang Yang

In this work, we propose a novel hybrid method for scene text detection namely Correlation Propagation Network (CPN).

Scene Text Detection Text Detection

Paper
Add Code

Deep Regression Tracking with Shrinkage Loss

1 code implementation • ECCV 2018 • Xiankai Lu, Chao Ma, Bingbing Ni, Xiaokang Yang, Ian Reid, Ming-Hsuan Yang

Regression trackers directly learn a mapping from regularly dense samples of target objects to soft labels, which are usually generated by a Gaussian function, to estimate target positions.

regression

Paper
Code

Multiple Granularity Group Interaction Prediction

no code implementations • CVPR 2018 • Taiping Yao, Minsi Wang, Bingbing Ni, Huawei Wei, Xiaokang Yang

Most human activity analysis works (i. e., recognition orãprediction) only focus on a single granularity, i. e., eitherãmodelling global motion based on the coarse level movement such as human trajectories orãforecasting future detailed action based on body partsâ movement such as skeleton motion.

Paper
Add Code

Crowd Counting via Adversarial Cross-Scale Consistency Pursuit

1 code implementation • CVPR 2018 • Zan Shen, Yi Xu, Bingbing Ni, Minsi Wang, Jianguo Hu, Xiaokang Yang

Crowd counting or density estimation is a challenging task in computer vision due to large scale variations, perspective distortions and serious occlusions, etc.

Ranked #4 on Crowd Counting on WorldExpo’10

Crowd Counting Density Estimation

Paper
Code

Structure Preserving Video Prediction

no code implementations • CVPR 2018 • Jingwei Xu, Bingbing Ni, Zefan Li, Shuo Cheng, Xiaokang Yang

Despite recent emergence of adversarial based methods for video prediction, existing algorithms often produce unsatisfied results in image regions with rich structural information (i. e., object boundary) and detailed motion (i. e., articulated body movement).

Object Video Prediction

Paper
Add Code

Fine-Grained Video Captioning for Sports Narrative

no code implementations • CVPR 2018 • Huanyu Yu, Shuo Cheng, Bingbing Ni, Minsi Wang, Jian Zhang, Xiaokang Yang

First, to facilitate this novel research of fine-grained video caption, we collected a novel dataset called Fine-grained Sports Narrative dataset (FSN) that contains 2K sports videos with ground-truth narratives from YouTube. com.

2k Video Captioning

Paper
Add Code

Attention-GAN for Object Transfiguration in Wild Images

no code implementations • ECCV 2018 • Xinyuan Chen, Chang Xu, Xiaokang Yang, DaCheng Tao

This paper studies the object transfiguration problem in wild images.

Object

Paper
Add Code

Decoupled Learning for Factorial Marked Temporal Point Processes

no code implementations • 21 Jan 2018 • Weichang Wu, Junchi Yan, Xiaokang Yang, Hongyuan Zha

In conventional (multi-dimensional) marked temporal point process models, event is often encoded by a single discrete variable i. e. a marker.

Point Processes

Paper
Add Code

Performance Guaranteed Network Acceleration via High-Order Residual Quantization

no code implementations • ICCV 2017 • Zefan Li, Bingbing Ni, Wenjun Zhang, Xiaokang Yang, Wen Gao

Input binarization has shown to be an effective way for network acceleration.

Binarization Quantization +1

Paper
Add Code

Terahertz Security Image Quality Assessment by No-reference Model Observers

no code implementations • 12 Jul 2017 • Menghan Hu, Xiongkuo Min, Guangtao Zhai, Wenhan Zhu, Yucheng Zhu, Zhaodi Wang, Xiaokang Yang, Guang Tian

Subsequently, the existing no-reference IQA algorithms, which were 5 opinion-aware approaches viz., NFERM, GMLF, DIIVINE, BRISQUE and BLIINDS2, and 8 opinion-unaware approaches viz., QAC, SISBLIM, NIQE, FISBLIM, CPBD, S3 and Fish_bb, were executed for the evaluation of the THz security image quality.

Image Quality Assessment

Paper
Add Code

Robust Visual Tracking via Hierarchical Convolutional Features

1 code implementation • 12 Jul 2017 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Specifically, we learn adaptive correlation filters on the outputs from each convolutional layer to encode the target appearance.

Object Recognition Visual Tracking

Paper
Code

Adaptive Correlation Filters with Long-Term and Short-Term Memory for Object Tracking

1 code implementation • 7 Jul 2017 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

Second, we learn a correlation filter over a feature pyramid centered at the estimated target position for predicting scale changes.

Object Tracking Position

155

Paper
Code

Skeleton-aided Articulated Motion Generation

no code implementations • 4 Jul 2017 • Yichao Yan, Jingwei Xu, Bingbing Ni, Xiaokang Yang

This work make the first attempt to generate articulated human motion sequence from a single image.

Ranked #2 on Gesture-to-Gesture Translation on NTU Hand Digit

Gesture-to-Gesture Translation Video Generation

Paper
Add Code

Recurrent Modeling of Interaction Context for Collective Activity Recognition

no code implementations • CVPR 2017 • Minsi Wang, Bingbing Ni, Xiaokang Yang

However, most of the previous activity recognition methods do not offer a flexible and scalable scheme to handle the high order context modeling problem.

Descriptive Group Activity Recognition

Paper
Add Code

Video Segmentation via Multiple Granularity Analysis

no code implementations • CVPR 2017 • Rui Yang, Bingbing Ni, Chao Ma, Yi Xu, Xiaokang Yang

We introduce a Multiple Granularity Analysis framework for video segmentation in a coarse-to-fine manner.

Multiple Instance Learning Segmentation +2

Paper
Add Code

Image Matching via Loopy RNN

no code implementations • 10 Jun 2017 • Donghao Luo, Bingbing Ni, Yichao Yan, Xiaokang Yang

Towards this end, we propose a novel loopy recurrent neural network (Loopy RNN), which is capable of aggregating relationship information of two input images in a progressive/iterative manner and outputting the consolidated matching score in the final iteration.

Paper
Add Code

Depth Structure Preserving Scene Image Generation

no code implementations • 1 Jun 2017 • Wendong Zhang, Bingbing Ni, Yichao Yan, Jingwei Xu, Xiaokang Yang

Key to automatically generate natural scene images is to properly arrange among various spatial elements, especially in the depth direction.

Image Generation Scene Generation

Paper
Add Code

Predicting Human Interaction via Relative Attention Model

no code implementations • 26 May 2017 • Yichao Yan, Bingbing Ni, Xiaokang Yang

Predicting human interaction is challenging as the on-going activity has to be inferred based on a partially observed video.

Paper
Add Code

Modeling The Intensity Function Of Point Process Via Recurrent Neural Networks

2 code implementations • 24 May 2017 • Shuai Xiao, Junchi Yan, Stephen M. Chu, Xiaokang Yang, Hongyuan Zha

In this paper, we model the background by a Recurrent Neural Network (RNN) with its units aligned with time series indexes while the history effect is modeled by another RNN whose units are aligned with asynchronous events to capture the long-range dynamics.

Point Processes Time Series +1

Paper
Code

Joint Modeling of Event Sequence and Time Series with Attentional Twin Recurrent Neural Networks

no code implementations • 24 Mar 2017 • Shuai Xiao, Junchi Yan, Mehrdad Farajtabar, Le Song, Xiaokang Yang, Hongyuan Zha

A variety of real-world processes (over networks) produce sequences of data whose complex temporal dynamics need to be studied.

Point Processes Time Series +1

Paper
Add Code

Person Re-Identification via Recurrent Feature Aggregation

1 code implementation • 23 Jan 2017 • Yichao Yan, Bingbing Ni, Zhichao Song, Chao Ma, Yan Yan, Xiaokang Yang

We address the person re-identification problem by effectively exploiting a globally discriminative feature representation from a sequence of tracked human regions/patches.

Patch Matching Person Re-Identification

Paper
Code

Learning a No-Reference Quality Metric for Single-Image Super-Resolution

2 code implementations • 18 Dec 2016 • Chao Ma, Chih-Yuan Yang, Xiaokang Yang, Ming-Hsuan Yang

Numerous single-image super-resolution algorithms have been proposed in the literature, but few studies address the problem of performance evaluation based on visual perception.

Ranked #7 on Video Quality Assessment on MSU SR-QA Dataset

Image Super-Resolution Video Quality Assessment

142

Paper
Code

Factors in Finetuning Deep Model for Object Detection With Long-Tail Distribution

no code implementations • CVPR 2016 • Wanli Ouyang, Xiaogang Wang, Cong Zhang, Xiaokang Yang

Our analysis and empirical results show that classes with more samples have higher impact on the feature learning.

Object object-detection +1

Paper
Add Code

Progressively Parsing Interactional Objects for Fine Grained Action Detection

no code implementations • CVPR 2016 • Bingbing Ni, Xiaokang Yang, Shenghua Gao

Fine grained video action analysis often requires reliable detection and tracking of various interacting objects and human body parts, denoted as interactional object parsing.

Action Analysis Action Recognition +5

Paper
Add Code

Cascaded Interactional Targeting Network for Egocentric Video Analysis

no code implementations • CVPR 2016 • Yang Zhou, Bingbing Ni, Richang Hong, Xiaokang Yang, Qi Tian

Firstly, a novel EM-like learning framework is proposed to train the pixel-level deep convolutional neural network (DCNN) by seamlessly integrating weakly supervised data (i. e., massive bounding box annotations) with a small set of strongly supervised data (i. e., fully annotated hand segmentation maps) to achieve state-of-the-art hand segmentation performance.

Action Recognition Foreground Segmentation +4

Paper
Add Code

Temporal Action Localization With Pyramid of Score Distribution Features

no code implementations • CVPR 2016 • Jun Yuan, Bingbing Ni, Xiaokang Yang, Ashraf A. Kassim

We investigate the feature design and classification architectures in temporal action localization.

Temporal Action Localization

Paper
Add Code

Factors in Finetuning Deep Model for object detection

no code implementations • 20 Jan 2016 • Wanli Ouyang, Xiaogang Wang, Cong Zhang, Xiaokang Yang

Our analysis and empirical results show that classes with more samples have higher impact on the feature learning.

Object object-detection +1

Paper
Add Code

A Matrix Decomposition Perspective to Multiple Graph Matching

no code implementations • ICCV 2015 • Junchi Yan, Hongteng Xu, Hongyuan Zha, Xiaokang Yang, Huanxi Liu, Stephen Chu

Graph matching has a wide spectrum of real-world applications and in general is known NP-hard.

Graph Matching

Paper
Add Code

Hierarchical Convolutional Features for Visual Tracking

no code implementations • ICCV 2015 • Chao Ma, Jia-Bin Huang, Xiaokang Yang, Ming-Hsuan Yang

The outputs of the last convolutional layers encode the semantic information of targets and such representations are robust to significant appearance variations.

Object Recognition Visual Object Tracking +1

Paper
Add Code

Motion Part Regularization: Improving Action Recognition via Trajectory Selection

no code implementations • CVPR 2015 • Bingbing Ni, Pierre Moulin, Xiaokang Yang, Shuicheng Yan

Inspired by the recent advance in sentence regularization for text classification, we introduce a Motion Part Regularization framework to mining discriminative semi-local groups of dense trajectories.

Action Recognition Sentence +3

Paper
Add Code

Cross-Scene Crowd Counting via Deep Convolutional Neural Networks

no code implementations • CVPR 2015 • Cong Zhang, Hongsheng Li, Xiaogang Wang, Xiaokang Yang

To address this problem, we propose a deep convolutional neural network (CNN) for crowd counting, and it is trained alternatively with two related learning objectives, crowd density and crowd count.

Ranked #15 on Crowd Counting on WorldExpo’10

Crowd Counting

Paper
Add Code

Discrete Hyper-Graph Matching

no code implementations • CVPR 2015 • Junchi Yan, Chao Zhang, Hongyuan Zha, Wei Liu, Xiaokang Yang, Stephen M. Chu

Evaluations on both synthetic and real-world data corroborate the efficiency of our method.

Graph Matching

Paper
Add Code

Long-Term Correlation Tracking

no code implementations • CVPR 2015 • Chao Ma, Xiaokang Yang, Chongyang Zhang, Ming-Hsuan Yang

In this paper, we address the problem of long-term visual tracking where the target objects undergo significant appearance variation due to deformation, abrupt motion, heavy occlusion and out-of-the-view.

Translation Visual Tracking

Paper
Add Code

A General Multi-Graph Matching Approach via Graduated Consistency-regularized Boosting

no code implementations • 20 Feb 2015 • Junchi Yan, Minsu Cho, Hongyuan Zha, Xiaokang Yang, Stephen Chu

We propose multi-graph matching methods to incorporate the two aspects by boosting the affinity score, meanwhile gradually infusing the consistency as a regularizer.

Graph Matching

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.