Search Results for author: Shuicheng Yan

Found 262 papers, 93 papers with code

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model

1 code implementation • 27 May 2024 • Kuan-Chih Huang, Xiangtai Li, Lu Qi, Shuicheng Yan, Ming-Hsuan Yang

This foundational estimation facilitates a detailed, coarse-to-fine segmentation strategy that significantly enhances the precision of object identification and segmentation.

Decoder Language Modelling +3

Paper
Code

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing

1 code implementation • 23 May 2024 • Ling Yang, Bohan Zeng, Jiaming Liu, Hong Li, Minghao Xu, Wentao Zhang, Shuicheng Yan

Therefore, this work, EditWorld, introduces a new editing task, namely world-instructed image editing, which defines and categorizes the instructions grounded by various world scenarios.

Instruction Following

Paper
Code

Non-confusing Generation of Customized Concepts in Diffusion Models

no code implementations • 11 May 2024 • Wang Lin, Jingyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao Jin, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang Zhang

We tackle the common challenge of inter-concept visual confusion in compositional concept generation using text-guided diffusion models (TGDMs).

Paper
Add Code

Auto-Encoding Morph-Tokens for Multimodal LLM

1 code implementation • 3 May 2024 • Kaihang Pan, Siliang Tang, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang

For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge.

Image Reconstruction MORPH

Paper
Code

SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation

1 code implementation • 23 Apr 2024 • Xiangyu Xu, Lijuan Liu, Shuicheng Yan

Existing Transformers for monocular 3D human shape and pose estimation typically have a quadratic computation and memory complexity with respect to the feature length, which hinders the exploitation of fine-grained information in high-resolution features that is beneficial for accurate reconstruction.

Ranked #21 on 3D Human Pose Estimation on 3DPW

3D Human Pose Estimation

Paper
Code

DGMamba: Domain Generalization via Generalized State Space Model

1 code implementation • 11 Apr 2024 • Shaocong Long, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Chenhao Ying, Yuan Luo, Lizhuang Ma, Shuicheng Yan

SPR strives to encourage the model to concentrate more on objects rather than context, consisting of two designs: Prior-Free Scanning~(PFS), and Domain Context Interchange~(DCI).

Domain Generalization

Paper
Code

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

3 code implementations • 29 Mar 2024 • Yikang Zhou, Tao Zhang, Shunping Ji, Shuicheng Yan, Xiangtai Li

Modern video segmentation methods adopt object queries to perform inter-frame association and demonstrate satisfactory performance in tracking continuously appearing objects despite large-scale motion and transient occlusion.

Object Video Segmentation +1

118

Paper
Code

Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

no code implementations • 27 Mar 2024 • Qiuhong Shen, Zike Wu, Xuanyu Yi, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang

We tackle the challenge of efficiently reconstructing a 3D asset from a single image at millisecond speed.

3D Generation 3D Reconstruction +1

Paper
Add Code

AgentStudio: A Toolkit for Building General Virtual Agents

no code implementations • 26 Mar 2024 • Longtao Zheng, Zhiyuan Huang, Zhenghai Xue, Xinrun Wang, Bo An, Shuicheng Yan

We have open-sourced the environments, datasets, benchmarks, and interfaces to promote research towards developing general virtual agents for the future.

Visual Grounding

Paper
Add Code

Explore In-Context Segmentation via Latent Diffusion Models

no code implementations • 14 Mar 2024 • Chaoyang Wang, Xiangtai Li, Henghui Ding, Lu Qi, Jiangning Zhang, Yunhai Tong, Chen Change Loy, Shuicheng Yan

In-context segmentation has drawn more attention with the introduction of vision foundation models.

Metric Learning Segmentation

Paper
Add Code

Data Augmentation in Human-Centric Vision

no code implementations • 13 Mar 2024 • Wentao Jiang, Yige Zhang, Shaozhong Zheng, Si Liu, Shuicheng Yan

This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks, a first of its kind in the field.

Data Augmentation Human Parsing +2

Paper
Add Code

Point Cloud Mamba: Point Cloud Learning via State Space Model

2 code implementations • 1 Mar 2024 • Tao Zhang, Xiangtai Li, Haobo Yuan, Shunping Ji, Shuicheng Yan

To enable Mamba to process 3-D point cloud data more effectively, we propose a novel Consistent Traverse Serialization method to convert point clouds into 1-D point sequences while ensuring that neighboring points in the sequence are also spatially adjacent.

Ranked #5 on Supervised Only 3D Point Cloud Classification on ScanObjectNN

Supervised Only 3D Point Cloud Classification

Paper
Code

Have Seen Me Before? Automating Dataset Updates Towards Reliable and Timely Evaluation

no code implementations • 19 Feb 2024 • Jiahao Ying, Yixin Cao, Bo wang, Wei Tang, Yizhe Yang, Shuicheng Yan

The basic idea is to generate unseen and high-quality testing samples based on existing ones to mitigate leakage issues.

Paper
Add Code

See the Unseen: Better Context-Consistent Knowledge-Editing by Noises

no code implementations • 15 Jan 2024 • Youcheng Huang, Wenqiang Lei, Zheng Zhang, Jiancheng Lv, Shuicheng Yan

In this paper, we empirically find that the effects of different contexts upon LLMs in recalling the same knowledge follow a Gaussian-like distribution.

knowledge editing

Paper
Add Code

Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket

1 code implementation • 4 Jan 2024 • Zhaokun Zhou, Kaiwei Che, Wei Fang, Keyu Tian, Yuesheng Zhu, Shuicheng Yan, Yonghong Tian, Li Yuan

To the best of our knowledge, this is the first time that the SNN achieves 80+% accuracy on ImageNet.

Image Classification Self-Supervised Learning

249

Paper
Code

Clarity ChatGPT: An Interactive and Adaptive Processing System for Image Restoration and Enhancement

no code implementations • 20 Nov 2023 • Yanyan Wei, Zhao Zhang, Jiahuan Ren, Xiaogang Xu, Richang Hong, Yi Yang, Shuicheng Yan, Meng Wang

The generalization capability of existing image restoration and enhancement (IRE) methods is constrained by the limited pre-trained datasets, making it difficult to handle agnostic inputs such as different degradation levels and scenarios beyond their design scopes.

Image Restoration Language Modelling

Paper
Add Code

Instant3D: Instant Text-to-3D Generation

1 code implementation • 14 Nov 2023 • Ming Li, Pan Zhou, Jia-Wei Liu, Jussi Keppo, Min Lin, Shuicheng Yan, Xiangyu Xu

We achieve this remarkable speed by devising a new network that directly constructs a 3D triplane from a text prompt.

3D Generation Negation +1

Paper
Code

Towards Garment Sewing Pattern Reconstruction from a Single Image

1 code implementation • 7 Nov 2023 • Lijuan Liu, Xiangyu Xu, Zhijie Lin, Jiabin Liang, Shuicheng Yan

In this work, we explore the challenging problem of recovering garment sewing patterns from daily photos for augmenting these applications.

Garment Reconstruction Texture Synthesis +1

117

Paper
Code

Skywork: A More Open Bilingual Foundation Model

1 code implementation • 30 Oct 2023 • Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu, Chenxia Li, Liu Yang, Xilin Luo, Xuejie Wu, Lunan Liu, Wenjun Cheng, Peng Cheng, Jianhao Zhang, XiaoYu Zhang, Lei Lin, Xiaokun Wang, Yutuan Ma, Chuanhai Dong, Yanqi Sun, Yifu Chen, Yongyi Peng, Xiaojuan Liang, Shuicheng Yan, Han Fang, Yahui Zhou

In this technical report, we present Skywork-13B, a family of large language models (LLMs) trained on a corpus of over 3. 2 trillion tokens drawn from both English and Chinese texts.

Language Modelling

1,131

Paper
Code

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

2 code implementations • NeurIPS 2023 • Zhongzhan Huang, Pan Zhou, Shuicheng Yan, Liang Lin

Besides, we also observe the theoretical benefits of the LSC coefficient scaling of UNet in the stableness of hidden features and gradient and also robustness.

Paper
Code

Heterogenous Memory Augmented Neural Networks

1 code implementation • 17 Oct 2023 • Zihan Qiu, Zhen Liu, Shuicheng Yan, Shanghang Zhang, Jie Fu

It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios.

Retrieval

Paper
Code

IPDreamer: Appearance-Controllable 3D Object Generation with Complex Image Prompts

1 code implementation • 9 Oct 2023 • Bohan Zeng, Shanglin Li, Yutang Feng, Ling Yang, Hong Li, Sicheng Gao, Jiaming Liu, Conghui He, Wentao Zhang, Jianzhuang Liu, Baochang Zhang, Shuicheng Yan

However, the appearance of 3D objects produced by these text-to-3D models is unpredictable, and it is hard for the single-image-to-3D methods to deal with complex images, thus posing a challenge in generating appearance-controllable 3D objects.

3D Generation Image to 3D +2

Paper
Code

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation

no code implementations • 5 Aug 2023 • Qiaosong Qi, Le Zhuo, Aixi Zhang, Yue Liao, Fei Fang, Si Liu, Shuicheng Yan

To address these limitations, we present a novel cascaded motion diffusion model, DiffDance, designed for high-resolution, long-form dance generation.

Representation Learning Super-Resolution

Paper
Add Code

SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning

2 code implementations • 3 Aug 2023 • Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He

More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).

Ranked #1 on Node Property Prediction on ogbn-arxiv

Feature Engineering Graph Learning +3

Paper
Code

Decoupled Prioritized Resampling for Offline RL

2 code implementations • 8 Jun 2023 • Yang Yue, Bingyi Kang, Xiao Ma, Qisen Yang, Gao Huang, Shiji Song, Shuicheng Yan

OPER is a plug-and-play component for offline RL algorithms.

Offline RL Reinforcement Learning (RL)

Paper
Code

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai

It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.

Document AI Entity Linking +1

Paper
Add Code

Evolving Knowledge Mining for Class Incremental Segmentation

1 code implementation • 3 Jun 2023 • Zhihe Lu, Shuicheng Yan, Xinchao Wang

In this paper, we for the first time investigate the efficient multi-grained knowledge reuse for CISS, and propose a novel method, Evolving kNowleDge minING (ENDING), employing a frozen backbone.

Class-Incremental Semantic Segmentation Knowledge Distillation

Paper
Code

Improving and Benchmarking Offline Reinforcement Learning Algorithms

1 code implementation • 1 Jun 2023 • Bingyi Kang, Xiao Ma, Yirui Wang, Yang Yue, Shuicheng Yan

Recently, Offline Reinforcement Learning (RL) has achieved remarkable progress with the emergence of various algorithms and datasets.

Attribute Benchmarking +4

Paper
Code

Efficient Diffusion Policies for Offline Reinforcement Learning

1 code implementation • NeurIPS 2023 • Bingyi Kang, Xiao Ma, Chao Du, Tianyu Pang, Shuicheng Yan

2) It is incompatible with maximum likelihood-based RL algorithms (e. g., policy gradient methods) as the likelihood of diffusion models is intractable.

D4RL Offline RL +3

Paper
Code

Generative Table Pre-training Empowers Models for Tabular Prediction

1 code implementation • 16 May 2023 • Tianping Zhang, Shaowen Wang, Shuicheng Yan, Jian Li, Qian Liu

Recently, the topic of table pre-training has attracted considerable research interest.

imbalanced classification Imputation

Paper
Code

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

2 code implementations • 3 May 2023 • Chao Du, Tianbo Li, Tianyu Pang, Shuicheng Yan, Min Lin

Sliced-Wasserstein Flow (SWF) is a promising approach to nonparametric generative modeling but has not been widely adopted due to its suboptimal generative quality and lack of conditional modeling capabilities.

Paper
Code

A Review of Deep Learning for Video Captioning

no code implementations • 22 Apr 2023 • Moloud Abdar, Meenakshi Kollati, Swaraja Kuraparthi, Farhad Pourpanah, Daniel McDuff, Mohammad Ghavamzadeh, Shuicheng Yan, Abduallah Mohamed, Abbas Khosravi, Erik Cambria, Fatih Porikli

Video captioning (VC) is a fast-moving, cross-disciplinary area of research that bridges work in the fields of computer vision, natural language processing (NLP), linguistics, and human-computer interaction.

Dense Video Captioning Question Answering +3

Paper
Add Code

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

1 code implementation • CVPR 2023 • Yunqing Zhao, Chao Du, Milad Abdollahzadeh, Tianyu Pang, Min Lin, Shuicheng Yan, Ngai-Man Cheung

To this end, we propose knowledge truncation to mitigate this issue in FSIG, which is a complementary operation to knowledge preservation and is implemented by a lightweight pruning-based method.

Image Generation Transfer Learning

Paper
Code

CoSDA: Continual Source-Free Domain Adaptation

1 code implementation • 13 Apr 2023 • Haozhe Feng, Zhaorui Yang, Hesun Chen, Tianyu Pang, Chao Du, Minfeng Zhu, Wei Chen, Shuicheng Yan

Recently, SFDA has gained popularity due to the need to protect the data privacy of the source domain, but it suffers from catastrophic forgetting on the source domain due to the lack of data.

Source-Free Domain Adaptation

Paper
Code

InceptionNeXt: When Inception Meets ConvNeXt

11 code implementations • 29 Mar 2023 • Weihao Yu, Pan Zhou, Shuicheng Yan, Xinchao Wang

Inspired by the long-range modeling ability of ViTs, large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance, like the remarkable work ConvNeXt which employs 7x7 depthwise convolution.

Image Classification Semantic Segmentation

30,248

Paper
Code

MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer

1 code implementation • ICCV 2023 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

To solve this issue, we propose a Masked Diffusion Transformer (MDT) that introduces a mask latent modeling scheme to explicitly enhance the DPMs' ability to contextual relation learning among object semantic parts in an image.

Ranked #3 on Image Generation on ImageNet 256x256

Image Generation

464

Paper
Code

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

no code implementations • 1 Mar 2023 • Tianbo Li, Min Lin, Zheyuan Hu, Kunhao Zheng, Giovanni Vignale, Kenji Kawaguchi, A. H. Castro Neto, Kostya S. Novoselov, Shuicheng Yan

Kohn-Sham Density Functional Theory (KS-DFT) has been traditionally solved by the Self-Consistent Field (SCF) method.

Numerical Integration

Paper
Add Code

Contrastive Video Question Answering via Video Graph Transformer

1 code implementation • 27 Feb 2023 • Junbin Xiao, Pan Zhou, Angela Yao, Yicong Li, Richang Hong, Shuicheng Yan, Tat-Seng Chua

CoVGT's uniqueness and superiority are three-fold: 1) It proposes a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, their relations and dynamics, for complex spatio-temporal reasoning.

Ranked #14 on Video Question Answering on NExT-QA (using extra training data)

Contrastive Learning Question Answering +1

Paper
Code

Bag of Tricks for Training Data Extraction from Language Models

1 code implementation • 9 Feb 2023 • Weichen Yu, Tianyu Pang, Qian Liu, Chao Du, Bingyi Kang, Yan Huang, Min Lin, Shuicheng Yan

With the advance of language models, privacy protection is receiving more attention.

Text Generation

Paper
Code

Better Diffusion Models Further Improve Adversarial Training

2 code implementations • 9 Feb 2023 • Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan

Under the $\ell_\infty$-norm threat model with $\epsilon=8/255$, our models achieve $70. 69\%$ and $42. 67\%$ robust accuracy on CIFAR-10 and CIFAR-100, respectively, i. e. improving upon previous state-of-the-art models by $+4. 58\%$ and $+8. 03\%$.

Denoising

114

Paper
Code

Learning to Optimize for Reinforcement Learning

1 code implementation • 3 Feb 2023 • Qingfeng Lan, A. Rupam Mahmood, Shuicheng Yan, Zhongwen Xu

Reinforcement learning (RL) is essentially different from supervised learning and in practice these learned optimizers do not work well even in simple RL tasks.

Inductive Bias Meta-Learning +2

Paper
Code

Visual Imitation Learning with Patch Rewards

1 code implementation • 2 Feb 2023 • Minghuan Liu, Tairan He, Weinan Zhang, Shuicheng Yan, Zhongwen Xu

Specifically, we present Adversarial Imitation Learning with Patch Rewards (PatchAIL), which employs a patch-based discriminator to measure the expertise of different local parts from given images and provide patch rewards.

Imitation Learning

Paper
Code

Does Federated Learning Really Need Backpropagation?

1 code implementation • 28 Jan 2023 • Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin

BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments, because the clients in BAFFLE only execute forward propagation and return a set of scalars to the server.

Federated Learning Quantization

Paper
Code

Reinforcement Learning from Diverse Human Preferences

no code implementations • 27 Jan 2023 • Wanqi Xue, Bo An, Shuicheng Yan, Zhongwen Xu

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation

1 code implementation • 18 Jan 2023 • Munan Ning, Donghuan Lu, Yujia Xie, Dongdong Chen, Dong Wei, Yefeng Zheng, Yonghong Tian, Shuicheng Yan, Li Yuan

Unsupervised domain adaption has been widely adopted in tasks with scarce annotated data.

Domain Adaptation Semantic Segmentation +1

Paper
Code

STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition

no code implementations • ICCV 2023 • Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, Shuicheng Yan

For the first time, we introduce vision Transformers into PPAR by treating a video as a tubelet sequence, and accordingly design two complementary mechanisms, i. e., sparsification and anonymization, to remove privacy from a spatio-temporal perspective.

Action Recognition Facial Expression Recognition (FER) +2

Paper
Add Code

Position-guided Text Prompt for Vision-Language Pre-training

1 code implementation • CVPR 2023 • Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan

In this work, we propose a novel Position-guided Text Prompt (PTP) paradigm to enhance the visual grounding ability of cross-modal models trained with VLP.

Ranked #5 on Zero-Shot Cross-Modal Retrieval on COCO 2014

Cross-Modal Retrieval Image Captioning +6

145

Paper
Code

NoiSER: Noise is All You Need for Low-Light Image Enhancement

no code implementations • 9 Nov 2022 • Zhao Zhang, Suiyi Zhao, Xiaojie Jin, Mingliang Xu, Yi Yang, Shuicheng Yan

In this paper, we present an embarrassingly simple yet effective solution to a seemingly impossible mission, low-light image enhancement (LLIE) without access to any task-related data.

Low-Light Image Enhancement regression

Paper
Add Code

Decoupled Cross-Scale Cross-View Interaction for Stereo Image Enhancement in The Dark

no code implementations • 2 Nov 2022 • Huan Zheng, Zhao Zhang, Jicong Fan, Richang Hong, Yi Yang, Shuicheng Yan

Specifically, we present a decoupled interaction module (DIM) that aims for sufficient dual-view information interaction.

Image Enhancement

Paper
Add Code

MetaFormer Baselines for Vision

7 code implementations • 24 Oct 2022 • Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang

By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85. 5% at 224x224 resolution, under normal supervised training without external data or distillation.

Ranked #2 on Domain Generalization on ImageNet-C (using extra training data)

Domain Generalization Image Classification

30,248

Paper
Code

Towards Sustainable Self-supervised Learning

1 code implementation • 20 Oct 2022 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models.

Ranked #1 on Semantic Segmentation on ImageNet-S

Object Detection Relation +3

Paper
Code

RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning

no code implementations • 18 Oct 2022 • Wei Qiu, Xiao Ma, Bo An, Svetlana Obraztsova, Shuicheng Yan, Zhongwen Xu

Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Boosting Offline Reinforcement Learning via Data Rebalancing

no code implementations • 17 Oct 2022 • Yang Yue, Bingyi Kang, Xiao Ma, Zhongwen Xu, Gao Huang, Shuicheng Yan

Therefore, we propose a simple yet effective method to boost offline RL algorithms based on the observation that resampling a dataset keeps the distribution support unchanged.

D4RL Offline RL +2

Paper
Add Code

Mutual Information Regularized Offline Reinforcement Learning

1 code implementation • NeurIPS 2023 • Xiao Ma, Bingyi Kang, Zhongwen Xu, Min Lin, Shuicheng Yan

In this work, we propose a novel MISA framework to approach offline RL from the perspective of Mutual Information between States and Actions in the dataset by directly constraining the policy improvement direction.

D4RL Offline RL +2

Paper
Code

Efficient Offline Policy Optimization with a Learned Model

1 code implementation • 12 Oct 2022 • Zichen Liu, Siyi Li, Wee Sun Lee, Shuicheng Yan, Zhongwen Xu

Instead of planning with the expensive MCTS, we use the learned model to construct an advantage estimation based on a one-step rollout.

Offline RL

Paper
Code

AdaptivePose++: A Powerful Single-Stage Network for Multi-Person Pose Regression

1 code implementation • 8 Oct 2022 • Yabo Xiao, Xiaojuan Wang, Dongdong Yu, Kai Su, Lei Jin, Mei Song, Shuicheng Yan, Jian Zhao

With the proposed body representation, we further deliver a compact single-stage multi-person pose regression network, termed as AdaptivePose.

3D Multi-Person Pose Estimation Human Detection +2

174

Paper
Code

LPT: Long-tailed Prompt Tuning for Image Classification

1 code implementation • 3 Oct 2022 • Bowen Dong, Pan Zhou, Shuicheng Yan, WangMeng Zuo

For better effectiveness, we divide prompts into two groups: 1) a shared prompt for the whole long-tailed dataset to learn general features and to adapt a pretrained model into target domain; and 2) group-specific prompts to gather group-specific features for the samples which have similar features and also to empower the pretrained model with discrimination ability.

Ranked #1 on Long-tail Learning on CIFAR-100-LT (ρ=100) (using extra training data)

Classification Image Classification +1

Paper
Code

Seeing Through the Noisy Dark: Towards Real-world Low-Light Image Enhancement and Denoising

no code implementations • 2 Oct 2022 • Jiahuan Ren, Zhao Zhang, Richang Hong, Mingliang Xu, Yi Yang, Shuicheng Yan

Low-light image enhancement (LLIE) aims at improving the illumination and visibility of dark images with lighting noise.

Attribute Denoising +1

Paper
Add Code

Spikformer: When Spiking Neural Network Meets Transformer

2 code implementations • 29 Sep 2022 • Zhaokun Zhou, Yuesheng Zhu, Chao He, YaoWei Wang, Shuicheng Yan, Yonghong Tian, Li Yuan

Spikformer (66. 3M parameters) with comparable size to SEW-ResNet-152 (60. 2M, 69. 26%) can achieve 74. 81% top1 accuracy on ImageNet using 4 time steps, which is the state-of-the-art in directly trained SNNs models.

Image Classification

249

Paper
Code

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

4 code implementations • 13 Aug 2022 • Xingyu Xie, Pan Zhou, Huan Li, Zhouchen Lin, Shuicheng Yan

Adan first reformulates the vanilla Nesterov acceleration to develop a new Nesterov momentum estimation (NME) method, which avoids the extra overhead of computing gradient at the extrapolation point.

728

Paper
Code

Video Graph Transformer for Video Question Answering

1 code implementation • 12 Jul 2022 • Junbin Xiao, Pan Zhou, Tat-Seng Chua, Shuicheng Yan

VGT's uniqueness are two-fold: 1) it designs a dynamic graph transformer module which encodes video by explicitly capturing the visual objects, their relations, and dynamics for complex spatio-temporal reasoning; and 2) it exploits disentangled video and text Transformers for relevance comparison between the video and text to perform QA, instead of entangled cross-modal Transformer for answer classification.

Ranked #4 on Video Question Answering on IntentQA

Question Answering Relation +2

Paper
Code

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

no code implementations • 25 Jun 2022 • Yang Yue, Bingyi Kang, Zhongwen Xu, Gao Huang, Shuicheng Yan

Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL.

Contrastive Learning Data Augmentation +5

Paper
Add Code

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

3 code implementations • 21 Jun 2022 • Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan

EnvPool is open-sourced at https://github. com/sail-sg/envpool.

reinforcement-learning Reinforcement Learning (RL)

4,651

Paper
Code

Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks

no code implementations • 8 Jun 2022 • Jiachun Pan, Pan Zhou, Shuicheng Yan

To solve these problems, we first theoretically show that on an auto-encoder of a two/one-layered convolution encoder/decoder, MRP can capture all discriminative features of each potential semantic class in the pretraining dataset.

Decoder

Paper
Add Code

$O(N^2)$ Universal Antisymmetry in Fermionic Neural Networks

no code implementations • 26 May 2022 • Tianyu Pang, Shuicheng Yan, Min Lin

In this paper, we substitute the Slater determinant with a pairwise antisymmetry construction, which is easy to implement and can reduce the computational cost to $O(N^2)$.

Variational Monte Carlo

Paper
Add Code

Inception Transformer

3 code implementations • 25 May 2022 • Chenyang Si, Weihao Yu, Pan Zhou, Yichen Zhou, Xinchao Wang, Shuicheng Yan

Recent studies show that Transformer has strong capability of building long-range dependencies, yet is incompetent in capturing high frequencies that predominantly convey local information.

Image Classification

569

Paper
Code

Towards Feature Distribution Alignment and Diversity Enhancement for Data-Free Quantization

no code implementations • 30 Apr 2022 • Yangcheng Gao, Zhao Zhang, Richang Hong, Haijun Zhang, Jicong Fan, Shuicheng Yan

To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics to imitate the distribution of real data, so that the performance degradation is alleviated.

Data Free Quantization Model Compression +1

Paper
Add Code

Improving Vision Transformers by Revisiting High-frequency Components

1 code implementation • 3 Apr 2022 • Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, Wei Liu

Inspired by this finding, we first investigate the effects of existing techniques for improving ViT models from a new frequency perspective, and find that the success of some techniques (e. g., RandAugment) can be attributed to the better usage of the high-frequency components.

Ranked #2 on Domain Generalization on Stylized-ImageNet

Domain Generalization Image Classification +1

Paper
Code

Mugs: A Multi-Granular Self-Supervised Learning Framework

1 code implementation • 27 Mar 2022 • Pan Zhou, Yichen Zhou, Chenyang Si, Weihao Yu, Teck Khim Ng, Shuicheng Yan

It provides complementary instance supervision to IDS via an extra alignment on local neighbors, and scatters different local-groups separately to increase discriminability.

Ranked #13 on Self-Supervised Image Classification on ImageNet

Contrastive Learning Self-Supervised Image Classification +3

Paper
Code

Self-Promoted Supervision for Few-Shot Transformer

1 code implementation • 14 Mar 2022 • Bowen Dong, Pan Zhou, Shuicheng Yan, WangMeng Zuo

The few-shot learning ability of vision transformers (ViTs) is rarely investigated though heavily desired.

Data Augmentation Few-Shot Learning +1

Paper
Code

Robustness and Accuracy Could Be Reconcilable by (Proper) Definition

1 code implementation • 21 Feb 2022 • Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan

The trade-off between robustness and accuracy has been widely studied in the adversarial literature.

Inductive Bias

Paper
Code

Modern Augmented Reality: Applications, Trends, and Future Directions

no code implementations • 18 Feb 2022 • Shervin Minaee, Xiaodan Liang, Shuicheng Yan

Augmented reality (AR) is one of the relatively old, yet trending areas in the intersection of computer vision and computer graphics with numerous applications in several areas, from gaming and entertainment, to education and healthcare.

Paper
Add Code

Deep Color Consistent Network for Low-Light Image Enhancement

1 code implementation • CVPR 2022 • Zhao Zhang, Huan Zheng, Richang Hong, Mingliang Xu, Shuicheng Yan, Meng Wang

Current low-light image enhancement methods can well improve the illumination.

Low-Light Image Enhancement

Paper
Code

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

1 code implementation • NeurIPS 2021 • Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang

The fine-tuning of pre-trained language models has a great success in many NLP fields.

Adversarial Robustness Natural Language Inference +1

Paper
Code

TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning

1 code implementation • 16 Dec 2021 • Shiming Chen, Ziming Hong, Wenjin Hou, Guo-Sen Xie, Yibing Song, Jian Zhao, Xinge You, Shuicheng Yan, Ling Shao

Analogously, VAT uses the similar feature augmentation encoder to refine the visual features, which are further applied in visual$\rightarrow$attribute decoder to learn visual-based attribute features.

Attribute Decoder +1

Paper
Code

DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition

1 code implementation • 9 Dec 2021 • Yuxuan Liang, Pan Zhou, Roger Zimmermann, Shuicheng Yan

While transformers have shown great potential on video recognition with their strong capability of capturing long-range dependencies, they often suffer high computational costs induced by the self-attention to the huge number of 3D tokens.

Video Recognition

Paper
Code

Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering

no code implementations • 8 Dec 2021 • Mingfei Chen, Jianfeng Zhang, Xiangyu Xu, Lijuan Liu, Yujun Cai, Jiashi Feng, Shuicheng Yan

Meanwhile, for achieving higher rendering efficiency, we introduce a progressive rendering pipeline through geometry guidance, which leverages the geometric feature volume and the predicted density values to progressively reduce the number of sampling points and speed up the rendering process.

Paper
Add Code

Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond

1 code implementation • NeurIPS 2021 • Pan Zhou, Hanshu Yan, Xiaotong Yuan, Jiashi Feng, Shuicheng Yan

Specifically, we prove that lookahead using SGD as its inner-loop optimizer can better balance the optimization error and generalization error to achieve smaller excess risk error than vanilla SGD on (strongly) convex problems and nonconvex problems with Polyak-{\L}ojasiewicz condition which has been observed/proved in neural networks.

Paper
Code

Arbitrary Virtual Try-On Network: Characteristics Preservation and Trade-off between Body and Clothing

no code implementations • 24 Nov 2021 • Yu Liu, Mingbo Zhao, Zhao Zhang, Haijun Zhang, Shuicheng Yan

Based on this dataset, we then propose the Arbitrary Virtual Try-On Network (AVTON) that is utilized for all-type clothes, which can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person.

Geometric Matching Virtual Try-on

Paper
Add Code

MetaFormer Is Actually What You Need for Vision

14 code implementations • CVPR 2022 • Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan

Based on this observation, we hypothesize that the general architecture of the Transformers, instead of the specific token mixer module, is more essential to the model's performance.

Ranked #9 on Semantic Segmentation on DensePASS

Image Classification Object Detection +1

127,011

Paper
Code

Direct Multi-view Multi-person 3D Pose Estimation

2 code implementations • NeurIPS 2021 • Tao Wang, Jianfeng Zhang, Yujun Cai, Shuicheng Yan, Jiashi Feng

Instead of estimating 3D joint locations from costly volumetric representation or reconstructing the per-person 3D pose from multiple detected 2D poses as in previous methods, MvP directly regresses the multi-person 3D poses in a clean and efficient way, without relying on intermediate tasks.

Ranked #3 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation 3D Pose Estimation

313

Paper
Code

Deep Long-Tailed Learning: A Survey

1 code implementation • 9 Oct 2021 • Yifan Zhang, Bingyi Kang, Bryan Hooi, Shuicheng Yan, Jiashi Feng

Deep long-tailed learning, one of the most challenging problems in visual recognition, aims to train well-performing deep models from a large number of images that follow a long-tailed class distribution.

441

Paper
Code

PnP-DETR: Towards Efficient Visual Analysis with Transformers

1 code implementation • ICCV 2021 • Tao Wang, Li Yuan, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

Recently, DETR pioneered the solution of vision tasks with transformers, it directly translates the image feature map into the object detection result.

object-detection Object Detection +1

129

Paper
Code

VOLO: Vision Outlooker for Visual Recognition

7 code implementations • 24 Jun 2021 • Li Yuan, Qibin Hou, Zihang Jiang, Jiashi Feng, Shuicheng Yan

Though recently the prevailing vision transformers (ViTs) have shown great potential of self-attention based models in ImageNet classification, their performance is still inferior to that of the latest SOTA CNNs if no extra data are provided.

Ranked #1 on Image Classification on VizWiz-Classification

Domain Generalization Image Classification +1

30,248

Paper
Code

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition

4 code implementations • 23 Jun 2021 • Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng

By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections.

185

Paper
Code

PSGAN++: Robust Detail-Preserving Makeup Transfer and Removal

1 code implementation • 26 May 2021 • Si Liu, Wentao Jiang, Chen Gao, Ran He, Jiashi Feng, Bo Li, Shuicheng Yan

In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively.

Style Transfer

703

Paper
Code

Human-centric Relation Segmentation: Dataset and Solution

no code implementations • 24 May 2021 • Si Liu, Zitian Wang, Yulu Gao, Lejian Ren, Yue Liao, Guanghui Ren, Bo Li, Shuicheng Yan

For the above exemplar case, our HRS task produces results in the form of relation triplets <girl [left hand], hold, book> and exacts segmentation masks of the book, with which the robot can easily accomplish the grabbing task.

Relation Segmentation

Paper
Add Code

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

13 code implementations • ICCV 2021 • Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

To overcome such limitations, we propose a new Tokens-To-Token Vision Transformer (T2T-ViT), which incorporates 1) a layer-wise Tokens-to-Token (T2T) transformation to progressively structurize the image to tokens by recursively aggregating neighboring Tokens into one Token (Tokens-to-Token), such that local structure represented by surrounding tokens can be modeled and tokens length can be reduced; 2) an efficient backbone with a deep-narrow structure for vision transformer motivated by CNN architecture design after empirical study.

Ranked #404 on Image Classification on ImageNet

Image Classification Language Modelling

3,230

Paper
Code

ORDNet: Capturing Omni-Range Dependencies for Scene Parsing

no code implementations • 11 Jan 2021 • Shaofei Huang, Si Liu, Tianrui Hui, Jizhong Han, Bo Li, Jiashi Feng, Shuicheng Yan

Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images.

Scene Parsing

Paper
Add Code

ProxylessKD: Direct Knowledge Distillation with Inherited Classifier for Face Recognition

no code implementations • 31 Oct 2020 • Weidong Shi, Guanghui Ren, Yunpeng Chen, Shuicheng Yan

We observe that existing knowledge distillation models optimize the proxy tasks that force the student to mimic the teacher's behavior, instead of directly optimizing the face recognition accuracy.

Face Recognition Knowledge Distillation

Paper
Add Code

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Shuning Chang, Xuecheng Nie, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data.

Optical Flow Estimation Pose Estimation

Paper
Add Code

Toward Accurate Person-level Action Recognition in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Yichen Zhou, Shuning Chang, Ziyuan Huang, Yunpeng Chen, Xuecheng Nie, Tao Wang, Jiashi Feng, Shuicheng Yan

Prior works always fail to deal with this problem in two aspects: (1) lacking utilizing information of the scenes; (2) lacking training data in the crowd and complex scenes.

Action Recognition In Videos Semantic Segmentation

Paper
Add Code

A Simple Baseline for Pose Tracking in Videos of Crowded Scenes

no code implementations • 16 Oct 2020 • Li Yuan, Shuning Chang, Ziyuan Huang, Yichen Zhou, Yunpeng Chen, Xuecheng Nie, Francis E. H. Tay, Jiashi Feng, Shuicheng Yan

This paper presents our solution to ACM MM challenge: Large-scale Human-centric Video Analysis in Complex Events\cite{lin2020human}; specifically, here we focus on Track3: Crowd Pose Tracking in Complex Events.

Multi-Object Tracking Optical Flow Estimation +1

Paper
Add Code

Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior

no code implementations • 8 Sep 2020 • Yan Zhang, Zhao Zhang, Yang Wang, Zheng Zhang, Li Zhang, Shuicheng Yan, Meng Wang

Nonnegative matrix factorization is usually powerful for learning the "shallow" parts-based representation, but it clearly fails to discover deep hierarchical information within both the basis and representation spaces.

Clustering Graph Learning +1

Paper
Add Code

Learning Target Domain Specific Classifier for Partial Domain Adaptation

no code implementations • 25 Aug 2020 • Chuan-Xian Ren, PengFei Ge, Peiyi Yang, Shuicheng Yan

Previous UDA methods assume that the source and target domains share an identical label space, which is unrealistic in practice since the label information of the target domain is agnostic.

Partial Domain Adaptation Unsupervised Domain Adaptation

Paper
Add Code

Dual Adversarial Auto-Encoders for Clustering

no code implementations • 23 Aug 2020 • Pengfei Ge, Chuan-Xian Ren, Jiashi Feng, Shuicheng Yan

By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of Auto-encoders.

Clustering Variational Inference

Paper
Add Code

ConvBERT: Improving BERT with Span-based Dynamic Convolution

7 code implementations • NeurIPS 2020 • Zi-Hang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

The novel convolution heads, together with the rest self-attention heads, form a new mixed attention block that is more efficient at both global and local context learning.

Natural Language Understanding

127,011

Paper
Code

A Survey on Concept Factorization: From Shallow to Deep Representation Learning

no code implementations • 31 Jul 2020 • Zhao Zhang, Yan Zhang, Mingliang Xu, Li Zhang, Yi Yang, Shuicheng Yan

In this paper, we therefore survey the recent advances on CF methodologies and the potential benchmarks by categorizing and summarizing the current methods.

Clustering Representation Learning

Paper
Add Code

Rethinking Bottleneck Structure for Efficient Mobile Network Design

4 code implementations • ECCV 2020 • Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan

In this paper, we rethink the necessity of such design changes and find it may bring risks of information loss and gradient confusion.

General Classification Neural Architecture Search +2

966

Paper
Code

Recapture as You Want

no code implementations • 2 Jun 2020 • Chen Gao, Si Liu, Ran He, Shuicheng Yan, Bo Li

LGR module utilizes body skeleton knowledge to construct a layout graph that connects all relevant part features, where graph reasoning mechanism is used to propagate information among part nodes to mine their relations.

Paper
Add Code

Semantic Domain Adversarial Networks for Unsupervised Domain Adaptation

no code implementations • 30 Mar 2020 • Dapeng Hu, Jian Liang, Qibin Hou, Hanshu Yan, Yunpeng Chen, Shuicheng Yan, Jiashi Feng

To successfully align the multi-modal data structures across domains, the following works exploit discriminative information in the adversarial training process, e. g., using multiple class-wise discriminators and introducing conditional information in input or output of the domain discriminator.

Object Recognition Semantic Segmentation +1

Paper
Add Code

Highly Efficient Salient Object Detection with 100K Parameters

1 code implementation • ECCV 2020 • Shang-Hua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan

Salient object detection models often demand a considerable amount of computation cost to make precise prediction for each pixel, making them hardly applicable on low-power devices.

Object object-detection +2

241

Paper
Code

RC-DARTS: Resource Constrained Differentiable Architecture Search

no code implementations • 30 Dec 2019 • Xiaojie Jin, Jiang Wang, Joshua Slocum, Ming-Hsuan Yang, Shengyang Dai, Shuicheng Yan, Jiashi Feng

In this paper, we propose the resource constrained differentiable architecture search (RC-DARTS) method to learn architectures that are significantly smaller and faster while achieving comparable accuracy.

Image Classification One-Shot Learning

Paper
Add Code

Very Long Natural Scenery Image Prediction by Outpainting

1 code implementation • ICCV 2019 • Zongxin Yang, Jian Dong, Ping Liu, Yi Yang, Shuicheng Yan

The second challenge is how to maintain high quality in generated results, especially for multi-step generations in which generated regions are spatially far away from the initial input.

Decoder Image Inpainting +1

Paper
Code

Learning Hybrid Representation by Robust Dictionary Learning in Factorized Compressed Space

no code implementations • 26 Dec 2019 • Jiahuan Ren, Zhao Zhang, Sheng Li, Yang Wang, Guangcan Liu, Shuicheng Yan, Meng Wang

Specifically, J-RFDL performs the robust representation by DL in a factorized compressed space to eliminate the negative effects of noise and outliers on the results, which can also make the DL process efficient.

Dictionary Learning

Paper
Add Code

Asymmetric GAN for Unpaired Image-to-image Translation

no code implementations • 25 Dec 2019 • Yu Li, Sheng Tang, Rui Zhang, Yongdong Zhang, Jintao Li, Shuicheng Yan

While in situations where two domains are asymmetric in complexity, i. e., the amount of information between two domains is different, these approaches pose problems of poor generation quality, mapping ambiguity, and model sensitivity.

Image-to-Image Translation Translation

Paper
Add Code

Convolutional Dictionary Pair Learning Network for Image Representation Learning

no code implementations • 17 Dec 2019 • Zhao Zhang, Yulin Sun, Yang Wang, Zheng-Jun Zha, Shuicheng Yan, Meng Wang

To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework.

Dictionary Learning Representation Learning

Paper
Add Code

DerainCycleGAN: Rain Attentive CycleGAN for Single Image Deraining and Rainmaking

1 code implementation • 15 Dec 2019 • Yanyan Wei, Zhao Zhang, Yang Wang, Mingliang Xu, Yi Yang, Shuicheng Yan, Meng Wang

However, in practice it is rather common to have no un-paired images in real deraining task, in such cases how to remove the rain streaks in an unsupervised way will be a very challenging task due to lack of constraints between images and hence suffering from low-quality recovery results.

Single Image Deraining

Paper
Code

Compressed DenseNet for Lightweight Character Recognition

no code implementations • 15 Dec 2019 • Zhao Zhang, Zemin Tang, Yang Wang, Haijun Zhang, Shuicheng Yan, Meng Wang

LDB is a convolutional block similarly as dense block, but it can reduce the computation cost and weight size to (1/L, 2/L), compared with original ones, where L is the number of layers in blocks.

Paper
Add Code

Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery

no code implementations • 13 Dec 2019 • Xianzhen Li, Zhao Zhang, Yang Wang, Guangcan Liu, Shuicheng Yan, Meng Wang

In this paper, we explore the deep multi-subspace recovery problem by designing a multilayer architecture for latent LRR.

Clustering Representation Learning

Paper
Add Code

Efficient Differentiable Neural Architecture Search with Meta Kernels

no code implementations • 10 Dec 2019 • Shoufa Chen, Yunpeng Chen, Shuicheng Yan, Jiashi Feng

We demonstrate the effectiveness of our search strategy by conducting extensive experiments.

Neural Architecture Search

Paper
Add Code

AdversarialNAS: Adversarial Neural Architecture Search for GANs

1 code implementation • CVPR 2020 • Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan

In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation.

Image Generation Neural Architecture Search +1

Paper
Code

Efficient Meta Learning via Minibatch Proximal Update

no code implementations • NeurIPS 2019 • Pan Zhou, Xiao-Tong Yuan, Huan Xu, Shuicheng Yan, Jiashi Feng

We address the problem of meta-learning which learns a prior over hypothesis from a sample of meta-training tasks for fast adaptation on meta-testing tasks.

Few-Shot Learning

Paper
Add Code

Discriminative Local Sparse Representation by Robust Adaptive Dictionary Pair Learning

no code implementations • 20 Nov 2019 • Yulin Sun, Zhao Zhang, Weiming Jiang, Zheng Zhang, Li Zhang, Shuicheng Yan, Meng Wang

In this paper, we propose a structured Robust Adaptive Dic-tionary Pair Learning (RA-DPL) framework for the discrim-inative sparse representation learning.

Representation Learning

Paper
Add Code

PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer

1 code implementation • CVPR 2020 • Wentao Jiang, Si Liu, Chen Gao, Jie Cao, Ran He, Jiashi Feng, Shuicheng Yan

In this paper, we address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image.

703

Paper
Code

Flexible Auto-weighted Local-coordinate Concept Factorization: A Robust Framework for Unsupervised Clustering

no code implementations • 2 Sep 2019 • Zhao Zhang, Yan Zhang, Sheng Li, Guangcan Liu, Dan Zeng, Shuicheng Yan, Meng Wang

For auto-weighting, RFA-LCF jointly preserves the manifold structures in the basis concept space and new coordinate space in an adaptive manner by minimizing the reconstruction errors on clean data, anchor points and coordinates.

Clustering

Paper
Add Code

Single-Stage Multi-Person Pose Machines

1 code implementation • ICCV 2019 • Xuecheng Nie, Jianfeng Zhang, Shuicheng Yan, Jiashi Feng

Based on SPR, we develop the SPM model that can directly predict structured poses for multiple persons in a single stage, and thus offer a more compact pipeline and attractive efficiency advantage over two-stage methods.

Ranked #3 on Keypoint Detection on MPII Multi-Person

3D Pose Estimation Keypoint Detection +1

128

Paper
Code

Joint Subspace Recovery and Enhanced Locality Driven Robust Flexible Discriminative Dictionary Learning

no code implementations • 11 Jun 2019 • Zhao Zhang, Jiahuan Ren, Weiming Jiang, Zheng Zhang, Richang Hong, Shuicheng Yan, Meng Wang

We propose a joint subspace recovery and enhanced locality based robust flexible label consistent dictionary learning method called Robust Flexible Discriminative Dictionary Learning (RFDDL).

Dictionary Learning

Paper
Add Code

Kernel-Induced Label Propagation by Mapping for Semi-Supervised Classification

no code implementations • 29 May 2019 • Zhao Zhang, Lei Jia, Mingbo Zhao, Guangcan Liu, Meng Wang, Shuicheng Yan

A Kernel-Induced Label Propagation (Kernel-LP) framework by mapping is proposed for high-dimensional data classification using the most informative patterns of data in kernel space.

Classification General Classification

Paper
Add Code

Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier

no code implementations • 27 May 2019 • Zhao Zhang, Weiming Jiang, Jie Qin, Li Zhang, Fanzhang Li, Min Zhang, Shuicheng Yan

Then we compute a linear classifier based on the approximated sparse codes by an analysis mechanism to simultaneously consider the classification and representation powers.

Dictionary Learning General Classification

Paper
Add Code

Robust Unsupervised Flexible Auto-weighted Local-Coordinate Concept Factorization for Image Clustering

no code implementations • 25 May 2019 • Zhao Zhang, Yan Zhang, Sheng Li, Guangcan Liu, Meng Wang, Shuicheng Yan

RFA-LCF integrates the robust flexible CF, robust sparse local-coordinate coding and the adaptive reconstruction weighting learning into a unified model.

Clustering Image Clustering +1

Paper
Add Code

Joint Label Prediction based Semi-Supervised Adaptive Concept Factorization for Robust Data Representation

no code implementations • 25 May 2019 • Zhao Zhang, Yan Zhang, Guangcan Liu, Jinhui Tang, Shuicheng Yan, Meng Wang

To enrich prior knowledge to enhance the discrimination, RS2ACF clearly uses class information of labeled data and more importantly propagates it to unlabeled data by jointly learning an explicit label indicator for unlabeled data.

Paper
Add Code

Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

28 code implementations • ICCV 2019 • Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, Jiashi Feng

Similarly, the output feature maps of a convolution layer can also be seen as a mixture of information at different frequencies.

Ranked #147 on Action Classification on Kinetics-400

Action Classification Image Classification +1

2,927

Paper
Code

Multi-Prototype Networks for Unconstrained Set-based Face Recognition

no code implementations • 13 Feb 2019 • Jian Zhao, Jianshu Li, Xiaoguang Tu, Fang Zhao, Yuan Xin, Junliang Xing, Hengzhu Liu, Shuicheng Yan, Jiashi Feng

In this paper, we study the challenging unconstrained set-based face recognition problem where each subject face is instantiated by a set of media (images and videos) instead of a single image.

Face Recognition

Paper
Add Code

A^2-Nets: Double Attention Networks

2 code implementations • NeurIPS 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Action Classification Action Recognition +2

Paper
Code

Graph-Based Global Reasoning Networks

9 code implementations • CVPR 2019 • Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

In this work, we propose a new approach for reasoning globally in which a set of features are globally aggregated over the coordinate space and then projected to an interaction space where relational reasoning can be efficiently computed.

Action Classification Action Recognition +4

337

Paper
Code

Style Separation and Synthesis via Generative Adversarial Networks

2 code implementations • 7 Nov 2018 • Rui Zhang, Sheng Tang, Yu Li, Junbo Guo, Yongdong Zhang, Jintao Li, Shuicheng Yan

The S3-GAN consists of an encoder network, a generator network, and an adversarial network.

Generative Adversarial Network Object

Paper
Code

$A^2$-Nets: Double Attention Networks

no code implementations • 27 Oct 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

Learning to capture long-range relations is fundamental to image/video recognition.

Ranked #35 on Action Recognition on UCF101

3D Absolute Human Pose Estimation Action Classification +3

Paper
Add Code

Look Across Elapse: Disentangled Representation Learning and Photorealistic Cross-Age Face Synthesis for Age-Invariant Face Recognition

1 code implementation • 2 Sep 2018 • Jian Zhao, Yu Cheng, Yi Cheng, Yang Yang, Haochong Lan, Fang Zhao, Lin Xiong, Yan Xu, Jianshu Li, Sugiri Pranata, ShengMei Shen, Junliang Xing, Hengzhu Liu, Shuicheng Yan, Jiashi Feng

Benchmarking our model on one of the most popular unconstrained face recognition datasets IJB-C additionally verifies the promising generalizability of AIM in recognizing faces in the wild.

Ranked #1 on Age-Invariant Face Recognition on MORPH Album2

Age-Invariant Face Recognition Benchmarking +4

360

Paper
Code

Dynamic Conditional Networks for Few-Shot Learning

no code implementations • ECCV 2018 • Fang Zhao, Jian Zhao, Shuicheng Yan, Jiashi Feng

This paper proposes a novel Dynamic Conditional Convolutional Network (DCCN) to handle conditional few-shot learning, i. e, only a few training samples are available for each condition.

Face Generation Few-Shot Learning +3

Paper
Add Code

Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation

no code implementations • ECCV 2018 • Xuecheng Nie, Jiashi Feng, Shuicheng Yan

This paper presents a novel Mutual Learning to Adapt model (MuLA) for joint human parsing and pose estimation.

Ranked #11 on Semantic Segmentation on LIP val

Human Parsing Multi-Task Learning +2

Paper
Add Code

Pose Partition Networks for Multi-Person Pose Estimation

no code implementations • ECCV 2018 • Xuecheng Nie, Jiashi Feng, Junliang Xing, Shuicheng Yan

This paper proposes a novel Pose Partition Network (PPN) to address the challenging multi-person pose estimation problem.

Human Detection Multi-Person Pose Estimation

Paper
Add Code

Multi-Fiber Networks for Video Recognition

no code implementations • ECCV 2018 • Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng

In this paper, we aim to reduce the computational cost of spatio-temporal deep neural networks, making them run as fast as their 2D counterparts while preserving state-of-the-art accuracy on video recognition benchmarks.

Ranked #36 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Paper
Add Code

Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements

1 code implementation • 7 Jun 2018 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

Specifically, we show that by solving a TNN minimization problem, the underlying tensor of size $n_1\times n_2\times n_3$ with tubal rank $r$ can be exactly recovered when the given number of Gaussian measurements is $O(r(n_1+n_2-r)n_3)$.

Paper
Code

Human Pose Estimation With Parsing Induced Learner

no code implementations • CVPR 2018 • Xuecheng Nie, Jiashi Feng, Yiming Zuo, Shuicheng Yan

Comprehensive experiments on benchmarks LIP and extended PASCAL-Person-Part show that the proposed Parsing Induced Learner can improve performance of both single- and multi-person pose estimation to new state-of-the-art.

Human Parsing Multi-Person Pose Estimation

Paper
Add Code

Towards Pose Invariant Face Recognition in the Wild

no code implementations • CVPR 2018 • Jian Zhao, Yu Cheng, Yan Xu, Lin Xiong, Jianshu Li, Fang Zhao, Karlekar Jayashree, Sugiri Pranata, ShengMei Shen, Junliang Xing, Shuicheng Yan, Jiashi Feng

To this end, we propose a Pose Invariant Model (PIM) for face recognition in the wild, with three distinct novelties.

Face Recognition Generative Adversarial Network +1

Paper
Add Code

Neural Style Transfer via Meta Networks

no code implementations • CVPR 2018 • Falong Shen, Shuicheng Yan, Gang Zeng

Recent works on style transfer typically need to train image transformation networks for every new style, and the style is encoded in the network parameters by enormous iterations of stochastic gradient descent, which lacks the generalization ability to new style in the inference stage.

Style Transfer

Paper
Add Code

Subspace Clustering by Block Diagonal Representation

no code implementations • 23 May 2018 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Tao Mei, Shuicheng Yan

Second, we observe that many existing methods approximate the block diagonal representation matrix by using different structure priors, e. g., sparsity and low-rankness, which are indirect.

Clustering

Paper
Add Code

Tensor Robust Principal Component Analysis with A New Tensor Nuclear Norm

1 code implementation • 10 Apr 2018 • Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

Equipped with the new tensor nuclear norm, we then solve the TRPCA problem by solving a convex program and provide the theoretical guarantee for the exact recovery.

115

Paper
Code

Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

2 code implementations • 10 Apr 2018 • Jian Zhao, Jianshu Li, Yu Cheng, Li Zhou, Terence Sim, Shuicheng Yan, Jiashi Feng

Despite the noticeable progress in perceptual tasks like detection, instance segmentation and human parsing, computers still perform unsatisfactorily on visually understanding humans in crowded scenes, such as group behavior analysis, person re-identification and autonomous driving, etc.

Ranked #1 on Multi-Human Parsing on PASCAL-Part

Autonomous Driving Clustering +6

5,146

Paper
Code

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

1 code implementation • CVPR 2018 • Pengyuan Lyu, Cong Yao, Wenhao Wu, Shuicheng Yan, Xiang Bai

We propose to detect scene text by localizing corner points of text bounding boxes and segmenting text regions in relative positions.

Ranked #2 on Scene Text Detection on ICDAR 2017 MLT

Multi-Oriented Scene Text Detection object-detection +2

315

Paper
Code

Face Aging with Contextual Generative Adversarial Nets

no code implementations • 1 Feb 2018 • Si Liu, Yao Sun, Defa Zhu, Renda Bao, Wei Wang, Xiangbo Shu, Shuicheng Yan

The age discriminative network guides the synthesized face to fit the real conditional distribution.

Face Verification

Paper
Add Code

WSNet: Learning Compact and Efficient Networks with Weight Sampling

no code implementations • ICLR 2018 • Xiaojie Jin, Yingzhen Yang, Ning Xu, Jianchao Yang, Jiashi Feng, Shuicheng Yan

We present a new approach and a novel architecture, termed WSNet, for learning compact and efficient deep neural networks.

Ranked #1 on Paper generation on 2017 Robotic Instrument Segmentation Challenge

Audio Classification General Classification +2

Paper
Add Code

BT-Nets: Simplifying Deep Neural Networks via Block Term Decomposition

no code implementations • 15 Dec 2017 • Guangxi Li, Jinmian Ye, Haiqin Yang, Di Chen, Shuicheng Yan, Zenglin Xu

Recently, deep neural networks (DNNs) have been regarded as the state-of-the-art classification methods in a wide range of applications, especially in image classification.

General Classification Image Classification

Paper
Add Code

Nonconvex Sparse Spectral Clustering by Alternating Direction Method of Multipliers and Its Convergence Analysis

no code implementations • 8 Dec 2017 • Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

Experimental analysis on several real data sets verifies the effectiveness of our method.

Clustering

Paper
Add Code

Weaving Multi-scale Context for Single Shot Detector

no code implementations • 8 Dec 2017 • Yunpeng Chen, Jianshu Li, Bin Zhou, Jiashi Feng, Shuicheng Yan

For 320x320 input of batch size = 8, WeaveNet reaches 79. 5% mAP on PASCAL VOC 2007 test in 101 fps with only 4 fps extra cost, and further improves to 79. 7% mAP with more iterations.

object-detection Object Detection

Paper
Add Code

Dual-Agent GANs for Photorealistic and Identity Preserving Profile Face Synthesis

no code implementations • NeurIPS 2017 • Jian Zhao, Lin Xiong, Panasonic Karlekar Jayashree, Jianshu Li, Fang Zhao, Zhecan Wang, Panasonic Sugiri Pranata, Panasonic Shengmei Shen, Shuicheng Yan, Jiashi Feng

In particular, we employ an off-the-shelf 3D face model as a simulator to generate profile face images with varying poses.

Ranked #1 on Face Verification on IJB-A

Face Generation Face Model +4

Paper
Add Code

WSNet: Compact and Efficient Networks Through Weight Sampling

no code implementations • ICML 2018 • Xiaojie Jin, Yingzhen Yang, Ning Xu, Jianchao Yang, Nebojsa Jojic, Jiashi Feng, Shuicheng Yan

We present a new approach and a novel architecture, termed WSNet, for learning compact and efficient deep neural networks.

Audio Classification General Classification +1

Paper
Add Code

Personalized and Occupational-aware Age Progression by Generative Adversarial Networks

no code implementations • 26 Nov 2017 • Siyu Zhou, Weiqiang Zhao, Jiashi Feng, Hanjiang Lai, Yan Pan, Jian Yin, Shuicheng Yan

Second, we propose a new occupational-aware adversarial face aging network, which learns human aging process under different occupations.

Human Aging

Paper
Add Code

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

no code implementations • 26 Nov 2017 • Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan

The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Integrated Face Analytics Networks through Cross-Dataset Hybrid Training

no code implementations • 16 Nov 2017 • Jianshu Li, Shengtao Xiao, Fang Zhao, Jian Zhao, Jianan Li, Jiashi Feng, Shuicheng Yan, Terence Sim

Specifically, iFAN achieves an overall F-score of 91. 15% on the Helen dataset for face parsing, a normalized mean error of 5. 81% on the MTFL dataset for facial landmark localization and an accuracy of 45. 73% on the BNU dataset for emotion recognition with a single model.

Face Alignment Face Parsing +1

Paper
Add Code

Predicting Scene Parsing and Motion Dynamics in the Future

no code implementations • NeurIPS 2017 • Xiaojie Jin, Huaxin Xiao, Xiaohui Shen, Jimei Yang, Zhe Lin, Yunpeng Chen, Zequn Jie, Jiashi Feng, Shuicheng Yan

The ability of predicting the future is important for intelligent systems, e. g. autonomous vehicles and robots to plan early and make decisions accordingly.

Autonomous Vehicles motion prediction +2

Paper
Add Code

Learning to Segment Human by Watching YouTube

no code implementations • 4 Oct 2017 • Xiaodan Liang, Yunchao Wei, Liang Lin, Yunpeng Chen, Xiaohui Shen, Jianchao Yang, Shuicheng Yan

An intuition on human segmentation is that when a human is moving in a video, the video-context (e. g., appearance and motion clues) may potentially infer reasonable mask information for the whole human body.

Human Detection Segmentation +5

Paper
Add Code

Scale-Adaptive Convolutions for Scene Parsing

no code implementations • ICCV 2017 • Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, Shuicheng Yan

Through adding a new scale regression layer, we can dynamically infer the position-adaptive scale coefficients which are adopted to resize the convolutional patches.

regression Scene Parsing

Paper
Add Code

Recurrent 3D-2D Dual Learning for Large-Pose Facial Landmark Detection

no code implementations • ICCV 2017 • Shengtao Xiao, Jiashi Feng, Luoqi Liu, Xuecheng Nie, Wei Wang, Shuicheng Yan, Ashraf Kassim

To address these challenging issues, we introduce a novel recurrent 3D-2D dual learning model that alternatively performs 2D-based 3D face model refinement and 3D-to-2D projection based 2D landmark refinement to reliably reason about self-occluded landmarks, precisely capture the subtle landmark displacement and accurately detect landmarks even in presence of extremely large poses.

Face Model Facial Landmark Detection

Paper
Add Code

Deep Sparse Subspace Clustering

no code implementations • 25 Sep 2017 • Xi Peng, Jiashi Feng, Shijie Xiao, Jiwen Lu, Zhang Yi, Shuicheng Yan

In this paper, we present a deep extension of Sparse Subspace Clustering, termed Deep Sparse Subspace Clustering (DSSC).

Clustering valid

Paper
Add Code

Meta Networks for Neural Style Transfer

1 code implementation • 13 Sep 2017 • Falong Shen, Shuicheng Yan, Gang Zeng

Style Transfer

127

Paper
Code

Discriminative Similarity for Clustering and Semi-Supervised Learning

no code implementations • 5 Sep 2017 • Yingzhen Yang, Feng Liang, Nebojsa Jojic, Shuicheng Yan, Jiashi Feng, Thomas S. Huang

By generalization analysis via Rademacher complexity, the generalization error bound for the kernel classifier learned from hypothetical labeling is expressed as the sum of pairwise similarity between the data from different classes, parameterized by the weights of the kernel classifier.

Clustering

Paper
Add Code

Learning with Rethinking: Recurrently Improving Convolutional Neural Networks through Feedback

no code implementations • 15 Aug 2017 • Xin Li, Zequn Jie, Jiashi Feng, Changsong Liu, Shuicheng Yan

However, most of the existing CNN models only learn features through a feedforward structure and no feedback information from top to bottom layers is exploited to enable the networks to refine themselves.

Paper
Add Code

Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization

no code implementations • CVPR 2016 • Canyi Lu, Jiashi Feng, Yudong Chen, Wei Liu, Zhouchen Lin, Shuicheng Yan

In this work, we prove that under certain suitable assumptions, we can recover both the low-rank and the sparse components exactly by simply solving a convex program whose objective is a weighted combination of the tensor nuclear norm and the $\ell_1$-norm, i. e., $\min_{{\mathcal{L}},\ {\mathcal{E}}} \ \|{{\mathcal{L}}}\|_*+\lambda\|{{\mathcal{E}}}\|_1, \ \text{s. t.}

Image Denoising

Paper
Add Code

FoveaNet: Perspective-aware Urban Scene Parsing

no code implementations • ICCV 2017 • Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng

Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and recognition errors.

Scene Parsing

Paper
Add Code

Neural Person Search Machines

no code implementations • ICCV 2017 • Hao Liu, Jiashi Feng, Zequn Jie, Karlekar Jayashree, Bo Zhao, Meibin Qi, Jianguo Jiang, Shuicheng Yan

We investigate the problem of person search in the wild in this work.

Ranked #4 on Person Re-Identification on CUHK-SYSU

Person Search

Paper
Add Code

Dual Path Networks

19 code implementations • NeurIPS 2017 • Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng

In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally.

Image Classification

30,248

Paper
Code

Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF

1 code implementation • CVPR 2017 • Falong Shen, Rui Gan, Shuicheng Yan, Gang Zeng

The proposed joint model also employs a guidance CRF to further enhance the segmentation performance.

Image Segmentation Scene Parsing +2

Paper
Code

Memory-Augmented Attribute Manipulation Networks for Interactive Fashion Search

no code implementations • CVPR 2017 • Bo Zhao, Jiashi Feng, Xiao Wu, Shuicheng Yan

We introduce a new fashion search protocol where attribute manipulation is allowed within the interaction between users and search engines, e. g. manipulating the color attribute of the clothing from red to blue.

Attribute Representation Learning

Paper
Add Code

Perceptual Generative Adversarial Networks for Small Object Detection

no code implementations • CVPR 2017 • Jianan Li, Xiaodan Liang, Yunchao Wei, Tingfa Xu, Jiashi Feng, Shuicheng Yan

In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to "super-resolved" ones, achieving similar characteristics as large objects and thus more discriminative for detection.

Generative Adversarial Network Object +2

Paper
Add Code

Video-based Person Re-identiﬁcation with Accumulative Motion Context

1 code implementation • 13 Jun 2017 • Hao liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng

Video based person re-identification plays a central role in realistic security and video surveillance.

Video-Based Person Re-Identification

Paper
Code

Personalized Age Progression with Bi-level Aging Dictionary Learning

no code implementations • 4 Jun 2017 • Xiangbo Shu, Jinhui Tang, Zechao Li, Hanjiang Lai, Liyan Zhang, Shuicheng Yan

Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e. g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process.

Dictionary Learning Face Verification

Paper
Add Code

Generative Partition Networks for Multi-Person Pose Estimation

1 code implementation • 21 May 2017 • Xuecheng Nie, Jiashi Feng, Junliang Xing, Shuicheng Yan

This paper proposes a new Generative Partition Network (GPN) to address the challenging multi-person pose estimation problem.

Ranked #1 on Multi-Person Pose Estimation on WAF (AP metric)

Human Detection Keypoint Detection +1

Paper
Code

Multiple-Human Parsing in the Wild

2 code implementations • 19 May 2017 • Jianshu Li, Jian Zhao, Yunchao Wei, Congyan Lang, Yidong Li, Terence Sim, Shuicheng Yan, Jiashi Feng

To address the multi-human parsing problem, we introduce a new multi-human parsing (MHP) dataset and a novel multi-human parsing model named MH-Parser.

Ranked #3 on Multi-Human Parsing on MHP v1.0

Multi-Human Parsing

652

Paper
Code

More is Less: A More Complicated Network with Less Inference Complexity

no code implementations • CVPR 2017 • Xuanyi Dong, Junshi Huang, Yi Yang, Shuicheng Yan

In this paper, we present a novel and general network structure towards accelerating the inference process of convolutional neural networks, which is more complicated in network structure yet with less inference complexity.

Paper
Add Code

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach

no code implementations • CVPR 2017 • Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, Shuicheng Yan

We investigate a principle way to progressively mine discriminative object regions using classification networks to address the weakly-supervised semantic segmentation problems.

Classification General Classification +4

Paper
Add Code

Tree-Structured Reinforcement Learning for Sequential Object Localization

no code implementations • NeurIPS 2016 • Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Feng Lu, Shuicheng Yan

Therefore, Tree-RL can better cover different objects with various scales which is quite appealing in the context of object proposal.

Object Object Localization +2

Paper
Add Code

Interpretable Structure-Evolving LSTM

no code implementations • CVPR 2017 • Xiaodan Liang, Liang Lin, Xiaohui Shen, Jiashi Feng, Shuicheng Yan, Eric P. Xing

Instead of learning LSTM models over the pre-fixed structures, we propose to further learn the intermediate interpretable multi-level graph structures in a progressive and stochastic way from data during the LSTM network optimization.

Small Data Image Classification

Paper
Add Code

Training Group Orthogonal Neural Networks with Privileged Information

no code implementations • 24 Jan 2017 • Yunpeng Chen, Xiaojie Jin, Jiashi Feng, Shuicheng Yan

Learning rich and diverse representations is critical for the performance of deep convolutional neural networks (CNNs).

Image Classification Image Segmentation +1

Paper
Add Code

Video-based Person Re-identification with Accumulative Motion Context

no code implementations • 1 Jan 2017 • Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng

Video based person re-identification plays a central role in realistic security and video surveillance.

Video-Based Person Re-Identification

Paper
Add Code

Robust LSTM-Autoencoders for Face De-Occlusion in the Wild

no code implementations • 27 Dec 2016 • Fang Zhao, Jiashi Feng, Jian Zhao, Wenhan Yang, Shuicheng Yan

The first one, named multi-scale spatial LSTM encoder, reads facial patches of various scales sequentially to output a latent representation, and occlusion-robustness is achieved owing to the fact that the influence of occlusion is only upon some of the patches.

Decoder Face Recognition

Paper
Add Code

Video Scene Parsing with Predictive Feature Learning

no code implementations • ICCV 2017 • Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan

In this way, the network can effectively learn to capture video dynamics and temporal context, which are critical clues for video scene parsing, without requiring extra manual annotations.

Representation Learning Scene Parsing

Paper
Add Code

Deep Joint Rain Detection and Removal from a Single Image

2 code implementations • CVPR 2017 • Wenhan Yang, Robby T. Tan, Jiashi Feng, Jiaying Liu, Zongming Guo, Shuicheng Yan

Based on the first model, we develop a multi-task deep learning architecture that learns the binary rain streak map, the appearance of rain streaks, and the clean background, which is our ultimate output.

Rain Removal

Paper
Code

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

no code implementations • 27 Aug 2016 • Xiaojie Jin, Yunpeng Chen, Jiashi Feng, Zequn Jie, Shuicheng Yan

In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images.

Scene Parsing

Paper
Add Code

Visual Processing by a Unified Schatten-$p$ Norm and $\ell_q$ Norm Regularized Principal Component Pursuit

no code implementations • 20 Aug 2016 • Jing Wang, Meng Wang, Xuegang Hu, Shuicheng Yan

Typically, the specific structure is assumed to be low rank, which holds for a wide range of data, such as images and videos.

Paper
Add Code

Multi-stage Object Detection with Group Recursive Learning

no code implementations • 18 Aug 2016 • Jianan Li, Xiaodan Liang, Jianshu Li, Tingfa Xu, Jiashi Feng, Shuicheng Yan

Most of existing detection pipelines treat object proposals independently and predict bounding box locations and classification scores over them separately.

Object object-detection +4

Paper
Add Code

Peak-Piloted Deep Network for Facial Expression Recognition

no code implementations • 24 Jul 2016 • Xiangyun Zhao, Xiaodan Liang, Luoqi Liu, Teng Li, Yugang Han, Nuno Vasconcelos, Shuicheng Yan

Objective functions for training of deep networks for face-related recognition tasks, such as facial expression recognition (FER), usually consider each sample independently.

Ranked #2 on Facial Expression Recognition (FER) on Oulu-CASIA

Face Recognition Facial Expression Recognition +2

Paper
Add Code

Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods

no code implementations • 19 Jul 2016 • Xiaojie Jin, Xiao-Tong Yuan, Jiashi Feng, Shuicheng Yan

In this paper, we propose an iterative hard thresholding (IHT) approach to train Skinny Deep Neural Networks (SDNNs).

Object Recognition

Paper
Add Code

Collaborative Layer-wise Discriminative Learning in Deep Neural Networks

no code implementations • 19 Jul 2016 • Xiaojie Jin, Yunpeng Chen, Jian Dong, Jiashi Feng, Shuicheng Yan

In this paper, we propose a layer-wise discriminative learning method to enhance the discriminative capability of a deep network by allowing its layers to work collaboratively for classification.

Classification General Classification +1

Paper
Add Code

Diversified Visual Attention Networks for Fine-Grained Object Classification

no code implementations • 28 Jun 2016 • Bo Zhao, Xiao Wu, Jiashi Feng, Qiang Peng, Shuicheng Yan

Fine-grained object classification is a challenging task due to the subtle inter-class difference and large intra-class variation.

Classification General Classification +1

Paper
Add Code

End-to-End Comparative Attention Networks for Person Re-identification

no code implementations • 14 Jun 2016 • Hao Liu, Jiashi Feng, Meibin Qi, Jianguo Jiang, Shuicheng Yan

The CAN model is able to learn which parts of images are relevant for discerning persons and automatically integrates information from different parts to determine whether a pair of images belongs to the same person.

Person Re-Identification

Paper
Add Code

Recurrent Face Aging

no code implementations • CVPR 2016 • Wei Wang, Zhen Cui, Yan Yan, Jiashi Feng, Shuicheng Yan, Xiangbo Shu, Nicu Sebe

Modeling the aging process of human face is important for cross-age face verification and recognition.

Face Verification

Paper
Add Code

Recurrently Target-Attending Tracking

no code implementations • CVPR 2016 • Zhen Cui, Shengtao Xiao, Jiashi Feng, Shuicheng Yan

The produced confidence maps from the RNNs are employed to adaptively regularize the learning of discriminative correlation filters by suppressing clutter background noises while making full use of the information from reliable parts.

Visual Tracking

Paper
Add Code

Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution

no code implementations • 29 Apr 2016 • Wenhan Yang, Jiashi Feng, Jianchao Yang, Fang Zhao, Jiaying Liu, Zongming Guo, Shuicheng Yan

To address this essentially ill-posed problem, we introduce a Deep Edge Guided REcurrent rEsidual~(DEGREE) network to progressively recover the high-frequency details.

Image Super-Resolution

Paper
Add Code

A Focused Dynamic Attention Model for Visual Question Answering

no code implementations • 6 Apr 2016 • Ilija Ilievski, Shuicheng Yan, Jiashi Feng

Solving VQA problems requires techniques from both computer vision for understanding the visual contents of a presented image or video, as well as the ones from natural language processing for understanding semantics of the question and generating the answers.

Ranked #8 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Question Answering Visual Question Answering

Paper
Add Code

Attentive Contexts for Object Detection

no code implementations • 24 Mar 2016 • Jianan Li, Yunchao Wei, Xiaodan Liang, Jian Dong, Tingfa Xu, Jiashi Feng, Shuicheng Yan

We provide preliminary answers to these questions through developing a novel Attention to Context Convolution Neural Network (AC-CNN) based object detection model.

Object object-detection +1

Paper
Add Code

Semantic Object Parsing with Graph LSTM

no code implementations • 23 Mar 2016 • Xiaodan Liang, Xiaohui Shen, Jiashi Feng, Liang Lin, Shuicheng Yan

By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data.

Object Superpixels

Paper
Add Code

Instance-Aware Hashing for Multi-Label Image Retrieval

no code implementations • 10 Mar 2016 • Hanjiang Lai, Pan Yan, Xiangbo Shu, Yunchao Wei, Shuicheng Yan

The instance-aware representations not only bring advantages to semantic hashing, but also can be used in category-aware hashing, in which an image is represented by multiple pieces of hash codes and each piece of code corresponds to a category.

Multi-Label Image Retrieval Retrieval

Paper
Add Code

Seq-NMS for Video Object Detection

1 code implementation • 26 Feb 2016 • Wei Han, Pooya Khorrami, Tom Le Paine, Prajit Ramachandran, Mohammad Babaeizadeh, Honghui Shi, Jianan Li, Shuicheng Yan, Thomas S. Huang

Video object detection is challenging because objects that are easily detected in one frame may be difficult to detect in another frame within the same clip.

General Classification Object +4

Paper
Code

Hybrid CNN and Dictionary-Based Models for Scene Recognition and Domain Adaptation

no code implementations • 29 Jan 2016 • Guo-Sen Xie, Xu-Yao Zhang, Shuicheng Yan, Cheng-Lin Liu

Learned from a large-scale training dataset, CNN features are much more discriminative and accurate than the hand-crafted features.

Clustering Domain Adaptation +1

Paper
Add Code

Scale-aware Pixel-wise Object Proposal Networks

no code implementations • 19 Jan 2016 • Zequn Jie, Xiaodan Liang, Jiashi Feng, Wen Feng Lu, Eng Hock Francis Tay, Shuicheng Yan

In particular, in order to improve the localization accuracy, a fully convolutional network is employed which predicts locations of object proposals for each pixel.

Object object-detection +2

Paper
Add Code

Deep Learning with S-shaped Rectified Linear Activation Units

1 code implementation • 22 Dec 2015 • Xiaojie Jin, Chunyan Xu, Jiashi Feng, Yunchao Wei, Junjun Xiong, Shuicheng Yan

Rectified linear activation units are important components for state-of-the-art deep convolutional networks.

Paper
Code

Online Crowdsourcing

no code implementations • 8 Dec 2015 • Changbo Zhu, Huan Xu, Shuicheng Yan

With the success of modern internet based platform, such as Amazon Mechanical Turk, it is now normal to collect a large number of hand labeled samples from non-experts.

Paper
Add Code

Human Parsing With Contextualized Convolutional Neural Network

no code implementations • ICCV 2015 • Xiaodan Liang, Chunyan Xu, Xiaohui Shen, Jianchao Yang, Si Liu, Jinhui Tang, Liang Lin, Shuicheng Yan

In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-level context, within-super-pixel context and cross-super-pixel neighborhood context into a unified network.

Human Parsing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.