Search Results for author: Gao Huang

Found 126 papers, 86 papers with code

Where and when to look? Spatial-temporal attention for action recognition in videos

no code implementations • ICLR 2019 • Lili Meng, Bo Zhao, Bo Chang, Gao Huang, Frederick Tung, Leonid Sigal

Our model is efficient, as it proposes a separable spatio-temporal mechanism for video attention, while being able to identify important parts of the video both spatially and temporally.

Action Recognition In Videos Temporal Action Localization +1

Paper
Add Code

Exploring Text-to-Motion Generation with Human Preference

1 code implementation • 15 Apr 2024 • Jenny Sheng, Matthieu Lin, Andrew Zhao, Kevin Pruvost, Yu-Hui Wen, Yangguang Li, Gao Huang, Yong-Jin Liu

This paper presents an exploration of preference learning in text-to-motion generation.

Paper
Code

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

2 code implementations • 18 Mar 2024 • Ruyi Xu, Yuan YAO, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang

To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.

1,312

Paper
Code

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

1 code implementation • 18 Mar 2024 • Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

Semantic Segmentation Video Recognition

Paper
Code

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

no code implementations • 17 Mar 2024 • Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang

GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention.

Object object-detection +2

Paper
Add Code

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

no code implementations • 14 Mar 2024 • Zeyu Liu, Weicong Liang, Zhanhao Liang, Chong Luo, Ji Li, Gao Huang, Yuhui Yuan

Visual text rendering poses a fundamental challenge for contemporary text-to-image generation models, with the core problem lying in text encoder deficiencies.

Text-to-Image Generation

Paper
Add Code

2023 Low-Power Computer Vision Challenge (LPCVC) Summary

no code implementations • 11 Mar 2024 • Leo Chen, Benjamin Boardley, Ping Hu, Yiru Wang, Yifan Pu, Xin Jin, Yongqiang Yao, Ruihao Gong, Bo Li, Gao Huang, Xianglong Liu, Zifu Wan, Xinwang Chen, Ning Liu, Ziyi Zhang, Dongping Liu, Ruijie Shan, Zhengping Che, Fachao Zhang, Xiaofeng Mou, Jian Tang, Maxim Chuprov, Ivan Malofeev, Alexander Goncharenko, Andrey Shcherbin, Arseny Yanchenko, Sergey Alyamkin, Xiao Hu, George K. Thiruvathukal, Yung Hsiang Lu

This article describes the 2023 IEEE Low-Power Computer Vision Challenge (LPCVC).

Paper
Add Code

Probabilistic Contrastive Learning for Long-Tailed Visual Recognition

1 code implementation • 11 Mar 2024 • Chaoqun Du, Yulin Wang, Shiji Song, Gao Huang

To overcome this obstacle, we propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space, and samples contrastive pairs accordingly.

Ranked #8 on Long-tail Learning on iNaturalist 2018

Long-tail Learning

Paper
Code

SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning

1 code implementation • 21 Feb 2024 • Chaoqun Du, Yizeng Han, Gao Huang

Recent advancements in semi-supervised learning have focused on a more realistic yet challenging task: addressing imbalances in labeled data while the class distribution of unlabeled data remains both unknown and potentially mismatched.

Paper
Code

LLM Agents for Psychology: A Study on Gamified Assessments

no code implementations • 19 Feb 2024 • Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang

Psychological measurement is essential for mental health, self-understanding, and personal development.

Paper
Add Code

Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels

no code implementations • 28 Dec 2023 • Rui Huang, Songyou Peng, Ayca Takmaz, Federico Tombari, Marc Pollefeys, Shiji Song, Gao Huang, Francis Engelmann

Therefore, we explore the use of image segmentation foundation models to automatically generate training labels for 3D segmentation.

Image Segmentation Scene Segmentation +1

Paper
Add Code

A-SDM: Accelerating Stable Diffusion through Redundancy Removal and Performance Optimization

no code implementations • 24 Dec 2023 • Jinchao Zhu, Yuxuan Wang, Xiaobing Tu, Siyuan Pan, Pengfei Wan, Gao Huang

The Stable Diffusion Model (SDM) is a popular and efficient text-to-image (t2i) generation and image-to-image (i2i) generation model.

Quantization

Paper
Add Code

Mask Grounding for Referring Image Segmentation

no code implementations • 19 Dec 2023 • Yong Xien Chng, Henry Zheng, Yizeng Han, Xuchong Qiu, Gao Huang

To tackle this challenge, we introduce a novel Mask Grounding auxiliary task that significantly improves visual grounding within language features, by explicitly teaching the model to learn fine-grained correspondence between masked textual tokens and their matching visual objects.

Ranked #2 on Referring Expression Segmentation on RefCOCO testB

Image Segmentation Referring Expression Segmentation +4

Paper
Add Code

GSVA: Generalized Segmentation via Multimodal Large Language Models

no code implementations • 15 Dec 2023 • Zhuofan Xia, Dongchen Han, Yizeng Han, Xuran Pan, Shiji Song, Gao Huang

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image.

Generalized Referring Expression Segmentation Referring Expression +1

Paper
Add Code

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations • 14 Dec 2023 • Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

342

Paper
Code

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

1 code implementation • 7 Dec 2023 • Jiayi Guo, Xingqian Xu, Yifan Pu, Zanlin Ni, Chaofei Wang, Manushree Vasu, Shiji Song, Gao Huang, Humphrey Shi

Specifically, we introduce Step-wise Variation Regularization to enforce the proportion between the variations of an arbitrary input latent and that of the output image is a constant at any diffusion training step.

257

Paper
Code

Augmenting Unsupervised Reinforcement Learning with Self-Reference

no code implementations • 16 Nov 2023 • Andrew Zhao, Erle Zhu, Rui Lu, Matthieu Lin, Yong-Jin Liu, Gao Huang

Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark for model-free methods, recording an 86% IQM and a 16% Optimality Gap.

Attribute reinforcement-learning +1

Paper
Add Code

Detecting Generated Images by Real Images Only

no code implementations • 2 Nov 2023 • Xiuli Bi, Bo Liu, Fan Yang, Bin Xiao, Weisheng Li, Gao Huang, Pamela C. Cosman

This paper approaches the generated image detection problem from a new perspective: Start from real images.

Paper
Add Code

Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning

1 code implementation • NeurIPS 2023 • Shenzhi Wang, Qisen Yang, Jiawei Gao, Matthieu Gaetan Lin, Hao Chen, Liwei Wu, Ning Jia, Shiji Song, Gao Huang

Existing solutions tackle this problem by imposing a policy constraint on the policy improvement objective in both offline and online learning.

D4RL Reinforcement Learning (RL)

Paper
Code

STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning

1 code implementation • NeurIPS 2023 • Weipu Zhang, Gang Wang, Jian Sun, Yetian Yuan, Gao Huang

The performance of these algorithms heavily relies on the sequence modeling and generation capabilities of the world model.

Ranked #5 on Atari Games 100k on Atari 100k

Atari Games 100k Model-based Reinforcement Learning +2

Paper
Code

Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL

2 code implementations • NeurIPS 2023 • Yang Yue, Rui Lu, Bingyi Kang, Shiji Song, Gao Huang

We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.

Attribute Offline RL

Paper
Code

Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation

no code implementations • 2 Oct 2023 • Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang

This study utilizes the intricate Avalon game as a testbed to explore LLMs' potential in deceptive environments.

Misinformation

Paper
Add Code

Generalized Activation via Multivariate Projection

1 code implementation • 29 Sep 2023 • Jiayun Li, Yuxiao Cheng, Yiwen Lu, Zhuofan Xia, Yilin Mo, Gao Huang

Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness.

Paper
Code

Hundreds Guide Millions: Adaptive Offline Reinforcement Learning with Expert Guidance

no code implementations • 4 Sep 2023 • Qisen Yang, Shenzhi Wang, Qihang Zhang, Gao Huang, Shiji Song

Offline reinforcement learning (RL) optimizes the policy on a previously collected dataset without any interactions with the environment, yet usually suffers from the distributional shift problem.

Offline RL reinforcement-learning +1

Paper
Add Code

DAT++: Spatially Dynamic Vision Transformer with Deformable Attention

1 code implementation • 4 Sep 2023 • Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

On the one hand, using dense attention in ViT leads to excessive memory and computational cost, and features can be influenced by irrelevant parts that are beyond the region of interests.

Ranked #4 on Object Detection on COCO 2017

Image Classification Instance Segmentation +2

694

Paper
Code

Fine-grained Recognition with Learnable Semantic Data Augmentation

1 code implementation • 1 Sep 2023 • Yifan Pu, Yizeng Han, Yulin Wang, Junlan Feng, Chao Deng, Gao Huang

Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories.

Data Augmentation Fine-Grained Image Recognition +2

Paper
Code

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

1 code implementation • 30 Aug 2023 • Yizeng Han, Zeyu Liu, Zhihang Yuan, Yifan Pu, Chaofei Wang, Shiji Song, Gao Huang

Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks.

Scheduling

Paper
Code

Computation-efficient Deep Learning for Computer Vision: A Survey

no code implementations • 27 Aug 2023 • Yulin Wang, Yizeng Han, Chaofei Wang, Shiji Song, Qi Tian, Gao Huang

Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks.

Autonomous Vehicles Edge-computing +1

Paper
Add Code

ExpeL: LLM Agents Are Experiential Learners

1 code implementation • 20 Aug 2023 • Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, Gao Huang

The recent surge in research interest in applying large language models (LLMs) to decision-making tasks has flourished by leveraging the extensive world knowledge embedded in LLMs.

Decision Making Transfer Learning +1

Paper
Code

Learning Specialized Activation Functions for Physics-informed Neural Networks

1 code implementation • 8 Aug 2023 • Honghui Wang, Lu Lu, Shiji Song, Gao Huang

To avoid the inefficient manual selection and to alleviate the optimization difficulty of PINNs, we introduce adaptive activation functions to search for the optimal function when solving different problems.

Paper
Code

FLatten Transformer: Vision Transformer using Focused Linear Attention

1 code implementation • ICCV 2023 • Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang

The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks.

337

Paper
Code

Multi-Dimensional Refinement Graph Convolutional Network with Robust Decouple Loss for Fine-Grained Skeleton-Based Action Recognition

no code implementations • 27 Jun 2023 • Sheng-Lan Liu, Yu-Ning Ding, Jin-Rong Zhang, Kai-Yuan Liu, Si-Fan Zhang, Fei-Long Wang, Gao Huang

Graph convolutional networks have been widely used in skeleton-based action recognition.

Fine-grained Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Dynamic Perceiver for Efficient Visual Recognition

1 code implementation • ICCV 2023 • Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang

Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.

Action Recognition Classification +4

Paper
Code

Decoupled Prioritized Resampling for Offline RL

2 code implementations • 8 Jun 2023 • Yang Yue, Bingyi Kang, Xiao Ma, Qisen Yang, Gao Huang, Shiji Song, Shuicheng Yan

OPER is a plug-and-play component for offline RL algorithms.

Offline RL Reinforcement Learning (RL)

Paper
Code

ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process

1 code implementation • 8 Jun 2023 • Changyao Tian, Chenxin Tao, Jifeng Dai, Hao Li, Ziheng Li, Lewei Lu, Xiaogang Wang, Hongsheng Li, Gao Huang, Xizhou Zhu

In each denoising step, our method first decodes pixels from previous VQ tokens, then generates new VQ tokens from the decoded pixels.

Denoising Representation Learning

Paper
Code

Boosting Offline Reinforcement Learning with Action Preference Query

no code implementations • 6 Jun 2023 • Qisen Yang, Shenzhi Wang, Matthieu Gaetan Lin, Shiji Song, Gao Huang

In particular, online fine-tuning has become a commonly used method to correct the erroneous estimates of out-of-distribution data learned in the offline training phase.

Autonomous Driving D4RL +2

Paper
Add Code

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models

1 code implementation • 25 May 2023 • Xingqian Xu, Jiayi Guo, Zhangyang Wang, Gao Huang, Irfan Essa, Humphrey Shi

Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.

Conditional Text-to-Image Synthesis Image Generation +3

704

Paper
Code

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory

1 code implementation • 25 May 2023 • Xizhou Zhu, Yuntao Chen, Hao Tian, Chenxin Tao, Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, Jifeng Dai

These agents, equipped with the logic and common sense capabilities of LLMs, can skillfully navigate complex, sparse-reward environments with text-based interactions.

Common Sense Reasoning Navigate +1

567

Paper
Code

Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention

1 code implementation • CVPR 2023 • Xuran Pan, Tianzhu Ye, Zhuofan Xia, Shiji Song, Gao Huang

Self-attention mechanism has been a key factor in the recent progress of Vision Transformer (ViT), which enables adaptive feature extraction from global contexts.

feature selection Inductive Bias

148

Paper
Code

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

1 code implementation • CVPR 2023 • Jiayi Guo, Chaofei Wang, You Wu, Eric Zhang, Kai Wang, Xingqian Xu, Shiji Song, Humphrey Shi, Gao Huang

Recently, CLIP-guided image synthesis has shown appealing performance on adapting a pre-trained source-domain generator to an unseen target domain.

Image Generation

Paper
Code

Adaptive Rotated Convolution for Rotated Object Detection

1 code implementation • ICCV 2023 • Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang

In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image.

Ranked #3 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +2

Paper
Code

Joint Representation Learning for Text and 3D Point Cloud

no code implementations • 18 Jan 2023 • Rui Huang, Xuran Pan, Henry Zheng, Haojun Jiang, Zhifeng Xie, Shiji Song, Gao Huang

During the pre-training stage, we establish the correspondence of images and point clouds based on the readily available RGB-D data and use contrastive learning to align the image and point cloud representations.

Contrastive Learning Instance Segmentation +4

Paper
Add Code

Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm

1 code implementation • ICCV 2023 • Wenxuan Ma, Shuang Li, Jinming Zhang, Chi Harold Liu, Jingxuan Kang, Yulin Wang, Gao Huang

To address this issue, this paper presents a novel approach that seeks to leverage linguistic knowledge for data-efficient visual learning.

Domain Generalization Few-Shot Learning +1

Paper
Code

Convolution-enhanced Evolving Attention Networks

1 code implementation • 16 Dec 2022 • Yujing Wang, Yaming Yang, Zhuo Li, Jiangang Bai, Mingliang Zhang, Xiangtai Li, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

To the best of our knowledge, this is the first work that explicitly models the layer-wise evolution of attention maps.

Image Classification Machine Translation +3

Paper
Code

Deep Incubation: Training Large Models by Divide-and-Conquering

3 code implementations • ICCV 2023 • Zanlin Ni, Yulin Wang, Jiangwei Yu, Haojun Jiang, Yue Cao, Gao Huang

In this paper, we present Deep Incubation, a novel approach that enables the efficient and effective training of large models by dividing them into smaller sub-modules that can be trained separately and assembled seamlessly.

Image Segmentation object-detection +2

254

Paper
Code

Boosted Dynamic Neural Networks

1 code implementation • 30 Nov 2022 • Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi

To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.

Paper
Code

BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

2 code implementations • CVPR 2023 • Chenyu Yang, Yuntao Chen, Hao Tian, Chenxin Tao, Xizhou Zhu, Zhaoxiang Zhang, Gao Huang, Hongyang Li, Yu Qiao, Lewei Lu, Jie zhou, Jifeng Dai

The proposed method is verified with a wide spectrum of traditional and modern image backbones and achieves new SoTA results on the large-scale nuScenes dataset.

Ranked #5 on 3D Object Detection on Rope3D

3D Object Detection

2,870

Paper
Code

Cross-Modal Adapter for Text-Video Retrieval

1 code implementation • 17 Nov 2022 • Haojun Jiang, Jianke Zhang, Rui Huang, Chunjiang Ge, Zanlin Ni, Jiwen Lu, Jie zhou, Shiji Song, Gao Huang

However, as pre-trained models are scaling up, fully fine-tuning them on text-video retrieval datasets has a high risk of overfitting.

Retrieval Video Retrieval

Paper
Code

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

1 code implementation • CVPR 2023 • Weijie Su, Xizhou Zhu, Chenxin Tao, Lewei Lu, Bin Li, Gao Huang, Yu Qiao, Xiaogang Wang, Jie zhou, Jifeng Dai

It has been proved that combining multiple pre-training strategies and data from various modalities/sources can greatly boost the training of large-scale models.

Ranked #2 on Semantic Segmentation on ADE20K (using extra training data)

Image Classification Long-tailed Object Detection +3

Paper
Code

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

1 code implementation • ICCV 2023 • Yulin Wang, Yang Yue, Rui Lu, Tianjiao Liu, Zhao Zhong, Shiji Song, Gao Huang

It is also effective for self-supervised learning (e. g., MAE).

Data Augmentation Self-Supervised Learning

Paper
Code

Boosting Offline Reinforcement Learning via Data Rebalancing

no code implementations • 17 Oct 2022 • Yang Yue, Bingyi Kang, Xiao Ma, Zhongwen Xu, Gao Huang, Shuicheng Yan

Therefore, we propose a simple yet effective method to boost offline RL algorithms based on the observation that resampling a dataset keeps the distribution support unchanged.

D4RL Offline RL +2

Paper
Add Code

Contrastive Language-Image Pre-Training with Knowledge Graphs

no code implementations • 17 Oct 2022 • Xuran Pan, Tianzhu Ye, Dongchen Han, Shiji Song, Gao Huang

Recent years have witnessed the fast development of large-scale pre-training frameworks that can extract multi-modal representations in a unified form and achieve promising performances when transferred to downstream tasks.

Knowledge Graphs

Paper
Add Code

A Mixture of Surprises for Unsupervised Reinforcement Learning

1 code implementation • 13 Oct 2022 • Andrew Zhao, Matthieu Gaetan Lin, Yangguang Li, Yong-Jin Liu, Gao Huang

However, both strategies rely on a strong assumption: the entropy of the environment's dynamics is either high or low.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Efficient Knowledge Distillation from Model Checkpoints

1 code implementation • 12 Oct 2022 • Chaofei Wang, Qisen Yang, Rui Huang, Shiji Song, Gao Huang

Knowledge distillation is an effective approach to learn compact models (students) with the supervision of large and strong models (teachers).

Knowledge Distillation

Paper
Code

Latency-aware Spatial-wise Dynamic Networks

2 code implementations • 12 Oct 2022 • Yizeng Han, Zhihang Yuan, Yifan Pu, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang

The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties.

Image Classification Instance Segmentation +4

Paper
Code

AdaFocusV3: On Unified Spatial-temporal Dynamic Video Recognition

no code implementations • 27 Sep 2022 • Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, Gao Huang

Recent research has revealed that reducing the temporal and spatial redundancy are both effective approaches towards efficient video recognition, e. g., allocating the majority of computation to a task-relevant subset of frames or the most valuable image regions of each frame.

Video Recognition

Paper
Add Code

ActiveNeRF: Learning where to See with Uncertainty Estimation

1 code implementation • 18 Sep 2022 • Xuran Pan, Zihang Lai, Shiji Song, Gao Huang

In this paper, we present a novel learning framework, ActiveNeRF, aiming to model a 3D scene with a constrained input budget.

Active Learning Novel View Synthesis

Paper
Code

Learning to Weight Samples for Dynamic Early-exiting Networks

1 code implementation • 17 Sep 2022 • Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, Shiji Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang

Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers.

Meta-Learning

Paper
Code

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

no code implementations • 25 Jun 2022 • Yang Yue, Bingyi Kang, Zhongwen Xu, Gao Huang, Shuicheng Yan

Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL.

Contrastive Learning Data Augmentation +5

Paper
Add Code

DiSparse: Disentangled Sparsification for Multitask Model Compression

1 code implementation • CVPR 2022 • Xinglong Sun, Ali Hassani, Zhangyang Wang, Gao Huang, Humphrey Shi

We analyzed the pruning masks generated with DiSparse and observed strikingly similar sparse network architecture identified by each task even before the training starts.

Model Compression

Paper
Code

Siamese Image Modeling for Self-Supervised Vision Representation Learning

2 code implementations • CVPR 2023 • Chenxin Tao, Xizhou Zhu, Weijie Su, Gao Huang, Bin Li, Jie zhou, Yu Qiao, Xiaogang Wang, Jifeng Dai

Driven by these analysis, we propose Siamese Image Modeling (SiameseIM), which predicts the dense representations of an augmented view, based on another masked view from the same image but with different augmentations.

Representation Learning Self-Supervised Learning +1

Paper
Code

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

no code implementations • 31 May 2022 • Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang

While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited.

Multi-Armed Bandits Reinforcement Learning (RL) +1

Paper
Add Code

SePiCo: Semantic-Guided Pixel Contrast for Domain Adaptive Semantic Segmentation

1 code implementation • 19 Apr 2022 • Binhui Xie, Shuang Li, Mingjia Li, Chi Harold Liu, Gao Huang, Guoren Wang

Domain adaptive semantic segmentation attempts to make satisfactory dense predictions on an unlabeled target domain by utilizing the supervised model trained on a labeled source domain.

Ranked #4 on Semantic Segmentation on GTAV-to-Cityscapes Labels

Semantic Segmentation Synthetic-to-Real Translation

110

Paper
Code

Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

1 code implementation • CVPR 2022 • Haojun Jiang, Yuanze Lin, Dongchen Han, Shiji Song, Gao Huang

Our method leverages an off-the-shelf object detector to identify visual objects from unlabeled images, and then language queries for these objects are obtained in an unsupervised fashion with a pseudo-query generation module.

Language Modelling Natural Language Queries +1

137

Paper
Code

Learn From the Past: Experience Ensemble Knowledge Distillation

no code implementations • 25 Feb 2022 • Chaofei Wang, Shaowei Zhang, Shiji Song, Gao Huang

We save a moderate number of intermediate models from the training process of the teacher model uniformly, and then integrate the knowledge of these intermediate models by ensemble technique.

Knowledge Distillation Transfer Learning

Paper
Add Code

Domain Adaptation via Prompt Learning

1 code implementation • 14 Feb 2022 • Chunjiang Ge, Rui Huang, Mixue Xie, Zihang Lai, Shiji Song, Shuang Li, Gao Huang

Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain, where only unlabeled samples are given.

Domain Adaptation

Paper
Code

Glance and Focus Networks for Dynamic Visual Recognition

1 code implementation • 9 Jan 2022 • Gao Huang, Yulin Wang, Kangchen Lv, Haojun Jiang, Wenhui Huang, Pengfei Qi, Shiji Song

Spatial redundancy widely exists in visual recognition tasks, i. e., discriminative features in an image or video frame usually correspond to only a subset of pixels, while the remaining regions are irrelevant to the task at hand.

Image Classification Video Recognition

180

Paper
Code

Vision Transformer with Deformable Attention

2 code implementations • CVPR 2022 • Zhuofan Xia, Xuran Pan, Shiji Song, Li Erran Li, Gao Huang

On the one hand, using dense attention e. g., in ViT, leads to excessive memory and computational cost, and features can be influenced by irrelevant parts which are beyond the region of interests.

Ranked #107 on Object Detection on COCO test-dev

Image Classification Object Detection +1

694

Paper
Code

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

1 code implementation • CVPR 2022 • Yulin Wang, Yang Yue, Yuanze Lin, Haojun Jiang, Zihang Lai, Victor Kulikov, Nikita Orlov, Humphrey Shi, Gao Huang

Recent works have shown that the computational efficiency of video recognition can be significantly improved by reducing the spatial redundancy.

Computational Efficiency Video Recognition

Paper
Code

Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework

1 code implementation • CVPR 2022 • Chenxin Tao, Honghui Wang, Xizhou Zhu, Jiahua Dong, Shiji Song, Gao Huang, Jifeng Dai

These methods appear to be quite different in the designed loss functions from various motivations.

Contrastive Learning Self-Supervised Learning

Paper
Code

Searching Parameterized AP Loss for Object Detection

1 code implementation • NeurIPS 2021 • Chenxin Tao, Zizhang Li, Xizhou Zhu, Gao Huang, Yong liu, Jifeng Dai

In this paper, we propose Parameterized AP Loss, where parameterized functions are introduced to substitute the non-differentiable components in the AP calculation.

Object object-detection +1

Paper
Code

Assessing a Single Image in Reference-Guided Image Synthesis

no code implementations • 8 Dec 2021 • Jiayi Guo, Chaoqun Du, Jiangshan Wang, Huijuan Huang, Pengfei Wan, Gao Huang

For Reference-guided Image Synthesis (RIS) tasks, i. e., rendering a source image in the style of another reference image, where assessing the quality of a single generated image is crucial, these metrics are not applicable.

Image Generation

Paper
Add Code

Temporal-Spatial Causal Interpretations for Vision-Based Reinforcement Learning

no code implementations • 6 Dec 2021 • Wenjie Shi, Gao Huang, Shiji Song, Cheng Wu

TSCI model builds on the formulation of temporal causality, which reflects the temporal causal relations between sequential observations and decisions of RL agent.

Causal Discovery Decision Making +2

Paper
Add Code

On the Integration of Self-Attention and Convolution

2 code implementations • CVPR 2022 • Xuran Pan, Chunjiang Ge, Rui Lu, Shiji Song, Guanfu Chen, Zeyi Huang, Gao Huang

In this paper, we show that there exists a strong underlying relation between them, in the sense that the bulk of computations of these two paradigms are in fact done with the same operation.

Representation Learning

362

Paper
Code

Fine-Grained Few Shot Learning with Foreground Object Transformation

no code implementations • 13 Sep 2021 • Chaofei Wang, Shiji Song, Qisen Yang, Xiang Li, Gao Huang

As a data augmentation method, FOT can be conveniently applied to any existing few shot learning algorithm and greatly improve its performance on FG-FSL tasks.

Data Augmentation Few-Shot Learning +2

Paper
Add Code

CAM-loss: Towards Learning Spatially Discriminative Feature Representations

no code implementations • ICCV 2021 • Chaofei Wang, Jiayu Xiao, Yizeng Han, Qisen Yang, Shiji Song, Gao Huang

The backbone of traditional CNN classifier is generally considered as a feature extractor, followed by a linear layer which performs the classification.

Few-Shot Learning Image Classification +2

Paper
Add Code

Integrating Large Circular Kernels into CNNs through Neural Architecture Search

1 code implementation • 6 Jul 2021 • Kun He, Chao Li, Yixiao Yang, Gao Huang, John E. Hopcroft

We first propose a simple yet efficient implementation of the convolution using circular kernels, and empirically show the significant advantages of large circular kernels over the counterpart square kernels.

Data Augmentation Neural Architecture Search

Paper
Code

On the Power of Multitask Representation Learning in Linear MDP

no code implementations • 15 Jun 2021 • Rui Lu, Gao Huang, Simon S. Du

We first discover a \emph{Least-Activated-Feature-Abundance} (LAFA) criterion, denoted as $\kappa$, with which we prove that a straightforward least-square algorithm learns a policy which is $\tilde{O}(H^2\sqrt{\frac{\mathcal{C}(\Phi)^2 \kappa d}{NT}+\frac{\kappa d}{n}})$ sub-optimal.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

1 code implementation • NeurIPS 2021 • Yiqin Yang, Xiaoteng Ma, Chenghao Li, Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Moreover, we extend ICQ to multi-agent tasks by decomposing the joint-policy under the implicit constraint.

Multi-agent Reinforcement Learning Offline RL +5

Paper
Code

Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition

2 code implementations • NeurIPS 2021 • Yulin Wang, Rui Huang, Shiji Song, Zeyi Huang, Gao Huang

Inspired by this phenomenon, we propose a Dynamic Transformer to automatically configure a proper number of tokens for each input image.

Ranked #29 on Image Classification on CIFAR-100 (using extra training data)

Computational Efficiency Image Classification

242

Paper
Code

Adaptive Focus for Efficient Video Recognition

1 code implementation • ICCV 2021 • Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, Gao Huang

In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency.

Computational Efficiency Video Recognition

120

Paper
Code

CondenseNet V2: Sparse Feature Reactivation for Deep Networks

1 code implementation • CVPR 2021 • Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian

Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency.

Computational Efficiency Image Classification +2

Paper
Code

AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks

no code implementations • CVPR 2022 • Hao Li, Tianwen Fu, Jifeng Dai, Hongsheng Li, Gao Huang, Xizhou Zhu

However, the automatic design of loss functions for generic tasks with various evaluation metrics remains under-investigated.

Paper
Add Code

Generalized Domain Conditioned Adaptation Network

1 code implementation • 23 Mar 2021 • Shuang Li, Binhui Xie, Qiuxia Lin, Chi Harold Liu, Gao Huang, Guoren Wang

Domain Adaptation (DA) attempts to transfer knowledge learned in the labeled source domain to the unlabeled but related target domain without requiring large amounts of target supervision.

Attribute Domain Adaptation

Paper
Code

Evolving Attention with Residual Convolutions

2 code implementations • 20 Feb 2021 • Yujing Wang, Yaming Yang, Jiangang Bai, Mingliang Zhang, Jing Bai, Jing Yu, Ce Zhang, Gao Huang, Yunhai Tong

In this paper, we propose a novel and generic mechanism based on evolving attention to improve the performance of transformers.

Image Classification Machine Translation +2

Paper
Code

Dynamic Neural Networks: A Survey

no code implementations • 9 Feb 2021 • Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, Yulin Wang

Dynamic neural network is an emerging research topic in deep learning.

Computational Efficiency Decision Making

Paper
Add Code

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

1 code implementation • 26 Jan 2021 • Yulin Wang, Zanlin Ni, Shiji Song, Le Yang, Gao Huang

Due to the need to store the intermediate activations for back-propagation, end-to-end (E2E) training of deep networks usually suffers from high GPUs memory footprint.

Paper
Code

Revisiting Locally Supervised Training of Deep Neural Networks

no code implementations • ICLR 2021 • Yulin Wang, Zanlin Ni, Shiji Song, Le Yang, Gao Huang

As InfoPro loss is difficult to compute in its original form, we derive a feasible upper bound as a surrogate optimization objective, yielding a simple but effective algorithm.

Paper
Add Code

Robust Offline Reinforcement Learning from Low-Quality Data

no code implementations • 1 Jan 2021 • Wenjie Shi, Tianchi Cai, Shiji Song, Lihong Gu, Jinjie Gu, Gao Huang

We theoretically show that AdaPT produces a tight upper bound on the distributional deviation between the learned policy and the behavior policy, and this upper bound is the minimum requirement to guarantee policy improvement at each iteration.

Continuous Control Offline RL +2

Paper
Add Code

A Unified Framework for Convolution-based Graph Neural Networks

no code implementations • 1 Jan 2021 • Xuran Pan, Shiji Song, Gao Huang

In this paper, we take a step forward to establish a unified framework for convolution-based graph neural networks, by formulating the basic graph convolution operation as an optimization problem in the graph Fourier space.

Paper
Add Code

3D Object Detection with Pointformer

1 code implementation • CVPR 2021 • Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, Gao Huang

In this paper, we propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.

3D Object Detection Object +2

148

Paper
Code

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

1 code implementation • ICCV 2021 • Mu Cai, Hong Zhang, Huijuan Huang, Qichuan Geng, Yixuan Li, Gao Huang

Image-to-image translation has been revolutionized with GAN-based methods.

Image-to-Image Translation Translation

Paper
Code

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

1 code implementation • ICLR 2021 • Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai

In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric.

Semantic Segmentation

Paper
Code

Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

1 code implementation • NeurIPS 2020 • Yulin Wang, Kangchen Lv, Rui Huang, Shiji Song, Le Yang, Gao Huang

The accuracy of deep convolutional neural networks (CNNs) generally improves when fueled with high resolution images.

Computational Efficiency General Classification +1

180

Paper
Code

Regularizing Deep Networks with Semantic Data Augmentation

1 code implementation • 21 Jul 2020 • Yulin Wang, Gao Huang, Shiji Song, Xuran Pan, Yitong Xia, Cheng Wu

The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features, i. e., certain directions in the deep feature space correspond to meaningful semantic transformations, e. g., changing the background or view angle of an object.

Data Augmentation

575

Paper
Code

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning

no code implementations • 5 Jul 2020 • Yulin Wang, Jiayi Guo, Shiji Song, Gao Huang

In this paper, we propose a novel meta-learning based SSL algorithm (Meta-Semi) that requires tuning only one additional hyper-parameter, compared with a standard supervised deep learning algorithm, to achieve competitive performance under various conditions of SSL.

Meta-Learning

Paper
Add Code

Domain Conditioned Adaptation Network

1 code implementation • 14 May 2020 • Shuang Li, Chi Harold Liu, Qiuxia Lin, Binhui Xie, Zhengming Ding, Gao Huang, Jian Tang

Most existing deep DA models only focus on aligning feature representations of task-specific layers across domains while integrating a totally shared convolutional architecture for source and target.

Domain Adaptation

Paper
Code

Deep Residual Correction Network for Partial Domain Adaptation

1 code implementation • 10 Apr 2020 • Shuang Li, Chi Harold Liu, Qiuxia Lin, Qi Wen, Limin Su, Gao Huang, Zhengming Ding

Deep domain adaptation methods have achieved appealing performance by learning transferable representations from a well-labeled source domain to a different but related unlabeled target domain.

Partial Domain Adaptation

Paper
Code

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation

1 code implementation • ECCV 2020 • Zhenda Xie, Zheng Zhang, Xizhou Zhu, Gao Huang, Stephen Lin

In the feature maps of CNNs, there commonly exists considerable spatial redundancy that leads to much repetitive processing.

Paper
Code

Resolution Adaptive Networks for Efficient Inference

2 code implementations • CVPR 2020 • Le Yang, Yizeng Han, Xi Chen, Shiji Song, Jifeng Dai, Gao Huang

Adaptive inference is an effective mechanism to achieve a dynamic tradeoff between accuracy and computational cost in deep networks.

143

Paper
Code

Self-Supervised Discovering of Interpretable Features for Reinforcement Learning

1 code implementation • 16 Mar 2020 • Wenjie Shi, Gao Huang, Shiji Song, Zhuoyuan Wang, Tingyu Lin, Cheng Wu

Deep reinforcement learning (RL) has recently led to many breakthroughs on a range of complex control tasks.

Atari Games Decision Making +2

Paper
Code

Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification

no code implementations • 6 Mar 2020 • Kaichen Zhou, Shiji Song, Gao Huang, Wu Cheng, Quan Zhou

Specifically, the proposed algorithm can be used to estimate the upper and lower bounds of the updated classifier's coefficient matrix with a low computational complexity related to the size of the updated dataset.

Incremental Learning L2 Regularization

Paper
Add Code

Cross-Iteration Batch Normalization

2 code implementations • CVPR 2021 • Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin

We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied.

Ranked #180 on Object Detection on COCO test-dev

Image Classification object-detection +1

129

Paper
Code

FSD-10: A Dataset for Competitive Sports Content Analysis

no code implementations • 9 Feb 2020 • Shenlan Liu, Xiang Liu, Gao Huang, Lin Feng, Lianyu Hu, Dong Jiang, Aibin Zhang, Yang Liu, Hong Qiao

To promote the research on action recognition from competitive sports video clips, we introduce a Figure Skating Dataset (FSD-10) for finegrained sports content analysis.

Action Recognition Benchmarking +1

Paper
Add Code

Gated Path Selection Network for Semantic Segmentation

no code implementations • 19 Jan 2020 • Qichuan Geng, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Zhong Zhou, Gao Huang

Semantic segmentation is a challenging task that needs to handle large scale variations, deformations and different viewpoints.

Segmentation Semantic Segmentation

Paper
Add Code

Convolutional Networks with Dense Connectivity

no code implementations • 8 Jan 2020 • Gao Huang, Zhuang Liu, Geoff Pleiss, Laurens van der Maaten, Kilian Q. Weinberger

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output.

Object Recognition

Paper
Add Code

Implicit Semantic Data Augmentation for Deep Networks

1 code implementation • NeurIPS 2019 • Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, Gao Huang

Our work is motivated by the intriguing property that deep networks are surprisingly good at linearizing features, such that certain directions in the deep feature space correspond to meaningful semantic transformations, e. g., adding sunglasses or changing backgrounds.

Image Augmentation

575

Paper
Code

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

1 code implementation • NeurIPS 2019 • Wenjie Shi, Shiji Song, Hui Wu, Ya-Chu Hsu, Cheng Wu, Gao Huang

To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Improved Techniques for Training Adaptive Deep Networks

2 code implementations • ICCV 2019 • Hao Li, Hong Zhang, Xiaojuan Qi, Ruigang Yang, Gao Huang

Adaptive inference is a promising technique to improve the computational efficiency of deep models at test time.

Computational Efficiency Knowledge Distillation

Paper
Code

Asymmetric Valleys: Beyond Sharp and Flat Local Minima

1 code implementation • NeurIPS 2019 • Haowei He, Gao Huang, Yang Yuan

Specifically, at a local minimum there exist many asymmetric directions such that the loss increases abruptly along one side, and slowly along the opposite side--we formally define such minima as asymmetric valleys.

Paper
Code

Gradient Boosted Feature Selection

no code implementations • 13 Jan 2019 • Zhixiang Eddie Xu, Gao Huang, Kilian Q. Weinberger, Alice X. Zheng

A feature selection algorithm should ideally satisfy four conditions: reliably extract relevant features; be able to identify non-linear feature interactions; scale linearly with the number of features and dimensions; allow the incorporation of known sparsity structure.

feature selection

Paper
Add Code

Domain-Aware SE Network for Sketch-based Image Retrieval with Multiplicative Euclidean Margin Softmax

1 code implementation • 11 Dec 2018 • Peng Lu, Gao Huang, Hangyu Lin, Wenming Yang, Guodong Guo, Yanwei Fu

This paper proposes a novel approach for Sketch-Based Image Retrieval (SBIR), for which the key is to bridge the gap between sketches and photos in terms of the data representation.

Retrieval Sketch-Based Image Retrieval

Paper
Code

Anytime Stereo Image Depth Estimation on Mobile Devices

3 code implementations • 26 Oct 2018 • Yan Wang, Zihang Lai, Gao Huang, Brian H. Wang, Laurens van der Maaten, Mark Campbell, Kilian Q. Weinberger

Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints.

Ranked #1 on Stereo Depth Estimation on KITTI2012

Stereo Depth Estimation

485

Paper
Code

Rethinking the Value of Network Pruning

2 code implementations • ICLR 2019 • Zhuang Liu, Ming-Jie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

Our observations are consistent for multiple network architectures, datasets, and tasks, which imply that: 1) training a large, over-parameterized model is often not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are typically not useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.

Network Pruning Neural Architecture Search

1,496

Paper
Code

Interpretable Spatio-temporal Attention for Video Action Recognition

no code implementations • 1 Oct 2018 • Lili Meng, Bo Zhao, Bo Chang, Gao Huang, Wei Sun, Frederich Tung, Leonid Sigal

Inspired by the observation that humans are able to process videos efficiently by only paying attention where and when it is needed, we propose an interpretable and easy plug-in spatial-temporal attention mechanism for video action recognition.

Action Recognition Temporal Action Localization

Paper
Add Code

An empirical study on evaluation metrics of generative adversarial networks

4 code implementations • ICLR 2018 • Qiantong Xu, Gao Huang, Yang Yuan, Chuan Guo, Yu Sun, Felix Wu, Kilian Weinberger

Evaluating generative adversarial networks (GANs) is inherently challenging.

365

Paper
Code

Resource Aware Person Re-identification across Multiple Resolutions

1 code implementation • CVPR 2018 • Yan Wang, Lequn Wang, Yurong You, Xu Zou, Vincent Chen, Serena Li, Gao Huang, Bharath Hariharan, Kilian Q. Weinberger

Not all people are equally easy to identify: color statistics might be enough for some cases while others might require careful reasoning about high- and low-level details.

Ranked #12 on Person Re-Identification on CUHK03 detected

Person Re-Identification

Paper
Code

Horizontal Pyramid Matching for Person Re-identification

1 code implementation • 14 Apr 2018 • Yang Fu, Yunchao Wei, Yuqian Zhou, Honghui Shi, Gao Huang, Xinchao Wang, Zhiqiang Yao, Thomas Huang

Despite the remarkable recent progress, person re-identification (Re-ID) approaches are still suffering from the failure cases where the discriminative body parts are missing.

Ranked #55 on Person Re-Identification on DukeMTMC-reID

Person Re-Identification

Paper
Code

CondenseNet: An Efficient DenseNet using Learned Group Convolutions

6 code implementations • CVPR 2018 • Gao Huang, Shichen Liu, Laurens van der Maaten, Kilian Q. Weinberger

It combines dense connectivity with a novel module called learned group convolution.

2,917

Paper
Code

Learning Efficient Convolutional Networks through Network Slimming

12 code implementations • ICCV 2017 • Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, Chang-Shui Zhang

For VGGNet, a multi-pass version of network slimming gives a 20x reduction in model size and a 5x reduction in computing operations.

Image Classification Neural Architecture Search

2,306

Paper
Code

Memory-Efficient Implementation of DenseNets

6 code implementations • 21 Jul 2017 • Geoff Pleiss, Danlu Chen, Gao Huang, Tongcheng Li, Laurens van der Maaten, Kilian Q. Weinberger

A 264-layer DenseNet (73M parameters), which previously would have been infeasible to train, can now be trained on a single workstation with 8 NVIDIA Tesla M40 GPUs.

1,896

Paper
Code

Snapshot Ensembles: Train 1, get M for free

10 code implementations • 1 Apr 2017 • Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft, Kilian Q. Weinberger

In this paper, we propose a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost.

233

Paper
Code

Multi-Scale Dense Networks for Resource Efficient Image Classification

7 code implementations • ICLR 2018 • Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, Kilian Q. Weinberger

In this paper we investigate image classification with computational resource limits at test time.

General Classification Image Classification

2,917

Paper
Code

Supervised Word Mover's Distance

1 code implementation • NeurIPS 2016 • Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger

Accurately measuring the similarity between text documents lies at the core of many real world applications of machine learning.

Document Classification General Classification +2

Paper
Code

Densely Connected Convolutional Networks

143 code implementations • CVPR 2017 • Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger

Ranked #1 on Classification on XImageNet-12

Breast Tumour Classification Crowd Counting +8

15,442

Paper
Code

Deep Networks with Stochastic Depth

17 code implementations • 30 Mar 2016 • Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, Kilian Weinberger

With stochastic depth we can increase the depth of residual networks even beyond 1200 layers and still yield meaningful improvements in test error (4. 91% on CIFAR-10).

Ranked #21 on Image Classification on SVHN

Image Classification

29,758

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.