Search Results for author: Heng Wang

Found 70 papers, 30 papers with code

Proposal-based Video Completion

no code implementations • ECCV 2020 • Yuan-Ting Hu, Heng Wang, Nicolas Ballas, Kristen Grauman, Alexander G. Schwing

Video inpainting is an important technique for a wide variety of applications from video content editing to video restoration.

Image Inpainting object-detection +4

Paper
Add Code

Dance Any Beat: Blending Beats with Visuals in Dance Video Generation

no code implementations • 15 May 2024 • Xuanchen Wang, Heng Wang, Dongnan Liu, Weidong Cai

We introduce a 2D motion-music alignment score (2D-MM Align) for quantitative assessment.

Image to Video Generation Optical Flow Estimation

Paper
Add Code

Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images

no code implementations • 4 May 2024 • Yik San Cheng, Runkai Zhao, Heng Wang, Hanchuan Peng, Weidong Cai

To address this limitation, we aim to distill the consensus knowledge from massive natural image data to aid the segmentation model in learning the complex neuron structures.

Segmentation

Paper
Add Code

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

no code implementations • 15 Apr 2024 • Mude Hui, Siwei Yang, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang, Yuyin Zhou, Cihang Xie

This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200, 000 edits.

Attribute

Paper
Add Code

Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications

no code implementations • 19 Mar 2024 • Heng Wang, Jianhua Zhang, Gaofeng Nie, Li Yu, Zhiqiang Yuan, Tongjie Li, Jialin Wang, Guangyi Liu

Digital twin channel (DTC) is the real-time mapping of a wireless channel from the physical world to the digital world, which is expected to provide significant performance enhancements for the sixth-generation (6G) air-interface design.

Paper
Add Code

MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts

no code implementations • 8 Mar 2024 • Zinan Zeng, Sen Ye, Zijian Cai, Heng Wang, YuHan Liu, Haokai Zhang, Minnan Luo

For instance, the metadata and the corresponding user's information of a review could be helpful.

Domain Generalization

Paper
Add Code

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

no code implementations • 5 Mar 2024 • Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang

Our MLM filter can generalize to different models and tasks, and be used as a drop-in replacement for CLIPScore.

Paper
Add Code

Hy-DAT: A Tool to Address Hydropower Modeling Gaps Using Interdependency, Efficiency Curves, and Unit Dispatch Models

no code implementations • 28 Feb 2024 • Dewei Wang, Bhaskar Mitra, Sameer Nekkalapu, Sohom Datta, Bibi Matthew, Rounak Meyur, Heng Wang, Slaven Kincic

As the power system continues to be flooded with intermittent resources, it becomes more important to accurately assess the role of hydro and its impact on the power grid.

Paper
Add Code

DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection

no code implementations • 16 Feb 2024 • Herun Wan, Shangbin Feng, Zhaoxuan Tan, Heng Wang, Yulia Tsvetkov, Minnan Luo

Large language models are limited by challenges in factuality and hallucinations to be directly employed off-the-shelf for judging the veracity of news articles, where factual accuracy is paramount.

Misinformation

Paper
Add Code

Video Recognition in Portrait Mode

1 code implementation • 21 Dec 2023 • Mingfei Han, Linjie Yang, Xiaojie Jin, Jiashi Feng, Xiaojun Chang, Heng Wang

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

Data Augmentation Video Recognition

Paper
Code

Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos

1 code implementation • 16 Dec 2023 • Mingfei Han, Linjie Yang, Xiaojun Chang, Heng Wang

A human need to capture both the event in every shot and associate them together to understand the story behind it.

Ranked #1 on video narration captioning on Shot2Story20K

Video Captioning video narration captioning +4

Paper
Code

Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens

no code implementations • 12 Dec 2023 • Fan Ma, Xiaojie Jin, Heng Wang, Yuchen Xian, Jiashi Feng, Yi Yang

This amplifies the effect of visual tokens on text generation, especially when the relative distance is longer between visual and text tokens.

Ranked #6 on Zero-Shot Video Question Answer on MSRVTT-QA

Hallucination Position +2

Paper
Add Code

InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

no code implementations • 20 Nov 2023 • Xiaotian Han, Quanzeng You, Yongfei Liu, Wentao Chen, Huangjie Zheng, Khalil Mrini, Xudong Lin, Yiqi Wang, Bohan Zhai, Jianbo Yuan, Heng Wang, Hongxia Yang

To mitigate this issue, we manually curate a benchmark dataset specifically designed for MLLMs, with a focus on complex reasoning tasks.

Paper
Add Code

GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks

no code implementations • 2 Nov 2023 • Xinlu Zhang, Yujie Lu, Weizhi Wang, An Yan, Jun Yan, Lianke Qin, Heng Wang, Xifeng Yan, William Yang Wang, Linda Ruth Petzold

Automatically evaluating vision-language tasks is challenging, especially when it comes to reflecting human judgments due to limitations in accounting for fine-grained details.

Image Generation

Paper
Add Code

Consistent-1-to-3: Consistent Image to 3D View Synthesis via Geometry-aware Diffusion Models

no code implementations • 4 Oct 2023 • Jianglong Ye, Peng Wang, Kejie Li, Yichun Shi, Heng Wang

Specifically, we decompose the NVS task into two stages: (i) transforming observed regions to a novel view, and (ii) hallucinating unseen regions.

Image to 3D Novel View Synthesis

Paper
Add Code

Resolving Knowledge Conflicts in Large Language Models

1 code implementation • 2 Oct 2023 • Yike Wang, Shangbin Feng, Heng Wang, Weijia Shi, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals.

Paper
Code

The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data Filtering

no code implementations • 27 Sep 2023 • Haichao Yu, Yu Tian, Sateesh Kumar, Linjie Yang, Heng Wang

DataComp is a new benchmark dedicated to evaluating different methods for data filtering.

Paper
Add Code

Advancements in 3D Lane Detection Using LiDAR Point Clouds: From Data Collection to Model Development

1 code implementation • 24 Sep 2023 • Runkai Zhao, Yuwen Heng, Heng Wang, Yuanda Gao, Shilei Liu, Changhao Yao, Jiawen Chen, Weidong Cai

Advanced Driver-Assistance Systems (ADAS) have successfully integrated learning-based techniques into vehicle perception and decision-making.

3D Lane Detection Decision Making

Paper
Code

Dataset Condensation via Generative Model

no code implementations • 14 Sep 2023 • David Junhao Zhang, Heng Wang, Chuhui Xue, Rui Yan, Wenqing Zhang, Song Bai, Mike Zheng Shou

Dataset condensation aims to condense a large dataset with a lot of training samples into a small set.

Dataset Condensation

Paper
Add Code

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

1 code implementation • 18 Aug 2023 • Heng Wang, Jianbo Ma, Santiago Pascual, Richard Cartwright, Weidong Cai

In this paper, we propose a lightweight solution to this problem by leveraging foundation models, specifically CLIP, CLAP, and AudioLDM.

Audio Generation

Paper
Code

Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence Generation

1 code implementation • 27 Jul 2023 • Zhiyuan Li, Dongnan Liu, Heng Wang, Chaoyi Zhang, Weidong Cai

We further show that with a simple extension, the generated pseudo sentences can be deployed as weak supervision to boost the 1% semi-supervised image caption benchmark up to 93. 4 CIDEr score (+8. 9) which showcases the versatility and effectiveness of our approach.

Image Captioning Model Optimization +2

Paper
Code

Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?

1 code implementation • ICCV 2023 • Cheng-En Wu, Yu Tian, Haichao Yu, Heng Wang, Pedro Morgado, Yu Hen Hu, Linjie Yang

Vision-language models such as CLIP learn a generic text-image embedding from large-scale training data.

Image Classification Language Modelling

Paper
Code

Exploring the Role of Audio in Video Captioning

no code implementations • 21 Jun 2023 • YuHan Shen, Linjie Yang, Longyin Wen, Haichao Yu, Ehsan Elhamifar, Heng Wang

Recent focus in video captioning has been on designing architectures that can consume both video and text modalities, and using large-scale video datasets with text transcripts for pre-training, such as HowTo100M.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Can Language Models Solve Graph Problems in Natural Language?

2 code implementations • NeurIPS 2023 • Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov

We then propose Build-a-Graph Prompting and Algorithmic Prompting, two instruction-based approaches to enhance LLMs in solving natural language graph problems.

In-Context Learning Knowledge Probing +2

Paper
Code

Detecting Spoilers in Movie Reviews with External Movie Knowledge and User Networks

1 code implementation • 22 Apr 2023 • Heng Wang, Wenqian Zhang, Yuyang Bai, Zhaoxuan Tan, Shangbin Feng, Qinghua Zheng, Minnan Luo

We then propose MVSD, a novel Multi-View Spoiler Detection framework that takes into account the external knowledge about movies and user activities on movie review platforms.

Paper
Code

Progressive Volume Distillation with Active Learning for Efficient NeRF Architecture Conversion

1 code implementation • 8 Apr 2023 • Shuangkang Fang, Yufeng Wang, Yi Yang, Weixin Xu, Heng Wang, Wenrui Ding, Shuchang Zhou

For instance, PVD-AL can distill an MLP-based model from a Hashtables-based model at a 10~20X faster speed and 0. 8dB~2dB higher PSNR than training the MLP-based model from scratch.

3D Reconstruction Novel View Synthesis

182

Paper
Code

$R^{2}$Former: Unified $R$etrieval and $R$eranking Transformer for Place Recognition

no code implementations • 6 Apr 2023 • Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang

Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.

Feature Correlation Retrieval +1

Paper
Add Code

PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters

1 code implementation • CVPR 2023 • Shuhong Chen, Kevin Zhang, Yichun Shi, Heng Wang, Yiheng Zhu, Guoxian Song, Sizhe An, Janus Kristjansson, Xiao Yang, Matthias Zwicker

We propose PAniC-3D, a system to reconstruct stylized 3D character heads directly from illustrated (p)ortraits of (ani)me (c)haracters.

3D Architecture 3D Reconstruction +1

703

Paper
Code

Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision

no code implementations • 9 Mar 2023 • Tarun Kalluri, Weiyao Wang, Heng Wang, Manmohan Chandraker, Lorenzo Torresani, Du Tran

Many top-down architectures for instance segmentation achieve significant success when trained and tested on pre-defined closed-world taxonomy.

Open-World Instance Segmentation Segmentation +1

Paper
Add Code

Temporal Perceiving Video-Language Pre-training

no code implementations • 18 Jan 2023 • Fan Ma, Xiaojie Jin, Heng Wang, Jingjia Huang, Linchao Zhu, Jiashi Feng, Yi Yang

Specifically, text-video localization consists of moment retrieval, which predicts start and end boundaries in videos given the text description, and text localization which matches the subset of texts with the video features.

Contrastive Learning Moment Retrieval +7

Paper
Add Code

R2Former: Unified Retrieval and Reranking Transformer for Place Recognition

1 code implementation • CVPR 2023 • Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang

Visual Place Recognition (VPR) estimates the location of query images by matching them with images in a reference database.

Feature Correlation Retrieval +1

Paper
Code

One is All: Bridging the Gap Between Neural Radiance Fields Architectures with Progressive Volume Distillation

1 code implementation • 29 Nov 2022 • Shuangkang Fang, Weixin Xu, Heng Wang, Yi Yang, Yufeng Wang, Shuchang Zhou

In this paper, we propose Progressive Volume Distillation (PVD), a systematic distillation method that allows any-to-any conversions between different architectures, including MLP, sparse or low-rank tensors, hashtables and their compositions.

Ranked #1 on Novel View Synthesis on NeRF (Average PSNR metric)

3D Reconstruction Neural Rendering +1

182

Paper
Code

PointNeuron: 3D Neuron Reconstruction via Geometry and Topology Learning of Point Clouds

1 code implementation • 15 Oct 2022 • Runkai Zhao, Heng Wang, Chaoyi Zhang, Weidong Cai

In this paper, we propose a novel framework for 3D neuron reconstruction.

Surface Reconstruction

Paper
Code

TwiBot-22: Towards Graph-Based Twitter Bot Detection

1 code implementation • 9 Jun 2022 • Shangbin Feng, Zhaoxuan Tan, Herun Wan, Ningnan Wang, Zilong Chen, Binchi Zhang, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, Xinshun Feng, Qingyue Zhang, Hongrui Wang, YuHan Liu, Yuyang Bai, Heng Wang, Zijian Cai, Yanbo Wang, Lijing Zheng, Zihan Ma, Jundong Li, Minnan Luo

Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse.

Misinformation Twitter Bot Detection

140

Paper
Code

Towards Generalisable Audio Representations for Audio-Visual Navigation

no code implementations • 1 Jun 2022 • Shunqi Mao, Chaoyi Zhang, Heng Wang, Weidong Cai

In audio-visual navigation (AVN), an intelligent agent needs to navigate to a constantly sound-making object in complex 3D environments based on its audio and visual perceptions.

Contrastive Learning Data Augmentation +2

Paper
Add Code

Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds

1 code implementation • 22 Apr 2022 • Heng Wang, Chaoyi Zhang, Jianhui Yu, Weidong Cai

Dense captioning in 3D point clouds is an emerging vision-and-language task involving object-level 3D scene understanding.

3D dense captioning 3D Object Detection +7

Paper
Code

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity

1 code implementation • CVPR 2022 • Weiyao Wang, Matt Feiszli, Heng Wang, Jitendra Malik, Du Tran

From PA we construct a large set of pseudo-ground-truth instance masks; combined with human-annotated instance masks we train GGNs and significantly outperform the SOTA on open-world instance segmentation on various benchmarks including COCO, LVIS, ADE20K, and UVO.

Open-World Instance Segmentation Semantic Segmentation

111

Paper
Code

Canonical Mean Filter for Almost Zero-Shot Multi-Task classification

no code implementations • 8 Apr 2022 • Yong Li, Heng Wang, Xiang Ye

Motivated by ANIL, we rethink the role of adaption in the feature extractor of CNAPs, which is a state-of-the-art representative few-shot method.

Paper
Add Code

3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis

1 code implementation • 9 Dec 2021 • Jianhui Yu, Chaoyi Zhang, Heng Wang, Dingxin Zhang, Yang song, Tiange Xiang, Dongnan Liu, Weidong Cai

General point clouds have been increasingly investigated for different tasks, and recently Transformer-based networks are proposed for point cloud analysis.

Ranked #1 on 3D Point Cloud Classification on IntrA

3D Part Segmentation 3D Point Cloud Classification

Paper
Code

PyTorchVideo: A Deep Learning Library for Video Understanding

1 code implementation • 18 Nov 2021 • Haoqi Fan, Tullie Murrell, Heng Wang, Kalyan Vasudev Alwala, Yanghao Li, Yilei Li, Bo Xiong, Nikhila Ravi, Meng Li, Haichuan Yang, Jitendra Malik, Ross Girshick, Matt Feiszli, Aaron Adcock, Wan-Yen Lo, Christoph Feichtenhofer

We introduce PyTorchVideo, an open-source deep-learning library that provides a rich set of modular, efficient, and reproducible components for a variety of video understanding tasks, including classification, detection, self-supervised learning, and low-level processing.

Self-Supervised Learning Video Understanding

3,203

Paper
Code

Searching for Two-Stream Models in Multivariate Space for Video Recognition

no code implementations • ICCV 2021 • Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan

We design a multivariate search space, including 6 search variables to capture a wide variety of choices in designing two-stream models.

Neural Architecture Search Video Recognition +1

Paper
Add Code

Voxel-wise Cross-Volume Representation Learning for 3D Neuron Reconstruction

no code implementations • 14 Aug 2021 • Heng Wang, Chaoyi Zhang, Jianhui Yu, Yang song, SiQi Liu, Wojciech Chrzanowski, Weidong Cai

Recently, a series of deep learning based segmentation methods have been proposed to improve the quality of raw 3D optical image stacks by removing noises and restoring neuronal structures from low-contrast background.

Decoder Representation Learning +1

Paper
Add Code

Towards Understanding the Effectiveness of Attention Mechanism

no code implementations • 29 Jun 2021 • Xiang Ye, Zihang He, Heng Wang, Yong Li

Instead, we verify the crucial role of feature map multiplication in attention mechanism and uncover a fundamental impact of feature map multiplication on the learned landscapes of CNNs: with the high order non-linearity brought by the feature map multiplication, it played a regularization role on CNNs, which made them learn smoother and more stable landscapes near real samples compared to vanilla CNNs.

Paper
Add Code

Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation

no code implementations • ICCV 2021 • Weiyao Wang, Matt Feiszli, Heng Wang, Du Tran

Current state-of-the-art object detection and segmentation methods work well under the closed-world assumption.

Object object-detection +6

Paper
Add Code

Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories

no code implementations • CVPR 2021 • Xitong Yang, Haoqi Fan, Lorenzo Torresani, Larry Davis, Heng Wang

The standard way of training video models entails sampling at each iteration a single clip from a video and optimizing the clip prediction with respect to the video-level label.

Action Detection Action Recognition +1

Paper
Add Code

Is Space-Time Attention All You Need for Video Understanding?

13 code implementations • 9 Feb 2021 • Gedas Bertasius, Heng Wang, Lorenzo Torresani

We present a convolution-free approach to video classification built exclusively on self-attention over space and time.

Ranked #1 on Video Question Answering on Howto100M-QA

Action Classification Action Recognition +5

3,973

Paper
Code

Single Neuron Segmentation using Graph-based Global Reasoning with Auxiliary Skeleton Loss from 3D Optical Microscope Images

no code implementations • 22 Jan 2021 • Heng Wang, Yang song, Chaoyi Zhang, Jianhui Yu, SiQi Liu, Hanchuan Peng, Weidong Cai

One of the critical steps in improving accurate single neuron reconstruction from three-dimensional (3D) optical microscope images is the neuronal structure segmentation.

Segmentation

Paper
Add Code

Interactive Prototype Learning for Egocentric Action Recognition

no code implementations • ICCV 2021 • Xiaohan Wang, Linchao Zhu, Heng Wang, Yi Yang

To avoid these additional costs, we propose an end-to-end Interactive Prototype Learning (IPL) framework to learn better active object representations by leveraging the motion cues from the actor.

Action Recognition Object +1

Paper
Add Code

From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques

1 code implementation • 14 Mar 2020 • Bin Hou, Qingjie Liu, Heng Wang, Yunhong Wang

Traditional change detection methods usually follow the image differencing, change feature extraction and classification framework, and their performance is limited by such simple image domain differencing and also the hand-crafted features.

Change Detection Generative Adversarial Network

Paper
Code

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

no code implementations • 19 Dec 2019 • Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu

We present a Chinese judicial reading comprehension (CJRC) dataset which contains approximately 10K documents and almost 50K questions with answers.

Machine Reading Comprehension

Paper
Add Code

Region and Object based Panoptic Image Synthesis through Conditional GANs

no code implementations • 14 Dec 2019 • Heng Wang, Donghao Zhang, Yang song, Heng Huang, Mei Chen, Weidong Cai

Our contribution consists of the proposal of a significant task worth investigating and a naive baseline of solving it.

Image-to-Image Translation Translation

Paper
Add Code

CAIL2019-SCM: A Dataset of Similar Case Matching in Legal Domain

2 code implementations • 20 Nov 2019 • Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Tianyang Zhang, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu

In this paper, we introduce CAIL2019-SCM, Chinese AI and Law 2019 Similar Case Matching dataset.

Traffic Object Detection

344

Paper
Code

Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning

no code implementations • IJCNLP 2019 • Heng Wang, Shuangyin Li, Rong pan, Mingzhi Mao

Meanwhile, a novel mechanism of reinforcement learning is proposed by forcing an agent to walk forward every step to avoid the agent stalling at the same entity node constantly.

Graph Attention reinforcement-learning +1

Paper
Add Code

FASTER Recurrent Networks for Efficient Video Classification

no code implementations • 10 Jun 2019 • Linchao Zhu, Laura Sevilla-Lara, Du Tran, Matt Feiszli, Yi Yang, Heng Wang

FASTER aims to leverage the redundancy between neighboring clips and reduce the computational cost by learning to aggregate the predictions from models of different complexities.

Ranked #26 on Action Recognition on UCF101

Action Classification Action Recognition +3

Paper
Add Code

Video Modeling with Correlation Networks

no code implementations • CVPR 2020 • Heng Wang, Du Tran, Lorenzo Torresani, Matt Feiszli

Motion is a salient cue to recognize actions in video.

Ranked #108 on Action Classification on Kinetics-400

Action Classification Action Recognition +1

Paper
Add Code

Large-scale weakly-supervised pre-training for video action recognition

3 code implementations • CVPR 2019 • Deepti Ghadiyaram, Matt Feiszli, Du Tran, Xueting Yan, Heng Wang, Dhruv Mahajan

Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?

Ranked #2 on Egocentric Activity Recognition on EPIC-KITCHENS-55 (Actions Top-1 (S2) metric)

Action Classification Action Recognition +3

9,317

Paper
Code

Video Classification with Channel-Separated Convolutional Networks

7 code implementations • ICCV 2019 • Du Tran, Heng Wang, Lorenzo Torresani, Matt Feiszli

It is natural to ask: 1) if group convolution can help to alleviate the high computational cost of video classification networks; 2) what factors matter the most in 3D group convolutional networks; and 3) what are good computation/accuracy trade-offs with 3D group convolutional networks.

Ranked #1 on Action Recognition on Sports-1M

Action Classification Action Recognition +3

3,973

Paper
Code

Defeats GAN: A Simpler Model Outperforms in Knowledge Representation Learning

no code implementations • 3 Apr 2019 • Heng Wang, Mingzhi Mao

The goal of knowledge representation learning is to embed entities and relations into a low-dimensional, continuous vector space.

Link Prediction Representation Learning

Paper
Add Code

Multi-task Learning for Chinese Word Usage Errors Detection

no code implementations • 3 Apr 2019 • Jinbin Zhang, Heng Wang

Chinese word usage errors often occur in non-native Chinese learners' writing.

Multi-Task Learning POS +1

Paper
Add Code

Overview of CAIL2018: Legal Judgment Prediction Competition

2 code implementations • 13 Oct 2018 • Haoxi Zhong, Chaojun Xiao, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu

In this paper, we give an overview of the Legal Judgment Prediction (LJP) competition at Chinese AI and Law challenge (CAIL2018).

Paper
Code

Scenes-Objects-Actions: A Multi-Task, Multi-Label Video Dataset

no code implementations • ECCV 2018 • Jamie Ray, Heng Wang, Du Tran, YuFei Wang, Matt Feiszli, Lorenzo Torresani, Manohar Paluri

The videos retrieved by the search engines are then veried for correctness by human annotators.

Action Recognition Temporal Action Localization +1

Paper
Add Code

CAIL2018: A Large-Scale Legal Dataset for Judgment Prediction

3 code implementations • 4 Jul 2018 • Chaojun Xiao, Haoxi Zhong, Zhipeng Guo, Cunchao Tu, Zhiyuan Liu, Maosong Sun, Yansong Feng, Xianpei Han, Zhen Hu, Heng Wang, Jianfeng Xu

In this paper, we introduce the \textbf{C}hinese \textbf{AI} and \textbf{L}aw challenge dataset (CAIL2018), the first large-scale Chinese legal dataset for judgment prediction.

Text Classification

274

Paper
Code

Devon: Deformable Volume Network for Learning Optical Flow

no code implementations • 20 Feb 2018 • Yao Lu, Jack Valmadre, Heng Wang, Juho Kannala, Mehrtash Harandi, Philip H. S. Torr

State-of-the-art neural network models estimate large displacement optical flow in multi-resolution and use warping to propagate the estimation between two resolutions.

Optical Flow Estimation

Paper
Add Code

Text Generation Based on Generative Adversarial Nets with Latent Variable

1 code implementation • 1 Dec 2017 • Heng Wang, Zengchang Qin, Tao Wan

We propose the VGAN model where the generative model is composed of recurrent neural network and VAE.

Language Modelling Text Generation

Paper
Code

A Closer Look at Spatiotemporal Convolutions for Action Recognition

20 code implementations • CVPR 2018 • Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann Lecun, Manohar Paluri

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition.

Ranked #3 on Action Recognition on Sports-1M

Action Classification Action Recognition +1

9,317

Paper
Code

Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing

no code implementations • 25 Jul 2017 • Shujian Yu, Zubin Abraham, Heng Wang, Mohak Shah, Yantao Wei, José C. Príncipe

A fundamental issue for statistical classification models in a streaming environment is that the joint distribution between predictor and response variables changes over time (a phenomenon also known as concept drifts), such that their classification performance deteriorates dramatically.

General Classification Two-sample testing

Paper
Add Code

Face Aging Effect Simulation using Hidden Factor Analysis Joint Sparse Representation

no code implementations • 4 Nov 2015 • Hongyu Yang, Di Huang, Yunhong Wang, Heng Wang, Yuanyan Tang

Face aging simulation has received rising investigations nowadays, whereas it still remains a challenge to generate convincing and natural age-progressed face images.