Search Results for author: Wei Li

Found 361 papers, 132 papers with code

Weight Excitation: Built-in Attention Mechanisms in Convolutional Neural Networks

1 code implementation • ECCV 2020 • Niamul Quader, Md Mafijul Islam Bhuiyan, Juwei Lu, Peng Dai, Wei Li

We propose novel approaches for simultaneously identifying important weights of a convolutional neural network (ConvNet) and providing more attention to the important weights during training.

3D Action Recognition 3D Object Classification +7

Paper
Code

《二十四史》古代汉语语义依存图库构建(Construction of Semantic Dependency Graph Bank of Ancient Chinese in twenty four histories)

no code implementations • CCL 2022 • Tian Huang, Yanqiu Shao, Wei Li

“语义依存图是NLP处理语义的深层分析方法, 能够对句子中词与词之间的语义进行分析。该文针对古代汉语特点, 在制定古代汉语语义依存图标注规范的基础上, 以《二十四史》为语料来源, 完成标注了规模为3000句的古代汉语语义依存图库, 标注一致性的kappa值为78. 83%。通过与现代汉语语义依存图库的对比, 对依存图库基本情况进行统计, 分析古代汉语的语义特色和规律。统计显示, 古代汉语语义分布宏观上符合齐普夫定律, 在语义事件描述上具有强烈的历史性叙事和正式文体特征, 如以人物纪传为中心, 时间、地点等周边角色描述细致, 叙事语言冷静客观, 缺少描述情态、语气、程度、时间状态等的修饰词语等。 "

Paper
Add Code

针对古代经典文献的引用查找问题的数据构建与匹配方法(Data Construction and Matching Method for the Task of Ancient Classics Reference Detection)

no code implementations • CCL 2022 • Wei Li, Yanqiu Shao, Mengxi Bi

Paper
Add Code

基于强化学习的古今汉语句子对齐研究(Research on Sentence Alignment of Ancient and Modern Chinese based on Reinforcement Learning)

no code implementations • CCL 2022 • Kuai Yu, Yanqiu Shao, Wei Li

“基于深度学习的有监督机器翻译取得了良好的效果, 但训练过程中需要大量质量较高的对齐语料。对于中文古今翻译场景, 高质量的平行语料并不多, 而粗对齐的篇章、段语料比较容易获得, 因此语料对齐很有研究价值和研究必要。在传统双语平行语料的句子对齐研究中, 传统方法根据双语文本中的长度、词汇、共现文字等语法信息, 建立一个综合评判标准来衡量两个句对之间相似度。此类方法虽然在单句对齐上取得了较好的效果, 但是对于句子语义匹配的能力有限, 并且在一些多对多的对齐模式上的性能表现不佳。在本文中我们提出尝试利用现在发展迅速且具有强大语义表示能力的预训练语言模型来考虑双语的语义信息, 但是单独使用预训练语言模型只能考虑相对局部的信息, 因此我们提出采用基于动态规划算法的强化学习训练目标来整合段落全局信息, 并且进行无监督训练。实验结果证明我们提出的方法训练得到的模型性能优于此前获得最好表现的基线模型, 尤其相较于传统模型难以处理的多对多对齐模式下, 性能提升较大。”

Sentence

Paper
Add Code

Meta-CQG: A Meta-Learning Framework for Complex Question Generation over Knowledge Bases

no code implementations • COLING 2022 • Kun Zhang, Yunqi Qiu, Yuanzhuo Wang, Long Bai, Wei Li, Xuhui Jiang, HuaWei Shen, Xueqi Cheng

Complex question generation over knowledge bases (KB) aims to generate natural language questions involving multiple KB relations or functional constraints.

Contrastive Learning Decoder +3

Paper
Add Code

Towards Efficient Coarse-to-Fine Networks for Action and Gesture Recognition

no code implementations • ECCV 2020 • Niamul Quader, Juwei Lu, Peng Dai, Wei Li

State-of-the-art approaches to video-based action and gesture recognition often employ two key concepts: First, they employ multistream processing; second, they use an ensemble of convolutional networks.

Ranked #1 on Action Classification on Jester test

3D Action Recognition Action Classification +3

Paper
Add Code

SgSum:Transforming Multi-document Summarization into Sub-graph Selection

1 code implementation • EMNLP 2021 • Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang

Comparing with traditional methods, our method has two main advantages: (1) the relations between sentences are captured by modeling both the graph structure of the whole document set and the candidate sub-graphs; (2) directly outputs an integrate summary in the form of sub-graph which is more informative and coherent.

Document Summarization Multi-Document Summarization +1

1,694

Paper
Code

Unsupervised Chinese Word Segmentation with BERT Oriented Probing and Transformation

1 code implementation • Findings (ACL) 2022 • Wei Li, Yuhan Song, Qi Su, Yanqiu Shao

Word Segmentation is a fundamental step for understanding Chinese language.

Chinese Word Segmentation Segmentation

Paper
Code

基于统一模型的藏文新闻摘要(Abstractive Summarization of Tibetan News Based on Hybrid Model)

no code implementations • CCL 2020 • Xiaodong Yan, Xiaoqing Xie, Yu Zou, Wei Li

Seq2seq神经网络模型在中英文文本摘要的研究中取得了良好的效果, 但在低资源语言的文本摘要研究还处于探索阶段, 尤其是在藏语中。此外, 目前还没有大规模的标注语料库进行摘要提取。本文提出了一种生成藏文新闻摘要的统一模型。利用TextRank算法解决了藏语标注训练数据不足的问题。然后, 采用两层双GRU神经网络提取代表原始新闻的句子, 减少冗余信息。最后, 使用基于注意力机制的Seq2Seq来生成理解式摘要。同时, 我们加入了指针网络来处理未登录词的问题。实验结果表明, ROUGE-1评分比传统模型提高了2%。关键词:文本摘要;藏文;TextRank; 指针网络;Bi-GRU

Abstractive Text Summarization

Paper
Add Code

ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios

no code implementations • 7 May 2024 • Dingrui Wang, Zheyuan Lai, Yuda Li, Yi Wu, Yuexin Ma, Johannes Betz, Ruigang Yang, Wei Li

Furthermore, a new metric named clamped temporal error (CTE) is proposed to give a more comprehensive evaluation of prediction performance, especially in time-sensitive emergency events of subseconds.

Paper
Add Code

FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models

no code implementations • 29 Apr 2024 • Wei Li, Ren Ma, Jiang Wu, Chenya Gu, Jiahui Peng, Jinyang Len, Songyang Zhang, Hang Yan, Dahua Lin, Conghui He

In the burgeoning field of large language models (LLMs), the assessment of fundamental knowledge remains a critical challenge, particularly for models tailored to Chinese language and culture.

Common Sense Reasoning Multiple-choice

Paper
Add Code

BezierFormer: A Unified Architecture for 2D and 3D Lane Detection

no code implementations • 25 Apr 2024 • Zhiwei Dong, Xi Zhu, Xiya Cao, Ran Ding, Wei Li, Caifa Zhou, Yongliang Wang, Qiangbo Liu

B\'{e}zierFormer formulate queries as B\'{e}zier control points and incorporate a novel B\'{e}zier curve attention mechanism.

3D Lane Detection

Paper
Add Code

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

1 code implementation • 25 Apr 2024 • Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang Jin, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang

Compared to both open-source and proprietary models, InternVL 1. 5 shows competitive performance, achieving state-of-the-art results in 8 of 18 benchmarks.

Ranked #6 on Visual Question Answering on MM-Vet

4k Language Modelling +3

1,572

Paper
Code

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

1 code implementation • 16 Apr 2024 • Bin Ren, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, ZhengJun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, Yawei Li, Jinhan Guan, Dehua Hu, Jiawei Yu, Qisheng Xu, Tao Sun, Long Lan, Kele Xu, Xin Lin, Jingtong Yue, Lehan Yang, Shiyi Du, Lu Qi, Chao Ren, Zeyu Han, YuHan Wang, Chaolin Chen, Haobo Li, Mingjun Zheng, Zhongbao Yang, Lianhong Song, Xingzhuo Yan, Minghan Fu, Jingyi Zhang, Baiang Li, Qi Zhu, Xiaogang Xu, Dan Guo, Chunle Guo, Jiadi Chen, Huanhuan Long, Chunjiang Duanmu, Xiaoyan Lei, Jie Liu, Weilin Jia, Weifeng Cao, Wenlong Zhang, Yanyu Mao, Ruilong Guo, Nihao Zhang, Qian Wang, Manoj Pandey, Maksym Chernozhukov, Giang Le, Shuli Cheng, Hongyuan Wang, Ziyan Wei, Qingting Tang, Liejun Wang, Yongming Li, Yanhui Guo, Hao Xu, Akram Khatami-Rizi, Ahmad Mahmoudi-Aznaveh, Chih-Chung Hsu, Chia-Ming Lee, Yi-Shiuan Chou, Amogh Joshi, Nikhil Akalwadi, Sampada Malagi, Palani Yashaswini, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi

In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking.

Image Super-Resolution

Paper
Code

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

2 code implementations • 9 Apr 2024 • Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Songyang Zhang, Haodong Duan, Wenwei Zhang, Yining Li, Hang Yan, Yang Gao, Zhe Chen, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Kai Chen, Conghui He, Xingcheng Zhang, Jifeng Dai, Yu Qiao, Dahua Lin, Jiaqi Wang

The Large Vision-Language Model (LVLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution.

Ranked #12 on Visual Question Answering on MM-Vet

4k Language Modelling +1

1,687

Paper
Code

LIPT: Latency-aware Image Processing Transformer

no code implementations • 9 Apr 2024 • Junbo Qiao, Wei Li, Haizhen Xie, Hanting Chen, Yunshuai Zhou, Zhijun Tu, Jie Hu, Shaohui Lin

Extensive experiments on multiple image processing tasks (e. g., image super-resolution (SR), JPEG artifact reduction, and image denoising) demonstrate the superiority of LIPT on both latency and PSNR.

Image Denoising Image Super-Resolution

Paper
Add Code

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

no code implementations • 3 Apr 2024 • Simiao Li, Yun Zhang, Wei Li, Hanting Chen, Wenjia Wang, BingYi Jing, Shaohui Lin, Jie Hu

Knowledge distillation (KD) is a promising yet challenging model compression technique that transfers rich learning representations from a well-performing but cumbersome teacher model to a compact student model.

Image Super-Resolution Knowledge Distillation +1

Paper
Add Code

Make Continual Learning Stronger via C-Flat

no code implementations • 1 Apr 2024 • Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Zixiang Zhao, Mang Wang, Aojun Lu, Tao Feng

A general framework of C-Flat applied to all CL categories and a thorough comparison with loss minima optimizer and flat minima based CL approaches is presented in this paper, showing that our method can boost CL performance in almost all cases.

Continual Learning

Paper
Add Code

IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions

no code implementations • 31 Mar 2024 • Zhijun Tu, Kunpeng Du, Hanting Chen, Hailing Wang, Wei Li, Jie Hu, Yunhe Wang

Recent advances have demonstrated the powerful capability of transformer architecture in image restoration.

Deblurring Denoising +3

Paper
Add Code

InternLM2 Technical Report

1 code implementation • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin

The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).

Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)

4k Long-Context Understanding

5,236

Paper
Code

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

no code implementations • 25 Mar 2024 • Quan Zhang, Xiaoyu Liu, Wei Li, Hanting Chen, Junchao Liu, Jie Hu, Zhiwei Xiong, Chun Yuan, Yunhe Wang

SPD leverages a self-distillation manner to distill the fused semantic priors to boost the performance of original IR models.

Deblurring Denoising +2

Paper
Add Code

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

2 code implementations • 25 Mar 2024 • Daoguang Zan, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei guan, Zhiguang Yang, Yongji Wang, Qianxiang Wang, Lizhen Cui

For feedback-based evaluation, we develop a VSCode plugin for CodeS and engage 30 participants in conducting empirical studies.

Benchmarking

21,204

Paper
Code

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

1 code implementation • 22 Mar 2024 • Junbo Yin, Jianbing Shen, Runnan Chen, Wei Li, Ruigang Yang, Pascal Frossard, Wenguan Wang

HSF applies Point-to-Grid and Grid-to-Region transformers to capture the multimodal scene context at different granularities.

3D Object Detection Autonomous Driving +1

Paper
Code

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations

1 code implementation • 21 Mar 2024 • Jiaxing Sun, Weiquan Huang, Jiang Wu, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He

We introduce CHARM, the first benchmark for comprehensively and in-depth evaluating the commonsense reasoning ability of large language models (LLMs) in Chinese, which covers both globally known and Chinese-specific commonsense.

Benchmarking Memorization

Paper
Code

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

1 code implementation • 20 Mar 2024 • Ziyu Liu, Zeyi Sun, Yuhang Zang, Wei Li, Pan Zhang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wang

Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.

Contrastive Learning Fine-Grained Visual Recognition +3

Paper
Code

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

no code implementations • 15 Mar 2024 • Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, Zhang Chen, Zac Yu, Jarvis Jin, Roman Komarytsia, Christiane Ahlheim, Yonghao Zhu, Simral Chaudhary, Bowen Li, Saravanan Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li, Abhinav Rastogi, Lucas Dixon

We investigate the setup of "Parameter Efficient Reinforcement Learning" (PERL), in which we perform reward model training and reinforcement learning using LoRA.

reinforcement-learning

Paper
Add Code

KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction

no code implementations • 12 Mar 2024 • Zixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, Xiang Li, Zhilei Hu, Long Bai, Wei Li, Yidan Liu, Pan Yang, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

After instruction tuning, KnowCoder further exhibits strong generalization ability on unseen schemas and achieves up to $\textbf{12. 5%}$ and $\textbf{21. 9%}$, compared to sota baselines, under the zero-shot setting and the low resource setting, respectively.

Code Generation Language Modelling +2

Paper
Add Code

SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking

no code implementations • 9 Mar 2024 • Hanzheng Wang, Wei Li, Xiang-Gen Xia, Qian Du, Jing Tian

Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously, making it highly suitable for handling challenges such as background clutter and visual similarity in object tracking.

Object Object Tracking

Paper
Add Code

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset

no code implementations • 29 Feb 2024 • Jiantao Qiu, Haijun Lv, Zhenjiang Jin, Rui Wang, Wenchang Ning, JIA YU, Chaobin Zhang, Zhenxiang Li, Pei Chu, Yuan Qu, Jin Shi, Lindong Lu, Runyu Peng, Zhiyuan Zeng, Huanze Tang, Zhikai Lei, Jiawei Hong, Keyu Chen, Zhaoye Fei, Ruiliang Xu, Wei Li, Zhongying Tu, Lin Dahua, Yu Qiao, Hang Yan, Conghui He

To evaluate the quality and utility of the dataset, we trained 1B-parameter and 3B-parameter models using WanJuan-CC and another dataset, RefinedWeb.

Paper
Add Code

Unlocking the Power of Large Language Models for Entity Alignment

no code implementations • 23 Feb 2024 • Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, HuaWei Shen, Yuanzhuo Wang

To address the constraints of limited input KG data, ChatEA introduces a KG-code translation module that translates KG structures into a format understandable by LLMs, thereby allowing LLMs to utilize their extensive background knowledge to improve EA accuracy.

Code Translation Entity Alignment +2

Paper
Add Code

An Error-Matching Exclusion Method for Accelerating Visual SLAM

no code implementations • 22 Feb 2024 • Shaojie Zhang, Yinghui Wang, Jiaxing Ma, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

In Visual SLAM, achieving accurate feature matching consumes a significant amount of time, severely impacting the real-time performance of the system.

Paper
Add Code

A Feature Matching Method Based on Multi-Level Refinement Strategy

no code implementations • 21 Feb 2024 • Shaojie Zhang, Yinghui Wang, Jiaxing Ma, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

Feature matching is a fundamental and crucial process in visual SLAM, and precision has always been a challenging issue in feature matching.

Paper
Add Code

A Robust Error-Resistant View Selection Method for 3D Reconstruction

no code implementations • 18 Feb 2024 • Shaojie Zhang, Yinghui Wang, Bin Nan, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

To address the issue of increased triangulation uncertainty caused by selecting views with small camera baselines in Structure from Motion (SFM) view selection, this paper proposes a robust error-resistant view selection method.

3D Reconstruction

Paper
Add Code

Region Feature Descriptor Adapted to High Affine Transformations

no code implementations • 15 Feb 2024 • Shaojie Zhang, Yinghui Wang, Bin Nan, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

To address the issue of feature descriptors being ineffective in representing grayscale feature information when images undergo high affine transformations, leading to a rapid decline in feature matching accuracy, this paper proposes a region feature descriptor based on simulating affine transformations using classification.

Paper
Add Code

A Highlight Removal Method for Capsule Endoscopy Images

no code implementations • 11 Feb 2024 • Shaojie Zhang, Yinghui Wang, Peixuan Liu, Wei Li, Jinlong Yang, Tao Yan, Yukai Wang, Liangyi Huang, Mingfeng Wang, Ibragim R. Atadjanov

The images captured by Wireless Capsule Endoscopy (WCE) always exhibit specular reflections, and removing highlights while preserving the color and texture in the region remains a challenge.

highlight removal

Paper
Add Code

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

1 code implementation • 29 Jan 2024 • Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Xilin Wei, Songyang Zhang, Haodong Duan, Maosong Cao, Wenwei Zhang, Yining Li, Hang Yan, Yang Gao, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang

We introduce InternLM-XComposer2, a cutting-edge vision-language model excelling in free-form text-image composition and comprehension.

Ranked #17 on Visual Question Answering on MM-Vet

Language Modelling Visual Question Answering

1,687

Paper
Code

Improving Natural Language Capability of Code Large Language Model

1 code implementation • 25 Jan 2024 • Wei Li, Daoguang Zan, Bei guan, Ailun Yu, Xiaolin Chen, Yongji Wang

Code large language models (Code LLMs) have demonstrated remarkable performance in code generation.

Code Generation Language Modelling +1

Paper
Code

UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion

no code implementations • 24 Jan 2024 • Wei Li, Xue Xu, Jiachen Liu, Xinyan Xiao

This paper presents UNIMO-G, a simple multimodal conditional diffusion framework that operates on multimodal prompts with interleaved textual and visual inputs, which demonstrates a unified ability for both text-driven and subject-driven image generation.

Conditional Image Generation Denoising +5

Paper
Add Code

OMG-Seg: Is One Model Good Enough For All Segmentation?

1 code implementation • 18 Jan 2024 • Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Decoder Interactive Segmentation +4

684

Paper
Code

A GAN-based data poisoning framework against anomaly detection in vertical federated learning

no code implementations • 17 Jan 2024 • Xiaolin Chen, Daoguang Zan, Wei Li, Bei guan, Yongji Wang

Specifically, the malicious participant initially employs semi-supervised learning to train a surrogate target model.

Anomaly Detection Data Poisoning +1

Paper
Add Code

Evolutionary Alternating Direction Method of Multipliers for Constrained Multi-Objective Optimization with Unknown Constraints

no code implementations • 2 Jan 2024 • Shuang Li, Ke Li, Wei Li, Ming Yang

Constrained multi-objective optimization problems (CMOPs) pervade real-world applications in science, engineering, and design.

Paper
Add Code

A Generalist FaceX via Learning Unified Facial Representation

1 code implementation • 31 Dec 2023 • Yue Han, Jiangning Zhang, Junwei Zhu, Xiangtai Li, Yanhao Ge, Wei Li, Chengjie Wang, Yong liu, Xiaoming Liu, Ying Tai

This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.

Facial Editing

Paper
Code

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

1 code implementation • 25 Dec 2023 • Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, Jianbing Shen

Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation.

3D Object Detection object-detection +1

Paper
Code

Domain Similarity-Perceived Label Assignment for Domain Generalized Underwater Object Detection

no code implementations • 20 Dec 2023 • Xisheng Li, Wei Li, Pinhao Song, Mingjun Zhang, Jie zhou

The inherent characteristics and light fluctuations of water bodies give rise to the huge difference between different layers and regions in underwater environments.

Data Augmentation object-detection +1

Paper
Add Code

CGS-Mask: Making Time Series Predictions Intuitive for All

no code implementations • 15 Dec 2023 • Feng Lu, Wei Li, Yifei Sun, Cheng Song, Yufei Ren, Albert Y. Zomaya

Artificial intelligence (AI) has immense potential in time series prediction, but most explainable tools have limited capabilities in providing a systematic understanding of important features over time.

Decision Making Feature Importance +2

Paper
Add Code

UINav: A Practical Approach to Train On-Device Automation Agents

no code implementations • 15 Dec 2023 • Wei Li, Fu-Lin Hsu, Will Bishop, Folawiyo Campbell-Ajala, Max Lin, Oriana Riva

Automation systems that can autonomously drive application user interfaces to complete user tasks are of great benefit, especially when users are situationally or permanently impaired.

Paper
Add Code

Brain Computer Interface Technology for Future Battlefield

no code implementations • 13 Dec 2023 • Guodong Xiong, Xinyan Ma, Wei Li, Jiaqi Cao, Jian Zhong, Yicong Su

With the development of artificial intelligence and unmanned equipment, human-machine hybrid formations will be the main focus in future combat formations.

Brain Computer Interface Decision Making

Paper
Add Code

CBQ: Cross-Block Quantization for Large Language Models

no code implementations • 13 Dec 2023 • Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang

Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.

Quantization

Paper
Add Code

GenDet: Towards Good Generalizations for AI-Generated Image Detection

1 code implementation • 12 Dec 2023 • Mingjian Zhu, Hanting Chen, Mouxiao Huang, Wei Li, Hailin Hu, Jie Hu, Yunhe Wang

The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news.

Anomaly Detection

242

Paper
Code

Knowledge Graph Driven Recommendation System Algorithm

no code implementations • 1 Dec 2023 • Chaoyang Zhang, Yanan Li, Shen Chen, Siwei Fan, Wei Li

We first use a single-layer neural network to merge individual node features in the graph, and then adjust the aggregation weights of neighboring entities by incorporating influence factors.

Paper
Add Code

Unsupervised learning of site percolation based on shuffled configurations

no code implementations • 20 Nov 2023 • Dian Xu, Shanshan Wang, Feng Gao, Wei Li, Jianmin Shen

In the field of statistical physics, machine learning has gained significant popularity and has achieved remarkable results in recent studies on phase transitions. In this paper, we apply Principal Component Analysis (PCA) and Autoencoder(AE) based on Unsupervised learning to study the various configurations of the percolation model in equilibrium phase transition.

Paper
Add Code

FireMatch: A Semi-Supervised Video Fire Detection Network Based on Consistency and Distribution Alignment

no code implementations • 9 Nov 2023 • Qinghua Lin, Zuoyong Li, Kun Zeng, Haoyi Fan, Wei Li, Xiaoguang Zhou

Considering the limited quantity of labeled video data, we propose a semi-supervised fire detection model called FireMatch, which is based on consistency regularization and adversarial distribution alignment.

Data Augmentation Fairness +2

Paper
Add Code

An invariant feature extraction for multi-modal images matching

no code implementations • 6 Nov 2023 • Chenzhong Gao, Wei Li

This paper aims at providing an effective multi-modal images invariant feature extraction and matching algorithm for the application of multi-source data analysis.

Paper
Add Code

Video-Helpful Multimodal Machine Translation

1 code implementation • 31 Oct 2023 • Yihang Li, Shuichiro Shimizu, Chenhui Chu, Sadao Kurohashi, Wei Li

In addition to the extensive training set, EVA contains a video-helpful evaluation set in which subtitles are ambiguous, and videos are guaranteed helpful for disambiguation.

Multimodal Machine Translation Translation

Paper
Code

SALMONN: Towards Generic Hearing Abilities for Large Language Models

1 code implementation • 20 Oct 2023 • Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

Audio captioning Automatic Speech Recognition +10

810

Paper
Code

MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning

1 code implementation • 15 Oct 2023 • Dichucheng Li, Yinghao Ma, Weixing Wei, Qiuqiang Kong, Yulun Wu, Mingjin Che, Fan Xia, Emmanouil Benetos, Wei Li

Recognizing the significance of pitch in capturing the nuances of IPTs and the importance of onset in locating IPT events, we investigate multi-task finetuning with pitch and onset detection as auxiliary tasks.

Instrument Playing Technique Detection Self-Supervised Learning

Paper
Code

On the Convergence of Federated Averaging under Partial Participation for Over-parameterized Neural Networks

no code implementations • 9 Oct 2023 • Xin Liu, Wei Li, Dazhi Zhan, Yu Pan, Xin Ma, Yu Ding, Zhisong Pan

Federated learning (FL) is a widely employed distributed paradigm for collaboratively training machine learning models from multiple clients without sharing local data.

Federated Learning

Paper
Add Code

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

2 code implementations • 9 Oct 2023 • Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Audio-visual large language models (LLM) have drawn significant attention, yet the fine-grained combination of both input streams is rather under-explored, which is challenging but necessary for LLMs to understand general video inputs.

Question Answering Video Question Answering

Paper
Code

Cross-head mutual Mean-Teaching for semi-supervised medical image segmentation

1 code implementation • 8 Oct 2023 • Wei Li, Ruifeng Bian, Wenyi Zhao, Weijin Xu, Huihua Yang

To address these concerns, we propose a novel Cross-head mutual mean-teaching Network (CMMT-Net) incorporated strong-weak data augmentation, thereby benefitting both self-training and consistency learning.

Data Augmentation Image Segmentation +2

Paper
Code

A Holistic Evaluation of Piano Sound Quality

no code implementations • 7 Oct 2023 • Monan Zhou, Shangda Wu, Shaohua Ji, Zijin Li, Wei Li

Unlike previous studies that focused on the effect of piano performance techniques on sound quality, this study evaluates the inherent sound quality of different pianos.

Few-Shot Learning

Paper
Add Code

Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training

no code implementations • 29 Sep 2023 • Runnan Chen, Xinge Zhu, Nenglun Chen, Dawei Wang, Wei Li, Yuexin Ma, Ruigang Yang, Tongliang Liu, Wenping Wang

In this paper, we propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages.

3D Semantic Segmentation Object

Paper
Add Code

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

1 code implementation • 26 Sep 2023 • Pan Zhang, Xiaoyi Dong, Bin Wang, Yuhang Cao, Chao Xu, Linke Ouyang, Zhiyuan Zhao, Haodong Duan, Songyang Zhang, Shuangrui Ding, Wenwei Zhang, Hang Yan, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang

We propose InternLM-XComposer, a vision-language large model that enables advanced image-text comprehension and composition.

Ranked #9 on Visual Question Answering (VQA) on InfiMM-Eval

Image Comprehension Reading Comprehension +1

1,687

Paper
Code

IFT: Image Fusion Transformer for Ghost-free High Dynamic Range Imaging

no code implementations • 26 Sep 2023 • Hailing Wang, Wei Li, Yuanyuan Xi, Jie Hu, Hanting Chen, Longyu Li, Yunhe Wang

By matching similar patches between frames, objects with large motion ranges in dynamic scenes can be aligned, which can effectively alleviate the generation of artifacts.

Paper
Add Code

Data Upcycling Knowledge Distillation for Image Super-Resolution

1 code implementation • 25 Sep 2023 • Yun Zhang, Wei Li, Simiao Li, Hanting Chen, Zhijun Tu, Wenjia Wang, BingYi Jing, Shaohui Lin, Jie Hu

Knowledge distillation (KD) compresses deep neural networks by transferring task-related knowledge from cumbersome pre-trained teacher models to compact student models.

Ranked #22 on Image Super-Resolution on Urban100 - 4x upscaling

Image Super-Resolution Knowledge Distillation +1

Paper
Code

Connecting Speech Encoder and Large Language Model for ASR

no code implementations • 25 Sep 2023 • Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

WikiMT++ Dataset Card

no code implementations • 23 Sep 2023 • Monan Zhou, Shangda Wu, YuAn Wang, Wei Li

WikiMT++ is an expanded and refined version of WikiMusicText (WikiMT), featuring 1010 curated lead sheets in ABC notation.

Emotion Classification Information Retrieval +3

Paper
Add Code

MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

1 code implementation • 22 Sep 2023 • Jiahao Xie, Wei Li, Xiangtai Li, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present MosaicFusion, a simple yet effective diffusion-based data augmentation approach for large vocabulary instance segmentation.

Data Augmentation Instance Segmentation +1

105

Paper
Code

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models

no code implementations • 21 Sep 2023 • Yidong Liu, FuKai Shang, Fang Wang, Rui Xu, Jun Wang, Wei Li, Yao Li, Conghui He

With the advancement of deep learning technologies, general-purpose large models such as GPT-4 have demonstrated exceptional capabilities across various domains.

Paper
Add Code

SoccerNet 2023 Challenges Results

2 code implementations • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

Paper
Code

VIGC: Visual Instruction Generation and Correction

2 code implementations • 24 Aug 2023 • Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He

A practical solution to this problem would be to utilize the available multimodal large language models (MLLMs) to generate instruction data for vision-language tasks.

Hallucination Image Captioning +1

Paper
Code

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models

1 code implementation • 21 Aug 2023 • Conghui He, Zhenjiang Jin, Chao Xu, Jiantao Qiu, Bin Wang, Wei Li, Hang Yan, Jiaqi Wang, Dahua Lin

The rise in popularity of ChatGPT and GPT-4 has significantly accelerated the development of large models, leading to the creation of numerous impressive large language models(LLMs) and multimodal large language models (MLLMs).

406

Paper
Code

Improving Anomaly Segmentation with Multi-Granularity Cross-Domain Alignment

no code implementations • 16 Aug 2023 • Ji Zhang, Xiao Wu, Zhi-Qi Cheng, Qi He, Wei Li

Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.

Autonomous Driving Contrastive Learning

Paper
Add Code

A Self-supervised SAR Image Despeckling Strategy Based on Parameter-sharing Convolutional Neural Networks

no code implementations • 11 Aug 2023 • Liang Chen, Yifei Yin, Hao Shi, Qingqing Sheng, Wei Li

The training image pairs are generated by the sub-sampler from real-word SAR image to estimate the noise distribution.

Sar Image Despeckling

Paper
Add Code

A Hybrid CNN-Transformer Architecture with Frequency Domain Contrastive Learning for Image Deraining

no code implementations • 7 Aug 2023 • Cheng Wang, Wei Li

Image deraining is a challenging task that involves restoring degraded images affected by rain streaks.

Contrastive Learning Rain Removal

Paper
Add Code

Improving Generalization in Visual Reinforcement Learning via Conflict-aware Gradient Agreement Augmentation

no code implementations • ICCV 2023 • Siao Liu, Zhaoyu Chen, Yang Liu, Yuzheng Wang, Dingkang Yang, Zhile Zhao, Ziqing Zhou, Xie Yi, Wei Li, Wenqiang Zhang, Zhongxue Gan

In particular, CG2A develops a Gradient Agreement Solver to adaptively balance the varying gradient magnitudes, and introduces a Soft Gradient Surgery strategy to alleviate the gradient conflicts.

reinforcement-learning

Paper
Add Code

Hyper-pixel-wise Contrastive Learning Augmented Segmentation Network for Old Landslide Detection through Fusing High-Resolution Remote Sensing Images and Digital Elevation Model Data

no code implementations • 2 Aug 2023 • Yiming Zhou, Yuexing Peng, Wei Li, Junchuan Yu, Daqing Ge, Wei Xiang

To extract accurate semantic features, a hyper-pixel-wise contrastive learning augmented segmentation network (HPCL-Net) is proposed, which augments the local salient feature extraction from boundaries of landslides through HPCL-Net and fuses heterogeneous infromation in the semantic space from high-resolution remote sensing images and digital elevation model data.

Contrastive Learning Landslide segmentation

Paper
Add Code

Adaptive Graph Convolution Networks for Traffic Flow Forecasting

1 code implementation • 7 Jul 2023 • Zhengdao Li, Wei Li, Kai Hwang

The AGC-net is constructed by the Adaptive Graph Convolution (AGC) based on a novel context attention mechanism, which consists of a set of graph wavelets with various learnable scales.

Paper
Code

NeMO: Neural Map Growing System for Spatiotemporal Fusion in Bird's-Eye-View and BDD-Map Benchmark

no code implementations • 7 Jun 2023 • Xi Zhu, Xiya Cao, Zhiwei Dong, Caifa Zhou, Qiangbo Liu, Wei Li, Yongliang Wang

We also provide a new scene-level BEV map evaluation setting along with the corresponding baseline for a more comprehensive comparison.

Autonomous Driving Time Series

Paper
Add Code

Balancing Logit Variation for Long-tailed Semantic Segmentation

1 code implementation • CVPR 2023 • Yuchao Wang, Jingjing Fei, Haochen Wang, Wei Li, Tianpeng Bao, Liwei Wu, Rui Zhao, Yujun Shen

In this way, we manage to close the gap between the feature areas of different categories, resulting in a more balanced representation.

Semantic Segmentation

Paper
Code

Contextual Object Detection with Multimodal Large Language Models

1 code implementation • 29 May 2023 • Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy

Moreover, we present ContextDET, a unified multimodal model that is capable of end-to-end differentiable modeling of visual-language contexts, so as to locate, identify, and associate visual objects with language inputs for human-AI interaction.

Cloze Test Decoder +7

159

Paper
Code

On the Value of Myopic Behavior in Policy Reuse

no code implementations • 28 May 2023 • Kang Xu, Chenjia Bai, Shuang Qiu, Haoran He, Bin Zhao, Zhen Wang, Wei Li, Xuelong Li

Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence.

Paper
Add Code

Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving

no code implementations • 25 May 2023 • Wenhao Cheng, Junbo Yin, Wei Li, Ruigang Yang, Jianbing Shen

In this work, we propose a new multi-modal visual grounding task, termed LiDAR Grounding.

3D Object Detection Autonomous Driving +4

Paper
Add Code

Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation

no code implementations • 23 May 2023 • Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Liwei Wu, Yuxi Wang, Zhaoxiang Zhang

To this end, we propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation, encouraging the model in learning similar cross-domain features.

Domain Generalization Semantic Segmentation

Paper
Add Code

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring

no code implementations • 19 May 2023 • Kaiqi Fu, Shaojun Gao, Shuju Shi, Xiaohai Tian, Wei Li, Zejun Ma

Specifically, we first pre-train the model using a reconstruction loss function, by masking phones and their durations jointly on a large amount of unlabeled speech and text prompts.

Self-Supervised Learning

Paper
Add Code

PaLM 2 Technical Report

1 code implementation • 17 May 2023 • Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

Ranked #1 on Multi-task Language Understanding on MMLU

Code Generation Common Sense Reasoning +6

Paper
Code

Mlinear: Rethink the Linear Model for Time-series Forecasting

no code implementations • 8 May 2023 • Wei Li, Xiangxu Meng, Chuhao Chen, Jianing Chen

In this paper, we carefully examine the opposing properties of CI and CD, and raise a practical question that has not been effectively answered, e. g.,"How to effectively mix the CI and CD properties of time series to achieve better predictive performance?"

Philosophy Time Series +1

Paper
Add Code

TransHP: Image Classification with Hierarchical Prompting

1 code implementation • NeurIPS 2023 • Wenhao Wang, Yifan Sun, Wei Li, Yi Yang

This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task.

Classification Image Classification

Paper
Code

Adaptive Mask Sampling and Manifold to Euclidean Subspace Learning with Distance Covariance Representation for Hyperspectral Image Classification

1 code implementation • IEEE Transactions on Geoscience and Remote Sensing 2023 • Mingsong Li, Wei Li, Yikun Liu, Yuwen Huang, and Gongping Yang.

Subsequently, based on distance covariance descriptor, a dual channel distance covariance representation (DC-DCR) module is proposed for modeling unified spectral-spatial feature representations and exploring spectral-spatial relationships, especially linear and nonlinear interdependence in spectral domain.

Ranked #1 on Hyperspectral Image Classification on Indian Pines (OA@5%perclass metric)

Hyperspectral image analysis Hyperspectral Image Classification +1

Paper
Code

Siamese DETR

1 code implementation • CVPR 2023 • Zeren Chen, Gengshi Huang, Wei Li, Jianing Teng, Kun Wang, Jing Shao, Chen Change Loy, Lu Sheng

In this work, we present Siamese DETR, a Siamese self-supervised pretraining approach for the Transformer architecture in DETR.

MULTI-VIEW LEARNING Representation Learning

Paper
Code

MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks

no code implementations • 29 Mar 2023 • Weicheng Kuo, AJ Piergiovanni, Dahun Kim, Xiyang Luo, Ben Caine, Wei Li, Abhijit Ogale, Luowei Zhou, Andrew Dai, Zhifeng Chen, Claire Cui, Anelia Angelova

We propose a novel paradigm of training with a decoder-only model for multimodal tasks, which is surprisingly effective in jointly learning of these disparate vision-language tasks.

Ranked #1 on Video Captioning on MSVD

Cross-Modal Retrieval Decoder +8

Paper
Add Code

Frame-Level Multi-Label Playing Technique Detection Using Multi-Scale Network and Self-Attention Mechanism

1 code implementation • 23 Mar 2023 • Dichucheng Li, Mingjin Che, Wenwu Meng, Yulun Wu, Yi Yu, Fan Xia, Wei Li

Instrument playing technique (IPT) is a key element of musical presentation.

Instrument Playing Technique Detection Multi-Label Classification

Paper
Code

Correlational Image Modeling for Self-Supervised Visual Pre-Training

1 code implementation • CVPR 2023 • Wei Li, Jiahao Xie, Chen Change Loy

We introduce Correlational Image Modeling (CIM), a novel and surprisingly effective approach to self-supervised visual pre-training.

Paper
Code

SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object Detection with Transformers

no code implementations • 15 Mar 2023 • Guoqiang Jin, Fan Yang, Mingshan Sun, Ruyi Zhao, Yakun Liu, Wei Li, Tianpeng Bao, Liwei Wu, Xingyu Zeng, Rui Zhao

To this end, we propose SeqCo-DETR, a novel Sequence Consistency-based self-supervised method for object DEtection with TRansformers.

Object object-detection +2

Paper
Add Code

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

1 code implementation • 6 Mar 2023 • Wei Li, Linchao Zhu, Longyin Wen, Yi Yang

This decoder is both data-efficient and computation-efficient: 1) it only requires the text data for training, easing the burden on the collection of paired data.

Decoder Image Captioning +1

109

Paper
Code

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

no code implementations • CVPR 2023 • Rui Zhao, Wei Li, Zhipeng Hu, Lincheng Li, Zhengxia Zou, Zhenwei Shi, Changjie Fan

In our method, taking the power of large-scale pre-trained multi-modal CLIP and neural rendering, T2P searches both continuous facial parameters and discrete facial parameters in a unified framework.

3D Generation Face Model +3

Paper
Add Code

Efficient Masked Autoencoders with Self-Consistency

no code implementations • 28 Feb 2023 • Zhaowen Li, Yousong Zhu, Zhiyang Chen, Wei Li, Chaoyang Zhao, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

However, its high random mask ratio would result in two serious problems: 1) the data are not efficiently exploited, which brings inefficient pre-training (\eg, 1600 epochs for MAE $vs.$ 300 epochs for the supervised), and 2) the high uncertainty and inconsistency of the pre-trained model, \ie, the prediction of the same patch may be inconsistent under different mask rounds.

Language Modelling Masked Language Modeling +3

Paper
Add Code

An Iterative Classification and Semantic Segmentation Network for Old Landslide Detection Using High-Resolution Remote Sensing Images

no code implementations • 24 Feb 2023 • Zili Lu, Yuexing Peng, Wei Li, Junchuan Yu, Daqing Ge, Wei Xiang

An object-level contrastive learning (OCL) strategy is employed in the object classification sub-network featuring a siamese network to realize the global features extraction, and a sub-object-level contrastive learning (SOCL) paradigm is designed in the semantic segmentation sub-network to efficiently extract salient features from boundaries of landslides.

Classification Contrastive Learning +3

Paper
Add Code

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

no code implementations • 21 Feb 2023 • Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i. e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation.

Paper
Add Code

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

no code implementations • 20 Feb 2023 • Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

AIIR-MIX: Multi-Agent Reinforcement Learning Meets Attention Individual Intrinsic Reward Mixing Network

no code implementations • 19 Feb 2023 • Wei Li, Weiyan Liu, Shitong Shao, Shiyi Huang

The results show that AIIR-MIX can dynamically assign each agent a real-time intrinsic reward in accordance with their actual contribution.

Multi-agent Reinforcement Learning Starcraft +1

Paper
Add Code

Objective Evaluation-based High-efficiency Learning Framework for Hyperspectral Image Classification

no code implementations • 10 Jan 2023 • Xuming Zhang, Jian Yan, Jia Tian, Wei Li, Xingfa Gu, Qingjiu Tian

This framework comprises two main parts: (i) a leakage-free balanced sampling strategy, and (ii) a modified end-to-end fully convolutional network (FCN) architecture that optimizes the trade-off between accuracy and efficiency.

Hyperspectral Image Classification Vocal Bursts Intensity Prediction

Paper
Add Code

RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis

1 code implementation • CVPR 2023 • Xudong Huang, Wei Li, Jie Hu, Hanting Chen, Yunhe Wang

We present Reference-guided Super-Resolution Neural Radiance Field (RefSR-NeRF) that extends NeRF to super resolution and photorealistic novel view synthesis.

Neural Rendering Novel View Synthesis +1

Paper
Code

WeCheck: Strong Factual Consistency Checker via Weakly Supervised Learning

1 code implementation • 20 Dec 2022 • Wenhao Wu, Wei Li, Xinyan Xiao, Jiachen Liu, Sujian Li, Yajuan Lv

As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks.

Natural Language Inference Question Answering +2

Paper
Code

DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image Super-Resolution

no code implementations • 15 Dec 2022 • Junbo Qiao, Shaohui Lin, Yunlun Zhang, Wei Li, Jie Hu, Gaoqi He, Changbo Wang, Lizhuang Ma

Real-world image super-resolution (RISR) has received increased focus for improving the quality of SR images under unknown complex degradation.

Image Super-Resolution SSIM

Paper
Add Code

SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from Point Cloud

1 code implementation • 6 Dec 2022 • Yan Wang, Junbo Yin, Wei Li, Pascal Frossard, Ruigang Yang, Jianbing Shen

However, these UDA solutions just yield unsatisfactory 3D detection results when there is a severe domain shift, e. g., from Waymo (64-beam) to nuScenes (32-beam).

3D Object Detection Autonomous Driving +5

Paper
Code

Exploring Stochastic Autoregressive Image Modeling for Visual Representation

1 code implementation • 3 Dec 2022 • Yu Qi, Fan Yang, Yousong Zhu, Yufei Liu, Liwei Wu, Rui Zhao, Wei Li

By introducing stochastic prediction and the parallel encoder-decoder, SAIM significantly improve the performance of autoregressive image modeling.

Decoder Self-Supervised Learning

Paper
Code

MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection

no code implementations • 25 Nov 2022 • Tianpeng Bao, Jiadong Chen, Wei Li, Xiang Wang, Jingjing Fei, Liwei Wu, Rui Zhao, Ye Zheng

However, existing datasets for unsupervised anomaly detection are biased towards manufacturing inspection, not considering maintenance inspection which is usually conducted under outdoor uncontrolled environment such as varying camera viewpoints, messy background and degradation of object surface after long-term working.

Unsupervised Anomaly Detection

Paper
Add Code

Delving into Out-of-Distribution Detection with Vision-Language Representations

2 code implementations • 24 Nov 2022 • Yifei Ming, Ziyang Cai, Jiuxiang Gu, Yiyou Sun, Wei Li, Yixuan Li

Recognizing out-of-distribution (OOD) samples is critical for machine learning systems deployed in the open world.

Ranked #5 on Out-of-Distribution Detection on ImageNet-1k vs Places

Out-of-Distribution Detection

Paper
Code

Transformation-Equivariant 3D Object Detection for Autonomous Driving

no code implementations • 22 Nov 2022 • Hai Wu, Chenglu Wen, Wei Li, Xin Li, Ruigang Yang, Cheng Wang

However, it is difficult to apply such networks to 3D object detection in autonomous driving due to its large computation cost and slow reasoning speed.

3D Object Detection Autonomous Driving +3

Paper
Add Code

A Data-driven Case-based Reasoning in Bankruptcy Prediction

no code implementations • 2 Nov 2022 • Wei Li, Wolfgang Karl Härdle, Stefan Lessmann

In addition, we delicately examine the explainability of the CBR system in the decision-making process of bankruptcy prediction.

Decision Making

Paper
Add Code

FRSUM: Towards Faithful Abstractive Summarization via Enhancing Factual Robustness

no code implementations • 1 Nov 2022 • Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Ziqiang Cao, Sujian Li, Hua Wu

We first measure a model's factual robustness by its success rate to defend against adversarial attacks when generating factual information.

Abstractive Text Summarization

Paper
Add Code

UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance

no code implementations • 28 Oct 2022 • Wei Li, Xue Xu, Xinyan Xiao, Jiachen Liu, Hu Yang, Guohao Li, Zhanpeng Wang, Zhifan Feng, Qiaoqiao She, Yajuan Lyu, Hua Wu

Diffusion generative models have recently greatly improved the power of text-conditioned image generation.

Image Generation Image-text matching +2

Paper
Add Code

Jet tagging algorithm of graph network with HaarPooling message passing

no code implementations • 25 Oct 2022 • Fei Ma, Feiyi Liu, Wei Li

In this paper, we introduce an approach of GNNs combined with a HaarPooling operation to analyze the events, called HaarPooling Message Passing neural network (HMPNet).

Jet Tagging

Paper
Add Code

Precisely the Point: Adversarial Augmentations for Faithful and Informative Text Generation

no code implementations • 22 Oct 2022 • Wenhao Wu, Wei Li, Jiachen Liu, Xinyan Xiao, Sujian Li, Yajuan Lyu

Though model robustness has been extensively studied in language understanding, the robustness of Seq2Seq generation remains understudied.

Informativeness Text Generation

Paper
Add Code

Robot Navigation with Reinforcement Learned Path Generation and Fine-Tuned Motion Control

no code implementations • 19 Oct 2022 • Longyuan Zhang, Ziyue Hou, Ji Wang, Ziang Liu, Wei Li

Multiple predictive path points are dynamically generated by a deep Markov model optimized using RL approach for robot to track.

Reinforcement Learning (RL) Robot Navigation

Paper
Add Code

Zero-shot point cloud segmentation by transferring geometric primitives

no code implementations • 18 Oct 2022 • Runnan Chen, Xinge Zhu, Nenglun Chen, Wei Li, Yuexin Ma, Ruigang Yang, Wenping Wang

To this end, we propose a novel framework to learn the geometric primitives shared in seen and unseen categories' objects and employ a fine-grained alignment between language and the learned geometric primitives.

Point Cloud Segmentation Semantic Segmentation

Paper
Add Code

HiSMatch: Historical Structure Matching based Temporal Knowledge Graph Reasoning

no code implementations • 18 Oct 2022 • Zixuan Li, Zhongni Hou, Saiping Guan, Xiaolong Jin, Weihua Peng, Long Bai, Yajuan Lyu, Wei Li, Jiafeng Guo, Xueqi Cheng

This is actually a matching task between a query and candidate entities based on their historical structures, which reflect behavioral trends of the entities at different timestamps.

Relation

Paper
Add Code

Unified Vision and Language Prompt Learning

1 code implementation • 13 Oct 2022 • Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

Prompt tuning, a parameter- and data-efficient transfer learning paradigm that tunes only a small number of parameters in a model's input space, has become a trend in the vision community since the emergence of large vision-language models like CLIP.

Domain Generalization Few-Shot Learning +2

Paper
Code

Stock Trading Volume Prediction with Dual-Process Meta-Learning

1 code implementation • 11 Oct 2022 • Ruibo Chen, Wei Li, Zhiyuan Zhang, Ruihan Bao, Keiko Harimoto, Xu sun

Our method can model the common pattern behind different stocks with a meta-learner, while modeling the specific pattern for each stock across time spans with stock-dependent parameters.

Algorithmic Trading Meta-Learning

Paper
Code

SoccerNet 2022 Challenges Results

7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Paper
Code

Misaligned orientations of 4f optical neural network for image classification accuracy on various datasets

no code implementations • 5 Oct 2022 • Yanbing Liu, Wei Li, Kun Cheng, Xun Liu, Wei Yang

In order to comprehensively investigate the influence caused by the misalignment, we proposed a method for estimating the performance of a 4f-ONN in response to various misalignment in the context of the image classification task. The misalignment in numerical simulation is estimated by manipulating the optical intensity distributions in the fourth focus plane in the 4f system.

Classification Image Classification

Paper
Add Code

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

2 code implementations • 28 Sep 2022 • Zhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Wei Li, Haixin Wang, Chaoyang Zhao, Liwei Wu, Rui Zhao, Jinqiao Wang, Ming Tang

Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks.

Multi-Label Classification Object +2

Paper
Code

Open-Ended Diverse Solution Discovery with Regulated Behavior Patterns for Cross-Domain Adaptation

no code implementations • 24 Sep 2022 • Kang Xu, Yan Ma, Bingsheng Wei, Wei Li

While Reinforcement Learning can achieve impressive results for complex tasks, the learned policies are generally prone to fail in downstream tasks with even minor model mismatch or unexpected perturbations.

Domain Adaptation

Paper
Add Code

Quantification before Selection: Active Dynamics Preference for Robust Reinforcement Learning

no code implementations • 23 Sep 2022 • Kang Xu, Yan Ma, Wei Li

Our key insight is that dynamic systems with different parameters provide different levels of difficulty for the policy, and the difficulty of behaving well in a system is constantly changing due to the evolution of the policy.

Informativeness reinforcement-learning +1

Paper
Add Code

GANet: Goal Area Network for Motion Forecasting

1 code implementation • 20 Sep 2022 • Mingkun Wang, Xinge Zhu, Changqian Yu, Wei Li, Yuexin Ma, Ruochun Jin, Xiaoguang Ren, Dongchun Ren, Mingxu Wang, Wenjing Yang

In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas rather than exact goal coordinates as preconditions for trajectory prediction, performing more robustly and accurately.

Ranked #15 on Motion Forecasting on Argoverse CVPR 2020

Motion Forecasting Trajectory Prediction

Paper
Code

Playing Technique Detection by Fusing Note Onset Information in Guzheng Performance

no code implementations • 19 Sep 2022 • Dichucheng Li, Yulun Wu, Qinyu Li, Jiahao Zhao, Yi Yu, Fan Xia, Wei Li

Because each Guzheng playing technique is applied to a note, a dedicated onset detector is trained to divide an audio into several notes and its predictions are fused with frame-wise IPT predictions.

Paper
Add Code

LO-Det: Lightweight Oriented Object Detection in Remote Sensing Images

no code implementations • 16 Sep 2022 • Zhanchao Huang, Wei Li, Xiang-Gen Xia, Hao Wang, Feiran Jie, Ran Tao

Specifically, a channel separation-aggregation (CSA) structure is designed to simplify the complexity of stacked separable convolutions, and a dynamic receptive field (DRF) mechanism is developed to maintain high accuracy by customizing the convolution kernel and its perception range dynamically when reducing the network complexity.

object-detection Object Detection +1

Paper
Add Code

Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

1 code implementation • 15 Sep 2022 • Ye Du, Yujun Shen, Haochen Wang, Jingjing Fei, Wei Li, Liwei Wu, Rui Zhao, Zehua Fu, Qingjie Liu

Self-training has shown great potential in semi-supervised learning.

Pseudo Label Semi-Supervised Semantic Segmentation +1

Paper
Code

Multi-Grained Angle Representation for Remote Sensing Object Detection

1 code implementation • 7 Sep 2022 • Hao Wang, Zhanchao Huang, Zhengchao Chen, Ying Song, Wei Li

The existing AOOD methods face the challenges of ambiguity and high costs in angle representation.

Object object-detection +2

611

Paper
Code

Language-aware Domain Generalization Network for Cross-Scene Hyperspectral Image Classification

no code implementations • 6 Sep 2022 • Yuxiang Zhang, Mengmeng Zhang, Wei Li, Shuai Wang, Ran Tao

Text information including extensive prior knowledge about land cover classes has been ignored in hyperspectral image classification (HSI) tasks.

Contrastive Learning Domain Generalization +1

Paper
Add Code

Task-wise Sampling Convolutions for Arbitrary-Oriented Object Detection in Aerial Images

1 code implementation • 6 Sep 2022 • Zhanchao Huang, Wei Li, Xiang-Gen Xia, Hao Wang, Ran Tao

Specifically, sampling positions of the localization convolution in TS-Conv are supervised by the oriented bounding box (OBB) prediction associated with spatial coordinates, while sampling positions and convolutional kernel of the classification convolution are designed to be adaptively adjusted according to different orientations for improving the orientation robustness of features.

object-detection Object Detection In Aerial Images +1

611

Paper
Code

Single-source Domain Expansion Network for Cross-Scene Hyperspectral Image Classification

no code implementations • 4 Sep 2022 • Yuxiang Zhang, Wei Li, Weidong Sun, Ran Tao, Qian Du

Currently, cross-scene hyperspectral image (HSI) classification has drawn increasing attention.

Contrastive Learning Decoder +3

Paper
Add Code

Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies

1 code implementation • 3 Sep 2022 • Xingrun Xing, Yangguang Li, Wei Li, Wenrui Ding, Yalong Jiang, Yufeng Wang, Jing Shao, Chunlei Liu, Xianglong Liu

Second, to improve the robustness of binary models with contextual dependencies, we compute the contextual dynamic embeddings to determine the binarization thresholds in general binary convolutional blocks.

Binarization Inductive Bias

Paper
Code

DeepInteraction: 3D Object Detection via Modality Interaction

2 code implementations • 23 Aug 2022 • Zeyu Yang, Jiaqi Chen, Zhenwei Miao, Wei Li, Xiatian Zhu, Li Zhang

Existing top-performance 3D object detectors typically rely on the multi-modal fusion strategy.

3D Object Detection Decoder +3

192

Paper
Code

Rethinking Graph Neural Networks for the Graph Coloring Problem

no code implementations • 15 Aug 2022 • Wei Li, Ruxuan Li, Yuzhe ma, Siu On Chan, David Pan, Bei Yu

Graph coloring, a classical and critical NP-hard problem, is the problem of assigning connected nodes as different colors as possible.

Paper
Add Code

Making the Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation

1 code implementation • 2 Aug 2022 • Wenxuan Ma, Jinming Zhang, Shuang Li, Chi Harold Liu, Yulin Wang, Wei Li

To alleviate these issues, we propose to simultaneously conduct feature alignment in two individual spaces focusing on different domains, and create for each space a domain-oriented classifier tailored specifically for that domain.

Pseudo Label Unsupervised Domain Adaptation

Paper
Code

Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios

4 code implementations • 12 Jul 2022 • Jiashi Li, Xin Xia, Wei Li, Huixia Li, Xing Wang, Xuefeng Xiao, Rui Wang, Min Zheng, Xin Pan

Then, Next Hybrid Strategy (NHS) is designed to stack NCB and NTB in an efficient hybrid paradigm, which boosts performance in various downstream tasks.

Ranked #281 on Image Classification on ImageNet

Image Classification

29,890

Paper
Code

Positive-Negative Equal Contrastive Loss for Semantic Segmentation

no code implementations • 4 Jul 2022 • Jing Wang, Jiangyun Li, Wei Li, Lingfei Xuan, Tianxiang Zhang, Wenxuan Wang

The contextual information is critical for various computer vision tasks, previous works commonly design plug-and-play modules and structural losses to effectively extract and aggregate the global context.

Contrastive Learning Semantic Segmentation

Paper
Add Code

Graph Information Aggregation Cross-Domain Few-Shot Learning for Hyperspectral Image Classification

1 code implementation • IEEE Transactions on Neural Networks and Learning Systems 2022 • Yuxiang Zhang, Wei Li, Mengmeng Zhang, Shuai Wang, Ran Tao, Qian Du

The IDE-block is used to characterize and aggregate the intradomain nonlocal relationships and the interdomain feature and distribution similarities are captured in the CSA-block.

Ranked #3 on Hyperspectral Image Classification on Kennedy Space Center

cross-domain few-shot learning Domain Adaptation +2

Paper
Code

SJ-HD^2R: Selective Joint High Dynamic Range and Denoising Imaging for Dynamic Scenes

no code implementations • 20 Jun 2022 • Wei Li, Shuai Xiao, Tianhong Dai, Shanxin Yuan, Tao Wang, Cheng Li, Fenglong Song

To further leverage these two paradigms, we propose a selective and joint HDR and denoising (SJ-HD$^2$R) imaging framework, utilizing scenario-specific priors to conduct the path selection with an accuracy of more than 93. 3$\%$.

Denoising

Paper
Add Code

Masked Frequency Modeling for Self-Supervised Visual Pre-Training

3 code implementations • 15 Jun 2022 • Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based approach for self-supervised pre-training of visual models.

Image Classification Image Restoration +2

Paper
Code

Toward Real-world Single Image Deraining: A New Benchmark and Beyond

1 code implementation • 11 Jun 2022 • Wei Li, Qiming Zhang, Jing Zhang, Zhen Huang, Xinmei Tian, DaCheng Tao

To address these issues, we establish a new high-quality dataset named RealRain-1k, consisting of $1, 120$ high-resolution paired clean and rainy images with low- and high-density rain streaks, respectively.

Domain Generalization Image Restoration +2

Paper
Code

MPANet: Multi-Patch Attention For Infrared Small Target object Detection

no code implementations • 5 Jun 2022 • Ao Wang, Wei Li, Xin Wu, Zhanchao Huang, Ran Tao

To this end, a multi-patch attention network (MPANet) based on the axial-attention encoder and the multi-scale patch branch (MSPB) structure is proposed.

object-detection Object Detection

Paper
Add Code

Benchmarking Unsupervised Anomaly Detection and Localization

no code implementations • 30 May 2022 • Ye Zheng, Xiang Wang, Yu Qi, Wei Li, Liwei Wu

From the time the MVTec AD dataset was proposed to the present, new research methods that are constantly being proposed push its precision to saturation.

Benchmarking Unsupervised Anomaly Detection

Paper
Add Code

Do We Really Need to Use Constraint Violation in Constrained Evolutionary Multi-Objective Optimization?

no code implementations • 28 May 2022 • Shuang Li, Ke Li, Wei Li

Constraint violation has been a building block to design evolutionary multi-objective optimization algorithms for solving constrained multi-objective optimization problems.

Paper
Add Code

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

no code implementations • 25 May 2022 • Eduardo Pérez-Pellitero, Sibi Catley-Chandar, Richard Shaw, Aleš Leonardis, Radu Timofte, Zexin Zhang, Cen Liu, Yunbo Peng, Yue Lin, Gaocheng Yu, Jin Zhang, Zhe Ma, Hongbin Wang, Xiangyu Chen, Xintao Wang, Haiwei Wu, Lin Liu, Chao Dong, Jiantao Zhou, Qingsen Yan, Song Zhang, Weiye Chen, Yuhang Liu, Zhen Zhang, Yanning Zhang, Javen Qinfeng Shi, Dong Gong, Dan Zhu, Mengdi Sun, Guannan Chen, Yang Hu, Haowei Li, Baozhu Zou, Zhen Liu, Wenjie Lin, Ting Jiang, Chengzhi Jiang, Xinpeng Li, Mingyan Han, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Juan Marín-Vega, Michael Sloth, Peter Schneider-Kamp, Richard Röttger, Chunyang Li, Long Bao, Gang He, Ziyao Xu, Li Xu, Gen Zhan, Ming Sun, Xing Wen, Junlin Li, Shuang Feng, Fei Lei, Rui Liu, Junxiang Ruan, Tianhong Dai, Wei Li, Zhan Lu, Hengyan Liu, Peian Huang, Guangyu Ren, Yonglin Luo, Chang Liu, Qiang Tu, Fangya Li, Ruipeng Gang, Chenghua Li, Jinjing Li, Sai Ma, Chenming Liu, Yizhen Cao, Steven Tel, Barthelemy Heyrman, Dominique Ginhac, Chul Lee, Gahyeon Kim, Seonghyun Park, An Gia Vien, Truong Thanh Nhat Mai, Howoon Yoon, Tu Vo, Alexander Holston, Sheir Zaheer, Chan Y. Park

The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i. e. solutions can not exceed a given number of operations).

Image Restoration Vocal Bursts Intensity Prediction

Paper
Add Code

A Survey on Hyperspectral Image Restoration: From the View of Low-Rank Tensor Approximation

no code implementations • 18 May 2022 • Na Liu, Wei Li, Yinjian Wang, Rao Tao, Qian Du, Jocelyn Chanussot

The ability of capturing fine spectral discriminative information enables hyperspectral images (HSIs) to observe, detect and identify objects with subtle spectral discrepancy.

Deblurring Denoising +2

Paper
Add Code

$(O,G)$-granular variable precision fuzzy rough sets based on overlap and grouping functions

no code implementations • 18 May 2022 • Wei Li, Bin Yang, Junsheng Qiao

In this paper, the depiction of $(O, G)$-granular variable precision fuzzy rough sets ($(O, G)$-GVPFRSs for short) is first given based on overlap and grouping functions.

Paper
Add Code

Some neighborhood-related fuzzy covering-based rough set models and their applications for decision making

no code implementations • 13 May 2022 • Gongao Qi, Bin Yang, Wei Li

In order to further generalize the FRS theory to more complicated data environments, we firstly propose four types of fuzzy neighborhood operators based on fuzzy covering by overlap functions and their implicators in this paper.

Decision Making

Paper
Add Code

On three types of $L$-fuzzy $β$-covering-based rough sets

no code implementations • 13 May 2022 • Wei Li, Bin Yang, Junsheng Qiao

In this paper, we mainly construct three types of $L$-fuzzy $\beta$-covering-based rough set models and study the axiom sets, matrix representations and interdependency of these three pairs of $L$-fuzzy $\beta$-covering-based rough approximation operators.

valid

Paper
Add Code

Answer-Me: Multi-Task Open-Vocabulary Visual Question Answering

no code implementations • 2 May 2022 • AJ Piergiovanni, Wei Li, Weicheng Kuo, Mohammad Saffar, Fred Bertsch, Anelia Angelova

We present Answer-Me, a task-aware multi-task framework which unifies a variety of question answering tasks, such as, visual question answering, visual entailment, visual reasoning.

Decoder Image Captioning +5

Paper
Add Code

HarmoF0: Logarithmic Scale Dilated Convolution For Pitch Estimation

1 code implementation • 2 May 2022 • Weixing Wei, Peilin Li, Yi Yu, Wei Li

Sounds, especially music, contain various harmonic components scattered in the frequency dimension.

Paper
Code

Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention

no code implementations • 24 Apr 2022 • Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li, Qianhua He

Although few-shot learning has attracted much attention from the fields of image and audio classification, few efforts have been made on few-shot speaker identification.

Audio Classification Few-Shot Learning +1

Paper
Add Code

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

1 code implementation • Findings (NAACL) 2022 • Yong Cao, Wei Li, Xianzhi Li, Min Chen, Guangyong Chen, Long Hu, Zhengdao Li, Hwang Kai

Sign language recognition and translation first uses a recognition module to generate glosses from sign language videos and then employs a translation module to translate glosses into spoken sentences.

Data Augmentation Sign Language Recognition +2

Paper
Code

A3CLNN: Spatial, Spectral and Multiscale Attention ConvLSTM Neural Network for Multisource Remote Sensing Data Classification

no code implementations • 9 Apr 2022 • Heng-Chao Li, Wen-Shuai Hu, Wei Li, Jun Li, Qian Du, Antonio Plaza

The problem of effectively exploiting the information multiple data sources has become a relevant but challenging research topic in remote sensing.

Transfer Learning

Paper
Add Code

Faster-TAD: Towards Temporal Action Detection with Proposal Generation and Classification in a Unified Network

no code implementations • 6 Apr 2022 • Shimin Chen, Chen Chen, Wei Li, Xunqiang Tao, Yandong Guo

In this paper, we propose a unified network for TAD, termed Faster-TAD, by re-purposing a Faster-RCNN like architecture.

Action Detection Action Spotting

Paper
Add Code

SEAL: A Large-scale Video Dataset of Multi-grained Spatio-temporally Action Localization

no code implementations • 6 Apr 2022 • Shimin Chen, Wei Li, Chen Chen, Jianyang Gu, Jiaming Chu, Xunqiang Tao, Yandong Guo

SEAL consists of two kinds of annotations, SEAL Tubes and SEAL Clips.

Action Recognition Spatio-Temporal Action Localization +2

Paper
Add Code

MS-HLMO: Multi-scale Histogram of Local Main Orientation for Remote Sensing Image Registration

no code implementations • 1 Apr 2022 • Chenzhong Gao, Wei Li, Ran Tao, Qian Du

Considering the characteristics and differences of multi-source remote sensing images, a feature-based registration algorithm named Multi-scale Histogram of Local Main Orientation (MS-HLMO) is proposed.

Image Registration

Paper
Add Code

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models

1 code implementation • CVPR 2022 • Feng Cheng, Mingze Xu, Yuanjun Xiong, Hao Chen, Xinyu Li, Wei Li, Wei Xia

We propose a memory efficient method, named Stochastic Backpropagation (SBP), for training deep neural networks on videos.

Action Detection Action Recognition

Paper
Code

MeMOT: Multi-Object Tracking with Memory

no code implementations • CVPR 2022 • Jiarui Cai, Mingze Xu, Wei Li, Yuanjun Xiong, Wei Xia, Zhuowen Tu, Stefano Soatto

We propose an online tracking algorithm that performs the object detection and data association under a common framework, capable of linking objects after a long time span.

Multi-Object Tracking Object +2

Paper
Add Code

FindIt: Generalized Localization with Natural Language Queries

no code implementations • 31 Mar 2022 • Weicheng Kuo, Fred Bertsch, Wei Li, AJ Piergiovanni, Mohammad Saffar, Anelia Angelova

We propose FindIt, a simple and versatile framework that unifies a variety of visual grounding and localization tasks including referring expression comprehension, text-based localization, and object detection.

Natural Language Queries Object +5

Paper
Add Code

SepViT: Separable Vision Transformer

2 code implementations • 29 Mar 2022 • Wei Li, Xing Wang, Xin Xia, Jie Wu, Jiashi Li, Xuefeng Xiao, Min Zheng, Shiping Wen

Vision Transformers have witnessed prevailing success in a series of vision tasks.

Instance Segmentation object-detection +1

18,106

Paper
Code

Open-Vocabulary DETR with Conditional Matching

1 code implementation • 22 Mar 2022 • Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy

To this end, we propose a novel open-vocabulary detector based on DETR -- hence the name OV-DETR -- which, once trained, can detect any object given its class name or an exemplar image.

Ranked #21 on Open Vocabulary Object Detection on MSCOCO

Language Modelling object-detection +1

194

Paper
Code

Towards 3D Scene Understanding by Referring Synthetic Models

no code implementations • 20 Mar 2022 • Runnan Chen, Xinge Zhu, Nenglun Chen, Dawei Wang, Wei Li, Yuexin Ma, Ruigang Yang, Wenping Wang

Promising performance has been achieved for visual perception on the point cloud.

Scene Understanding Transfer Learning

Paper
Add Code

UNIMO-2: End-to-End Unified Vision-Language Grounded Learning

1 code implementation • Findings (ACL) 2022 • Wei Li, Can Gao, guocheng niu, Xinyan Xiao, Hao liu, Jiachen Liu, Hua Wu, Haifeng Wang

In particular, we propose to conduct grounded learning on both images and texts via a sharing grounded space, which helps bridge unaligned images and texts, and align the visual and textual semantic spaces on different types of corpora.

1,694

Paper
Code

Complex Evolutional Pattern Learning for Temporal Knowledge Graph Reasoning

1 code implementation • ACL 2022 • Zixuan Li, Saiping Guan, Xiaolong Jin, Weihua Peng, Yajuan Lyu, Yong Zhu, Long Bai, Wei Li, Jiafeng Guo, Xueqi Cheng

Furthermore, these models are all trained offline, which cannot well adapt to the changes of evolutional patterns from then on.

Paper
Code

Efficient universal shuffle attack for visual object tracking

no code implementations • 14 Mar 2022 • Siao Liu, Zhaoyu Chen, Wei Li, Jiwei Zhu, Jiafeng Wang, Wenqiang Zhang, Zhongxue Gan

Recently, adversarial attacks have been applied in visual object tracking to deceive deep trackers by injecting imperceptible perturbations into video frames.

Adversarial Attack Computational Efficiency +2

Paper
Add Code

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

no code implementations • CVPR 2022 • Zhaowen Li, Yousong Zhu, Fan Yang, Wei Li, Chaoyang Zhao, Yingying Chen, Zhiyang Chen, Jiahao Xie, Liwei Wu, Rui Zhao, Ming Tang, Jinqiao Wang

Furthermore, our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2. 5% with the same pre-training epochs in linear probing, and surpass current self-supervised object detection methods on COCO dataset, demonstrating its universality and potential.

Image Classification Object +4

Paper
Add Code

Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

no code implementations • 10 Mar 2022 • Wei Li, Wenhao Wu, Moye Chen, Jiachen Liu, Xinyan Xiao, Hua Wu

In this survey, we provide a systematic overview of the research progress on the faithfulness problem of NLG, including problem analysis, evaluation metrics and optimization methods.

Abstractive Text Summarization Data-to-Text Generation +2

Paper
Add Code

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

1 code implementation • CVPR 2022 • Yuchao Wang, Haochen Wang, Yujun Shen, Jingjing Fei, Wei Li, Guoqiang Jin, Liwei Wu, Rui Zhao, Xinyi Le

A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability.

Ranked #3 on Semi-Supervised Semantic Segmentation on PASCAL VOC 2012 50%

Semi-Supervised Semantic Segmentation

410

Paper
Code

A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds

1 code implementation • 8 Mar 2022 • Yan Xia, Qiangqiang Wu, Wei Li, Antoni B. Chan, Uwe Stilla

Recent works on 3D single object tracking treat the task as a target-specific 3D detection task, where an off-the-shelf 3D detector is commonly employed for the tracking.

3D Single Object Tracking motion prediction +1

Paper
Code

Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information

no code implementations • 1 Mar 2022 • Kaiqi Fu, Shaojun Gao, Kai Wang, Wei Li, Xiaohai Tian, Zejun Ma

Moreover, we utilize multi-source information (e. g., MFCC and deep features) to further improve the scoring system performance.

Data Augmentation Word-level pronunciation scoring

Paper
Add Code

DEEPCHORUS: A Hybrid Model of Multi-scale Convolution and Self-attention for Chorus Detection

1 code implementation • 13 Feb 2022 • Qiqi He, Xiaoheng Sun, Yi Yu, Wei Li

Chorus detection is a challenging problem in musical signal processing as the chorus often repeats more than once in popular songs, usually with rich instruments and complex rhythm forms.

Paper
Code

TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

1 code implementation • 2 Feb 2022 • Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov

In this paper, we propose TONet, a plug-and-play model that improves both tone and octave perceptions by leveraging a novel input representation and a novel network architecture.

Decoder Information Retrieval +3

Paper
Code

A Joint Morphological Profiles and Patch Tensor Change Detection for Hyperspectral Imagery

no code implementations • 20 Jan 2022 • Zengfu Hou, Wei Li

Multi-temporal hyperspectral images can be used to detect changed information, which has gradually attracted researchers' attention.

Change Detection Image Reconstruction

Paper
Add Code

DDU-Net: Dual-Decoder-U-Net for Road Extraction Using High-Resolution Remote Sensing Images

no code implementations • 18 Jan 2022 • Ying Wang, Yuexing Peng, Xinran Liu, Wei Li, George C. Alexandropoulos, Junchuan Yu, Daqing Ge, Wei Xiang

Extracting roads from high-resolution remote sensing images (HRSIs) is vital in a wide variety of applications, such as autonomous driving, path planning, and road navigation.

Autonomous Driving Decoder

Paper
Add Code

Evolutionary Action Selection for Gradient-based Policy Learning

no code implementations • 12 Jan 2022 • Yan Ma, Tianxing Liu, Bingsheng Wei, Yi Liu, Kang Xu, Wei Li

Evolutionary Algorithms (EAs) and Deep Reinforcement Learning (DRL) have recently been integrated to take the advantage of the both methods for better exploration and exploitation. The evolutionary part in these hybrid methods maintains a population of policy networks. However, existing methods focus on optimizing the parameters of policy network, which is usually high-dimensional and tricky for EA. In this paper, we shift the target of evolution from high-dimensional parameter space to low-dimensional action space. We propose Evolutionary Action Selection-Twin Delayed Deep Deterministic Policy Gradient (EAS-TD3), a novel hybrid method of EA and DRL. In EAS, we focus on optimizing the action chosen by the policy network and attempt to obtain high-quality actions to promote policy learning through an evolutionary algorithm.

Continuous Control Evolutionary Algorithms

Paper
Add Code

Generative adversarial network for super-resolution imaging through a fiber

no code implementations • 3 Jan 2022 • Wei Li, Ksenia Abrashitova, Gerwin Osnabrugge, Lyubov V. Amitonova

A multimode fiber represents the ultimate limit in miniaturization of imaging endoscopes.

Compressive Sensing Generative Adversarial Network +2

Paper
Add Code

PPDL: Predicate Probability Distribution Based Loss for Unbiased Scene Graph Generation

no code implementations • CVPR 2022 • Wei Li, Haiwei Zhang, Qijie Bai, Guoqing Zhao, Ning Jiang, Xiaojie Yuan

However, the application value of SG on downstream tasks is severely limited by the predicate classification bias, which is caused by long-tailed data and presented as semantic bias of predicted relation predicates.

Graph Generation Predicate Classification +1

Paper
Add Code

Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark

1 code implementation • CVPR 2022 • Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang

In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3, 536 videos and 84, 750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories.

Segmentation Video Panoptic Segmentation

120

Paper
Code

An Intelligent Self-driving Truck System For Highway Transportation

no code implementations • 31 Dec 2021 • Dawei Wang, Lingping Gao, Ziquan Lan, Wei Li, Jiaping Ren, Jiahui Zhang, Peng Zhang, Pei Zhou, Shengao Wang, Jia Pan, Dinesh Manocha, Ruigang Yang

Recently, there have been many advances in autonomous driving society, attracting a lot of attention from academia and industry.

Autonomous Driving Decision Making

Paper
Add Code

Transfer learning of phase transitions in percolation and directed percolation

no code implementations • 31 Dec 2021 • Jianmin Shen, Feiyi Liu, Shiyang Chen, Dian Xu, Xiangna Chen, Shengfeng Deng, Wei Li, Gabor Papp, Chunbin Yang

With the DANN, only a small fraction of input configurations (2d images) needs to be labeled, which is automatically chosen, in order to capture the critical point.

Transfer Learning

Paper
Add Code

Improving the Transferability of Adversarial Examples with Resized-Diverse-Inputs, Diversity-Ensemble and Region Fitting

2 code implementations • ECCV 2020 • Junhua Zou, Zhisong Pan, Junyang Qiu, Xin Liu, Ting Rui, Wei Li

RDIM and region fitting do not require extra running time and these three steps can be well integrated into other attacks.

143

Paper
Code

Variational Autoencoder with CCA for Audio-Visual Cross-Modal Retrieval

no code implementations • 5 Dec 2021 • Jiwei Zhang, Yi Yu, Suhua Tang, Jianming Wu, Wei Li

On the one hand, audio encoder and visual encoder separately encode audio data and visual data into two different latent spaces.

Cross-Modal Retrieval Information Retrieval +1

Paper
Add Code

Exploiting Both Domain-specific and Invariant Knowledge via a Win-win Transformer for Unsupervised Domain Adaptation

no code implementations • 25 Nov 2021 • Wenxuan Ma, Jinming Zhang, Shuang Li, Chi Harold Liu, Yulin Wang, Wei Li

Unsupervised Domain Adaptation (UDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain.

Transfer Learning Unsupervised Domain Adaptation

Paper
Add Code

FastFlow: Unsupervised Anomaly Detection and Localization via 2D Normalizing Flows

5 code implementations • 15 Nov 2021 • Jiawei Yu, Ye Zheng, Xiang Wang, Wei Li, Yushuang Wu, Rui Zhao, Liwei Wu

However, current methods can not effectively map image features to a tractable base distribution and ignore the relationship between local and global features which are important to identify anomalies.

Ranked #20 on Anomaly Detection on MVTec AD

Unsupervised Anomaly Detection Weakly Supervised Defect Detection

2,789

Paper
Code

SgSum: Transforming Multi-document Summarization into Sub-graph Selection

1 code implementation • 25 Oct 2021 • Moye Chen, Wei Li, Jiachen Liu, Xinyan Xiao, Hua Wu, Haifeng Wang

Document Summarization Multi-Document Summarization +1

1,694

Paper
Code

Deep Learning for UAV-based Object Detection and Tracking: A Survey

no code implementations • 25 Oct 2021 • Xin Wu, Wei Li, Danfeng Hong, Ran Tao, Qian Du

Owing to effective and flexible data acquisition, unmanned aerial vehicle (UAV) has recently become a hotspot across the fields of computer vision (CV) and remote sensing (RS).

Management Object +3

Paper
Add Code

Learning UI Navigation through Demonstrations composed of Macro Actions

no code implementations • 16 Oct 2021 • Wei Li

The action space is restricted to the UI elements plus a few global actions.

Optical Character Recognition (OCR)

Paper
Add Code

MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

no code implementations • 7 Oct 2021 • Gaojian Wang, Qian Jiang, Xin Jin, Wei Li, Xiaohui Cui

Moreover, we make a key observation that subtle forgery artifacts can be further exposed in the patch-wise phase and amplitude spectrum and exhibit different clues.

Paper
Add Code

Referring Self-supervised Learning on 3D Point Cloud

no code implementations • 29 Sep 2021 • Runnan Chen, Xinge Zhu, Nenglun Chen, Dawei Wang, Wei Li, Yuexin Ma, Ruigang Yang, Wenping Wang

In this paper, we study a new problem named Referring Self-supervised Learning (RSL) on 3D scene understanding: Given the 3D synthetic models with labels and the unlabeled 3D real scene scans, our goal is to distinguish the identical semantic objects on an unseen scene according to the referring synthetic 3D models.

Scene Understanding Self-Supervised Learning

Paper
Add Code

A General Gaussian Heatmap Label Assignment for Arbitrary-Oriented Object Detection

1 code implementation • 27 Sep 2021 • Zhanchao Huang, Wei Li, Xiang-Gen Xia, Ran Tao

Specifically, an anchor-free object-adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2-D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects.

Ranked #31 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +1

611

Paper
Code

CENN: Conservative energy method based on neural networks with subdomains for solving variational problems involving heterogeneous and complex geometries

1 code implementation • 25 Sep 2021 • Yizheng Wang, Jia Sun, Wei Li, Zaiyuan Lu, Yinghua Liu

The advantage of the proposed method is higher efficiency, more accurate, and less hyperparameters than the strong form PINN with subdomains.

Paper
Code

Tied & Reduced RNN-T Decoder

no code implementations • 15 Sep 2021 • Rami Botros, Tara N. Sainath, Robert David, Emmanuel Guzman, Wei Li, Yanzhang He

Previous works on the Recurrent Neural Network-Transducer (RNN-T) models have shown that, under some conditions, it is possible to simplify its prediction network with little or no loss in recognition accuracy (arXiv:2003. 07705 [eess. AS], [2], arXiv:2012. 06749 [cs. CL]).

Decoder Language Modelling

Paper
Add Code

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception

1 code implementation • 12 Sep 2021 • Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Wei Li, Yuexin Ma, Hongsheng Li, Ruigang Yang, Dahua Lin

In this paper, we benchmark our model on these three tasks.

Panoptic Segmentation Segmentation

812

Paper
Code

Musical Tempo Estimation Using a Multi-scale Network

no code implementations • 3 Sep 2021 • Xiaoheng Sun, Qiqi He, Yongwei Gao, Wei Li

Recently, some single-step systems without onset detection have shown their effectiveness in automatic musical tempo estimation.

Paper
Add Code

LUAI Challenge 2021 on Learning to Understand Aerial Images

1 code implementation • 30 Aug 2021 • Gui-Song Xia, Jian Ding, Ming Qian, Nan Xue, Jiaming Han, Xiang Bai, Michael Ying Yang, Shengyang Li, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang, Qiang Zhou, Chao-hui Yu, Kaixuan Hu, Yingjia Bu, Wenming Tan, Zhe Yang, Wei Li, Shang Liu, Jiaxuan Zhao, Tianzhi Ma, Zi-han Gao, Lingqi Wang, Yi Zuo, Licheng Jiao, Chang Meng, Hao Wang, Jiahao Wang, Yiming Hui, Zhuojun Dong, Jie Zhang, Qianyue Bao, Zixiao Zhang, Fang Liu

This report summarizes the results of Learning to Understand Aerial Images (LUAI) 2021 challenge held on ICCV 2021, which focuses on object detection and semantic segmentation in aerial images.

Object object-detection +4

259

Paper
Code

Long-term, Short-term and Sudden Event: Trading Volume Movement Prediction with Graph-based Multi-view Modeling

1 code implementation • 23 Aug 2021 • Liang Zhao, Wei Li, Ruihan Bao, Keiko Harimoto, YunfangWu, Xu sun

Trading volume movement prediction is the key in a variety of financial applications.

Paper
Code

ASAT: Adaptively Scaled Adversarial Training in Time Series

no code implementations • 20 Aug 2021 • Zhiyuan Zhang, Wei Li, Ruihan Bao, Keiko Harimoto, Yunfang Wu, Xu sun

Besides the security concerns of potential adversarial examples, adversarial training can also improve the generalization ability of neural networks, train robust neural networks, and provide interpretability for neural networks.

Adversarial Robustness Time Series +1

Paper
Add Code

Multi defect detection and analysis of electron microscopy images with deep learning

no code implementations • 19 Aug 2021 • Mingren Shen, Guanzhao Li, Dongxia Wu, YuHan Liu, Jacob Greaves, Wei Hao, Nathaniel J. Krakauer, Leah Krudy, Jacob Perez, Varun Sreenivasan, Bryan Sanchez, Oigimer Torres, Wei Li, Kevin Field, Dane Morgan

Electron microscopy is widely used to explore defects in crystal structures, but human detecting of defects is often time-consuming, error-prone, and unreliable, and is not scalable to large numbers of images or real-time analysis.

Defect Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.