Search Results for author: Lingpeng Kong

Found 102 papers, 66 papers with code

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

1 code implementation • 21 Mar 2024 • Qiushi Sun, Zhirui Chen, Fangzhi Xu, Kanzhi Cheng, Chang Ma, Zhangyue Yin, Jianing Wang, Chengcheng Han, Renyu Zhu, Shuai Yuan, Qipeng Guo, Xipeng Qiu, Pengcheng Yin, XiaoLi Li, Fei Yuan, Lingpeng Kong, Xiang Li, Zhiyong Wu

Building on our examination of the developmental trajectories, we further investigate the emerging synergies between code intelligence and broader machine intelligence, uncovering new cross-domain opportunities and illustrating the substantial influence of code intelligence across various domains.

163

Paper
Code

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image

1 code implementation • 5 Mar 2024 • Xijia Tao, Shuai Zhong, Lei LI, Qi Liu, Lingpeng Kong

In this paper, we propose a novel jailbreaking attack against VLMs, aiming to bypass their safety barrier when a user inputs harmful instructions.

Paper
Code

Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models

no code implementations • 1 Mar 2024 • Lei LI, Yuqi Wang, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu

To fill this gap, we introduce Multimodal ArXiv, consisting of ArXivCap and ArXivQA, for enhancing LVLMs scientific comprehension.

Benchmarking Mathematical Reasoning +1

Paper
Add Code

GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers

1 code implementation • 29 Feb 2024 • Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi

Large language models (LLMs) have achieved impressive performance across various mathematical reasoning benchmarks.

GSM8K Math +1

Paper
Code

Training-Free Long-Context Scaling of Large Language Models

1 code implementation • 27 Feb 2024 • Chenxin An, Fei Huang, Jun Zhang, Shansan Gong, Xipeng Qiu, Chang Zhou, Lingpeng Kong

The ability of Large Language Models (LLMs) to process and generate coherent text is markedly weakened when the number of input tokens exceeds their pretraining length.

16k

216

Paper
Code

LoRA Meets Dropout under a Unified Framework

no code implementations • 25 Feb 2024 • Sheng Wang, Liheng Chen, Jiyue Jiang, Boyang Xue, Lingpeng Kong, Chuan Wu

Hence, a possible contradiction arises from negligible trainable parameters of LoRA and the effectiveness of previous dropout methods, which has been largely overlooked.

Paper
Add Code

PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA

no code implementations • 24 Feb 2024 • Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu

Hopefully, the conspicuously higher parameter efficiency can establish PRoLoRA as a resource-friendly alternative to LoRA.

Paper
Add Code

Empowering Large Language Model Agents through Action Learning

1 code implementation • 24 Feb 2024 • Haiteng Zhao, Chang Ma, Guoyin Wang, Jing Su, Lingpeng Kong, Jingjing Xu, Zhi-Hong Deng, Hongxia Yang

Large Language Model (LLM) Agents have recently garnered increasing interest yet they are limited in their ability to learn from trial and error, a key element of intelligent behavior.

Language Modelling Large Language Model

Paper
Code

BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models

no code implementations • 21 Feb 2024 • Xueliang Zhao, Xinting Huang, Tingchen Fu, Qintong Li, Shansan Gong, Lemao Liu, Wei Bi, Lingpeng Kong

Multimodal reasoning stands as a pivotal capability for large vision-language models (LVLMs).

Geometry Problem Solving Molecular Property Prediction +2

Paper
Add Code

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

1 code implementation • 12 Feb 2024 • Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong

Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents.

1,000

Paper
Code

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

1 code implementation • 12 Feb 2024 • Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Zhenguo Li, Wei Bi, Lingpeng Kong

This work explores the integration of diffusion models and Chain-of-Thought (CoT), a well-established technique to improve the reasoning ability in autoregressive language models.

Math

Paper
Code

AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents

2 code implementations • 24 Jan 2024 • Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He

Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications.

Benchmarking

186

Paper
Code

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

1 code implementation • 18 Dec 2023 • Jiahui Gao, Renjie Pi, Jipeng Zhang, Jiacheng Ye, Wanjun Zhong, YuFei Wang, Lanqing Hong, Jianhua Han, Hang Xu, Zhenguo Li, Lingpeng Kong

We first analyze the limitations of current Multimodal Large Language Models (MLLMs) in this area: they struggle to accurately comprehending basic geometric elements and their relationships.

Language Modelling Large Language Model

103

Paper
Code

Linear Attention via Orthogonal Memory

no code implementations • 18 Dec 2023 • Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Given that orthogonal memory compresses global information, we further dissect the context to amplify fine-grained local information.

Causal Language Modeling Computational Efficiency +1

Paper
Add Code

Silkie: Preference Distillation for Large Visual Language Models

no code implementations • 17 Dec 2023 • Lei LI, Zhihui Xie, Mukai Li, Shunian Chen, Peiyi Wang, Liang Chen, Yazheng Yang, Benyou Wang, Lingpeng Kong

This paper explores preference distillation for large vision language models (LVLMs), improving their ability to generate helpful and faithful responses anchoring the visual context.

Ranked #18 on Visual Question Answering on MM-Vet

Hallucination Visual Question Answering

Paper
Add Code

Self-Infilling Code Generation

1 code implementation • 29 Nov 2023 • Lin Zheng, Jianbo Yuan, Zhi Zhang, Hongxia Yang, Lingpeng Kong

This work introduces self-infilling code generation, a general framework that incorporates infilling operations into auto-regressive decoding.

Code Generation

Paper
Code

Collaborative Evaluation: Exploring the Synergy of Large Language Models and Humans for Open-ended Generation Evaluation

1 code implementation • 30 Oct 2023 • Qintong Li, Leyang Cui, Lingpeng Kong, Wei Bi

To explore the synergy between humans and LLM-based evaluators and address the challenges of existing inconsistent evaluation criteria in open-ended NLG tasks, we propose a Collaborative Evaluation pipeline CoEval, involving the design of a checklist of task-specific criteria and the detailed evaluation of texts, in which LLM generates initial ideation, and then humans engage in scrutiny.

Text Generation

Paper
Code

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

no code implementations • 19 Oct 2023 • Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong

Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving.

GSM8K Math

Paper
Add Code

Attentive Multi-Layer Perceptron for Non-autoregressive Generation

1 code implementation • 14 Oct 2023 • Shuyang Jiang, Jun Zhang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Furthermore, we marry AMLP with popular NAR models, deriving a highly efficient NAR-AMLP architecture with linear time and space complexity.

Machine Translation Speech Synthesis +1

Paper
Code

Lemur: Harmonizing Natural Language and Code for Language Agents

1 code implementation • 10 Oct 2023 • Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents.

518

Paper
Code

DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models

1 code implementation • 9 Oct 2023 • Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong

Diffusion models have gained prominence in generating high-quality sequences of text.

673

Paper
Code

Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration

1 code implementation • 30 Sep 2023 • Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong

Large Language Models (LLMs) are evolving at an unprecedented pace and have exhibited considerable capability in the realm of natural language processing (NLP) with world knowledge.

World Knowledge

Paper
Code

Extrapolating Large Language Models to Non-English by Aligning Languages

2 code implementations • 9 Aug 2023 • Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

We start from targeting individual languages by performing cross-lingual instruction-tuning (CoIT) on LLaMA, i. e. tuning it with translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMAs), and formulate underlying scaling laws to investigate the advantages of using scalable translation data.

Translation

Paper
Code

L-Eval: Instituting Standardized Evaluation for Long Context Language Models

3 code implementations • 20 Jul 2023 • Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu

Recently, there has been growing interest in extending the context length of large language models (LLMs), aiming to effectively process long inputs of one turn or conversations with more extensive histories.

Instruction Following

11,159

Paper
Code

Linearized Relative Positional Encoding

no code implementations • 18 Jul 2023 • Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Meanwhile, it emphasizes a general paradigm for designing broadly more relative positional encoding methods that are applicable to linear transformers.

Image Classification Language Modelling +2

Paper
Add Code

Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

1 code implementation • 11 Jun 2023 • Jiacheng Ye, Xijia Tao, Lingpeng Kong

First, does multilingual transfer ability exist in English-centric models and how does it compare with multilingual pretrained models?

Paper
Code

INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation

1 code implementation • 10 Jun 2023 • Wenhao Zhu, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen

We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.

Machine Translation Translation

Paper
Code

M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

no code implementations • 7 Jun 2023 • Lei LI, Yuwei Yin, Shicheng Li, Liang Chen, Peiyi Wang, Shuhuai Ren, Mukai Li, Yazheng Yang, Jingjing Xu, Xu sun, Lingpeng Kong, Qi Liu

To tackle this challenge and promote research in the vision-language field, we introduce the Multi-Modal, Multilingual Instruction Tuning (M$^3$IT) dataset, designed to optimize VLM alignment with human instructions.

World Knowledge

Paper
Add Code

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

1 code implementation • NeurIPS 2023 • Haiteng Zhao, Shengchao Liu, Chang Ma, Hannan Xu, Jie Fu, Zhi-Hong Deng, Lingpeng Kong, Qi Liu

We pretrain GIMLET on the molecule tasks along with instructions, enabling the model to transfer effectively to a broad range of tasks.

Property Prediction Zero-Shot Learning

Paper
Code

Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving

1 code implementation • 25 May 2023 • Xueliang Zhao, Wenda Li, Lingpeng Kong

Large language models~(LLMs) present an intriguing avenue of exploration in the domain of formal theorem proving.

Ranked #3 on Automated Theorem Proving on miniF2F-test (Pass@100 metric)

Automated Theorem Proving

Paper
Code

Can Language Models Understand Physical Concepts?

1 code implementation • 23 May 2023 • Lei LI, Jingjing Xu, Qingxiu Dong, Ce Zheng, Qi Liu, Lingpeng Kong, Xu sun

Language models~(LMs) gradually become general-purpose interfaces in the interactive and embodied world, where the understanding of physical concepts is an essential prerequisite.

Paper
Code

Optimizing Non-Autoregressive Transformers with Contrastive Learning

no code implementations • 23 May 2023 • Chenxin An, Jiangtao Feng, Fei Huang, Xipeng Qiu, Lingpeng Kong

In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.

Contrastive Learning Machine Translation +2

Paper
Add Code

DetGPT: Detect What You Need via Reasoning

1 code implementation • 23 May 2023 • Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang

Overall, our proposed paradigm and DetGPT demonstrate the potential for more sophisticated and intuitive interactions between humans and machines.

Autonomous Driving Object +2

721

Paper
Code

Generating Data for Symbolic Language with Large Language Models

1 code implementation • 23 May 2023 • Jiacheng Ye, Chengzu Li, Lingpeng Kong, Tao Yu

However, such an approach has primarily been applied to natural language tasks and has not yet been explored for symbolic language tasks with complex structured outputs (e. g., semantic parsing and code generation).

Code Generation Semantic Parsing

Paper
Code

A Cognitive Stimulation Dialogue System with Multi-source Knowledge Fusion for Elders with Cognitive Impairment

no code implementations • 14 May 2023 • Jiyue Jiang, Sheng Wang, Qintong Li, Lingpeng Kong, Chuan Wu

In this paper, we propose a multi-source knowledge fusion method for CS dialogue (CSD), to generate open-ended responses guided by the CS principle and emotional support strategy.

Decoder

Paper
Add Code

Toeplitz Neural Network for Sequence Modeling

2 code implementations • 8 May 2023 • Zhen Qin, Xiaodong Han, Weixuan Sun, Bowen He, Dong Li, Dongxu Li, Yuchao Dai, Lingpeng Kong, Yiran Zhong

Sequence modeling has important applications in natural language processing and computer vision.

Language Modelling Position

Paper
Code

TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models

1 code implementation • 18 Apr 2023 • Yuwei Yin, Jean Kaddour, Xiang Zhang, Yixin Nie, Zhenguang Liu, Lingpeng Kong, Qi Liu

In addition, generative data augmentation (GDA) has been shown to produce more diverse and flexible data.

Data Augmentation domain classification +1

Paper
Code

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis

2 code implementations • 10 Apr 2023 • Wenhao Zhu, Hongyi Liu, Qingxiu Dong, Jingjing Xu, ShuJian Huang, Lingpeng Kong, Jiajun Chen, Lei LI

Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT).

Machine Translation Translation

Paper
Code

Fine-grained Audible Video Description

1 code implementation • CVPR 2023 • Xuyang Shen, Dong Li, Jinxing Zhou, Zhen Qin, Bowen He, Xiaodong Han, Aixuan Li, Yuchao Dai, Lingpeng Kong, Meng Wang, Yu Qiao, Yiran Zhong

We explore a new task for audio-visual-language modeling called fine-grained audible video description (FAVD).

Language Modelling Masked Language Modeling +5

Paper
Code

A Challenging Benchmark for Low-Resource Learning

1 code implementation • 7 Mar 2023 • Yudong Wang, Chang Ma, Qingxiu Dong, Lingpeng Kong, Jingjing Xu

Experiments on a wide range of models show that neural networks, even pre-trained language models, have sharp performance drops on our benchmark, demonstrating the effectiveness on evaluating the weaknesses of neural networks.

Paper
Code

Retrieved Sequence Augmentation for Protein Representation Learning

1 code implementation • 24 Feb 2023 • Chang Ma, Haiteng Zhao, Lin Zheng, Jiayi Xin, Qintong Li, Lijun Wu, Zhihong Deng, Yang Lu, Qi Liu, Lingpeng Kong

RSA links query protein sequences to a set of sequences with similar structures or properties in the database and combines these sequences for downstream prediction.

Property Prediction Representation Learning +1

Paper
Code

A Reparameterized Discrete Diffusion Model for Text Generation

1 code implementation • 11 Feb 2023 • Lin Zheng, Jianbo Yuan, Lei Yu, Lingpeng Kong

This work studies discrete diffusion probabilistic models with applications to natural language generation.

Text Generation

Paper
Code

Compositional Exemplars for In-context Learning

1 code implementation • 11 Feb 2023 • Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong

The performance of ICL is highly dominated by the quality of the selected in-context examples.

Code Generation Contrastive Learning +6

Paper
Code

In-Context Learning with Many Demonstration Examples

1 code implementation • 9 Feb 2023 • Mukai Li, Shansan Gong, Jiangtao Feng, Yiheng Xu, Jun Zhang, Zhiyong Wu, Lingpeng Kong

Based on EVALM, we scale up the size of examples efficiently in both instruction tuning and in-context learning to explore the boundary of the benefits from more annotated data.

16k 8k +2

Paper
Code

Efficient Attention via Control Variates

1 code implementation • 9 Feb 2023 • Lin Zheng, Jianbo Yuan, Chong Wang, Lingpeng Kong

Built upon previous progress of RFA, we characterize this gap through the lens of control variates and show that RFA can be decomposed into a sum of multiple control variate estimators for each element in the sequence.

Paper
Code

Audio-Visual Segmentation with Semantics

1 code implementation • 30 Jan 2023 • Jinxing Zhou, Xuyang Shen, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation Semantic Segmentation +1

431

Paper
Code

Lego-MT: Learning Detachable Models for Massively Multilingual Machine Translation

1 code implementation • 20 Dec 2022 • Fei Yuan, Yinquan Lu, Wenhao Zhu, Lingpeng Kong, Lei LI, Yu Qiao, Jingjing Xu

To address the needs of learning representations for all languages in a unified space, we propose a novel efficient training recipe, upon which we build an effective detachable model, Lego-MT.

Machine Translation Translation

Paper
Code

Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering

1 code implementation • 20 Dec 2022 • Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong

Despite the surprising few-shot performance of in-context learning (ICL), it is still a common practice to randomly sample examples to serve as context.

In-Context Learning

Paper
Code

Explanation Regeneration via Information Bottleneck

1 code implementation • 19 Dec 2022 • Qintong Li, Zhiyong Wu, Lingpeng Kong, Wei Bi

Explaining the black-box predictions of NLP models naturally and accurately is an important open problem in natural language generation.

Explanation Generation Language Modelling +2

Paper
Code

Unsupervised Explanation Generation via Correct Instantiations

no code implementations • 21 Nov 2022 • Sijie Cheng, Zhiyong Wu, Jiangjie Chen, Zhixing Li, Yang Liu, Lingpeng Kong

The major difficulty is finding the conflict point, where the statement contradicts our real world.

Explanation Generation

Paper
Add Code

An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks

1 code implementation • 24 Oct 2022 • Changlong Yu, Tianyi Xiao, Lingpeng Kong, Yangqiu Song, Wilfred Ng

Though linguistic knowledge emerges during large-scale language model pretraining, recent work attempt to explicitly incorporate human-defined linguistic priors into task-specific fine-tuning.

Language Modelling

Paper
Code

ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback

2 code implementations • 22 Oct 2022 • Jiacheng Ye, Jiahui Gao, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

To improve the quality of dataset synthesis, we propose a progressive zero-shot dataset generation framework, ProGen, which leverages the feedback from the task-specific model to guide the generation of new training data via in-context examples.

Informativeness text-classification +2

Paper
Code

The Devil in Linear Transformer

1 code implementation • 19 Oct 2022 • Zhen Qin, Xiaodong Han, Weixuan Sun, Dongxu Li, Lingpeng Kong, Nick Barnes, Yiran Zhong

In this paper, we examine existing kernel-based linear transformers and identify two key issues that lead to such performance gaps: 1) unbounded gradients in the attention computation adversely impact the convergence of linear transformer models; 2) attention dilution which trivially distributes attention scores over long sequences while neglecting neighbouring structures.

Language Modelling Text Classification

Paper
Code

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

1 code implementation • 17 Oct 2022 • Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong

Bringing together theoretical analysis and empirical evidence, we demonstrate the great potential of diffusion models in complex conditional language generation tasks.

Text Generation

673

Paper
Code

CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling

1 code implementation • 14 Oct 2022 • Jun Zhang, Shuyang Jiang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

In this paper, we propose Comprehensive Attention Benchmark (CAB) under a fine-grained attention taxonomy with four distinguishable attention patterns, namely, noncausal self, causal self, noncausal cross, and causal cross attentions.

Benchmarking Long-range modeling

Paper
Code

Audio-Visual Segmentation

1 code implementation • 11 Jul 2022 • Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong

To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process.

Segmentation

431

Paper
Code

Vicinity Vision Transformer

1 code implementation • 21 Jun 2022 • Weixuan Sun, Zhen Qin, Hui Deng, Jianyuan Wang, Yi Zhang, Kaihao Zhang, Nick Barnes, Stan Birchfield, Lingpeng Kong, Yiran Zhong

Based on this observation, we present a Vicinity Attention that introduces a locality bias to vision transformers with linear complexity.

Image Classification

Paper
Code

CoNT: Contrastive Neural Text Generation

2 code implementations • 29 May 2022 • Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Code Comment Generation Comment Generation +4

420

Paper
Code

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning

2 code implementations • 25 May 2022 • Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong

In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs.

text-classification Text Classification +1

Paper
Code

Language Models Can See: Plugging Visual Controls in Text Generation

1 code implementation • 5 May 2022 • Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lingpeng Kong, Nigel Collier

MAGIC is a flexible framework and is theoretically compatible with any text generation tasks that incorporate image grounding.

Image Captioning Image-text matching +3

251

Paper
Code

Lexical Knowledge Internalization for Neural Dialog Generation

1 code implementation • ACL 2022 • Zhiyong Wu, Wei Bi, Xiang Li, Lingpeng Kong, Ben Kao

We propose knowledge internalization (KI), which aims to complement the lexical knowledge into neural dialog models.

Contrastive Learning

Paper
Code

Event Transition Planning for Open-ended Text Generation

1 code implementation • Findings (ACL) 2022 • Qintong Li, Piji Li, Wei Bi, Zhaochun Ren, Yuxuan Lai, Lingpeng Kong

Open-ended text generation tasks, such as dialogue generation and story completion, require models to generate a coherent continuation given limited preceding context.

Dialogue Generation Story Completion

Paper
Code

Linear Complexity Randomized Self-attention Mechanism

1 code implementation • 10 Apr 2022 • Lin Zheng, Chong Wang, Lingpeng Kong

By combining the expressiveness in RA and the efficiency in RFA, we develop a novel linear complexity self-attention mechanism called linear randomized attention (LARA).

Paper
Code

cosFormer: Rethinking Softmax in Attention

3 code implementations • ICLR 2022 • Zhen Qin, Weixuan Sun, Hui Deng, Dongxu Li, Yunshen Wei, Baohong Lv, Junjie Yan, Lingpeng Kong, Yiran Zhong

As one of its core components, the softmax attention helps to capture long-range dependencies yet prohibits its scale-up due to the quadratic space and time complexity to the sequence length.

Ranked #4 on Offline RL on D4RL

D4RL Language Modelling +1

173

Paper
Code

Revisiting Over-smoothing in BERT from the Perspective of Graph

no code implementations • ICLR 2022 • Han Shi, Jiahui Gao, Hang Xu, Xiaodan Liang, Zhenguo Li, Lingpeng Kong, Stephen M. S. Lee, James T. Kwok

Recently over-smoothing phenomenon of Transformer-based models is observed in both vision and language fields.

Paper
Add Code

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

3 code implementations • 16 Feb 2022 • Jiacheng Ye, Jiahui Gao, Qintong Li, Hang Xu, Jiangtao Feng, Zhiyong Wu, Tao Yu, Lingpeng Kong

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs).

Knowledge Distillation Natural Language Inference +5

Paper
Code

A Contrastive Framework for Neural Text Generation

2 code implementations • 13 Feb 2022 • Yixuan Su, Tian Lan, Yan Wang, Dani Yogatama, Lingpeng Kong, Nigel Collier

Text generation is of great importance to many natural language processing applications.

Text Generation

445

Paper
Code

SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples

1 code implementation • 16 Jan 2022 • Hao Wang, Yangguang Li, Zhen Huang, Yong Dou, Lingpeng Kong, Jing Shao

To alleviate feature suppression, we propose contrastive learning for unsupervised sentence embedding with soft negative samples (SNCSE).

Contrastive Learning Data Augmentation +7

Paper
Code

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.

Ranked #1 on Task-Oriented Dialogue Systems on KVRET

Few-Shot Learning Question Answering +3

531

Paper
Code

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

1 code implementation • NAACL 2022 • Jakob Prange, Nathan Schneider, Lingpeng Kong

We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling.

Language Modelling

Paper
Code

Ripple Attention for Visual Perception with Sub-quadratic Complexity

no code implementations • 6 Oct 2021 • Lin Zheng, Huijie Pan, Lingpeng Kong

Transformer architectures are now central to sequence modeling tasks.

Paper
Add Code

ABC: Attention with Bounded-memory Control

no code implementations • ACL 2022 • Hao Peng, Jungo Kasai, Nikolaos Pappas, Dani Yogatama, Zhaofeng Wu, Lingpeng Kong, Roy Schwartz, Noah A. Smith

One way to improve the efficiency is to bound the memory size.

Language Modelling Machine Translation

Paper
Add Code

Cascaded Head-colliding Attention

1 code implementation • ACL 2021 • Lin Zheng, Zhiyong Wu, Lingpeng Kong

Transformers have advanced the field of natural language processing (NLP) on a variety of important tasks.

Language Modelling Machine Translation +1

Paper
Code

Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation

no code implementations • ACL 2021 • Zhiyong Wu, Lingpeng Kong, Wei Bi, Xiang Li, Ben Kao

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information.

Multimodal Machine Translation Translation

Paper
Add Code

Random Feature Attention

no code implementations • ICLR 2021 • Hao Peng, Nikolaos Pappas, Dani Yogatama, Roy Schwartz, Noah A. Smith, Lingpeng Kong

RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism.

Ranked #27 on Machine Translation on IWSLT2014 German-English

Language Modelling Machine Translation +3

Paper
Add Code

Adaptive Semiparametric Language Models

no code implementations • 4 Feb 2021 • Dani Yogatama, Cyprien de Masson d'Autume, Lingpeng Kong

We present a language model that combines a large parametric neural network (i. e., a transformer) with a non-parametric episodic memory component in an integrated architecture.

Language Modelling

Paper
Add Code

Good for Misconceived Reasons: Revisiting Neural Multimodal Machine Translation

no code implementations • 1 Jan 2021 • Zhiyong Wu, Lingpeng Kong, Ben Kao

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information.

Multimodal Machine Translation Translation

Paper
Add Code

Syntactic Structure Distillation Pretraining For Bidirectional Encoders

no code implementations • 27 May 2020 • Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence.

Knowledge Distillation Language Modelling +3

Paper
Add Code

A Mutual Information Maximization Perspective of Language Representation Learning

no code implementations • ICLR 2020 • Lingpeng Kong, Cyprien de Masson d'Autume, Wang Ling, Lei Yu, Zihang Dai, Dani Yogatama

We show state-of-the-art word representation learning methods maximize an objective function that is a lower bound on the mutual information between different parts of a word sequence (i. e., a sentence).

Representation Learning Sentence

Paper
Add Code

Better Document-Level Machine Translation with Bayes' Rule

no code implementations • TACL 2020 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available.

Document Level Machine Translation Document Translation +4

Paper
Add Code

Relative Pixel Prediction For Autoregressive Image Generation

no code implementations • 25 Sep 2019 • Wang Ling, Chris Dyer, Lei Yu, Lingpeng Kong, Dani Yogatama, Susannah Young

In natural images, transitions between adjacent pixels tend to be smooth and gradual, a fact that has long been exploited in image compression models based on predictive coding.

Colorization Image Colorization +4

Paper
Add Code

Putting Machine Translation in Context with the Noisy Channel Model

no code implementations • 25 Sep 2019 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation.

Document Translation Language Modelling +3

Paper
Add Code

Episodic Memory in Lifelong Language Learning

2 code implementations • NeurIPS 2019 • Cyprien de Masson d'Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama

We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier.

Continual Learning General Classification +3

Paper
Code

Learning and Evaluating General Linguistic Intelligence

no code implementations • 31 Jan 2019 • Dani Yogatama, Cyprien de Masson d'Autume, Jerome Connor, Tomas Kocisky, Mike Chrzanowski, Lingpeng Kong, Angeliki Lazaridou, Wang Ling, Lei Yu, Chris Dyer, Phil Blunsom

We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly.

Natural Language Understanding Question Answering

Paper
Add Code

Variational Smoothing in Recurrent Neural Network Language Models

no code implementations • ICLR 2019 • Lingpeng Kong, Gabor Melis, Wang Ling, Lei Yu, Dani Yogatama

We present a new theoretical perspective of data noising in recurrent neural network language models (Xie et al., 2017).

Language Modelling

Paper
Add Code

Sentence Encoding with Tree-constrained Relation Networks

no code implementations • 26 Nov 2018 • Lei Yu, Cyprien de Masson d'Autume, Chris Dyer, Phil Blunsom, Lingpeng Kong, Wang Ling

The meaning of a sentence is a function of the relations that hold between its words.

General Classification Machine Translation +5

Paper
Add Code

Neural Phrase-to-Phrase Machine Translation

no code implementations • 6 Nov 2018 • Jiangtao Feng, Lingpeng Kong, Po-Sen Huang, Chong Wang, Da Huang, Jiayuan Mao, Kan Qiao, Dengyong Zhou

We also design an efficient dynamic programming algorithm to decode segments that allows the model to be trained faster than the existing neural phrase-based machine translation method by Huang et al. (2018).

Decoder Machine Translation +1

Paper
Add Code

End-to-End Neural Segmental Models for Speech Recognition

no code implementations • 1 Aug 2017 • Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals

Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.

Decoder speech-recognition +1

Paper
Add Code

SyntaxNet Models for the CoNLL 2017 Shared Task

no code implementations • 15 Mar 2017 • Chris Alberti, Daniel Andor, Ivan Bogatyy, Michael Collins, Dan Gillick, Lingpeng Kong, Terry Koo, Ji Ma, Mark Omernick, Slav Petrov, Chayut Thanapirom, Zora Tung, David Weiss

We describe a baseline dependency parsing system for the CoNLL2017 Shared Task.

Dependency Parsing

Paper
Add Code

DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks

1 code implementation • 13 Mar 2017 • Lingpeng Kong, Chris Alberti, Daniel Andor, Ivan Bogatyy, David Weiss

In this work, we present a compact, modular framework for constructing novel recurrent neural architectures.

Decoder Dependency Parsing +2

61,770

Paper
Code

Multitask Learning with CTC and Segmental CRF for Speech Recognition

no code implementations • 21 Feb 2017 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models.

speech-recognition Speech Recognition

Paper
Add Code

DyNet: The Dynamic Neural Network Toolkit

4 code implementations • 15 Jan 2017 • Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

3,406

Paper
Code

What Do Recurrent Neural Network Grammars Learn About Syntax?

1 code implementation • EACL 2017 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith

We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection.

Ranked #20 on Constituency Parsing on Penn Treebank

Constituency Parsing Dependency Parsing +1

186

Paper
Code

Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

1 code implementation • EMNLP 2016 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith

We introduce two first-order graph-based dependency parsers achieving a new state of the art.

Ranked #17 on Dependency Parsing on Penn Treebank

Dependency Parsing

Paper
Code

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

no code implementations • 1 Mar 2016 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals

This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction.

Ranked #16 on Speech Recognition on TIMIT

Acoustic Modelling Language Modelling +2

Paper
Add Code

Segmental Recurrent Neural Networks

2 code implementations • 18 Nov 2015 • Lingpeng Kong, Chris Dyer, Noah A. Smith

Representations of the input segments (i. e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.

Chinese Word Segmentation Handwriting Recognition +2

Paper
Code

Document Context Language Models

1 code implementation • 12 Nov 2015 • Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.

Sentence

Paper
Code

ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer

no code implementations • WS 2015 • Ting-Hao Huang, Yun-Nung Chen, Lingpeng Kong

Morphological Analysis Sentiment Analysis

Paper
Add Code

Transforming Dependencies into Phrase Structures

1 code implementation • HLT 2015 • Noah A. Smith, Alexander M. Rush, Lingpeng Kong

Dependency Parsing Sentence +1

Paper
Code

Dependency Parsing for Weibo: An Efficient Probabilistic Logic Programming Approach

no code implementations • EMNLP 2014 • William Yang Wang, Lingpeng Kong, Kathryn Mazaitis, William W. Cohen

Dependency Parsing Machine Translation +2

Paper
Add Code

A Dependency Parser for Tweets

no code implementations • EMNLP 2014 • Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, Noah A. Smith

Dependency Parsing Domain Adaptation +1

Paper
Add Code

An Empirical Comparison of Parsing Methods for Stanford Dependencies

no code implementations • 16 Apr 2014 • Lingpeng Kong, Noah A. Smith

Stanford typed dependencies are a widely desired representation of natural language sentences, but parsing is one of the major computational bottlenecks in text analysis systems.

Dependency Parsing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.