Search Results for author: Pengjun Xie

Found 59 papers, 40 papers with code

A Fine-Grained Domain Adaption Model for Joint Word Segmentation and POS Tagging

1 code implementation • EMNLP 2021 • Peijie Jiang, Dingkun Long, Yueheng Sun, Meishan Zhang, Guangwei Xu, Pengjun Xie

Self-training is one promising solution for it, which struggles to construct a set of high-quality pseudo training instances for the target domain.

Domain Adaptation POS +3

Paper
Code

Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning

1 code implementation • 17 Apr 2024 • Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang

Our objective is to train a generative model that can simultaneously provide a score indicating the presence of shared key point between a pair of arguments and generate the shared key point.

Argument Mining graph partitioning +2

Paper
Code

Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training

2 code implementations • 8 Apr 2024 • Longhui Zhang, Dingkun Long, Meishan Zhang, Yanzhao Zhang, Pengjun Xie, Min Zhang

Experimental results on Chinese sequence labeling datasets demonstrate that the improved BABERT variant outperforms the vanilla version, not only on these tasks but also more broadly across a range of Chinese natural language understanding tasks.

Language Modelling Natural Language Understanding

367

Paper
Code

Improving Retrieval Augmented Open-Domain Question-Answering with Vectorized Contexts

no code implementations • 2 Apr 2024 • Zhuo Chen, Xinyu Wang, Yong Jiang, Pengjun Xie, Fei Huang, Kewei Tu

With our method, the origin language models can cover several times longer contexts while keeping the computing requirements close to the baseline.

In-Context Learning Language Modelling +2

Paper
Add Code

Let LLMs Take on the Latest Challenges! A Chinese Dynamic Question Answering Benchmark

1 code implementation • 29 Feb 2024 • Zhikun Xu, Yinghui Li, Ruixue Ding, Xinyu Wang, Boli Chen, Yong Jiang, Hai-Tao Zheng, Wenlian Lu, Pengjun Xie, Fei Huang

To promote the improvement of Chinese LLMs' ability to answer dynamic questions, in this paper, we introduce CDQA, a Chinese Dynamic QA benchmark containing question-answer pairs related to the latest news on the Chinese Internet.

Question Answering

Paper
Code

A Comprehensive Study of Knowledge Editing for Large Language Models

2 code implementations • 2 Jan 2024 • Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches.

Ranked #1 on knowledge editing on zsRE (using extra training data)

knowledge editing

1,429

Paper
Code

EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models with Semi-structured Data

no code implementations • 25 Dec 2023 • Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

Experimental results demonstrate the effectiveness of continual pre-training of E-commerce LLMs and the efficacy of our devised data mixing strategy.

In-Context Learning

Paper
Add Code

TSRankLLM: A Two-Stage Adaptation of LLMs for Text Ranking

1 code implementation • 28 Nov 2023 • Longhui Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang

Text ranking is a critical task in various information retrieval applications, and the recent success of pre-trained language models (PLMs), especially large language models (LLMs), has sparked interest in their application to text ranking.

Decoder Information Retrieval +1

Paper
Code

Text Representation Distillation via Information Bottleneck Principle

1 code implementation • 9 Nov 2023 • Yanzhao Zhang, Dingkun Long, Zehan Li, Pengjun Xie

Pre-trained language models (PLMs) have recently shown great success in text representation field.

Knowledge Distillation Retrieval +1

Paper
Code

Language Models are Universal Embedders

1 code implementation • 12 Oct 2023 • Xin Zhang, Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Min Zhang

As such cases span from English to other natural or programming languages, from retrieval to classification and beyond, it is desirable to build a unified embedding model rather than dedicated ones for each scenario.

Code Search Language Modelling +2

Paper
Code

Editing Personality for Large Language Models

1 code implementation • 3 Oct 2023 • Shengyu Mao, Xiaohan Wang, Mengru Wang, Yong Jiang, Pengjun Xie, Fei Huang, Ningyu Zhang

This task seeks to adjust the models' responses to opinion-related questions on specified topics since an individual's personality often manifests in the form of their expressed opinions, thereby showcasing different personality traits.

1,429

Paper
Code

Do PLMs Know and Understand Ontological Knowledge?

1 code implementation • 12 Sep 2023 • Weiqi Wu, Chengyue Jiang, Yong Jiang, Pengjun Xie, Kewei Tu

In this paper, we focus on probing whether PLMs store ontological knowledge and have a semantic understanding of the knowledge rather than rote memorization of the surface form.

Logical Reasoning Memorization +1

Paper
Code

Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking

1 code implementation • 4 Sep 2023 • Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie, Fei Huang

Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates, which is crucial for location-related services such as navigation maps.

Chunking Multi-Task Learning +1

Paper
Code

Hybrid Retrieval and Multi-stage Text Ranking Solution at TREC 2022 Deep Learning Track

no code implementations • 23 Aug 2023 • Guangwei Xu, Yangzhao Zhang, Longhui Zhang, Dingkun Long, Pengjun Xie, Ruijie Guo

Large-scale text retrieval technology has been widely used in various practical business scenarios.

Document Ranking Language Modelling +3

Paper
Add Code

SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding

1 code implementation • 21 Aug 2023 • Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang

However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format.

Entity Typing Event Extraction +3

191

Paper
Code

EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce

1 code implementation • 14 Aug 2023 • Yangning Li, Shirong Ma, Xiaobin Wang, Shen Huang, Chengyue Jiang, Hai-Tao Zheng, Pengjun Xie, Fei Huang, Yong Jiang

EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews.

Instruction Following Language Modelling +2

183

Paper
Code

Towards General Text Embeddings with Multi-stage Contrastive Learning

no code implementations • 7 Aug 2023 • Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang

We present GTE, a general-purpose text embedding model trained with multi-stage contrastive learning.

Contrastive Learning Unsupervised Pre-training

Paper
Add Code

Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model

no code implementations • 1 Jul 2023 • Jiong Cai, Yong Jiang, Yue Zhang, Chengyue Jiang, Ke Yu, Jianhui Ji, Rong Xiao, Haihong Tang, Tao Wang, Zhongqiang Huang, Pengjun Xie, Fei Huang, Kewei Tu

We also show that pretraining the QE module with auto-generated QE data from user logs can further improve the overall performance.

Text Matching

Paper
Add Code

Bidirectional End-to-End Learning of Retriever-Reader Paradigm for Entity Linking

1 code implementation • 21 Jun 2023 • Yinghui Li, Yong Jiang, Yangning Li, Xingyu Lu, Pengjun Xie, Ying Shen, Hai-Tao Zheng

Entity Linking (EL) is a fundamental task for Information Extraction and Knowledge Graphs.

Entity Linking Entity Retrieval +3

Paper
Code

Exploring Lottery Prompts for Pre-trained Language Models

no code implementations • 31 May 2023 • Yulin Chen, Ning Ding, Xiaobin Wang, Shengding Hu, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie

Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning.

Paper
Add Code

Challenging Decoder helps in Masked Auto-Encoder Pre-training for Dense Passage Retrieval

no code implementations • 22 May 2023 • Zehan Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie

Recently, various studies have been directed towards exploring dense passage retrieval techniques employing pre-trained language models, among which the masked auto-encoder (MAE) pre-training architecture has emerged as the most promising.

Decoder Passage Retrieval +1

Paper
Add Code

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

no code implementations • 11 May 2023 • Dongyang Li, Ruixue Ding, Qiang Zhang, Zheng Li, Boli Chen, Pengjun Xie, Yao Xu, Xin Li, Ning Guo, Fei Huang, Xiaofeng He

With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information.

Entity Alignment Natural Language Understanding

Paper
Add Code

DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition

1 code implementation • 5 May 2023 • Zeqi Tan, Shen Huang, Zixia Jia, Jiong Cai, Yinghui Li, Weiming Lu, Yueting Zhuang, Kewei Tu, Pengjun Xie, Fei Huang, Yong Jiang

Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model.

Multilingual Named Entity Recognition named-entity-recognition +4

367

Paper
Code

Zero-Shot Information Extraction via Chatting with ChatGPT

1 code implementation • 20 Feb 2023 • Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, Wenjuan Han

Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.

Event Extraction named-entity-recognition +3

732

Paper
Code

COMBO: A Complete Benchmark for Open KG Canonicalization

1 code implementation • 8 Feb 2023 • Chengyue Jiang, Yong Jiang, Weiqi Wu, Yuting Zheng, Pengjun Xie, Kewei Tu

The subject and object noun phrases and the relation in open KG have severe redundancy and ambiguity and need to be canonicalized.

Open Knowledge Graph Canonicalization Relation

Paper
Code

MGeo: Multi-Modal Geographic Pre-Training Method

1 code implementation • 11 Jan 2023 • Ruixue Ding, Boli Chen, Pengjun Xie, Fei Huang, Xin Li, Qiang Zhang, Yao Xu

Single-modal PTMs can barely make use of the important GC and therefore have limited performance.

Language Modelling

Paper
Code

Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate Ultra-Fine Entity Typing

1 code implementation • 18 Dec 2022 • Chengyue Jiang, Wenyang Hui, Yong Jiang, Xiaobin Wang, Pengjun Xie, Kewei Tu

We also found MCCE is very effective in fine-grained (130 types) and coarse-grained (9 types) entity typing.

Ranked #2 on Entity Typing on Open Entity

Entity Typing Language Modelling +2

367

Paper
Code

Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field

1 code implementation • 3 Dec 2022 • Chengyue Jiang, Yong Jiang, Weiqi Wu, Pengjun Xie, Kewei Tu

We use mean-field variational inference for efficient type inference on very large type sets and unfold it as a neural network module to enable end-to-end training.

Ranked #3 on Entity Typing on Open Entity

Entity Typing Sentence +2

367

Paper
Code

Named Entity and Relation Extraction with Multi-Modal Retrieval

1 code implementation • 3 Dec 2022 • Xinyu Wang, Jiong Cai, Yong Jiang, Pengjun Xie, Kewei Tu, Wei Lu

MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.

Ranked #1 on Multi-modal Named Entity Recognition on SNAP (MNER)

Multi-modal Named Entity Recognition Named Entity Recognition +4

367

Paper
Code

Few-shot Classification with Hypersphere Modeling of Prototypes

no code implementations • 10 Nov 2022 • Ning Ding, Yulin Chen, Ganqu Cui, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie

Moreover, it is more convenient to perform metric-based classification with hypersphere prototypes than statistical modeling, as we only need to calculate the distance from a data point to the surface of the hypersphere.

Classification Few-Shot Learning +1

Paper
Add Code

Retrieval Oriented Masking Pre-training Language Model for Dense Passage Retrieval

1 code implementation • 27 Oct 2022 • Dingkun Long, Yanzhao Zhang, Guangwei Xu, Pengjun Xie

Pre-trained language model (PTM) has been shown to yield powerful text representations for dense passage retrieval task.

Language Modelling Masked Language Modeling +2

150

Paper
Code

Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling

2 code implementations • 27 Oct 2022 • Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang

We apply BABERT for feature induction of Chinese sequence labeling tasks.

Ranked #1 on Chinese Word Segmentation on MSRA

Chinese Named Entity Recognition Chinese Word Segmentation +3

6,149

Paper
Code

Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks

no code implementations • 19 Oct 2022 • Xuming Hu, Yong Jiang, Aiwei Liu, Zhongqiang Huang, Pengjun Xie, Fei Huang, Lijie Wen, Philip S. Yu

Data augmentation techniques have been used to alleviate the problem of scarce labeled data in various NER tasks (flat, nested, and discontinuous NER tasks).

Data Augmentation named-entity-recognition +3

Paper
Add Code

Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning

2 code implementations • 19 Oct 2022 • Hongqiu Wu, Ruixue Ding, Hai Zhao, Boli Chen, Pengjun Xie, Fei Huang, Min Zhang

Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios.

Language Modelling Meta-Learning

6,149

Paper
Code

DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech Entity Linking

1 code implementation • 27 Sep 2022 • Shen Huang, Yuchen Zhai, Xinwei Long, Yong Jiang, Xiaobin Wang, Yin Zhang, Pengjun Xie

Speech Entity Linking aims to recognize and disambiguate named entities in spoken languages.

Entity Linking named-entity-recognition +5

367

Paper
Code

Domain-Specific NER via Retrieving Correlated Samples

1 code implementation • COLING 2022 • Xin Zhang, Yong Jiang, Xiaobin Wang, Xuming Hu, Yueheng Sun, Pengjun Xie, Meishan Zhang

Successful Machine Learning based Named Entity Recognition models could fail on texts from some special domains, for instance, Chinese addresses and e-commerce titles, where requires adequate background knowledge.

Named Entity Recognition

Paper
Code

Adversarial Self-Attention for Language Understanding

1 code implementation • 25 Jun 2022 • Hongqiu Wu, Ruixue Ding, Hai Zhao, Pengjun Xie, Fei Huang, Min Zhang

Deep neural models (e. g. Transformer) naturally learn spurious features, which create a ``shortcut'' between the labels and inputs, thus impairing the generalization and robustness.

Ranked #1 on Machine Reading Comprehension on DREAM

Machine Reading Comprehension Named Entity Recognition (NER) +4

Paper
Code

HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking

1 code implementation • 21 May 2022 • Yanzhao Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie

Existing text retrieval systems with state-of-the-art performance usually adopt a retrieve-then-reranking architecture due to the high computational cost of pre-trained language models and the large corpus size.

Ranked #1 on Passage Re-Ranking on MS MARCO

Passage Ranking Passage Re-Ranking +2

Paper
Code

Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting

1 code implementation • NAACL 2022 • Linzhi Wu, Pengjun Xie, Jie zhou, Meishan Zhang, Chunping Ma, Guangwei Xu, Min Zhang

Prior research has mainly resorted to heuristic rule-based constraints to reduce the noise for specific self-augmentation methods individually.

named-entity-recognition Named Entity Recognition +1

Paper
Code

Parallel Instance Query Network for Named Entity Recognition

1 code implementation • ACL 2022 • Yongliang Shen, Xiaobin Wang, Zeqi Tan, Guangwei Xu, Pengjun Xie, Fei Huang, Weiming Lu, Yueting Zhuang

Each instance query predicts one entity, and by feeding all instance queries simultaneously, we can query all entities in parallel.

Ranked #1 on Nested Named Entity Recognition on GENIA

Chinese Named Entity Recognition named-entity-recognition +5

Paper
Code

Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval

1 code implementation • 7 Mar 2022 • Dingkun Long, Qiong Gao, Kuan Zou, Guangwei Xu, Pengjun Xie, Ruijie Guo, Jian Xu, Guanjun Jiang, Luxi Xing, Ping Yang

We find that the performance of retrieval models trained on dataset from general domain will inevitably decrease on specific domain.

Passage Retrieval Retrieval

150

Paper
Code

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition

1 code implementation • SemEval (NAACL) 2022 • Xinyu Wang, Yongliang Shen, Jiong Cai, Tao Wang, Xiaobin Wang, Pengjun Xie, Fei Huang, Weiming Lu, Yueting Zhuang, Kewei Tu, Wei Lu, Yong Jiang

Our system wins 10 out of 13 tracks in the MultiCoNER shared task.

Multilingual Named Entity Recognition Named Entity Recognition +1

173

Paper
Code

AISHELL-NER: Named Entity Recognition from Chinese Speech

1 code implementation • 17 Feb 2022 • Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei Huang

Named Entity Recognition (NER) from speech is among Spoken Language Understanding (SLU) tasks, aiming to extract semantic information from the speech signal.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

Few-shot Learning with Big Prototypes

no code implementations • 29 Sep 2021 • Ning Ding, Yulin Chen, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie

A big prototype could be effectively modeled by two sets of learnable parameters, one is the center of the hypersphere, which is an embedding with the same dimension of training examples.

Few-Shot Learning

Paper
Add Code

Prompt-Learning for Fine-Grained Entity Typing

no code implementations • 24 Aug 2021 • Ning Ding, Yulin Chen, Xu Han, Guangwei Xu, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu, Juanzi Li, Hong-Gee Kim

In this work, we investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.

Entity Typing Knowledge Probing +5

Paper
Add Code

Counterfactual Inference for Text Classification Debiasing

1 code implementation • ACL 2021 • Chen Qian, Fuli Feng, Lijie Wen, Chunping Ma, Pengjun Xie

In inference, given a factual input document, Corsair imagines its two counterfactual counterparts to distill and mitigate the two biases captured by the poisonous model.

counterfactual Counterfactual Inference +3

Paper
Code

Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

1 code implementation • ACL 2021 • Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie

Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers.

Domain Adaptation named-entity-recognition +3

Paper
Code

Few-NERD: A Few-Shot Named Entity Recognition Dataset

7 code implementations • ACL 2021 • Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu

In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types.

Ranked #5 on Named Entity Recognition (NER) on Few-NERD (SUP)

Few-shot NER Named Entity Recognition

378

Paper
Code

Probing BERT in Hyperbolic Spaces

1 code implementation • ICLR 2021 • Boli Chen, Yao Fu, Guangwei Xu, Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing

We introduce a Poincare probe, a structural probe projecting these embeddings into a Poincare subspace with explicitly defined hierarchies.

Word Embeddings

Paper
Code

Prototypical Representation Learning for Relation Extraction

1 code implementation • ICLR 2021 • Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang

This approach allows us to learn meaningful, interpretable prototypes for the final classification.

Few-Shot Learning Relation +3

Paper
Code

Keyphrase Extraction with Dynamic Graph Convolutional Networks and Diversified Inference

no code implementations • 24 Oct 2020 • Haoyu Zhang, Dingkun Long, Guangwei Xu, Pengjun Xie, Fei Huang, Ji Wang

Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.

Decoder Keyphrase Extraction +1

Paper
Add Code

Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation

1 code implementation • ACL 2020 • Ning Ding, Dingkun Long, Guangwei Xu, Muhua Zhu, Pengjun Xie, Xiaobin Wang, Hai-Tao Zheng

In order to simultaneously alleviate these two issues, this paper proposes to couple distant annotation and adversarial training for cross-domain CWS.

Chinese Word Segmentation Sentence

Paper
Code

Hierarchy-Aware Global Model for Hierarchical Text Classification

no code implementations • ACL 2020 • Jie Zhou, Chunping Ma, Dingkun Long, Guangwei Xu, Ning Ding, Haoyu Zhang, Pengjun Xie, Gongshen Liu

Hierarchical text classification is an essential yet challenging subtask of multi-label text classification with a taxonomic hierarchy.

General Classification Hierarchical Multi-label Classification +3

Paper
Add Code

A Neural Multi-digraph Model for Chinese NER with Gazetteers

1 code implementation • ACL 2019 • Ruixue Ding, Pengjun Xie, Xiaoyan Zhang, Wei Lu, Linlin Li, Luo Si

Gazetteers were shown to be useful resources for named entity recognition (NER).

named-entity-recognition Named Entity Recognition +1

Paper
Code

Neural Chinese Address Parsing

no code implementations • NAACL 2019 • Hao Li, Wei Lu, Pengjun Xie, Linlin Li

This paper introduces a new task {--} Chinese address parsing {--} the task of mapping Chinese addresses into semantically meaningful chunks.

Structured Prediction

Paper
Add Code

Better Modeling of Incomplete Annotations for Named Entity Recognition

no code implementations • NAACL 2019 • Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li

Supervised approaches to named entity recognition (NER) are largely developed based on the assumption that the training data is fully annotated with named entity information.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

DM\_NLP at SemEval-2018 Task 12: A Pipeline System for Toponym Resolution

no code implementations • SEMEVAL 2019 • Xiaobin Wang, Chunping Ma, Huafei Zheng, Chu Liu, Pengjun Xie, Linlin Li, Luo Si

This paper describes DM-NLP{'}s system for toponym resolution task at Semeval 2019.

Toponym Resolution

Paper
Add Code

DM\_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features

no code implementations • SEMEVAL 2018 • Chunping Ma, Huafei Zheng, Pengjun Xie, Chen Li, Linlin Li, Luo Si

This paper describes our submissions for SemEval-2018 Task 8: Semantic Extraction from CybersecUrity REports using NLP.

Chinese Word Segmentation Chunking +5

Paper
Add Code

Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task

no code implementations • IJCNLP 2017 • Yi Yang, Pengjun Xie, Jun Tao, Guangwei Xu, Linlin Li, Luo Si

This paper introduces Alibaba NLP team system on IJCNLP 2017 shared task No.

Ranked #1 on 2D Human Pose Estimation on Alibaba Cluster Trace (using extra training data)

2D Human Pose Estimation Position

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.