Joint Entity and Relation Extraction from Scientific Documents: Role of Linguistic Information and Entity Types

Scientific articles contain various types of domain-specific entities and relations between them. The entities and their relations succinctly capture important information about the topic of the document and hence, they are crucial to the understanding and automatic analysis of the documents. In this paper, we aim to automatically extract entities and relations from a scientific abstract using a deep neural model. Given an input sentence, we use a pretrained transformer to produce contextual embeddings of the tokens which are then enriched with embeddings of their part-of-speech (POS) tags. A sequence of enriched token representations forms a span, and entities and relations are jointly learned over spans. Entity logits predicted by the entity classifier are used as features in the relation classifier. Our proposed model improves upon competitive baselines in the literature for entity and relation extraction on SciERC and ADE datasets.

PDF
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Relation Extraction Adverse Drug Events (ADE) Corpus SpERT.PL (without overlap and BioBERT) RE+ Macro F1 82.39 # 5
NER Macro F1 91.14 # 5
Relation Extraction Adverse Drug Events (ADE) Corpus SpERT.PL (with overlap and BioBERT) RE+ Macro F1 82.03 # 7
NER Macro F1 91.17 # 4
Joint Entity and Relation Extraction SciERC SpERT.PL (SciBERT) Entity F1 70.53 # 1
Relation F1 51.25 # 3
Cross Sentence No # 1

Methods


No methods listed for this paper. Add relevant methods here