1 code implementation • 26 Oct 2023 • Kaiser Sun, Adina Williams, Dieuwke Hupkes
NLP models have progressed drastically in recent years, according to numerous datasets proposed to evaluate performance.
1 code implementation • 19 Dec 2022 • Kaiser Sun, Peng Qi, Yuhao Zhang, Lan Liu, William Yang Wang, Zhiheng Huang
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets, with a notable average of +1. 7 F2 gain when a BART model is trained on SQuAD and evaluated on 8 QA datasets.
no code implementations • 6 Oct 2022 • Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin
We present a taxonomy for characterising and understanding generalisation research in NLP.
1 code implementation • Findings (ACL) 2021 • Kaiser Sun, Ana Marasović
An attention matrix of a transformer self-attention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output.