Search Results for author: Aref Jafari

Found 9 papers, 3 papers with code

How to Select One Among All ? An Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding

no code implementations • Findings (EMNLP) 2021 • Tianda Li, Ahmad Rashid, Aref Jafari, Pranav Sharma, Ali Ghodsi, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a model compression algorithm that helps transfer the knowledge in a large neural network into a smaller one.

Adversarial Robustness Data Augmentation +4

Paper
Add Code

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

1 code implementation • 31 Jul 2023 • Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, Jimmy Lin

In this paper, we introduce a new dataset, HAGRID (Human-in-the-loop Attributable Generative Retrieval for Information-seeking Dataset) for building end-to-end generative information-seeking models that are capable of retrieving candidate quotes and generating attributed explanations.

Information Retrieval Informativeness +1

Paper
Code

Improved knowledge distillation by utilizing backward pass knowledge in neural networks

no code implementations • 27 Jan 2023 • Aref Jafari, Mehdi Rezagholizadeh, Ali Ghodsi

Augmenting the training set by adding this auxiliary improves the performance of KD significantly and leads to a closer match between the student and the teacher.

Knowledge Distillation Model Compression

Paper
Add Code

Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization

no code implementations • 12 Dec 2022 • Aref Jafari, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart, Ali Ghodsi

Knowledge Distillation (KD) has been extensively used for natural language understanding (NLU) tasks to improve a small model's (a student) generalization by transferring the knowledge from a larger model (a teacher).

Knowledge Distillation Natural Language Understanding

Paper
Add Code

Do we need Label Regularization to Fine-tune Pre-trained Language Models?

no code implementations • 25 May 2022 • Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri, Peng Lu, Pascal Poupart, Ali Ghodsi

Knowledge Distillation (KD) is a prominent neural model compression technique that heavily relies on teacher network predictions to guide the training of a student model.

Knowledge Distillation Model Compression

Paper
Add Code

Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher

no code implementations • COLING 2022 • Mehdi Rezagholizadeh, Aref Jafari, Puneeth Salad, Pranav Sharma, Ali Saheb Pasand, Ali Ghodsi

A case in point is that the best performing checkpoint of the teacher might not necessarily be the best teacher for training the student in KD.

Image Classification Knowledge Distillation +3

Paper
Add Code

How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding

1 code implementation • 13 Sep 2021 • Tianda Li, Ahmad Rashid, Aref Jafari, Pranav Sharma, Ali Ghodsi, Mehdi Rezagholizadeh

Knowledge Distillation (KD) is a model compression algorithm that helps transfer the knowledge of a large neural network into a smaller one.

Adversarial Robustness Data Augmentation +4

Paper
Code

Annealing Knowledge Distillation

1 code implementation • EACL 2021 • Aref Jafari, Mehdi Rezagholizadeh, Pranav Sharma, Ali Ghodsi

Knowledge distillation (KD) is a prominent model compression technique for deep neural networks in which the knowledge of a trained large teacher model is transferred to a smaller student model.

Image Classification Knowledge Distillation +1

Paper
Code

Segmentation Approach for Coreference Resolution Task

no code implementations • 30 Jun 2020 • Aref Jafari, Ali Ghodsi

This has been accomplished by defining an embedding method for the position of all members of a coreference cluster in a document and resolving all of them for a given mention.

coreference-resolution Position +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.