Search Results for author: Zhan Su

Found 6 papers, 5 papers with code

Towards Modular LLMs by Building and Reusing a Library of LoRAs

no code implementations • 18 May 2024 • Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks.

Paper
Add Code

Language Modeling Using Tensor Trains

1 code implementation • 7 May 2024 • Zhan Su, Yuqin Zhou, Fengran Mo, Jakob Grue Simonsen

We propose a novel tensor network language model based on the simplest tensor network (i. e., tensor trains), called `Tensor Train Language Model' (TTLM).

Language Modelling

Paper
Code

History-Aware Conversational Dense Retrieval

1 code implementation • 30 Jan 2024 • Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns.

Conversational Search Information Retrieval +1

Paper
Code

Multi-Head Adapter Routing for Cross-Task Generalization

1 code implementation • NeurIPS 2023 • Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni

We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation and propose $\texttt{MHR}$-$\mu$, which discards routing and fine-tunes the average of the pre-trained adapters on each downstream tasks.

Paper
Code

A Generalized Language Model in Tensor Space

1 code implementation • 31 Jan 2019 • Lipeng Zhang, Peng Zhang, Xindian Ma, Shuqin Gu, Zhan Su, Dawei Song

Theoretically, we prove that such tensor representation is a generalization of the n-gram language model.

Language Modelling Tensor Decomposition +1

Paper
Code

A Quantum Many-body Wave Function Inspired Language Modeling Approach

1 code implementation • 28 Aug 2018 • Peng Zhang, Zhan Su, Lipeng Zhang, Benyou Wang, Dawei Song

The recently proposed quantum language model (QLM) aimed at a principled approach to modeling term dependency by applying the quantum probability theory.

Language Modelling Question Answering +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.