Search Results for author: Guangtao Zeng

Found 8 papers, 8 papers with code

MedDialog: Large-scale Medical Dialogue Datasets

1 code implementation • EMNLP 2020 • Guangtao Zeng, Wenmian Yang, Zeqian Ju, Yue Yang, Sicheng Wang, Ruisi Zhang, Meng Zhou, Jiaqi Zeng, Xiangyu Dong, Ruoyu Zhang, Hongchao Fang, Penghui Zhu, Shu Chen, Pengtao Xie

We also study the transferability of models trained on MedDialog to low-resource medical dialogue generation tasks.

Dialogue Generation Transfer Learning

521

Paper
Code

Sailor: Open Language Models for South-East Asia

2 code implementations • 4 Apr 2024 • Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin

We present Sailor, a family of open language models ranging from 0. 5B to 7B parameters, tailored for South-East Asian (SEA) languages.

Language Modelling Question Answering +1

483

Paper
Code

TinyLlama: An Open-Source Small Language Model

2 code implementations • 4 Jan 2024 • Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu

We present TinyLlama, a compact 1. 1B language model pretrained on around 1 trillion tokens for approximately 3 epochs.

Computational Efficiency Language Modelling

7,011

Paper
Code

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

1 code implementation • 23 Oct 2023 • Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan

We show that MechanisticProbe is able to detect the information of the reasoning tree from the model's attentions for most examples, suggesting that the LM indeed is going through a process of multi-step reasoning within its architecture in many cases.

Paper
Code

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

1 code implementation • 28 May 2023 • Guangtao Zeng, Peiyuan Zhang, Wei Lu

Fine-tuning pre-trained language models for multiple tasks tends to be expensive in terms of storage.

Transfer Learning

Paper
Code

Unsupervised Non-transferable Text Classification

1 code implementation • 23 Oct 2022 • Guangtao Zeng, Wei Lu

Training a good deep learning model requires substantial data and computing resources, which makes the resulting neural model a valuable intellectual property.

text-classification Text Classification