Search Results for author: Lili Yu

Found 15 papers, 7 papers with code

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

1 code implementation • 12 Apr 2024 • Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

395

Paper
Code

Jointly Training Large Autoregressive Multimodal Models

no code implementations • 27 Sep 2023 • Emanuele Aiello, Lili Yu, Yixin Nie, Armen Aghajanyan, Barlas Oguz

In recent years, advances in the large-scale pretraining of language and text-to-image models have revolutionized the field of machine learning.

Image Generation

Paper
Add Code

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.

Ranked #2 on Text-to-Image Generation on MS COCO

Decoder Language Modelling +3

336

Paper
Code

LIMA: Less Is More for Alignment

5 code implementations • NeurIPS 2023 • Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.

Language Modelling reinforcement-learning

2,570

Paper
Code

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

no code implementations • NeurIPS 2023 • Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis

Autoregressive transformers are spectacular models for short sequences but scale poorly to long sequences such as high-resolution images, podcasts, code, or books.

Decoder Density Estimation +1

Paper
Add Code

VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

no code implementations • 4 May 2023 • Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih

We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spatio-temporal reasoning.

Decoder Question Answering +4

Paper
Add Code

Scaling Laws for Generative Mixed-Modal Language Models

no code implementations • 10 Jan 2023 • Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.

Paper
Add Code

Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences

no code implementations • 19 Dec 2022 • Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz

Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries.

Abstractive Text Summarization

Paper
Add Code

Nutri-bullets Hybrid: Consensual Multi-document Summarization

no code implementations • NAACL 2021 • Darsh Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlight similarities and contradictions in input documents.

Document Summarization Language Modelling +3

Paper
Add Code

Nutribullets Hybrid: Multi-document Health Summarization

2 code implementations • 8 Apr 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlights similarities and contradictions in input documents.

Language Modelling Nutrition +1

898

Paper
Code

Nutri-bullets: Summarizing Health Studies by Composing Segments

1 code implementation • 22 Mar 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

We introduce \emph{Nutri-bullets}, a multi-document summarization task for health and nutrition.

Document Summarization Language Modelling +2

Paper
Code

Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport

1 code implementation • ACL 2020 • Kyle Swanson, Lili Yu, Tao Lei

Selecting input features of top relevance has become a popular method for building self-explaining models.

Text Matching

Paper
Code

Interactive Classification by Asking Informative Questions

1 code implementation • ACL 2020 • Lili Yu, Howard Chen, Sida Wang, Tao Lei, Yoav Artzi

We study the potential for interaction in natural language classification.

Classification General Classification +2

Paper
Code

Building a Production Model for Retrieval-Based Chatbots

no code implementations • WS 2019 • Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei

Response suggestion is an important task for building human-computer conversation systems.

Retrieval

Paper
Add Code

On the evolution of word usage of classical Chinese poetry

no code implementations • 10 Sep 2015 • Liang Liu, Lili Yu

The primary goal of this study is to provide quantitative evidence of the evolutionary linkages, with emphasis on character usage, among different period genres of classical Chinese poetry.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.