1 code implementation • 12 Apr 2024 • Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou
The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.
no code implementations • 27 Sep 2023 • Emanuele Aiello, Lili Yu, Yixin Nie, Armen Aghajanyan, Barlas Oguz
In recent years, advances in the large-scale pretraining of language and text-to-image models have revolutionized the field of machine learning.
1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan
It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.
Ranked #2 on Text-to-Image Generation on MS COCO
5 code implementations • NeurIPS 2023 • Chunting Zhou, PengFei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences.
no code implementations • NeurIPS 2023 • Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis
Autoregressive transformers are spectacular models for short sequences but scale poorly to long sequences such as high-resolution images, podcasts, code, or books.
no code implementations • 4 May 2023 • Xilun Chen, Lili Yu, Wenhan Xiong, Barlas Oğuz, Yashar Mehdad, Wen-tau Yih
We propose a new two-stage pre-training framework for video-to-text generation tasks such as video captioning and video question answering: A generative encoder-decoder model is first jointly pre-trained on massive image-text data to learn fundamental vision-language concepts, and then adapted to video data in an intermediate video-text pre-training stage to learn video-specific skills such as spatio-temporal reasoning.
no code implementations • 10 Jan 2023 • Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer
To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.
no code implementations • 19 Dec 2022 • Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu, Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz
Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries.
no code implementations • NAACL 2021 • Darsh Shah, Lili Yu, Tao Lei, Regina Barzilay
We present a method for generating comparative summaries that highlight similarities and contradictions in input documents.
2 code implementations • 8 Apr 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay
We present a method for generating comparative summaries that highlights similarities and contradictions in input documents.
1 code implementation • 22 Mar 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay
We introduce \emph{Nutri-bullets}, a multi-document summarization task for health and nutrition.
1 code implementation • ACL 2020 • Kyle Swanson, Lili Yu, Tao Lei
Selecting input features of top relevance has become a popular method for building self-explaining models.
1 code implementation • ACL 2020 • Lili Yu, Howard Chen, Sida Wang, Tao Lei, Yoav Artzi
We study the potential for interaction in natural language classification.
no code implementations • WS 2019 • Kyle Swanson, Lili Yu, Christopher Fox, Jeremy Wohlwend, Tao Lei
Response suggestion is an important task for building human-computer conversation systems.
no code implementations • 10 Sep 2015 • Liang Liu, Lili Yu
The primary goal of this study is to provide quantitative evidence of the evolutionary linkages, with emphasis on character usage, among different period genres of classical Chinese poetry.