no code implementations • 1 Apr 2024 • Deqing Fu, Ghazal Khalighinejad, Ollie Liu, Bhuwan Dhingra, Dani Yogatama, Robin Jia, Willie Neiswanger
Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs.
no code implementations • 4 Feb 2024 • Ollie Liu, Deqing Fu, Dani Yogatama, Willie Neiswanger
Large language models (LLMs) are increasingly used across society, including in domains like business, engineering, and medicine.
no code implementations • 16 Nov 2023 • Ting-Rui Chiang, Xinyan Velocity Yu, Joshua Robinson, Ollie Liu, Isabelle Lee, Dani Yogatama
Augmenting a language model (LM) with $k$-nearest neighbors ($k$NN) retrieval on its training data alone can decrease its perplexity, though the underlying reasons for this remain elusive.
1 code implementation • 12 Oct 2023 • Xianghao Kong, Ollie Liu, Han Li, Dani Yogatama, Greg Ver Steeg
For diffusion models, we show that a natural non-negative decomposition of mutual information emerges, allowing us to quantify informative relationships between words and pixels in an image.
1 code implementation • 3 May 2023 • Ghazal Khalighinejad, Ollie Liu, Sam Wiseman
We investigate the ability of transformer models to approximate the CKY algorithm, using them to directly predict a sentence's parse and thus avoid the CKY algorithm's cubic dependence on sentence length.
1 code implementation • NeurIPS 2023 • Michael Hanna, Ollie Liu, Alexandre Variengien
Concretely, we use mechanistic interpretability techniques to explain the (limited) mathematical abilities of GPT-2 small.